Cross-Cluster Image Promotion Techniques

Many organizations decide to have multiple container clusters to segregate different environments. This leads to the problem of how to move container images created in one cluster to another cluster.

The need to move images across clusters typically arises when one needs to implement a promotion process where the next environment for the given app is not in the same cluster as the current environment. This situation is common regardless of the container cluster manager and delivery pipeline being used. In this article, I will assume using OpenShift as the container cluster manager and Jenkins as the delivery pipeline tool.

The canonical workflow needed to move images across clusters can be described by the below picture:

This workflow can be implemented by the following lists of commands:
1. docker login <source_reg>
2. docker pull <source_pull_spec>
3. docker tag <source_pull_pec> <dest_pull_spec>
4. docker login <dest_reg>
5. docker push <dest_pull_spec>

What we need to make this workflow work is a docker server that can connect to the source and destination registries. The docker server must be configured with either the insecure-registry flag or the correct certificates in order to talk to the registries. We also need the ability to authenticate and pull from the source registries and authenticate and push to the destination registry.

If we want to automate the process we can assume that the previous list of commands will be issued as one of the steps of our delivery pipeline.

The pipeline steps should look like:

stage ‘pull’
sh 'oc login -p <password> -u <username> --insecure-skip-tls-verify=true <source-cluster>’
sh 'docker -p `oc whoami -t` -u `oc whoami` <source-registry>'
sh 'docker <source-pull-spec>'
stage 'push'    
sh 'docker tag <source-pull-spec> <dest-pull-spec>’
sh 'oc login -p <password> -u <username> --insecure-skip-tls-verify=true <dest-cluster>’
sh 'docker -p `oc whoami -t` -u `oc whoami` <dest-registry>'
sh 'docker push <dest-pull-spec>'

There are several ways to implement this architecture, I will present three of them.

External Pipeline Tool and Docker Server

The pipeline tool is installed outside of OpenShift and has access to a local docker server (via linux socket as it is the default).

It is up to the system administrator of the external server to setup the docker server correctly.
Typically in these kinds of setups, the CI/CD tool is installed together with the docker daemon.
This article presents a step by step tutorial on how to setup this configuration.

This approach works fine, except that we have significant pieces of our systems that are not running in OpenShift. From a strategic standpoint, this presents an issue, because the roadmap for container adoption should encompass that the ci/cd pipeline runs inside the container cluster.

Internal Pipeline Tool and Docker Running in a Pod

In this implementation, the pipeline tool runs inside OpenShift as a pod and it connects to a docker server also running as a pod.

You can create the docker pod with this:

oc create -f https://raw.githubusercontent.com/raffaelespazzoli/containers-quickstarts/dind/dind/docker-is.yaml
oc create serviceaccount docker
oc adm policy add-scc-to-user privileged system:serviceaccount:`oc project -q`:docker
oc create -f https://raw.githubusercontent.com/raffaelespazzoli/containers-quickstarts/dind/dind/docker-dc.yaml
oc expose dc docker

Notice that this way of creating a docker pod is not officially supported in OpenShift and uses an alpine-based image from docker hub. If you want to use this method you will have make sure you can support it.

If you use Jenkins as your pipeline tool, you can create a docker-jenkins-slave to be able to communicate to the docker server pod as follows:

oc new-build --code=https://github.com/raffaelespazzoli/containers-quickstarts#dind --context-dir=dind/jenkins-docker --strategy=docker --name=docker-jenkins-slave

You will also have to configure the docker-jenkins-slave here:

At this point you can code the pipeline as follows:

node('docker-jenkins-slave') {

  stage 'pull'
  sh 'oc login -p <user> -u <pwd> --insecure-skip-tls-verify=true <source cluster>'
  sh 'docker -H docker:2375 login -p `oc whoami -t` -u admin -e me@me.com <source registry>'            
  sh 'docker -H docker:2375 pull <source pull spec>'

  stage 'push'    
  sh 'oc login -p <user> -u <pwd> --insecure-skip-tls-verify=true <dest cluster>'
  sh 'docker -H docker:2375 login -p `oc whoami -t` -u admin -e me@me.com <dest registry>'
  sh 'docker -H docker:2375 tag <source pull spec> <dest pull spec>'
  sh 'docker -H docker:2375 push <dest pull spec>'

Notice that we have to define a node with the same name as the pod definition in the Jenkins configuration to make sure that the steps will be executed in the container that we have prepared with the docker client.

Notice also that now all the docker commands have the -H option which specifies where the docker server is.

With this approach all our systems run in OpenShift. The issue with this approach is that the docker storage of the docker pod will tend to grow indefinitely, so a strategy should be devised to prevent filling the disk space. Because we have used emptyDir for the docker storage, simply restarting the pod would be enough.

Use an OpenShift Build as a Means to Move Images

It is possible to trick an OpenShift build config into effectively becoming a means for moving images, here is an example:

oc new-build --strategy=docker --dockerfile=’FROM <source pull spec>’ --to-docker=true --to=<dest-pull-spec> --name=image-mover

In this case, the docker server used for the move will be one of the docker servers running in the OpenShift nodes.

This approach at first sight may look even more elegant than the previous one. In reality because the build can be spawned in any of the cluster nodes, all the nodes will have to be configured to talk with the source registry (in this example). This type of configuration is usually done at cluster setup, therefore this approach requires a good amount of planning.

All the above presented approaches have pro and cons, but they all share a common issue: having access to a docker server is essentially equivalent to being root on the server (or pod) running that docker server.

In the case of the delivery pipeline, which are usually written by developers this setup is equivalent to giving root access to a machine to all the developers who are authorized to write pipelines. For many enterprises this represents a risk that they cannot tolerate. Although there are ways to minimize the risks in all of the above approaches, the following approach removes this risk altogether.

Overcoming the Security Issue

A registry exposes a REST API. We don’t need the docker server to consume that API, we can consume it directly. Removing the docker server from the picture removes the risks issue. There are tools such as skopeo that can talk to docker registries directly and perform download and upload operations.

The Jenkins pipeline would look like this:

node('skopeo-jenkins-slave') {
stage  ‘move-image’
sh 'oc login -p <user> -u <pwd> --insecure-skip-tls-verify=true <source cluster>'
sh ‘src-creds=`oc whoami`:`oc whoami -t`’
sh 'oc login -p <user> -u <pwd> --insecure-skip-tls-verify=true <dest cluster>'
sh ‘dest-creds=`oc whoami`:`oc whoami -t`’
sh ‘skopeo --tls-verify=false copy --dest-creds $dest-creds --src-creds $src-creds <src-pull-spec> <dest-pull-spec>

If you are running Jenkins inside OpenShift you can create a skopeo-jenkins-slave as follows:

oc new-build --code=https://github.com/raffaelespazzoli/containers-quickstarts#dind --context-dir=dind/jenkins-skopeo --strategy=docker --name=skopeo-jenkins-slave

You will have to configure your Jenkins cloud plugin as for the docker-jenkins-slave.

I personally believe that this latest approach is the preferable approach in many situations mainly because it solves the security problem introduced by docker.

Notice that skopeo as a tool is still maturing, but anyone can write a docker registry client, so I can foresee that in the future we will have several tools like skopeo.

Categories
OpenShift Container Platform, OpenShift Origin, Thought Leadership
Tags
, , ,
  • Thanks for this greate article!

    BTW, shouldn’t step 5 read “docker push…”?

  • Nick Strugnell

    Thanks for the article. A pattern that I see frequently with customers is a dev/test cluster, and a separate production cluster. In this scenario, one can dispense with the registry altogether in the production cluster, and simply follow an imagestream (tagged as ‘production’) from the registry in the dev/test cluster. Promoting to production is then simply a tagging operation and the deployment configs in the production cluster will pull from the registry in the dev/test cluster.