Using persistent volumes with docker as a Developer on OpenShift

We understand that Developers want to try out OpenShift on their own laptop or desktop PC.  Given that we have made available an All-In-One virtual machine image. This is based on OpenShift Origin and we have just updated this to version 1.1.3 of OpenShift Origin.

For this release of the VM image we have also updated the configuration to pre-provision some persistent volumes. This means that you can now easily experiment with using persistent data, allowing you to work with more than just 12 factor or cloud native applications.

A few examples of how you might make use of persistent volumes are:

  • As a shared file system storage for large media data files that you then want to provide for download via a HTTP web server running on many containers.
  • As a shared file system storage for the upload directory of a web application for the processing or sharing of data files.
  • To provide persistent storage for working with files in an interactive data analytics and visualisation environment such as Jupyter Notebook.
  • For holding the database files when running a SQL database such as MySQL or PostgreSQL, or a No-SQL database such as MongoDB.

Listing the available volumes

Setting up persistent volumes would normally be done by the administrator of your OpenShift environment. You may not therefore know what persistent volumes you have available to use.

With how the All-In-One VM image has been configured, you can determine what persistent volumes have been configured and that you can use, with the oc get pv command. For the All-In-One VM image this command shows that we have pre-provisioned the following persistent volumes:

pv01      <none>    1Gi        RWO,RWX       Available                       3h
pv02      <none>    2Gi        RWO,RWX       Available                       3h
pv03      <none>    3Gi        RWO,RWX       Available                       3h
pv04      <none>    4Gi        RWO,RWX       Available                       3h
pv05      <none>    5Gi        RWO,RWX       Available                       3h

In a typical setup you are more likely to see multiple volumes of the same size. For the All-In-One VM image though, we have created a set of persistent volumes of different sizes. This is so you can see what happens more easily when making persistent volume claims of different sizes which may not match what is available.

When using these persistent volumes you will actually find that the configuration of the VM itself doesn’t impose file system quotas on how much can be stored. As a result, although the persistent volume has a size associated with it for the purposes of satisfying volume claims, you can actually store more data than that and are simply limited by the size of the disk volume allocated to the VM. In a production system an administrator would have set up file system quotas for any volumes to match the advertised amount.

Adding a volume to an application

There are various ways that persistent volumes can be associated with an application. To demonstrate in this post how volumes work we are going to use the oc volume command. Before we can do that though, we need to create an application to add our persistent volume to. For this we are going to deploy an instance of the nginx web server to host some files for download.

$ oc new-project download-server
Now using project "download-server" on server "".

You can add applications to this project with the 'new-app' command. For example, try:

    $ oc new-app centos/ruby-22-centos7~

to build a new hello-world application in Ruby.

$ oc new-app nginx
--> Found Docker image 6e36f46 (8 days old) from Docker Hub for "nginx"

    * An image stream will be created as "nginx:latest" that will track this image
    * This image will be deployed in deployment config "nginx"
    * Ports 443/tcp, 80/tcp will be load balanced by service "nginx"
      * Other containers can access this service through the hostname "nginx"
    * WARNING: Image "nginx" runs as the 'root' user which may not be permitted by your cluster administrator

--> Creating resources with label app=nginx ...
    imagestream "nginx" created
    deploymentconfig "nginx" created
    service "nginx" created
--> Success
    Run 'oc status' to view your app.

$ oc expose service nginx
route "nginx" exposed

Accessing the running instance of the nginx web server at the URL we are treated with the default ‘Welcome to nginx!’ home page. This is because we haven’t yet provided it with any files to actually host.

We could create a new Docker image which derived from the nginx image and add our files to that, but that means that the files are a part of the Docker image itself and we will end up with a copy on every node. We also need to rebuild and redeploy our Docker image to make any updates.

The quicker and easier alternative to creating a new Docker image to hold the files is to use a persistent volume. All we need to work out is where in the Docker image nginx is looking for the files it is hosting. Looking at the documentation for the nginx Docker image, we find it is using the directory /usr/share/nginx/html. Therefore, all we need to do is mount our persistent volume at that location.

To claim a suitable persistent volume and mount it, we can run the oc volume command:

$ oc volume dc/nginx --add --claim-size 512M --mount-path /usr/share/nginx/html --name downloads

In this case we have said that we need a persistent volume which is at least 512M in size and that we want it mounted at the path /usr/share/nginx/html within the running containers of our application. We also name our volume to make it easier to detach the volume from the application if we need to.

We can see that the persistent volume has been allocated by using oc get pv.

$ oc get pv
NAME      LABELS    CAPACITY   ACCESSMODES   STATUS      CLAIM                       REASON    AGE
pv01      <none>    1Gi        RWO,RWX       Bound       download-server/pvc-rkly7             10h
pv02      <none>    2Gi        RWO,RWX       Available                                         10h
pv03      <none>    3Gi        RWO,RWX       Available                                         10h
pv04      <none>    4Gi        RWO,RWX       Available                                         10h
pv05      <none>    5Gi        RWO,RWX       Available                                         10h

You will note that there wasn’t actually a persistent volume available of the exact size we requested. As a result, it used the best fit which was available. As a result we were granted a claim on the persistent volume of size 1Gi.

If you want to get a list of all the persistent volume claims you have, you can use oc get pvc.

$ oc get pvc
pvc-rkly7   <none>    Bound     pv01      1Gi        RWO,RWX       6m

To see which applications the persistent volumes are associated with, you can use the oc volume dc --all command.

$ oc volume dc --all
  pvc/pvc-rkly7 (allocated 1GiB) as downloads
    mounted at /usr/share/nginx/html

You will see here the downloads name which we actually gave the volume when it was assigned to the application. If you hadn’t named the volume when adding it, this is also where you can find the automatically generated name which you would later need if wanting to deatch the volume from the application.

Copying files into the volume

We have mounted the volume into the application but it is currently empty. We therefore need to populate it with some files. To do that we are going to copy them in using the oc rsh command and tar.

To do this we first need to identify one of the pods in which our application is running. We can do this using oc get pods.

$ oc get pods
nginx-12-6lfbo   1/1       Running   0          11m

We only have one pod, but if you had already scaled up your application, you could pick any pod as all will mount the same persistent volume.

We can now copy some files into the persistent volume using the command:

tar cf - . | oc rsh nginx-12-6lfbo tar xofC - /usr/share/nginx/html

We didn’t use the oc rsync command in this case, although that is what you would normally try and use. This is because the nginx image doesn’t contain the rsync binary.  We will though be addressing that problem, so down the track you should be able to use oc rsync.

With some files copied into the volume, we can then test it using curl.

$ curl -o /tmp/Nginx.HTTP.Server.pdf
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 7341k  100 7341k    0     0  54.3M      0 --:--:-- --:--:-- --:--:-- 54.3M

To test that the volume is being mounted into all containers when we scale up the application, we can increase the number of replicas using oc scale.

$ oc scale dc/nginx --replicas 5
deploymentconfig "nginx" scaled

$ oc get pods
nginx-12-0q8cd   1/1       Running   0          1m
nginx-12-1o7qi   1/1       Running   0          1m
nginx-12-6lfbo   1/1       Running   0          51m
nginx-12-6rxa3   1/1       Running   0          1m
nginx-12-zdowf   1/1       Running   0          1m

Keep running curl to confirm that it always works, or use oc rsh to create an interactive shell in each container, use the df command to verify that the volume is mounted and verify the contents of the directory /usr/share/nginx/html.

Deatching volumes from applications

Volumes do not necessarily need to always be associated with the one application. It is possible to remove the volume association.

To do this for our application, with the fact that we named the volume downloads when mounted against the application, we would run:

$ oc volume dc/nginx --remove --name downloads

$ oc volume dc --all

As you can see, the oc volume dc --all command no longer shows it associated with the application. This doesn’t mean that you have reliquished the volume completely. The volume claim will still be marked up against your project.

$ oc get pvc
pvc-rkly7   <none>    Bound     pv01      1Gi        RWO,RWX       1h

In fact we can now mount the volume back into the same application, or a different application. This time though, as we already have a volume claim, we need to reference it by the volume claim name.

$ oc volume dc/nginx --add --type pvc --claim-name pvc-rkly7 --mount-path /usr/share/nginx/html --name downloads

Very importantly, if you try and use the volume again, all the files you originally copied into the volume are still there. Just because we detached the volume from the container doesn’t result in the data it contains being lost.

Releasing a persistent volume

If you were finished with the persistent volume and didn’t care about the data it contained, then not only do you need to detach the volume from the application, you need to delete the persistent volume claim.

$ oc volume dc/nginx --remove --name downloads

$ oc delete pvc/pvc-rkly7
persistentvolumeclaim "pvc-rkly7" deleted

$ oc get pvc

$ oc get pv
pv01      <none>    1Gi        RWO,RWX       Available                       11h
pv02      <none>    2Gi        RWO,RWX       Available                       11h
pv03      <none>    3Gi        RWO,RWX       Available                       11h
pv04      <none>    4Gi        RWO,RWX       Available                       11h
pv05      <none>    5Gi        RWO,RWX       Available                       11h

Having done that, because in the case of the All-In-One image the persistent volumes were marked to be recycled, they would then be scrubbed of any data automatically, and made available for use by others.

News, OpenShift Origin
, ,
Comments are closed.