Deploying a PostgreSQL Pod in OpenShift V3

Crunchy Data Solutions

Today we have a guest post from our partner Crunchy Data Solutions.

At Crunchy Data Solutions (Crunchy) we were excited to hear the recent announcement from the OpenShift team regarding the new public Origin repo integrating the work Red Hat has been doing for over twelve months in OpenShift Origin and related projects like Docker, Kubernetes, GearD and Project Atomic.

Crunchy has been working with the OpenShift team for some time and witnessed the value proposition of combining PostgreSQL, a leading open source object relational database, and OpenShift first hand by working with enterprise users as they deployed PostgreSQL on OpenShift v2.

In order to advance these enterprise deployments of PostgreSQL on the OpenShift platform, Crunchy has previously collaborated with the OpenShift team to build a High Availability PostgreSQL Cartridge and stand alone PostgreSQL 9.3 Cartridge for OpenShiftv2 and, in anticipation of OpenShiftv3, recently announced a private beta for Crunchy PostgreSQL Manager, a Docker based tool enabling enterprises to build High- Availability PostgreSQL Clusters.

As we began our deep dive into the new OpenShift Origin repo and Kubernetes orchestration, we wanted to pass along our initial learning on how to provision a simple PostgreSQL database on an OpenShift Pod in Kubernetes.

Specifically, in this blog post, we will provide an example of how to deploy a PostgreSQL Pod, running as a Docker container, in OpenShift v3 and end up with a deployed solution.

Background

For those of you not familiar with the fundamentals of OpenShiftv3 and Kubernetes, we recommend the excellent blog post from Red Hat’s Ben Parees. Borrowing a few of the primary concepts from Ben’s blog:

Docker image: Defines a file system for running an isolated Linux process (typically an application)

Docker container: Running instance of a Docker image with its own isolated filesystem, network, and process spaces.

Pod: Kubernetes object that groups related Docker containers that need to share network, filesystem or memory together for placement on a node. Multiple instances of a Pod can run to provide scaling and redundancy.

Service: Kubernetes object that provides load balanced access to all instances of a Pod from another container in another Pod.

Template: An OpenShift template is a set of instructions, defined in JSON, which you can pass into OpenShift via the command line allow you to parameterize an application definition.

PostgreSQL Pod Definition

To begin, the first thing you will need to do is define an OpenShift Template for the basic PostgreSQL Pod.

In order to define the basic PostgreSQL Pod, we will use the OpenShift Template: standalone-pod-template.json . This file, when applied into OpenShift, defines the basic PostgreSQL Pod.

(In this example we will use three samples Template files to deploy the PostgreSQL database. Each of the Templates used in this example can be found here: Crunchy OpenShift v3 PostgreSQL Templates)

Key values for this first Template include:

hostDir: This is the local path to which PostgreSQL will write it’s data files. This directory needs to be created prior to running the Pod. Importantly, the directory needs to be created with the PostgreSQL user as the owner of the file. Additionally, if you are running on a system with SELinux enabled, this directory needs the correct SELinux settings to be applied as follows:

sudo mkdir /var/lib/pgsql/exampleuser
sudo chown postgres:postgres /var/lib/pgsql/exampleuser
sudo chcon -Rt svirt_sandbox_file_t /var/lib/pgsql/exampleuser

env: This is where the environment variables that will be passed into the Container are set. In this example, we use environment variables to pass a sample PostgreSQL user name and password. By convention, the PostgreSQL image will use these environment variables to set up the PostgreSQL database, user, and password. The created PostgreSQL database will be named the same as the user name.

labels: The label’s value is ‘crunchy-node’. This label will be referenced by the Service object as a means of locating a specific Pod.

To create the Pod, you enter the following command:

openshift kube process -c ./standalone-pod-template.json | openshift kube apply -c -

You can verify that OpenShift has created the Pod, that the Pod exists and that it is running with the following command:

openshift kube list pods

You can find the IP address of the PostgreSQL Pod by issuing the following command:

openshift kube list --json pods

Look for the “podIP” value in the JSON output.

At this point, you can use the following PostgreSQL psql command to connect to the PostgreSQL database:

psql -h 172.17.0.4 -W -U exampleuser exampleuser

As specified in the Pod Template file, the password for this example is: ‘example’

You now have a working PostgreSQL 9.3.5 database up and running as a Kubernetes Pod!

Another interesting thing to note is that you also have a local port (9000) that now is provided by the Pod definition upon which you can connect. This local port is established just for demonstration purposes and typically you would not want to consume a local port like this. Connect as follows:

psql -p 9000 -h localhost -U exampleuser exampleuser

Service Definition

Now that your PostgreSQL database Pod is running, the next step is to define a Kubernetes Service that will act as a proxy to our PostgreSQL Pod.

Similar to the process of defining a Pod for the PostgreSQL database, we will begin by defining a Template for this Kubernetes Service using the Template file standalone-service-template.json.

The following keys are important to call out within this template:

selector: This value is used to find or select which pods the service will proxy. In this example, we specify the value of ‘crunchy-node’, the same as what we used as the label in the PostgreSQL pod template.

port: this is the port the service will proxy, in this case we specify the standard PostgreSQL port, 5432. We deploy the service into OpenShift with the following command:

openshift kube process -c ./standalone-service-template.json | openshift kube apply -c -

You can see the deployed Service with the following command:

openshift kube list services
pg-service template=pg-service-template name=crunchy-node 172.121.17.1 5432

Consumer Pod Definition

Now that our PostgreSQL Pod and Service are running, we are ready to demonstrate how a consumer or client of the PostgreSQL Pod will identify the PostgreSQL database by environment values supplied by the Service.

Once again, as we did with the PostgreSQL Pod and the Service, we will begin by defining a Template for the Consumer Pod: standalone-consumer-template.json

Interesting values within this Template include:

env: This is a set of environment variables that store the same PostgreSQL user name and password as we used earlier in the PostgreSQL Pod Template. These values get passed into the Consumer Pod which uses them to form the postgres connection string

ports: The example uses port settings, hostPort and containerPort to allow for an easy demonstration of this blog example. The port settings cause a localhost port (12000) to be bound to the container’s port (12000).
Typically you would not use the host port settings for a container since it consumes a dedicated local port.

You deploy the Consumer Pod with the following command:

openshift kube process -c ./standalone-consumer-template.json | openshift kube apply -c –

You can verify the Consumer Pod by running the following command:

openshift kube list pods

At this point, you can test the Consumer Pod by issuing the following curl command:

curl localhost:12000/test

You should see output from the command similar to this:

[ { "Name": "exampleuser" },
{ "Name": "postgres" },
{ "Name": "template0" },
{ "Name": "template1" },
{ "Name": "testdb" } ]

The Consumer Pod, merely connects to the PostgreSQL database Pod using the service provided credentials, performs a ‘select datname from pg_stat_database’ and returns the results of the query.

You might be asking, how does the consumer know how to connect to the PostgreSQL instance? Let me explain, when a service is created within OpenShift, it causes environment variables to be created within the Kubernetes environment that describe the host and port to be used by consumers that want to reach a given pod (e.g. PostgreSQL Pod). The naming of these environment variables uses the service’s id value to prefix the variables (e.g. pg-service). In this example, the following environment variables are created by the Service:

PG_SERVICE_SERVICE_HOST
PG_SERVICE_SERVICE_PORT

The PostgreSQL user name and password values are provided directly in the Consumer Pod’s template definition so that when the consumer pod is created, OpenShift creates the following environment variables:

PG_USERNAME
PG_PASSWORD

When the consumer pod is started, these four environment variables are passed into the pod’s environment and are used to form the connection string to PostgreSQL. See the startadmin.sh script for an example of how the environment variables are passed into the example code.

Of particular interest, Kubernetes creates a special IP subnet (with routing rules) it uses to route traffic though the Service(s). So inside the consumer pod, the connection to the PostgreSQL database pod is accomplished with the following command:

psql -h 172.121.17.1 -W -U exampleuser exampleuser

Notice in this case that the host IP address is the Kubernetes provided IP subnet address and NOT the directly accessible PostgreSQL Pod IP address! Kubernetes performs this network ‘magic’ to give services and service consumers a level of indirection to the actual service implementation (Pod).

Development Notes

Note, we are not running ssh within the example Pods but instead use the nsenter command to ‘get inside’ the running containers. We use this simple bash function for this purpose:

function dockerenter() {
thispid=`docker inspect --format "{{ .State.Pid }}" $1`
echo $thispid
sudo nsenter --mount --uts --ipc --net --pid --target $thispid
}

From within the Pods / Containers, you can see the running processes and the environment variables that were provided to the Pods / Containers. We have printed these environment variables out to /tmp/envvars.out in order to allow you want to take a look at them.

Building the Consumer

The Consumer Pod is a simple golang REST server that listens on port 12000 and only has a single resource /test. You can build the binary by running the example/Makefile located in the sample code.

Building Images

The Docker images used for this example are provided in the github repo listed at the end of this blog. To build the Docker images, run the Makefiles located within the crunchy-admin and crunchy-node directories. While these images are based upon the Red Hat RHEL 7 base image, they have also been tested against the CentOS7 and Fedora 20 base images.

The crunchy-node image includes the open source PostgreSQL 9.3.5 server, client, and contrib modules. I’ve also included the procps-ng utility so that I can run ps commands from within the deployed container, this is useful for debug purposes and is not required.

The crunchy-admin image includes the example golang REST server used for the demonstration as well as the PostgreSQL client modules as well. The PostgreSQL client is useful to debug with but is not required.

Private Docker Repository

Note for this example, I run my own private Docker registry at the following URL:

http://registry:5000

You will see this registry URL referenced in the JSON template files, adjust according to your own environment.

Teardown

At this point, you can tear down the Pods and Services you have deployed by issuing the following commands:

$ openshift kube delete pods/pg-standalone-1
$ openshift kube delete pods/pg-consumer-1
$ openshift kube delete services/pg-service

Conclusion

The combination of OpenShift, Kubernetes, and Docker offers a sophisticated PaaS foundation enabling advanced deployment scenarios as required by advanced PostgreSQL installations. As described above, in a few relatively simple steps, you can experiment with running PostgreSQL upon the OpenShift v3 platform. In a future blog post we will provide additional information regarding more complex PostgreSQL deployments.

The source code for this example is found at:

https://github.com/crunchyds/openshift-3-postgres-example

Categories
OpenShift Origin, PostgreSQL
Tags
, , ,
  • Josh Berkus

    I don’t see anything in the above about volume management. That is, PostgreSQL’s files still seem to be stored on a local directory which is unmanaged by Kubernetes. Am I missing something?

    • Jeff McCormick

      volume mgmt is something the Openshift dev team has scheduled to work on in a future sprint is my understanding, right now, provisioning the directory that postgres writes to is a manual step.