Improving Build Time of Java Builds on OpenShift

Since we released OpenShift 3 back in July 2015, one of the most common questions I get from developers is how to get better build time for Java based builds. In this post, I will guide you through the process of speeding up Java Maven based builds, and will explain other options that can be taken to the ones that I’ll be showing.

As you might know, OpenShift 3 Enterprise provides Middleware Services (xPaas), which is a set of Java based images for JBoss EAP, JBoss EWS (Tomcat), JBoss Fuse Integration Services, JBoss A-MQ, JBoss Decision Server and JBoss Data Grid. Also, OpenShift Origin provides an additional JBoss based images for Wildfly, our application server community project. All these images are source-to-image (S2I) enable, this will get your application source code built (using Maven) and layered into the application container.

When working with Maven, it is very common to use a Central Artifact Repository Manager in your organization for centralizing and managing all the required and generated dependencies. This will also give you isolation from the real location of the artifacts in the Internet and some security mechanisms, amongst other features.

During my life as a developer and consultant, I’ve been working with Nexus Artifact Manager for this purpose. I will not say that it’s the best or worst, but only that it is the one most familiar to me. Because of that, I will be using it in my OpenShift install.

It is important to note that everything I will describe can be executed in OpenShift Enterprise or OpenShift Origin. The only requirement is that if you’re using the Middleware Services images, you should have the corresponding subscriptions for running them.

The first thing we need to do is to lay out our OpenShift architecture. I’ve decided to deploy Nexus as a service in OpenShift. For that purpose I have created a Nexus image (not supported) that I will be building and deploying internally in my OpenShift instance, in a project that I’ve called ci. This project name is important as it will be used to reference the nexus instance. It is part of the service DNS name.

$ oc new-project ci --display-name="Continuous Integration for OpenShift" --description="This project holds all continuous integration required infrastructure, like Nexus, Jenkins,..."

$ oc create -f https://raw.githubusercontent.com/jorgemoralespou/nexus-ose/master/nexus/ose3/nexus-resources.json -n ci

The steps above will create a project called ci, and it will add some OpenShift resources to the project, namely:
  • A nexus ServiceAccount for using in build.
  • A BuildConfig for building the Nexus image, based on Centos7, that will be published into a nexus ImageStream. When the BuildConfig gets deployed, a nexus build will be triggered.
I’ve used the official sonatype nexus image’s Dockerfile as base. After that, I added my own requirements for the purpose of this blog, like making sure any user will be able to deploy the image with an OpenShift restricted policy, or adding configuration to use Red Hat’s JBoss maven repositories.

 

The build will take some time, so be patient!.

 

 

The templates that are provided as part of the loaded resources will allow you to deploy an instance of the Nexus image built, using the nexus ServiceAccount. It will also be configured to have a service on port 8081 and a route on whatever hostname you decide, for external access.
Also, these templates will allow you to have a persistent instance of Nexus, using a PersistentVolume or working in an ephemeral mode, where if the nexus replica dies, you’ll lose all of your cached dependencies. For testing purposes, it’s much easier to setup the ephemeral instance, but for a more real usage, you should consider only the persistent image.

All the instructions on how to set the persistent volume and all the requirements are in the README file in the Github repository

In this example, I will deploy the ephemeral version, with the following command:

oc new-app --template=nexus-ephemeral --param=APPLICATION_HOSTNAME=nexus.apps.10.2.2.2.xip.io

You can also deploy your nexus instance using the OpenShift console:

 

It is very important to understand that the nexus instance will not be deployed until the build process has finished, and this can take quite some time, so be patient!

The value provided to APPLICATION_HOSTNAME is dependant on your installation. My OpenShift environment default application domain is apps.10.2.2.2.xip.io

 

We can access our nexus instance through the APPLICATION_HOSTNAME value we have provided, and check what repositories are in there. Default credentials for this nexus instance are (admin/admin123). It is important to note, that this Nexus server comes already configured with some Red Hat JBoss repositories, to allow our S2I images to fetch the appropriate dependencies.

What we need now is a way of instructing our JBoss S2I builder images to use this nexus instance as artifact repository manager. There are some alternatives to this, and I will show two of them.

Using the provided S2I builder

JBoss EAP S2I Builder Image version 1.2, which is the latest version of the builder image, is included in OpenShift Enterprise 3.1. It provides an environment variable that can be set to point to a maven mirror url, unsurprisingly it is called MAVEN_MIRROR_URL. I will use that variable to get the maven artifacts through our Nexus instance.

To check that our builds will use our internal nexus instance, we can browse to the public group page and verify that there is no dependency currently stored.

Let’s create a new project and create a sample application using nexus.

$ oc new-project eap-nexus-builds --display-name="EAP builds with Nexus" --description="Building Applications in EAP using Nexus for dependency management"
For the application, we will be using the EAP S2I Builder image, and we will use the default sample project. Then, we will set a build MAVEN_MIRROR_URL.

To do the previous Configuration through the UI you need to use OpenShift Origin 1.1.1 or greater or OpenShift Enteprise 3.1.1 or greater. If instead, using json file approach to create the app, you will need OpenShift Origin 1.1.0 or greater or OpenShift Enterprise  3.1 or greater. You can create the application using the following command oc create -f https://raw.githubusercontent.com/jorgemoralespou/nexus-ose/master/other/eap-nexus-resources/eap-nexus-resources-all.json

You should notice that I’ve used internal DNS name of our nexus instance, which is nexus.ci.svc.cluster.local, which follows the pattern <service-name>.<project>.svc.cluster.local for services. This is a very powerful feature of OpenShift that provides DNS names for every service, and much more.

When building the application, we will notice that maven dependencies are being pulled from our nexus instance, instead of the default public Red Hat JBoss' repositories.

Once our build is finished, we will also see how our nexus repository artifact group is filled with all the dependencies that have been pulled down.

And we will have our application running.

Here, we can have a historical view of the builds before and after setting MAVEN_MIRROR_URL. The first build in OpenShift always takes longer than any other build as it has to push all the base layers to the registry after the build. Successive builds will just push the application layer. From build #2 to #5 we can see the time it takes a normal build, without using Nexus, averaging 1 minute and 13 seconds

Build #7 introduces the change with MAVEN_MIRROR_URL set, but as this is the first build after the environment variable has been set, it still took 1 minute and 8 seconds to complete. This build was populating Nexus with all the pulled down dependencies.

In builds #8 to #10 we can see that the average time it takes now to build is 42 seconds

As can be seen, we get an average benefit of 31 seconds in building time after introducing our integration with an artifact repository manager, like Nexus.

Modifying the S2I builder

Not always we can have the comfort of working with S2i builder images that expose the ability to set a Maven mirror like the Middleware Services images provided by Red Hat does. In those cases, you need to think of other mechanisms to integrate these images with an artifact repository manager.

The options can vary, ranging from the most obvious, modify or extend the builder image using incremental builds, up to creating builder image from scratch. Since I do not like modifying existing images, especially those created by others, I will show how to extend existing Wildfly S2I Builder images to make use of a Nexus artifact repository manager.
The same approach can be used with any other builder image, and some other technologies that use or can benefit from the use of an artifact repository manager, especially that Nexus or Artifactory support storing dependencies for other languages than just java.

I have created a file that will install all the required resources needed to work with the Nexus instance provided in the OpenShift install. These resources are:

  • 3 BuildConfigs, for Wildfly 8, Wildfly 9 and Wildfly 10.
  • 6 ImageStreams, one for each of the original ImageStreams for every Wildfly version (8, 9 and 10) and another one for each of the modified S2I builder images for Wildfly integrated with nexus (8, 9 and 10).

The change that I’ve done to the default Wildfly S2I builder image is as simple as providing an overloaded settings.xml file in my custom S2I builder images that points to the nexus artifact repository manager. This change is the easiest to prove this functionality, although probably a better option would be to provide environment variable to customize the assembly process.

To install the Wildfly version:

$ oc new-project wildfly-nexus-builds --display-name="Wildfly builds with Nexus" --description="Building Applications in Wildfly using Nexus for dependency management"

$ oc create -f https://raw.githubusercontent.com/jorgemoralespou/nexus-ose/master/builders/wildfly-nexus/wildfly-nexus-resources.json

Once we have our custom Wildfly S2I images built,

 

we can just create a sample application with them.

$ oc new-app --docker-image=wildfly-nexus-9 --strategy=source --code=https://github.com/bparees/openshift-jee-sample.git --name='wildfly-nexus-sample'

Here, we see as well that our build process is fetching the required maven dependencies from the provided Nexus artifact repository manager.

This first build took 3 minutes and 11 seconds, it includes building with the plain wildfly-9 image available on Github, and the time needed to pull down the image. This image was not doing any dependency management.

In the second build, I updated the BuildConfig to use wildfly-nexus-9 builder image and this build took 1 minutes and 24 seconds. The reason for that is that Nexus was caching all the dependencies, since I used a clean nexus instance.

On the third and fourth build, all the dependencies were already cached in Nexus and build time dropped to 37 and 35 seconds, respectively.

As in the previous example, with EAP, we get a benefit of more than 40 seconds in our build time by using an artifact repository manager, like Nexus.

Using incremental build

Another option we can use to improve Maven based Java builds in OpenShift, is to enable the incremental builds.
Unfortunately not all images support this feature, since it requires the existence of save-artifacts script responsible for saving artifacts used during builds.
In our cases these will be maven dependencies. This will have the same behavior as having a local maven repository into the build image itself, with the drawback of reaching out for the previously built image and getting the dependencies out of it.

To test this mode, I have created a sample resources file that can be easily tested.

$ oc new-project eap-incremental-builds --display-name="EAP incremental builds" --description="Building Applications in EAP using incremental build mode"

$ oc create -f https://raw.githubusercontent.com/jorgemoralespou/nexus-ose/master/other/eap-incremental/eap-incremental-resources.json

After we’ve created the resources, let’s do some builds and look at the times.

As can be seen in the image above, the times for the second and third build, which are the builds benefiting from the stored artifacts, are much shorter: 48 and 47 seconds.
But it’s the same time it takes when using the artifact repository manager. So, there is no additional benefit in time, although it is much simpler for those images that support incremental mode, as the developer will only need to specify a flag in the BuildConfig.

 

In this example, the application and pulled down dependencies are not adding a big overhead in size to the initial eap64-openshift S2I image, only 7 MB.

 

But we need to be careful with this approach as there are other images or applications that will have much more dependencies, and the size of the generated image can grow enormously. 130 MB in the following example using Fuse Integration Services.

Summary

For every application that we build, we will be getting a performance benefit by caching into an artifact repository manager it’s dependencies. Initially we will be perceiving a performance benefit for the second and subsequent builds of every application, but as the artifact repository manager stores more and more dependencies this benefit will be also seen in initial builds of new applications, and most of the dependencies will already be cached.

Also, we can use incremental builds to get better performance on Java based builds, but it is important to understand that even if this approach is easier to set up there are some drawbacks, like the need for the image to support incremental mode.
Note that in this scenario, the build process saves the dependencies within the image being built. This means that if successive builds are run in different nodes, every node will have to first pull down the image from the OpenShift’s Docker registry, which might take longer than pulling down the dependencies again.

The most important benefit of using Nexus, or any other artifact repository dependency manager, is the security and the fact that dependencies downloaded by one developer/build will be reused over all the builds using the same dependencies. Whereas in the case of incremental builds only the dependencies downloaded during previous build can be reused and only by the same build. This might have huge impact for any Java-based organization.

In this blog, I’ve highlighted how we can improve the build time of Maven based Java builds in OpenShift, but also a very important topic is the use of the internal DNS service names to reference from one project to another. The only caveat to this, is that if we are using the multi-tenant OVS networking plugin, our cluster administrators will have to make visible our ci project to all other projects:

$ oadm pod-network make-projects-global ci