Configuration management is a competitive field. Prior to OpenShift 3.0, OpenShift (and largely Red Hat as a whole) had mostly been in Puppet's camp with the other major competitor being Chef. When OpenShift started working on its install/configuration for 3.0, it very quickly became clear that Puppet was no longer the obvious choice. So after a large amount of investment in our 2.x Puppet based installer and operational tooling, we decided to start over with Ansible. I won't claim this route is correct for everyone, but I'll try to explain our thinking behind the switch.

Agents

agent

Attributions [1]

Agent-less was a big deal to OpenShift 3.0 because we wanted to work with Atomic Host. Requiring an agent means you have to have something installed on nodes for the configuration management master to be able to talk to. Ansible, unlike Puppet and Chef, is Agent-less and only requires SSH access to the nodes. The agent-less requirement of Atomic Host may not be a long-term limitation, but keeping the nodes as simple as possible will likely always be desirable.

Declarative (vs. Imperative)

OpenShift has a strong preference for the declarative paradigm both in our product as well as with our configuration management. Being declarative means you focus on the what instead of the how. An example would be you state that a service should be in a state of running. The imperative alternative is to simply start the service. Both paradigms have their positives and negatives. Imperative systems are often easier to get started with, able to tackle really complex problems, and more efficient. The big advantage of declarative systems is they are generally easier to make idempotent and resilient. In addition, it's perhaps easier to reason about adding additional logic to declarative systems without having to think as much about the big picture. Both Ansible and Puppet are declarative systems. Chef, however, is imperative which has always ruled it out for us.

Dependency Management and Performance

I can't speak about performance of the various options more than our experience, but the facet that caused the most real life performance issues with Puppet was dependency management. Puppet manages dependencies by asking you to declare dependencies within resources. Ex:

 package { 'iptables':
    ensure => present,
    before => Service['iptables'],
}

It's a simple enough mechanism, but in reality perhaps too simple for us to effectively manage. After multiple years of working with Puppet in OpenShift Online, our state of the art workflow was to run puppet multiple times to ensure that everything was applied. We could never guarantee that all the dependencies were specified correctly and multiple runs meant that eventually, the various steps happened before their dependencies. As you can imagine, this was very inefficient.

With Ansible, the dependency mechanism is simply the ordering from top to bottom in their YAML based playbooks. So everything is naturally dependent on the thing before it. In practice, this has proven much easier to manage and has drastically improved our efficiency.

Complexity of Extensions

This one is certainly down to personal preference. I won't claim there is a concrete rule that makes any of the configuration management systems simpler than others. But for us, a big win is using Ansible modules vs creating custom resource types with Puppet. The most obvious difference between the two is that Puppet resource types are ruby based and Ansible modules are written in python. I won't consider that a pro or con since it's almost entirely a personal preference. The more substantive difference for us is the amount of boilerplate and integrated puppet logic you need to learn to be able to write effective puppet resource types is significantly more than what you need to learn with Ansible modules. I am sure that many Puppet aficionados see this as a positive for Puppet. They are providing more structure and features after all. But for us, the simplicity of abstracting modules down to a pure python script is a net positive.

 

After working with Ansible for over 2 years now, we have switched our operational infrastructure and installer completely over to Ansible. We honestly haven't had to look back very often. Admittedly, we probably gained a lot of confirmation bias when Red Hat acquired Ansible ~1 year into our adopting it. But being shy about pointing out problems in our own code isn't generally a problem in the open source community. And we have truly had a great experience using Ansible thus far.

 

Attributions:

[1] https://commons.wikimedia.org/wiki/File:Secret_Service_agent_standing_on_the_tarmac.jpg