One of Puppet's strongest and most powerful features is its dependency management. Because dependencies are declared, it allows for the Puppet agent to intelligently determine the order changes are applied and do them in the most efficient order possible. If you've worked in environments where configuration was done via scripts, you know its not uncommon to see a service on a host get restarted multiple times by different scripts that are configuring different things. Not exactly what you'd like to see, especially if those scripts get run on a host that's already in production. Puppet is nice enough to save us from that trap. If we tell it what each of our resources needs...

Dependencies are also arguably the most maligned feature of Puppet. Many don't find it intuitive that each Puppet run can evaluate resources in a different order depending on how the dependency graph was parsed by the agent. This is especially frustrating when you forget to put a relationship in your manifest or create a dependency cycle. It can get confusing and people don't like that. Lets see if we can make at least one use case easier to work with.

Avoid Cross Module Dependencies at all Costs

One of the problems that makes this more difficult is when you need a resource in the module you're working on to require or notify a resource in another module. You've got to know which class in the other module must be present for the dependency to resolve and make sure its included or defined somewhere else in the node's catalog. Frequently, you didn't write that other module so it can be frustrating searching through it and sometimes figuring out how it works to know what you need or don't need from it. Lets start with an example. We'll use Apache since its something everyone is familiar with.

An Aside on Module Names

You'll be tempted to name your internal modules using generic names for what they manage. When you get to the point that you're doing more complex things, you'll want to be checking out the Puppet Forge to see if anyone has already solved your problem. No sense reinventing the wheel after all. The convention used on the forge is that modules will install into their generic names. So the puppetlabs/apache module will install to the apache directory under your modulepath. If you've already made an apache module you are using in other places this can be a problem, especially if you don't want to migrate everything all at once.

To avoid this issue we've started moving our internal modules (those with site specific data or that aren't complete enough to be published) to use a few prefixes on the module names for organization purposes. A site specific prefix for modules that contain data or configuration that's only useful to our site. We manage OpenShift Online so oo_thing was pretty much a no brainer. (I'll use ex_thing for the simple example modules I'm creating for this post.) A lib_thing prefix for modules that are "generic" meaning they have no site specific data and time permitting could potentially be developed further and published to the forge for others to use. Then we leave the module names without prefixes (thing) open should we need to pull in a module from the forge or Github.

Back to the Example

So for this example we've got a simple top level class using the classic Package/File/Service pattern in the lib_httpd module that installs the httpd package, hardens the permissions of the configuration directories and manages the httpd service. Basic stuff and it will install a running web server, albeit a boring one.

class lib_httpd () {
package { 'httpd':
ensure => present,
}
 
file { ['/etc/httpd/conf.d', '/etc/httpd/conf']:
ensure => directory,
owner => 'root',
group => 'root',
mode => '0640',
require => Package['httpd'],
}
 
file { '/etc/httpd/conf/httpd.conf':
ensure => file,
owner => 'root',
group => 'root',
mode => '0640',
require => Package['httpd'],
}
 
service { 'httpd':
ensure => running,
enable => true,
subscribe => File['/etc/httpd/conf/httpd.conf'],
}
}

Creating Dependency Problems

Now lets say we wanted to do something useful, like deploy a PHP "Hello World!" app. OK, not very useful but good as an example. So lets assume your junior admin is working on this module and he's new to puppet. He'll get a request from the developer that says something like "I need PHP installed on the host, this PHP file in /srv/hello_world and this config file in /etc/httpd/conf.d." Seems fairly straight forward so he adds the following to the ex_hello module and includes the class on the new node.

class ex_hello () {
package { 'php':
ensure => present,
}
 
file { '/srv/hello_world':
ensure => directory,
owner => 'root',
group => 'root',
mode => '0644',
}
 
file { '/srv/hello_world/index.php':
ensure => directory,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/ex_hello/index.php',
require => Package['php'],
}
 
file { '/etc/httpd/conf.d/hello.conf':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/ex_hello/hello.conf',
}
}

He sees the following error when he runs puppet agent -t on the host for the first time but notices that a second run completes just fine.

err: <span style="color:#006600; font-weight:bold;">/</span><span style="color:#CC00FF; font-weight:bold;">File</span><span style="color:#006600; font-weight:bold;">[</span><span style="color:#006600; font-weight:bold;">/</span>etc<span style="color:#006600; font-weight:bold;">/</span>httpd<span style="color:#006600; font-weight:bold;">/</span>conf.<span style="color:#9900CC;">d</span><span style="color:#006600; font-weight:bold;">/</span>hello.<span style="color:#9900CC;">conf</span><span style="color:#006600; font-weight:bold;">]</span><span style="color:#006600; font-weight:bold;">/</span><span style="color:#9966CC; font-weight:bold;">ensure</span>: change from absent to file failed: Cannot create <span style="color:#006600; font-weight:bold;">/</span>etc<span style="color:#006600; font-weight:bold;">/</span>httpd<span style="color:#006600; font-weight:bold;">/</span>conf.<span style="color:#9900CC;">d</span><span style="color:#006600; font-weight:bold;">/</span>hello.<span style="color:#9900CC;">conf</span>; parent directory <span style="color:#006600; font-weight:bold;">/</span>etc<span style="color:#006600; font-weight:bold;">/</span>httpd<span style="color:#006600; font-weight:bold;">/</span>conf.<span style="color:#9900CC;">d</span> does <span style="color:#9966CC; font-weight:bold;">not</span> exist

The reason for this is that the httpd package hadn't been installed before the config file was evaluated. Remembering that he hadn't bothered to include the lib_httpd class he redid his file resource for the config file to fix that and make sure Apache gets restarted after the file is in place like so.

  include ::lib_httpd
 
file { '/etc/httpd/conf.d/hello.conf':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/ex_hello/hello.conf',
require => Package['httpd', 'php'],
notify => Service['httpd'],
}

Now this works OK but its a pain to remember every time you put a config file in place. Something many members of your team are likely to want to do for various projects. Worse, what if Apache 3 came out and your team just had to have it. Because, you know, its shiny. :-) So your web expert built a httpd3 package, included it in your internal repositories, and switched the lib_httpd module over to use it. Suddenly everywhere that was using the old package name and service will stop working. Not good.

Slightly Better

Sometimes the simplest thing to do is to require the entire containing class for your external deps which will put a dependency on everything contained by that class. Its better than specifically depending on resources in other modules. That causes a problem in this case because the package and the service you can't require something and notify it at the same time. The dreaded dependency cycle.

Fortunately, as of Puppet 2.7 and above, resolving a dep cycle got much easier. If you set graph = true or run Puppet with the --graph option the Puppet agent will generate a Graphviz dot file of the cycle at /var/lib/puppet/state/graphs/cycles.dot. You can just run that file through the dot command to see a graphical representation of the cycle making much easier to see what's going on.

The Best Way to Help an Admin

You could have helped your junior admin out tremendously if you'd given him an easy tool to work with when you originally wrote the lib_httpd module. Adding and updating files in the conf.d directory is something people are going to frequently do when working with Apache so you can anticipate this and provide them with a define to do that for them.

define lib_httpd::confd_file (
$ensure = 'file',
$content = undef,
$source = undef
) {
include ::lib_httpd
if $content and $source {
fail('You may not supply both content and source parameters to httpd::conf::file')
} elsif $content == undef and $source == undef {
fail('You must supply either the content or source parameter to httpd::conf::file')
}
 
file { "/etc/httpd/conf.d/${name}":
ensure => $ensure,
owner => 'root',
group => 'root',
mode => '0640',
content => $content,
source => $source,
require => Package['httpd'],
notify => Service['httpd'],
}
}

Now your junior admin can simplify his file config file down to using this define. He doesn't have to know what class the httpd package and service are in, he doesn't even need to know what the package and service are. He just knows that the lib_httpd module is out there and since you documented it well (you did didn't you?) there's a define that will handle this for him.

  lib_httpd::confd_file { 'hello.conf':
source => 'puppet:///modules/ex_hello/hello.conf',
require => File['/srv/hello_world/index.php'],
}

This pattern works well for pretty much any service that allows for configuring it via a .d directory and the defines can be more complex as needed. You'll notice the example above has an ensure parameter which allows for removing the files as well as creating them. You could add parameters to control all the other aspects of the file. In Apache's case you could also have the define notify an exec which run a config reolod or a graceful restart instead of notifying the service to restart. This pattern also allows you to easily port your module to other distributions/operating systems where the package names, service names, and configuration locations are likely to be different.

Regardless, the point is that the person consuming your module doesn't need to know anything about the internal dependencies of your module or worry about managing them. Its all happening automatically as far as they know.