Puppet
puppet is the main configuration management tool to be used on the Wikimedia clusters (puppet for dummies on the blog).
puppetd is the client daemon that runs on all servers, and manages machines with configuration information gathered from puppetmasterd, running on machine stafford.pmtpa.wmnet.
Note: Some of the information on this page (specifically "Making changes") is a little outdated with the migration to Git/Gerrit. The docs for that are at labsconsole.wm.o
Contents |
puppetd
To install puppet on a single machine, simply run
# apt-get install puppet
On Solaris, the installation instructions for the Blastwave packages seem to work.
Communication with the puppetmaster server is over encrypted SSL and with signed certificates. To sign the certificate of the newly installed machine on the puppetmaster server, log in on sockpuppet.pmtpa.wmnet and run:
# puppetca -s clienthostname
For the first two runs (before, and after signing the certificate on the puppetmaster), use:
# puppetd --test --ca_server sockpuppet.pmtpa.wmnet
To check the list of outstanding, unsigned certificates, use:
# puppetca -l
Reinstalls
When a server gets reinstalled, the existing certs/keys on the puppetmaster will not match the freshly generated keys on the client, and puppet will not work.
Before a server runs puppet for the first time (again), on the puppetmaster host, the following command should be run to erase all history of a server:
# puppetca --clean clienthostname
However, if this is done after puppetd has already run and therefore has already generated new keys, this is not sufficient. To fix this situation on the client, use the following command to erase the newly generated keys/certificates:
# find /var/lib/puppet -name "$(hostname -f)*" -exec rm -f {} \;
Misc
Sometimes you want to purge info for a host from the puppet db. The below will do it for you:
puppetstoredconfigclean.rb server fqdn
on sockpuppet. All references, i.e. the host entry and all facts going with it, will be tossed.
Puppetmaster
The puppetmaster server in pmtpa is sockpuppet.pmtpa.wmnet.
Installation
Simply use the (backported) puppetmaster Ubuntu package:
# apt-get install puppetmaster puppetmaster-passenger
The default package install uses the Webrick development webserver. That works fine for a couple of nodes, but is single-threaded. Therefore we eventually switched to using Mongrel, but are now using a Passenger based install, from the package puppetmaster-passenger. This implies that puppetmaster is started from Apache, and not by an independent daemon anymore.
The installation basically follows these instructions, as well as the default configurations provided in the package.
Configuration
The default configuration is very usable, but we've made some tweaks here and there.
See /etc/puppet/site.pp for the basics. Puppet currently pushes out crontabs for the image scalers, ganglia binaries and conf files on on hosts, and syncs user information including ssh keys on all hosts. It will reread its conf instantly. Changes to any given host get pushed out every 30 minutes, but puppet is continually updating some host or other. See syslog on sockpuppet for details.
MD5 is broken, use SHA1 for signing certificates:
ca_md=sha1
We use storeconfigs so hosts can exchange configuration (e.g. SSH host keys). To enable this, configure:
storeconfigs=true dbadapter=sqlite3 dblocation=$vardir/clientconfigs/clientconfigs.sqlite3
Packages rails, sqlite3, libsqlite3-ruby need to be installed. The directory /var/lib/puppet/clientconfigs should be created and owned by user/group puppet.
Making changes
See the documentation on Labs for this.
If you don't have a Labs account, but you have root, you can link your SVN account to Labs, see the Gerrit documentation for this.
You can syntax check your changes by
# puppet parser validate filename-here
Noop test run
You can do a dry run of your changes using:
# puppetd --noop --test --debug
This will give you (among other things) a list of all the changes it would make.
Trigger a run
Just run:
# puppetd --test
Debugging
Using
# puppetd --test --trace --debug
You get maximum output from puppet.
You can see a list of classes that are being included on a given puppet host, by checking the file /var/lib/puppet/state/classes.txt.
Errors
Occassionally you may see puppet fill up disks, and then result in yaml errors during puppet runs. If so, you can run the following on the puppet master, but do so very, very carefully:
cd /var/lib/puppet && find . -name "*<servername>*.yaml -delete
Check .erb template syntax
"ERB files are easy to syntax check. For a file mytemplate.erb, run"
erb -x -T '-' mytemplate.erb | ruby -c(puppet templating)
Guidelines
- Always include the base class for every node
- For every service deployed, please use a system_role definition (defined in generic-definitions.pp) to indicate what a server is running. This will be put in the MOTD. As the definition name, you should normally use the relevant puppet class. For example:
system_role { "cache::bits": description => "bits Varnish cache server" }
Style Guide
- Each class that defines a certain service or system role, for example class cache::bits, should use a system_role definition with the containing class's $title. For example:
class misc::url-downloader {
system_role { "misc::url-downloader": description => "Upload-by-URL proxy" }
}
Puppet will add a note to the system's MOTD, so it's immediately clear what services are on a given system when logging in.
- Files that are fully deployed by Puppet using the file type, should generally use a read-only file mode (i.e., 0444 or 0555). This makes it more obvious that this file should not be modified, as Puppet will overwrite it anyway.
- For each service, create a nested class with the name service::decommission (e.g. apache::decommission) which removes any configuration and prepares a host for decommissioning.
- For each service, create a nested class with the name service::monitoring (e.g. squid::monitoring) which sets up any required (Nagios) monitoring configuration on the monitoring server.
Useful global variables
These are useful variables you can refer to from anywhere in the Puppet manifests. Most of these get defined in realm.pp or base.pp.
- $
- :realm : The "realm" the system belongs to. Currently we have the realms production', fundraising and labs.
- $
- :site : Contains the 5-letter site name of the server, e.g. "pmtpa", "eqiad" or "esams".
Todo
- More secure certificate signing
- Better, more automated version control
- Better tools for adding/maintaining node definitions