Puppet
(→Making changes) |
(→Style Guide) |
||
| Line 86: | Line 86: | ||
== Style Guide == | == Style Guide == | ||
| + | * Each class that defines a certain service or system role, for example class <tt>cache::bits</tt>, should use a <tt>system_role</tt> definition with the containing class's <tt>$title</tt>. For example: | ||
| + | class misc::url-downloader { | ||
| + | system_role { "misc::url-downloader": description => "Upload-by-URL proxy" } | ||
| + | } | ||
| + | Puppet will add a note to the system's MOTD, so it's immediately clear what services are on a given system when logging in. | ||
| + | |||
* For each service, create a nested class with the name <tt>''service''::decommission</tt> (e.g. ''apache::decommission'') which ''removes'' any configuration and prepares a host for decommissioning. | * For each service, create a nested class with the name <tt>''service''::decommission</tt> (e.g. ''apache::decommission'') which ''removes'' any configuration and prepares a host for decommissioning. | ||
* For each service, create a nested class with the name <tt>''service''::monitoring</tt> (e.g. ''squid::monitoring'') which sets up any required (Nagios) monitoring configuration on the monitoring server. | * For each service, create a nested class with the name <tt>''service''::monitoring</tt> (e.g. ''squid::monitoring'') which sets up any required (Nagios) monitoring configuration on the monitoring server. | ||
Revision as of 12:50, 6 October 2011
puppet is the main configuration management tool to be used on the Wikimedia clusters.
puppetd is the client daemon that runs on all servers, and manages machines with configuration information gathered from puppetmasterd, running on machine sockpuppet.pmtpa.wmnet.
Note: Some of the information on this page (specifically "Making changes") is a little outdated with the migration to Git/Gerrit. The docs for that are at labsconsole.wm.o
Contents |
puppetd
To install puppet on a single machine, simply run
# apt-get install puppet
On Solaris, the installation instructions for the Blastwave packages seem to work.
Communication with the puppetmaster server is over encrypted SSL and with signed certificates. To sign the certificate of the newly installed machine on the puppetmaster server, log in on sockpuppet.pmtpa.wmnet and run:
# puppetca -s clienthostname
To check the list of outstanding, unsigned certificates, use:
# puppetca -l
Reinstalls
When a server gets reinstalled, the existing certs/keys on the puppetmaster will not match the freshly generated keys on the client, and puppet will not work.
Before a server runs puppet for the first time (again), on the puppetmaster host, the following command should be run to erase all history of a server:
# puppetca --clean clienthostname
However, if this is done after puppetd has already run and therefore has already generated new keys, this is not sufficient. To fix this situation on the client, use the following command to erase the newly generated keys/certificates:
# find /var/lib/puppet -name "$(hostname -f)*" -exec rm -f {} \;
Puppetmaster
The puppetmaster server in pmtpa is sockpuppet.pmtpa.wmnet.
Installation
Simply use the (backported) puppetmaster Ubuntu package:
# apt-get install puppetmaster puppetmaster-passenger
The default package install uses the Webrick development webserver. That works fine for a couple of nodes, but is single-threaded. Therefore we eventually switched to using Mongrel, but are now using a Passenger based install, from the package puppetmaster-passenger. This implies that puppetmaster is started from Apache, and not by an independent daemon anymore.
The installation basically follows these instructions, as well as the default configurations provided in the package.
Configuration
The default configuration is very usable, but we've made some tweaks here and there.
See /etc/puppet/site.pp for the basics. Puppet currently pushes out crontabs for the image scalers, ganglia binaries and conf files on on hosts, and syncs user information including ssh keys on all hosts. It will reread its conf instantly. Changes to any given host get pushed out every 30 minutes, but puppet is continually updating some host or other. See syslog on sockpuppet for details.
MD5 is broken, use SHA1 for signing certificates:
ca_md=sha1
We use storeconfigs so hosts can exchange configuration (e.g. SSH host keys). To enable this, configure:
storeconfigs=true dbadapter=sqlite3 dblocation=$vardir/clientconfigs/clientconfigs.sqlite3
Packages rails, sqlite3, libsqlite3-ruby need to be installed. The directory /var/lib/puppet/clientconfigs should be created and owned by user/group puppet.
Making changes
See the documentation on Labs for this.
If you don't have a Labs account, but you have root, you can link your SVN account to Labs, see the Gerrit documentation for this.
You can syntax check your changes by
# puppet --parseonly filename-here
Noop test run
You can do a dry run of your changes using:
# puppetd --noop --test --debug
This will give you (among other things) a list of all the changes it would make.
Trigger a run
Just run:
# puppetd --test
Debugging
Using
# puppetd --test --trace --debug
You get maximum output from puppet.
You can see a list of classes that are being included on a given puppet host, by checking the file /var/lib/puppet/state/classes.txt.
Guidelines
- Always include the base class for every node
- For every service deployed, please use a system_role definition (defined in generic-definitions.pp) to indicate what a server is running. This will be put in the MOTD. As the definition name, you should normally use the relevant puppet class. For example:
system_role { "cache::bits": description => "bits Varnish cache server" }
Style Guide
- Each class that defines a certain service or system role, for example class cache::bits, should use a system_role definition with the containing class's $title. For example:
class misc::url-downloader {
system_role { "misc::url-downloader": description => "Upload-by-URL proxy" }
}
Puppet will add a note to the system's MOTD, so it's immediately clear what services are on a given system when logging in.
- For each service, create a nested class with the name service::decommission (e.g. apache::decommission) which removes any configuration and prepares a host for decommissioning.
- For each service, create a nested class with the name service::monitoring (e.g. squid::monitoring) which sets up any required (Nagios) monitoring configuration on the monitoring server.
Todo
- More secure certificate signing
- Better, more automated version control
- Better tools for adding/maintaining node definitions