DNS
This page describes Wikimedia's DNS setup. Wikimedia use two separate kinds of DNS servers, authoritative nameservers (that respond to queries from third party nameservers for our domains) and resolvers (that resolve DNS queries for our own servers)
Contents |
Authoritative nameservers
In the new even newer DNS setup, Wikimedia have 3 authoritative DNS servers, all running PowerDNS. The three authoritative servers are:
- ns0.wikimedia.org - 66.230.200.16 (bayle)
- ns1.wikimedia.org - 203.212.189.252 (yf1019)
- ns2.wikimedia.org - 91.198.174.4 (lily)
The servers are running with two PowerDNS backends each: the Bind backend (which emulates/reads Bind style zonefiles) and Geobackend (which is responsible for geographic DNS). The two backends are overlapping, meaning that for a given query the Bind backend is asked first, and if that declines to answer (because it doesn't have the requested record), the next backend, geobackend will be asked. Therefor, the usual contents of the zones are in regular Bind style zonefiles, and the geodns record rr.wikimedia.org. is added by geobackend.
Bayle is the master nameserver, sort of, but the slaves are not using AXFR! Zonefiles and other configuration are replicated through the use of rsync in an update script. In case of emergency the servers can be synced from any other as well.
All configuration files can be found in
/etc/powerdns/
on all three hosts.
The main PowerDNS configuration file is /etc/powerdns/pdns.conf. Its configuration is documented on PowerDNS.
Additionally, there's a Bind backend configuration file, /etc/powerdns/bind.conf. It's compatible with Bind's own configuration format, but is only used to list the domains which the Bind backend has to serve. (Almost) all other options are ignored. In our setup, bind.conf is autogenerated from the domain templates.
Installation steps for a Wikimedia authoritative DNS server
Almost everything is done by the wikimedia-task-dns-auth package. Installing this will pull in PowerDNS and setup the directory structure.
Make sure PowerDNS is not running - we don't want to serve out outdated/missing info.
Then, change the relevant settings in /etc/powerdns/pdns.conf - notably the following settings usually need changing:
- local-address
- query-local-address
These need to be set to the DNS service IP of the host (ns0.wikimedia.org etc.)
- default-soa-name
- geo-soa-values
Need to be set to the correct name server name
And run authdns-update on the master server
Then, bind the service IP in /etc/network/interfaces:
up ip addr add 211.115.107.190/32 dev $IFACE
Then, start PowerDNS.
Important differences from the original Bind DNS setup
- PowerDNS is used exclusively, instead of a mixed Bind/PowerDNS setup
- The wildcard records have been removed. This means that the zonefiles and/or /home/wikipedia/conf/langlist will need to be kept up to date!
- Geographic DNS and static DNS have been integrated into the same nameservers and wikimedia.org zone, which improves query latency because only a single query/response is needed.
- Zonefiles are generated from zone templates.
- No AXFR is being used; zones are replicated through rsync, and SOA serials are purely cosmetic.
Domain templates
Because Wikimedia have a lot of zones that essentially contain the same records (aliases for wikipedia.org and other projects), the old DNS setup used a single zonefile for multiple zones. That has the advantage that just a single change in a zonefile affects many zones. Unfortunately, it doesn't permit the use of $ORIGIN lines in the zonefile. In the new DNS setup, each zone gets its own zonefile, but multiple zonefiles can be generated from a single zone template.
The zone templates are (regular) files in
/etc/powerdns/templates/
Each regular file in this directory corresponds to a zone with the same name. Each symbolic link to a regular file in this directory corresponds to a domain alias. So, in this example:
# ls -l templates/mediawiki* lrwxrwxrwx 1 root root 13 Jun 19 15:52 templates/mediawiki.com -> mediawiki.org lrwxrwxrwx 1 root root 13 Jun 19 15:52 templates/mediawiki.net -> mediawiki.org -rw-r--r-- 1 root root 1500 Jun 19 15:12 templates/mediawiki.org
...one zone mediawiki.org is listed, with two alias zones, mediawiki.com and mediawiki.net.
Substitution variables
Within the zone template, a few predefined variables can be used, that will be substituted when the actual zonefiles are generated from the template. These variables include:
- $zonename
- The actual zone qname (FQDN) of the zonefile to be generated
- $serial
- The SOA serial number, derived from the current date and hour in YYYYMMDDHH format
- $langlist
- A list of language subdomain CNAMEs, i.e. a list of all language abbreviations for all languages any Wikimedia project has, generated from /home/wikipedia/common/langlist.
gen-zones
The actual zonefiles are generated from the zone templates by a Python script, gen-zones. It simply reads all zone templates from the template directory, applies string substitutions, and writes the result to the
/etc/powerdns/zones
directory, where PowerDNS can read them as regular zonefiles.
gen-bind.conf
gen-bind.conf is a Python script, that generates bind.conf by looking at the structure of the files and symlinks in /usr/local/etc/powerdns/templates/. For each regular file in that directory, it creates a corresponding block of zone statements for that zone and the zone aliases. For the example above, that would give rise to:
# mediawiki.org aliases
zone "mediawiki.com" { type master; file "mediawiki.com"; };
zone "mediawiki.net" { type master; file "mediawiki.net"; };
zone "mediawiki.org" { type master; file "mediawiki.org"; };
authdns-update
/usr/sbin/authdns-update is a simple shellscript, that automates the invocations of the scripts above. It goes through the following steps:
- generation of a (new) list of language subdomain CNAMEs from /home/wikipedia/common/langlist
- generation of the zonefiles from the zone templates
- generation of bind.conf
- reload of the local powerdns daemon (on ns0)
- synchronizing the slaves. for each slave:
- copying the langlist-cnames, zone templates and geomaps to the slave (using rsync)
- generation of the zonefiles on the slave
- generation of bind.conf on the slave
- reload of the remote powerdns daemon
Basically, authdns-update takes care of everything after you've edited the zonefiles.
Geographic DNS
Geographic DNS makes sure that clients end up using the Wikimedia cluster closest to them, by varying DNS responses based on the (country of the) resolver IP querying. Its configuration is still mostly the same as described on PowerDNS.
Geomaps are to be found in
/etc/powerdns/geomaps
However, this is now a symlink to one of several directories containing geomaps, called "scenarios". Several scenarios are provided for certain common cases, like "knams-down".
To see what scenario is currently used, and the available scenarios, use:
# authdns-scenario
To select a new scenario, use:
# authdns-scenario scenario
This changes the symlink /etc/powerdns/geomaps.
The IP->Country RBLDNS zonefile is located in
/etc/powerdns/zz.countries.nerd.dk.rbldnsd
HOWTO
This section briefly explains how to do the most common DNS changes.
Change GeoDNS
For example, when a certain cluster is down/unreachable, and you want to move all traffic to the others.
The new setup now uses scenarios for this, which are effectively just directories with sets of geomaps for each common case.
To move all traffic away from e.g. knams when it's down, log in on ns0 (bayle) and do:
# authdns-scenario knams-down # authdns-update -s "ns2.wikimedia.org"
The latter skips ns2.wikimedia.org (pascal.knams) in the update process since it'll likely be unreachable anyway. Also see below.
To see which scenario (default: normal) is currently used, use:
# authdns-scenario
Changing records in a zonefile
- Edit the template file /etc/powerdns/templates/zonename on ns0 (bayle)
- Run authdns-update
Adding a new zone
- First, decide if this new zone will use a new, independent zonefile, or will be an alias of another zone
- independent zonefile
- Create the new zone template as /etc/powerdns/templates/zonename (Copy an existing, relatively clean zonefile like wiktionary.org to start with).
- zone alias
- Make a symbolic link /etc/powerdns/templates/aliasname for the alias to the zone being aliased.
- Run authdns-update
Removing a zone
- Remove the corresponding file or symlink /etc/powerdns/templates/zonename
- Run authdns-update
Adding a new (language) wiki
- Add the language code to /home/wikipedia/common/langlist
- Run authdns-update
If a certain nameserver is unreachable
When a certain nameserver is unreachable, the others can still be updated from any of the other servers, by running authdns-update there. To skip the unreachable server in the update process, use:
# authdns-update -s "server list"
where server list is a space separated list of FQDNs. Do not forget the quotes, the script will only accept one argument behind -s.
If you only need to change the geomaps, e.g. when a certain cluster is unreachable, then it's also possible to run authdns-scenario on each server and restart PowerDNS manually.
Resolvers
Each cluster has its own set of recursive resolvers:
- pmtpa
- bayle (66.230.200.17), mchenry (66.230.200.18)
- knams
- lily (91.198.174.6), bayle (66.230.200.17)
- yaseo
- yf1019 (203.212.189.251), bayle (66.230.200.17)
(expand)
Each resolver runs the PowerDNS recursor, using package pdns-recursor in the Wikimedia APT repository (universe). The configuration file is:
/etc/powerdns/recursor.conf
Some runtime control is available through rec_control, see http://docs.powerdns.com/rec-control.html
The following settings have been modified from the default:
allow-from
Lists the IP ranges that are allowed to query this recursor. 127/8 and internal and external Wikimedia IP ranges are listed.
forward-zones
Forwards queries for the internal zones to the authoritative nameserver(s):
forward-zones= wmnet=66.230.200.16, 10.in-addr.arpa=66.230.200.16
local-address
Comma separated list of IPs on which the recursor should listen for queries. List the (external) service IP, e.g. 66.230.200.17.
setgid, setuid
Change uid/gid to pdns. Unfortunately this account is not created by the Debian package, so use:
# adduser --system --no-create-home --group --disabled-password pdns
Statistics
To setup statistics of the recursor, use the following steps:
- install rrdtool
- Copy the directory /usr/local/powerdnsstats off one of the other recursors (bayle, mchenry)
- Install lighttpd or apache if not already present
- mkdir /var/www/pdns as root
- Run cd /var/www/pdns && /usr/local/powerdnsstats/create && wget http://bayle.wikimedia.org/pdns/index.html as root
- Set up the following cron job, in /etc/cron.d/pdns-recursor:
*/5 * * * * root cd /var/www/pdns/ && /usr/local/powerdnsstats/update && /usr/local/powerdnsstats/makegraphs >/dev/null