Lists.wikimedia.org
Contents |
HowTo
Create a mailing list
There are 2 ways to create a mailing list:
- Via the web interface at http://lists.wikimedia.org/mailman/create - a list's creator password is needed. The site password works as well.
- At the shell prompt on lists.wikimedia.org (2012-03: sodium). As root, run newlist.
In both cases, it's not necessary to add e-mail aliases anywhere!
Remove a mailing list
To remove a mailing list, at the shell prompt on sodium run:
# rmlist listname
To also remove all archives, use:
# rmlist -a listname
Remove a message from the mailing list archives
Sometimes it's necessary to remove a message from mailing list archive, e.g. when someone's complaining about public visibility in search engines. However, mailing list archives, though public, are no longer indexed by search engines as they are excluded in robots.txt.
Export a listing of all subscribers to a mailing list
- Login to the mailing list server
./var/lib/mailman/bin/list_members -f -o <file to write to> <list name>
Remove an individual from all mailing lists
Occasionally we need to remove an individual from every mailing list we have, such as when an email address no longer works but we don't want mailman to turn it off due to bounce detection. The remove_members command is the solution - this is a command line utility to remove one or more email addresses from a specific list or from all lists.
- remove an individual from a specific list
/var/lib/mailman/bin/remove_members mylist user@example.com
- remove two addresses from all lists
/var/lib/mailman/bin/remove_members --nouserack --fromall user1@example.com user2@example.com
Rename a mailing list
Upgrade Mailman
The Mailman package insists on the Mailman queue being empty during the upgrade. As messages are constantly coming in, that's not easy, and stopping the Mailman process doesn't prevent Exim from delivering messages to Mailman either. Therefore, the best solution is to make Exim hold messages on the queue by putting the line
hold_domains = lists.wikimedia.org
in /etc/exim4/exim4.conf, in the main configuration section. Run
# /etc/init.d/exim4 reload
Afterwards. Then, perform the apt-get upgrade, and reverse the Exim4 config.
Configuration details
The new Mailman setup lives on sodium, and uses the standard Ubuntu package mailman. The mailing list state is under /var/lib/mailman/, the global configuration is in /etc/mailman/.
The mail server used is Exim, the web server used is lighttpd.
Mailman setup
Mailman has fairly reasonable default values, and doesn't need a lot of changes from the defaults. The following settings were modified in /etc/mailman/mm_cfg.py:
# If you change these, you have to configure your http server # accordingly (Alias and ScriptAlias directives in most httpds) DEFAULT_URL_PATTERN = 'http://%s/mailman/' PRIVATE_ARCHIVE_URL = '/mailman/private'
# Default domain for email addresses of newly created MLs DEFAULT_EMAIL_HOST = 'lists.wikimedia.org' # Default host for web interface of newly created MLs DEFAULT_URL_HOST = 'lists.wikimedia.org'
Exim recognizes which lists exist under @lists.wikimedia.org, so aliases are only needed in other domains:
# Uncomment this if you configured your MTA such that it
# automatically recognizes newly created lists.
# (see /usr/share/doc/mailman/README.{EXIM,...})
# MTA=None # Misnomer, suppresses alias output on newlist
MTA=None
# Set Reply-To to the list by default DEFAULT_REPLY_GOES_TO_LIST = 0
htDig integration
See Mailman-htdig for details. htDig has been integrated in the Wikimedia Mailman Ubuntu package, and just needs to be enabled in mm_cfg.py.
Due to performance problems we were not using/updating the htDig indexes anyway - htDig integration has therefore been removed during the distribution upgrade to Hardy.
Mail server setup
Near the top of the exim4.conf file, there are several macros related to Mailman. These define system-specific settings/locations used by the router(s) and transport(s) in the rest of the configuration file. For a Debian/Ubuntu Mailman package, the following macro's are accurate:
# Mailman MAILMAN_HOME = /usr/lib/mailman MAILMAN_LISTS_HOME = /var/lib/mailman MAILMAN_WRAP = MAILMAN_HOME/mail/mailman MAILMAN_UID = list MAILMAN_GID = list
There's a domain list that contains a list of all domains that can "contain" mailing lists, i.e. the domains for which the Mailman router(s) should run. This list is also used as part of the "local domains" list, the list for which this mail server accepts mail and handles it locally.
domainlist mailman_domains = lists.wikimedia.org domainlist local_domains = +system_domains : +mailman_domains
Main configuration
Several tweaks have been made to the main configuration to make Mailman delivery go smooth.
In case of high load / lots of incoming connections, mail from the local host (including Mailman) and other Wikimedia servers are given preference:
smtp_reserve_hosts = <; 127.0.0.1 ; ::1 ; +wikimedia_nets
For big mailing lists, Mailman needs to send a lot of recipients per mail / connection. Per default, Exim only queues mails that have > 10 recipients, to be delivered by a subsequent queue runner, which can cause significant delays. The default Mailman limit is 500 recipients per connection, so make Exim accept that:
smtp_accept_queue_per_connection = 500
Allow Exim to do 50 deliveries to remote hosts in parallel (this means 50 processes):
remote_max_parallel = 50
Routers
In Exim, the routers determine if a certain e-mail address is accepted for delivery or mail transport, and how it's going to be handled (routed). For Mailman, the following list router accepts a recipient that:
- has a domain in the domain list mailman_domains
- has a Mailman configuration file matching the local part (i.e. the mailing list exists)
Certain postfixes of the localpart, e.g. "-bounces" are accepted as well.
When the router accepts the recipient address, it's set up for delivery using the list transport (see below).
# Mailman list handling. Test the mailing list address without suffix
# first, as a mailing list like wikifi-admin is a valid list name.
list:
driver = accept
domains = +mailman_domains
require_files = MAILMAN_LISTS_HOME/lists/$local_part/config.pck
transport = list
list_suffix:
driver = accept
domains = +mailman_domains
require_files = MAILMAN_LISTS_HOME/lists/$local_part/config.pck
local_part_suffix = -bounces : -bounces+* : \
-confirm+* : -join : -leave : \
-owner : -request : -admin : \
-subscribe : -unsubscribe
transport = list
If the conditions for this router fail (i.e. the router is not run) then the no_more makes sure that no subsequent routers will be tried (in the current configuration there are none that might accept), and the recipient address is failed.
Transports
An Exim transport configures a way of transporting a message, e.g. over the network (SMTP), to a file (MBOX/Maildir/etc) or using a pipe to a process. The following transport sets up delivery to Mailman:
# Mailman pipe transport
list:
driver = pipe
command = MAILMAN_WRAP \
'${if def:local_part_suffix \
{${sg{$local_part_suffix}{-(\\w+)(\\+.*)?}{\$1}}} \
{post}}' \
$local_part
current_directory = MAILMAN_LISTS_HOME
home_directory = MAILMAN_LISTS_HOME
user = MAILMAN_UID
group = MAILMAN_GID
For content scanning, temporary mbox files are written to /var/spool/exim4/scan, and deleted after scanning. Similarly, Exim keeps "hints" databases in /var/spool/exim4/db, which are non-essential caches. To improve performance somewhat, these directory is mounted as a tmpfs filesystem, using the following line in /etc/fstab:
tmpfs /var/spool/exim4/scan tmpfs defaults 0 0 tmpfs /var/spool/exim4/db tmpfs defaults 0 0
Mailing list privacy protection
It has happened in the past that by hitting on the Reply All button in one's e-mail client, private info from an internal list leaked to a public mailing list because it was listed in the CC list, and the imprudent sender did not notice. In order to try to catch these incidents, a little filter has been implemented.
- If the To: or CC: of a message body matches an item in a list of private mailing list addresses, and
- the list of recipients as known by the mailing list server contains a Wikimedia mailing list that's not a private mailing lists, then
- the message is bounced with the message Message rejected for privacy protection: The list of recipients contains both private and public lists.
It's possible to circumvent this restriction by sending to the private list as a BCC, Blind Carbon Copy.
This filter is implemented using an Exim system filter:
# Exim filter
# Mailing list privacy protection
if foranyaddress $h_To:,$h_Cc: ( $thisaddress matches "\\N^(internal-l|private-l)@(lists\.|mail\.)?wiki[mp]edia\.org$\\N" ) then
if foranyaddress $recipients ( $thisaddress matches "\\N@lists\.wikimedia\.org$\\N" and $thisaddress does not match "\\N^(internal-l|private-l)@\\N" ) then
fail text "Message rejected for privacy protection: The list of recipients contains both private and public mailing lists"
endif
endif
This filter is enabled in the configuration file using
system_filter = CONFDIR/system_filter
Address header rewriting
It turned out that, after the migration, many users kept sending mails to both the old and the new mailing list addresses, thereby causing duplicate messages. To reduce this, Exim has been configured to rewrite the old mailing list addresses to the new ones in the To: and CC: headers, using the following option on the list transport:
list:
...
# Rewrite body headers of old mailing list addresses to new ones
headers_rewrite = \N^.*@(mail\.)?wiki[mp]edia\.org$\N "${if exists{MAILMAN_LISTS_HOME/lists/$local_part/config.pck}{$local_part@lists.wikimedia.org}fail}" ct
Web server setup
To get Mailman running with lighttpd, a couple of small changes had to be made to the default configuration file. mod_cgi and mod_redirect need to be loaded:
server.modules = (
"mod_access",
"mod_alias",
"mod_accesslog",
"mod_redirect",
"mod_cgi",
)
To make path /mailman/ invoke the correct CGI scripts, use:
# Mailman
alias.url = (
"/mailman/" => "/usr/lib/cgi-bin/mailman/",
"/pipermail/" => "/var/lib/mailman/archives/public/",
"/images/" => "/usr/share/images/",
)
url.redirect = (
"^/(index\.html?)?$" => "http://meta.wikimedia.org/wiki/Mailing_lists/overview",
"^/mailman/?$" => "/mailman/listinfo"
)
$HTTP["url"] =~ "^/mailman/" {
cgi.assign = ( "" => "" )
}
See also http://www.gnu.org/software/mailman/mailman-install/node10.html
SpamAssassin
SpamAssassin is installed using the default Ubuntu spamassassin package. A couple of configuration changes were made.
By default, spamd, if enabled, runs as root. To change this:
# adduser --system --home /var/lock/spamassassin --group --disabled-password --disabled-login spamd
The following settings were modified in /etc/default/spamassassin:
# Change to one to enable spamd ENABLED=1
User preferences are disabled, spamd listens on the loopback interface only, and runs as user/group spamd:
OPTIONS="--max-children 5 --nouser-config --listen-ip=127.0.0.1 -u spamd -g spamd"
Run spamd with nice level 10:
# Set nice level of spamd NICE="--nicelevel 10"
Backups
sodium is backed up to mchenry using rdiff-backup. The daily CRON job is in /etc/cron.daily/zz_backup, and makes use of a dedicated ssh key /root/.ssh/rdiff-backup through /root/.ssh/config. The list of in-/excluded directories is in /etc/rdiff-backup/includes. The data is backed up to /var/backup/sodium/.
Tested failure modes
Because mail delivery and transport should be reliable, I have tested what happens in certain failure modes, e.g. when SpamAssassin's spamd daemon is not running.
Spamd not running
Because of the /defer_ok modifiers in the Exim ACLs, Exim will act as if no spam filtering attempts are made when spamd is not running, and will accept the message. The following lines are logged:
spam acl condition: warning - spamd connection to 127.0.0.1, port 783 failed: Connection refused spam acl condition: all spamd servers failed H=xxx.xxxxxxx.xx [xx.xx.xx.xx]:xxxx I=[145.97.39.157]:25 U=exim Warning: ACL "warn" statement skipped: condition test deferred
Mailman not running
If the Mailman queue runner daemons are not running, incoming messages will still get delivered to the Mailman queue by Exim. However, nothing else will happen until the Mailman processes are started.
TODO
- Mail server configuration fine tuning
- Mail server configuration documentation
-
Mailman configuration fine tuning -
Spam filtering (current config?) -
htdig -
Backup MX -
Automatic mailing list index script (also, 404 handlers, robots.txt...) -
Migrating existing mailing lists, with announcements -
Redirection of old URL to new -
DNS Resolver - Search engine for archive messages (Including private messages)
- Monitoring
-
Backups
Migration
Configuration files can be copied to sodium just fine. Variables that may need to be changed are:
- reply_to_address
- host_name
Most of these can probably be done automatically. Not present in the dumped list configuration file is the list's URLs. A fix_url withlist script is provided to change this.
Archives can be copied by just transferring the .mbox file, and then rebuilding the archive from scratch with arch --wipe.
Migration to sodium
- Files in dirs lists, archives and data should be rsynced again, with Mailman disabled on both sides. dpkg-reconfigure mailman should be rerun.