Mail

From Wikitech
(Difference between revisions)
Jump to: navigation, search
(IMAP server: group mgmg)
(IMAP server: perms)
Line 383: Line 383:
 
The IMAP server is [[sanger]]. It only receives e-mail destined for its IMAP accounts; other mail is handled by [[mchenry]]. Outgoing mail is not sent directly, but routed via the mail relays, so the IMAP server should never build up a large mail queue itself.
 
The IMAP server is [[sanger]]. It only receives e-mail destined for its IMAP accounts; other mail is handled by [[mchenry]]. Outgoing mail is not sent directly, but routed via the mail relays, so the IMAP server should never build up a large mail queue itself.
  
Mail is stored under the directory
+
Mail is stored under the directory <tt>/var/vmail</tt>, which should be created with the correct permissions:
 +
# mkdir /var/vmail
 +
# chown root:vmail /var/vmail
 +
# chmod g+s /var/vmail
  
/var/vmail
+
Mail storage uses a single system user account ''vmail'', which has been created with the command
 
+
and uses a single system user account ''vmail'', which has been created with the command
+
  
 
  # adduser --system --home /var/vmail --no-create-home --group --disabled-password --disabled-login vmail
 
  # adduser --system --home /var/vmail --no-create-home --group --disabled-password --disabled-login vmail

Revision as of 13:07, 12 May 2007

Contents

HowTo

This section lists some commonly needed actions and how to perform them.

Modify aliases

Right now mchenry is the mail relay which also does all aliasing. All domains are now separate and don't share the same alias file. Each domain has its own alias file in /etc/exim4/aliases/.

To add/modify/remove an alias, simply edit the corresponding text file on mchenry, e.g. /etc/exim4/aliases/wikimedia.org. No additional steps are necessary.

To use the same aliases for multiple domains you can use symbolic links, however be careful because unqualified targets (i.e. mail addresses without a domain, like noc) that are not listed in the same alias file (for example, OTRS queues) may not work as they do not exist in the symbolically linked domain. Use fully qualified addresses in that case.

Adding / removing an OTRS queues and mail addresses

Just add the queue in OTRS with appropriate mail addresses and be happy. mchenry will automatically see that the queue exists or has disappeared, and no involvement from Wikimedia admins is necessary.

Under some circumstances it's possible that, due to negative caching at the secondary MXes, a new mail address will only start working after up to two hours.

Adding / removing mail domains

Set up DNS MX records with mchenry.wikimedia.org as the primary MX, and lists.wikimedia.org as secondary, and things should already start to work. You'll probably want to add an alias file on the primary mail relay though, or no mail will be accepted.

If you don't want to rely on DNS MX records alone, you can also add the domain to the file /etc/exim4/local_domains on the primary mail relay, and /etc/exim4/relay_domains on the secondary mail relays, but this is not a requirement.

Searching the logs

Exim 4's main log file is /var/log/exim4/mainlog. Using exigrep instead of grep may be helpful, as it combines (scattered) log lines per mail transaction.

Design decisions

Reliable mail delivery first 
Spam filtering and other tricks are needed, but reliable mail delivery for genuine mail should have a higher priority. False positives in mail rejects and incompatibilities should be kept to a minimum.
Black box mail system, no user shell logins 
Few users would make good use of this anyway. Greatly simplifies network and host security, allows the use of some (non-critical) non-standardized extensions between software components for greater performance, interoperability and features because it doesn't have to support whatever shell users might install to access things directly.
IMAP only, no POP3 
IMAP has good client support nowadays, and for a large part solves the problem of having multiple clients. Also backups can be done centrally on the server side, and multiple folders with server side mail filtering might be supported.
Support for mail submission 
Through SMTP authentication we can allow our users to submit mails through the mail server, without them having to configure an outgoing mail server for whatever network they reside on. Can support multiple ports/protocols to evade firewalls.
SSL/TLS access only, no plain-text 
Although client support for this is not 100% yet, especially on mobile devices, the risks of using plain-text protocols is too high, especially with users visiting conferences and other locations with insecure wireless networks.
Quota support 
Although we can set quotas widely especially for those who need it, quotas should be implemented to protect the system.
Spam and virus filtering 
Is unfortunately necessary. Whether this should be global or per-user is to be determined.
Multi-domain support 
We have many domains, and the mail setup should be able to distinguish between domains where necessary.
Web access 
Some form of web-mail would be nice, although not critical at first and can be implemented at later stages.
Backups 
At least daily, with snapshots.
Cold failover 
Setting up a completely redundant system is probably a bit overkill at this stage, but we should make it easy and quick to set up a new mail system on other hardware in case of major breakage.
Documentation 
Although not all aspects of the involved software can be described of course, the specifics of the Wikimedia setup should be properly documented and HOWTOs for commonly needed tasks should be provided.

Software

MTA
Exim : Great flexibility, very configurable, reliable, secure.
IMAP server
Dovecot : Fast, secure, flexible.

Formats used

Maildir 
Safe, convenient format, moderately good performance, good software support.
Password and user databases 
To be determined. Important aspects: easy maintenance, good software support, replication support. Possible options:
  • passwd-file - Simple field-separated text file, non-indexed. Supported by both Exim and Dovecot.
  • sqlite - Indexed file format, powerful SQL queries, no full-blown RDBMS needed. Also easy to change to MySQL/PostgreSQL should that ever be necessary. Supported by both Exim and Dovecot.
Other data lookups 
either flat-file for small lists, or cdb for larger, indexed lookups.

Mailbox storage and mail delivery

Ext3 as file system 
ReiserFS may be a bit faster, but Ext3 is more reliable. Make sure directory indexes are enabled.
LVM 
For easy resizing, moving of data to other disks, and snapshots for backups.
RAID-1 
The new mail servers have hardware RAID controllers, we'll probably use them.
Dovecot's "deliver" as LDA 
Though Exim has a good internal Maildir "transport", the use of Dovecot's LDA allows it to use and update the Dovecot specific indexing for greater performance.
fcntl() and dot-file locking 
Greatest common divisors.
Maildir++ quotas 
Standard, reasonably fast.

Authentication

PLAIN authentication 
Universally supported for both IMAP and SMTP. Encrypted connections are used exclusively, so no elaborate hashing schemes needed.
SMD5 or SSHA password scheme 
Salted hashing.
SMTP authentication through either Exim's Dovecot authenticator, or using direct lookups 
Exim 4.64 has support for directly authenticating against Dovecot's authenticator processes, though this version is not in Ubuntu Feisty yet, so needs backporting. If direct lookups from Exim's authenticators are easy enough, use that. Also depends on the security model.

Layout

The mail setup consists of 2 general mail servers, plus a mailing lists server (lily) and an OTRS server. The two general mail servers are mchenry and sanger.
Wikimedia mail setup

One server (mchenry) acts as relay; it accepts mail connections from outside, checks them for spam, viruses and other policy checks, and then queues and/or forwards to the appropriate internal mail server. It also accepts mail destined for outside domains from internal servers, including the application servers.

The other server, sanger, is the IMAP server. It accepts mail from mchenry and delivers it to local user mailboxes. Outgoing mail from SMTP authenticated accounts are also accepted on this server, and forwarded to mchenry, where it's queued and sent out. Web mail and other supportive applications related to user mail accounts and their administration will also run on sanger.

Lily, the mailing lists server, also acts as a secondary MX and forwards non-mailing list mail to mchenry. In case of downtime of mchenry, it might be able to send partial (IMAP account) mail to sanger directly, depending on the added complexity of the configuration. During major hardware failure of sanger, mchenry (with identical hardware) should be able to be setup as IMAP server.

Configuration details

Mail relay

The current mail relay is mchenry.

As a mail relay needs to do a lot of DNS lookups, it's a good place for a DNS resolver, and therefore mchenry is pmtpa's secondary DNS recursor - although mchenry uses its own resolver as primary.

mchenry uses Exim 4, the standard Ubuntu Feisty exim4 exim4-daemon-heavy package. This package does some stupid things like running under a Debian-exim user, but not enough to warrant running our own modified version. All configuration lives in /etc/exim4, where exim4.conf is Exim's main configuration file.

The following domain and host lists are defined near the top of the configuration file:

# Standard lists
hostlist wikimedia_nets = <; 66.230.200.0/24 ; 145.97.39.128/26 ; 211.115.107.128/26 ; 2001:610:672::/64
domainlist system_domains = @
domainlist relay_domains =
domainlist legacy_mailman_domains = wikimedia.org : wikipedia.org
domainlist local_domains = +system_domains : +legacy_mailman_domains : lsearch;CONFDIR/local_domains : @mx_primary/ignore=127.0.0.1

system_domains is a list for domains related to the functioning of the local system, e.g. mchenry.wikimedia.org and associated system users. It has little relevance to the rest of the Wikimedia mail setup, but makes sure that mail submitted by local software is handled properly.

relay_domains is a list for domains that are allowed to be relayed through this host.

local_domains is a compound list of all domains that are in some way processed locally. They are not routed using the standard dnslookup router. Besides the domains listed in /etc/exim4/local_domains, mail will also accepted for any domain which has mchenry (or, one of its interface IP addresses) listed in DNS as the primary MX. This could get abused by people having control over some arbitrary DNS zone of course, but since typically no alias file for it will exist, no mail address will be accepted in that case anyway.

For content scanning, temporary mbox files are written to /var/spool/exim4/scan, and deleted after scanning. To improve performance somewhat, this directory is mounted as a tmpfs filesystem, using the following line in /etc/fstab:

tmpfs   /var/spool/exim4/scan   tmpfs   defaults        0       0

Resource limits

To behave gracefully under load, some resource limits are applied in the main configuration section:

# Resource control
check_spool_space = 50M

No mail delivery if there's less than 50MB free.

deliver_queue_load_max = 75.0
queue_only_load = 50.0

No mail delivery if system load is > 75, and queue-only (without immediate delivery) when load is > 50.

smtp_accept_max = 100
smtp_accept_max_per_host = 10

Accept maximally 100 SMTP connections simultaneously, max. 10 from the same host.

smtp_reserve_hosts = <; 127.0.0.1 ; ::1 ; +wikimedia_nets

Reserve SMTP connection slots for our own servers.

smtp_accept_queue_per_connection = 500

If more than 500 mails are sent in one connection, queue them without immediate delivery.

remote_max_parallel = 25

Invoke at most 25 parallel delivery processes.

smtp_connect_backlog = 32

TCP SYN backlog parameter.

Aliases

Each Wikimedia domain (wikimedia.org, wikipedia.org, wiktionary.org, etc...) is now distinct and has its own aliases file, under /etc/exim4/aliases/. Alias files use the standard format. Unqualified address targets in the alias file (local parts without domain) are qualified to the same domain. Special :fail: and :defer: targets and pipe commands are also supported, see http://www.exim.org/exim-html-4.66/doc/html/spec_html/ch22.html#SECTspecitredli.

The following router takes care of this. It's run for all domains in the +local_domains domain list defined near the top of the Exim configuration file. It checks whether the file /etc/exim4/aliases/$domain exists, and then uses it to do an alias lookup.

# Use alias files /etc/exim4/aliases/$domain for domains like
# wikimedia.org, wikipedia.org, wiktionary.org etc.

aliases:
       driver = redirect
       domains = +local_domains
       require_files = CONFDIR/aliases/$domain
       data = ${lookup{$local_part}lsearch{CONFDIR/aliases/$domain}}
       qualify_preserve_domain
       allow_fail
       allow_defer
       forbid_file
       include_directory = CONFDIR
       pipe_transport = address_pipe

OTRS

For OTRS, the mail relay queries the OTRS MySQL servers directly to check the existence of an OTRS mail address. This implies that newly created OTRS queues / mail addresses will start to work immediately and no involvement from Wikimedia admins is needed.

The MySQL servers are specified near the top of the Exim configuration file:

# MySQL lookups (OTRS)
hide mysql_servers = srv7.wikimedia.org/otrs/exim/password : \
                     srv8.wikimedia.org/otrs/exim/password

These servers will be queried in turn. If neither of these servers respond, or respond with an error, the mail will be deferred. A MySQL user account "exim" with (just) SELECT privileges on the system_address table of the otrs database needs to exist, which is accessible from the mail relay (mchenry.wikimedia.org).

The following router does the actual aliasing of the OTRS address to otrs@ticket.wikimedia.org, if the OTRS queue address exists in the database:

# Query the OTRS MySQL server(s) for the existence of the queue address
# $local_part@$domain, and alias to otrs@ticket.wikimedia.org if
# successful.

otrs:
       driver = redirect
       domains = +local_domains
       condition = ${lookup mysql{SELECT value0 FROM system_address WHERE value0='${quote_mysql:$local_part@$domain}'}{true}fail}
       data = otrs@ticket.wikimedia.org

SpamAssassin

SpamAssassin is installed using the default Ubuntu spamassassin package. A couple of configuration changes were made.

By default, spamd, if enabled, runs as root. To change this:

# adduser --system --home /var/lock/spamassassin --group --disabled-password --disabled-login spamd

The following settings were modified in /etc/default/spamassassin:

# Change to one to enable spamd
ENABLED=1

User preferences are disabled, spamd listens on the loopback interface only, and runs as user/group spamd:

OPTIONS="--max-children 5 --nouser-config --listen-ip=127.0.0.1 -u spamd -g spamd"

Run spamd with nice level 10:

# Set nice level of spamd
NICE="--nicelevel 10"

In Exim, SpamAssassin is called from the DATA ACL for domains in domain list spamassassin_domains. exim4.conf:

domainlist spamassassin_domains = *
acl_smtp_data = acl_check_data
acl_check_data:
        # Let's trust local senders to not send out spam
        accept hosts = +relay_from_hosts

        # Run through spamassassin
        accept endpass
               acl = spamassassin

spamassassin:

        # Only run through SpamAssassin if requested for this domain and
        # the message is not too large
        accept condition = ${if >{$message_size}{40K}}

        # Add spam headers if score >= 1
        warn spam = nonexistent:true
             condition = ${if >{$spam_score_int}{10}{1}{0}}
             add_header = X-Spam-Score: $spam_score ($spam_bar)
             add_header = X-Spam-Report: $spam_report

        # Reject spam at high scores (> 12)
        deny message = This message scored $spam_score spam points.
             spam = nonexistent/defer_ok
             condition = ${if >{$spam_score_int}{120}{1}{0}}

        accept

First, not listed in spamassassin_domains is accepted, as well as mails bigger than 40 KB. Then a Spam check is done using the local spamd daemons. If that results in a score of minimum 1, X-Spam-Score: and X-Spam-Report: headers are added. If the spam score is 12 or higher, the mail is rejected outright.

Mailing lists

Mailing lists now live on a dedicated mailing lists server (lily) on a dedicated mail domain lists.wikimedia.org. However, mail for the old addresses such as info-en@wikipedia.org still come in and should be rewritten to the new addresses, and then forwarded to the mailing lists server.

Near the top of the Exim configuration file a domain list is defined, which contains mail domains that can contain these old addresses:

domainlist legacy_mailman_domains = wikimedia.org : wikipedia.org : mail.wikimedia.org : mail.wikipedia.org

The following router, near the end of the routers section, checks if a given local part exists in the file /etc/exim4/legacy_mailing_lists, and rewrites it to the new address if it does, to be routed via the normal DNS MX/SMTP routers/transports. Since Mailman does not distinguish between domains, only a single local parts file for all legacy mailman domains exists. This file only needs to contain the mailing list names; all suffixes are handled by the router.

# Alias old mailing list addresses to @lists.wikimedia.org on lily

legacy_mailing_lists:
       driver = redirect
       domains = +legacy_mailman_domains
       data = $local_part$local_part_suffix@lists.wikimedia.org
       local_parts = lsearch;CONFDIR/legacy_mailing_lists
       local_part_suffix = -bounces : -bounces+* : \
                               -confirm+* : -join : -leave : \
                               -owner : -request : -admin : \
                               -subscribe : -unsubscribe
       local_part_suffix_optional

Wiki mail

The application servers send out mail for wiki password reminders/changes, and e-mail notification on changes if enabled. These automated mass mailings are also accepted by the mail relay, mchenry, but are treated somewhat separately. To minimize the chance of external mail servers blocking mchenry's regular mail because of mass emails, these "wiki mails" are sent out using a separate IP.

Near the top of the configuration a macro is defined for the IP address to accept incoming wiki mail, and to use for sending it out to the world:

WIKI_INTERFACE=66.230.200.216

A hostlist is defined for the IP ranges that are allowed to relay from:

hostlist relay_from_hosts = <; @[] ; 66.230.200.0/24 ; 10.0.0.0/16

The rest of the configuration file uses the incoming interface address to distinguish wiki mail from regular mail. Therefore care must be taken that external hosts cannot connect using this interface address. A SMTP Connect ACL takes care of this:

# Policy control
acl_smtp_connect = acl_check_connect
acl_check_connect:
       # Deny external connections to the internal bulk mail submission
       # interface

       deny condition = ${if match_ip{$interface_address}{WIKI_INTERFACE}{true}{false}}
            ! hosts = +wikimedia_nets

       accept

Wiki mail gets picked up by the first router, selecting on sender address and incoming interface address:

# Route mail generated by MediaWiki differently

wiki_mail:
       driver = dnslookup
       domains = ! +local_domains
       senders = wiki@wikimedia.org
       condition = ${if match_ip{$interface_address}{WIKI_INTERFACE}{true}{false}}
       transport = bulk_smtp
       ignore_target_hosts = <; 0.0.0.0 ; 127.0.0.0/8 ; 0::0/0 ; 10/8 ; 172.16/12 ; 192.168/16
       no_verify

The router directs to a separate SMTP transport, bulk_smtp. no_verify is set because mails from the application servers are not verified anyway, to be as liberal as possible with incoming mails and keep the queues on the application servers small. Queue handling should be done on the mail relay. For other mail, this router is not applicable so is not needed for verification either.

The bulk_smtp transport sets a different outgoing interface IP address, and a separate HELO string:

# Transport for sending out automated bulk (wiki) mail

bulk_smtp:
       driver = smtp
       hosts_avoid_tls = <; 0.0.0.0/0 ; 0::0/0
       interface = WIKI_INTERFACE
       helo_data = wiki-mail.wikimedia.org

Wiki mail also has a shorter retry/bounce time than regular mail; only 8 hours:

begin retry

*       *       senders=wiki@wikimedia.org      F,1h,15m; G,8h,1h,1.5

Postmaster

For any local domain, postmaster@ should be accepted even if it's forgotten in alias files. A special redirect router takes care of this:

# Redirect postmaster@$domain if it hasn't been accepted before

postmaster:
       driver = redirect
       domains = +local_domains
       local_parts = postmaster
       data = postmaster@$primary_hostname
       cannot_route_message = Address $local_part@$domain does not exist

Internal address rewriting

Internal servers in the .pmtpa.wmnet domain sometimes send out mail, which gets rejected by mail servers in the outside world. Sender domain address verification cannot resolve the domain .pmtpa.wmnet, and the mail gets rejected. To solve this, mchenry rewrites the Envelope From to root@wikimedia.org for any mail that has a .pmtpa.wmnet sender address:

#################
# Rewrite rules #
################# 

begin rewrite

# Rewrite the envelope From for mails from internal servers in *.pmtpa.wmnet,
# as they are usually rejected by sender domain address verification.
*@*.pmtpa.wmnet root@wikimedia.org      F

Secondary mail relay

lily is Wikimedia's secondary mail relay. It should do the same policy checks on incoming mail as the primary mail relay, so make sure its ACLs are equivalent for the relevant domains.

Lily does not have a copy/cache of the local parts which are accepted by the primary relay, as that is a dynamic process. Instead, it uses recipient address verification callouts, i.e. it asks the primary mail relay whether a recipient address would be accepted or not. In case the primary mail relay is unreachable, or does not respond within 5-30s, the address is assumed to exist and the mail is accepted - it is, after all, a backup MX. Callouts are cached, so resources are saved for frequently appearing destination addresses.

Relay domains

Secondary mail relays will relay for any domain for which the following holds:

  1. The domain is listed in a static text file of domains: /etc/exim4/relay_domains, or
  2. The secondary mail relay is listed as a secondary MX in DNS for the domain, and
  3. The higher priority MXes are in a configured list of allowed primaries

The latter is to prevent abuse; we don't really want people with control over a DNS zone abusing our mail servers as backup MXes.

Near the top of the configuration file, two domain lists are defined for domains to relay for:

domainlist relay_domains = lsearch;CONFDIR/relay_domains
domainlist secondary_domains = @mx_secondary/ignore=127.0.0.1

relay_domains contains domains explicitly listed in the text file /etc/exim4/relay_domains, and secondary_domains queries DNS whether the local host is listed as a secondary MX. Note: the two lists will usually overlap.

A host list is defined with accepted primary mail relays. This list should only contain IPs, and are the only IP addresses where @mx_secondary domains will be relayed to. For domains explicitly configured in relay_domains, it doesn't matter what the primary MX is.

@mx_secondary domains use a separate dnslookup router, to check the higher priority MX records:

# Relay @mx_secondary domains only to these hosts
hostlist primary_mx = 66.230.200.240
# Route relay domains only if the higher prio MXes are in the allowed list

secondary:
       driver = dnslookup
       domains = ! +relay_domains : +secondary_domains
       transport = remote_smtp
       ignore_target_hosts = ! +primary_mx
       cannot_route_message = Primary MX(s) for $domain not in the allowed list
       no_more

All relevant (= higher priority) MX records not in hostlist primary_mx are removed from the list for consideration by Exim. In case there are no higher priority MX records which coincide with the primary_mx list, the MX list will be empty and the router will decline. As this router is run during address verification in the SMTP session as well, the RCPT command will be rejected.

Exim's dnslookup router has a precondition check check_secondary_mx. However, the secondary_domains domainlist serves the same purpose, and using both at the same time in fact doesn't work, as by the time the check_secondary_mx check is run, Exim will already have removed the local host from the MX list (due to ignore_target_hosts), and the router will decline to run.

Note: this router should not be run for domains in domainlist relay_domains, as for those domains, the MX rules need not to be as stringent. They can be handled by the regular dnslookup router:

# Route non-local domains (including +relay_domains) via DNS MX and A records

dnslookup:
       driver = dnslookup
       domains = ! +local_domains
       transport = remote_smtp
       ignore_target_hosts = <; 0.0.0.0 ; 127.0.0.0/8 ; 10/8 ; 172.16/12 ; 192.168/16
       cannot_route_message = Cannot route to remote domain $domain
       no_more

IMAP server

The IMAP server is sanger. It only receives e-mail destined for its IMAP accounts; other mail is handled by mchenry. Outgoing mail is not sent directly, but routed via the mail relays, so the IMAP server should never build up a large mail queue itself.

Mail is stored under the directory /var/vmail, which should be created with the correct permissions:

# mkdir /var/vmail
# chown root:vmail /var/vmail
# chmod g+s /var/vmail

Mail storage uses a single system user account vmail, which has been created with the command

# adduser --system --home /var/vmail --no-create-home --group --disabled-password --disabled-login vmail

User Debian-exim needs to be part of the vmail group to access the mail directories:

# gpasswd -a Debian-exim vmail

Smart host

The last Exim router in the configuration file handles (outgoing) mail not destined for the local server; it sends mail for all domains to mchenry.wikimedia.org, or lists.wikimedia.org if the former is down.

# Send all mail not destined for the local machine via a set of
# mail relays ("smart hosts")

smart_route:
       driver = manualroute
       transport = remote_smtp
       route_list = *  mchenry.wikimedia.org:lists.wikimedia.org

See also

External documentation

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox