Squid logging
Squid side
We use our own version of the squid pipe logging patch, [1] based on [2]. It is in the Wikimedia APT repository starting with release squid_2.6.7-wm3.
The pipe logging patch is restricted to starting programs with no command line, so we use a helper script at /usr/bin/log2udp_wm to invoke /usr/bin/log2udp with the appropriate command line. To change the port or destination host, edit this file.
The program will be started when squid starts, and it will be restarted on a HUP signal. Under certain error conditions, it may exit, in which case squid will log a "pipe broken" error to cache.log and abort. In normal operation, it should be quite robust.
Aggregation
UDP packets are sent from all squids to the log host henbane.yaseo.wikimedia.org. A program called udp2log runs there. udp2log logs to an arbitrary number of destination files or pipes, specified in /etc/udp2log. The configuration file format is line based, with each line containing:
pipe <factor> <command line>
or
file <factor> <filename>
Unlike the squid pipe log, these pipes are full shell commands and may contain spaces. There is no quoting, the command goes from after the factor to the end of the line.
<factor> is an integer sampling factor, one in every <factor> packets will be sent to the designated destination.
Comments are lines starting with "#".
Reconfiguration can be done by sending a HUP signal to udp2log. The sequence of events when HUP is received are as follows:
- HUP is received, a flag is set
- Wait for the next UDP packet
- Load the new configuration, open all files and pipes specified
- If all pipes and files were successfully opened, swap in the configuration and close all previously opened pipes and files. Otherwise, close the new pipes and files and retain the old configuration.
- Process the received packet using the new configuration
- Resume the recv() loop
This means that pipe scripts should not obtain any lock on a shared resource until after the first line is received on stdin.
udp2log is designed to run as a daemon for long periods of time with only HUP reloads, not restarts. Restarts should be avoided because they cause packet loss. I haven't quite gotten around to writing the daemonize code yet, so for now it runs in a root screen. It runs as the user "logger".
Log files
Log files are stored in /a/log.
This section should be updated with details of currently collected log files and aggregation data.