Multicast HTCP purging

From Wikitech
Revision as of 03:26, 26 January 2013 by LeslieCarr (Talk | contribs)

Jump to: navigation, search

This page talks about proposal that was eventually implemented. It's kind of current, but the essay style makes for a very verbose documentation page.

Warning: This page is old. It is kept as an archive. Do not expect anything on it to be relevant to the current day.

Multicast HTCP purging is a new method of Squid purging by using multicast HTCP packets.

Contents

Why

Previous methods of Squid purging implemented in MediaWiki, SquidUpdate::purge and SquidUpdate::fastPurge, used HTTP PURGE requests over unicast TCP connections from all Apaches to all Squids. This had a few drawbacks:

  • All Apaches needed to be able to connect to all Squids
  • There was overhead of handling Squid's replies and TCP connection overhead

The biggest drawback was that it was plain slow.

Software modifications

Someone came up with the idea to write a single Squid purge daemon, to which all Apaches could connect to, and send a single HTTP PURGE request. This daemon would then multiplex this message to all Squids.

I started thinking of implementing this, but then came up with a different idea: making use of multicast. By sending the purge requests to a specific multicast group to which all Squids can subscribe to, the network will take care of the multiplexing, which means that there is no need for a separate daemon, and it will be done very efficiently in hardware.

One problem of course, is that TCP is not suitable to be sent using multicast, as it requires two way communication. UDP however, does not (on itself). Furthermore, it turned out that one of the inter-cache protocols, HTCP was designed with support for URI purging (HTCP CLR). So, all we needed was HTCP client support in MediaWiki, with support for sending to multicast groups.

Squid

Of course, life was not that great. Squid's HTCP support turned out to be very incomplete at best, and even in direct violation of the RFC. It had no support for HTCP CLR at all. A patch to implement was quickly found via Google [1], but turned out to have issues as well.

I modified the patch to

  • work without requiring HTCP CLR responses
  • work at all when not requesting HTCP CLR responses
  • use a different store searching algorithm instead of htcpCheckHit(), which was intended for finding cache entries for URI hits instead of URI purges
  • allow the simultaneous removal of both HEAD and GET entries with a single HTCP request, by specifying NONE as the HTTP method

MediaWiki

MediaWiki was extended with a SquidPurge::HTCPPurge method, that takes a HTCP multicast group address, a HTCP port number, and a multicast TTL (see DefaultSettings.php to send all URLs to purge to. It can't make use of persistent sockets, but the overhead of setting up a UDP socket is minimal. It also doesn't have to worry about handling responses.

Some profiling runs show that the new method is about 8000 times faster than the older fastPurge method.

Networking changes

Of course the network had to be set up to support multicast, especially as all the Apaches and Squids are not on the same subnet. The Florida network has now been configured to route multicast, which seems to work reliably.

Getting the multicast packets routed to other clusters over the world turned out to be tough. Native multicast routing is not really an option, as most networks (including the ISP's in Florida) don't support it. Tunneling is, but didn't really work reliably in tests, and also might involve the use of non-free software.

updmcast

Therefor, I wrote a small application level multicast tool in Python, udpmcast. It joins a given multicast group on startup, listens on a specified UDP port, and then forwards all received packets to a given set of (unicast or multicast) destinations.

The program can be found in the MediaWiki CVS repository. Its usage is quite straightforward, and its options can be found by running it with the -h argument.

Current setup

As of January 2013, dobson is running udpmcast via /etc/rc.local and sending to hooft. Group is 239.128.0.112 port 4827.

The MediaWiki patches have been applied to MediaWiki CVS, both in HEAD and REL1_4, and are live on the site. All Apaches are configured through CommonSettings.php to send HTCP purge requests to the multicast group address 239.128.0.112. It uses multicast Time To Live 2 (instead of the default, 1) because the messages need to cross a single subnet/router.

The Florida squids have all been patched with my HTCP patch, and use this new method of purging. The French squids will follow soon. Until all Squids have been converted, the old HTTP purge method is still active for these caches. The only change in configuration on the Squids, besides having them listen on the HTCP port, is:

mcast_groups 239.128.0.112

to have them join the relevant multicast group, and receive all the purge requests.

udpmcast has been set up to forward all multicast HTCP squid purge requests to the three French squids individually, over unicast. It runs on dobson, because that's an external host that doesn't run Squid, and therefore doesn't have conflicts on binding HTCP port 4827:

./udpmcast.py -dj 239.128.0.112 212.85.150.133 212.85.150.132 212.85.150.131

Forwarding rules

Starting with version 1.5, udpmcast.py supports forwarding rules, where it selects the destination address list based on the source address that sent the packet. These forward rules can be specified as a Python dictionary on the command line. This is useful to support HTCP purge streams in both directions, for example between the pmtpa and yaseo clusters.

On dobson, it's configured as:

/usr/local/bin/udpmcast.py -u nobody -dj 239.128.0.112 "{ '211.115.107.158': ['239.128.0.112'] }" \
212.85.150.133 212.85.150.132 212.85.150.131 62.18.16.25 145.97.39.130 211.115.107.158

This means: "If the UDP packet was sent from 211.115.107.158 (amaryllis), send it to multicast group 239.128.0.112. If not, send it to the default address list.

External links

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox