MediaWiki caching
Contents |
Cache headers
(See the HTTP specs for more formal wording)
Headers are sent mainly in OutputPage.php, function sendCacheControl(), around line 317. The headers sent depend mainly on the action (setSquidMaxage = $wgSquidMaxage in index.php for view and history) and if a cookie is sent by the browser.
Headers explained
Last-modified
No client-side caching without as browsers don't know what to base their if-modified-since requests on. If the page didn't change the squid will only respond with a 304 (unchanged) status code to those, only the header is transferred.
Cache-control
s-maxage
Tells intermediate caches such as squids how long they should consider the content to be valid without ever checking back. This needs to be hidden from caches we can't purge, otherwise users won't see changes. This is the reason for a header_access rule on the Squids which replaces any Cache-control header with one that only allows client caching:
Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
maxage
How long clients (browsers) should deem the content to be up to date. We allow clients to keep the page (the 'private' allows this), but tell them to send a conditional if-modified-since request. For this of course the Last-modified header is needed, we set it to the last modification time or- if we don't have it- to the current time minus one hour.
private
Allows browsers to cache the content
Putting it together
Cache-Control: s-maxage=($wgSquidMaxage) , must-revalidate, max-age=0'
Allows caching on squids (s-maxage) which will replace it with
Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
for all anon visitors without session which don't send a cookie. Second-tier squids are allowed to get the original headers with a special rule in squid.conf that matches their ips. After the first visit to an edit page or login the user sends a cookie and mw will also send no s-maxage to the squids so they don't cache it:
Cache-Control: private, must-revalidate, max-age=0
This again allows browsers to cache the page while forcing them to check for changes on each page view.
Vary
Tells downstream proxy caches to cache the content depending on some values- if those values are different, serve another page for the same url. We use
Vary: Accept-Encoding, Cookie
to make sure logged-in users (which send a cookie) get pages with their user name and prefs (the cookie bit) and clients that don't support gzip transfer-encoding don't get compressed pages. I think there's some support for transparent decompression in Squid3, so it might not require to store different copies. See also: Vary in RFC 2616