MediaWiki caching

From Wikitech
(Difference between revisions)
Jump to: navigation, search
(Vary: link [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44 Vary in RFC 2616])
 
(9 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 +
{{fixme|Add info on our custom X-Vary stuff}}
 +
 
== Cache headers ==
 
== Cache headers ==
 +
(See [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html the HTTP specs for more formal wording])
  
Are sent mainly in OutputPage.php, function sendCacheControl(), around line 317. The headers sent depend mainly on the action (setSquidMaxage = $wgSquidMaxage in index.php for view and history) and if a cookie is sent by the browser.
+
Headers are sent mainly in OutputPage.php, function sendCacheControl(), around line 317. The headers sent depend mainly on the action (setSquidMaxage = $wgSquidMaxage in index.php for view and history) and if a cookie is sent by the browser.
  
 
== Headers explained ==
 
== Headers explained ==
 
=== Last-modified ===
 
=== Last-modified ===
No client-side caching without as browsers don't know what to base their if-modified-since requests on. If the page didn't change the squid will only respond with a 304 (unchanged) status code to those, only the header is transferred.
+
This is required for client-side caching, as without it browsers don't know what to base their if-modified-since requests on. If the page hasn't changed the squid will only respond with a 304 (unchanged) status code, and only the response code and headers are transferred.
 +
 
 
=== Cache-control ===
 
=== Cache-control ===
 
==== s-maxage ====
 
==== s-maxage ====
Tells intermediate caches such as squids how long they should consider the content to be valid without ever checking back. This needs to be hidden from caches we can't purge, otherwise users won't see chenges. This is the reason for a header_access rule on the Squids which replaces any Cache-control header with one that only allows client caching:
+
Tells intermediate caches such as squids how long they should consider the content to be valid without ever checking back. This needs to be hidden from caches we can't purge, otherwise users won't see changes. This is the reason for a header_access rule on the Squids which replaces any Cache-control header with one that only allows client caching:
  
 
  Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
 
  Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
  
==== maxage ====
+
==== max-age ====
How long clients (browsers) should deem the content to be up to date. We allow clients to keep the page (the 'private' but allows this), but tell them to send a conditional if-modified-since request. For this of course the Last-modified header is needed, we set it to the last modification time or- if we don't have it- to the current time minus one hour.
+
How long clients (browsers) should deem the content to be up to date. We allow clients to keep the page (the 'private' allows this), but tell them to send a conditional if-modified-since request. For this of course the Last-modified header is needed, we set it to the last modification time or- if we don't have it- to the current time minus one hour. Images and stylesheets (including the generated ones that represent the user's pref selections) have max-age > 0 to avoid reloading those on each request. This is the reason why users have to refresh their cache after changing the prefs. (Is there a way to force a client to re-request something using javascript?)
  
 
==== private ====
 
==== private ====
Line 35: Line 39:
 
  Vary: Accept-Encoding, Cookie
 
  Vary: Accept-Encoding, Cookie
  
to make sure logged-in users (which send a cookie) get pages with their user name and prefs (the cookie bit) and clients that don't support gzip transfer-encoding don't get compressed pages. I think there's some support for transparent decompression in Squid3, so it might not require to store different copys. ''See also'': [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44 Vary in RFC 2616]
+
to make sure logged-in users (which send a cookie) get pages with their user name and prefs (the cookie bit) and clients that don't support gzip transfer-encoding don't get compressed pages. I think there's some support for transparent decompression in Squid3, so it might not require to store different copies. ''See also'': [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44 Vary in RFC 2616]
 +
 
 +
[[Category:Caching]]

Latest revision as of 22:53, 27 September 2011

FIXME: Add info on our custom X-Vary stuff

Contents

[edit] Cache headers

(See the HTTP specs for more formal wording)

Headers are sent mainly in OutputPage.php, function sendCacheControl(), around line 317. The headers sent depend mainly on the action (setSquidMaxage = $wgSquidMaxage in index.php for view and history) and if a cookie is sent by the browser.

[edit] Headers explained

[edit] Last-modified

This is required for client-side caching, as without it browsers don't know what to base their if-modified-since requests on. If the page hasn't changed the squid will only respond with a 304 (unchanged) status code, and only the response code and headers are transferred.

[edit] Cache-control

[edit] s-maxage

Tells intermediate caches such as squids how long they should consider the content to be valid without ever checking back. This needs to be hidden from caches we can't purge, otherwise users won't see changes. This is the reason for a header_access rule on the Squids which replaces any Cache-control header with one that only allows client caching:

Cache-Control: private, s-maxage=0, max-age=0, must-revalidate

[edit] max-age

How long clients (browsers) should deem the content to be up to date. We allow clients to keep the page (the 'private' allows this), but tell them to send a conditional if-modified-since request. For this of course the Last-modified header is needed, we set it to the last modification time or- if we don't have it- to the current time minus one hour. Images and stylesheets (including the generated ones that represent the user's pref selections) have max-age > 0 to avoid reloading those on each request. This is the reason why users have to refresh their cache after changing the prefs. (Is there a way to force a client to re-request something using javascript?)

[edit] private

Allows browsers to cache the content

[edit] Putting it together

Cache-Control: s-maxage=($wgSquidMaxage) , must-revalidate, max-age=0'

Allows caching on squids (s-maxage) which will replace it with

Cache-Control: private, s-maxage=0, max-age=0, must-revalidate

for all anon visitors without session which don't send a cookie. Second-tier squids are allowed to get the original headers with a special rule in squid.conf that matches their ips. After the first visit to an edit page or login the user sends a cookie and mw will also send no s-maxage to the squids so they don't cache it:

Cache-Control: private, must-revalidate, max-age=0

This again allows browsers to cache the page while forcing them to check for changes on each page view.

[edit] Vary

Tells downstream proxy caches to cache the content depending on some values- if those values are different, serve another page for the same url. We use

Vary: Accept-Encoding, Cookie

to make sure logged-in users (which send a cookie) get pages with their user name and prefs (the cookie bit) and clients that don't support gzip transfer-encoding don't get compressed pages. I think there's some support for transparent decompression in Squid3, so it might not require to store different copies. See also: Vary in RFC 2616

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox