Bits.wikimedia.org/Varnish testing

From Wikitech
< Bits.wikimedia.org(Difference between revisions)
Jump to: navigation, search
(Created page with 'Few notes: * We hit a bug where all threads are writing to acceptor pipe, but acceptor thread doesn't seem to pick that up * Originally thought as 2.6.24 kernel problem, a 2.6.32...')
 
Line 23: Line 23:
 
** varnishd -n /dev/shm -smalloc,1G -f /usr/local/etc/varnish/bits.vcl -T 127.0.0.1:6000 -w 2000 -a 0.0.0.0:80 -p thread_pool_add_delay=1 -p send_timeout=30 -p listen_depth=4096
 
** varnishd -n /dev/shm -smalloc,1G -f /usr/local/etc/varnish/bits.vcl -T 127.0.0.1:6000 -w 2000 -a 0.0.0.0:80 -p thread_pool_add_delay=1 -p send_timeout=30 -p listen_depth=4096
  
So far successful operation of last test may be attributed to:
+
Last attempt was successful with half load and failed with full load immediately, it had:  
 
* trunk
 
* trunk
 
* send_timeout
 
* send_timeout
Line 29: Line 29:
 
* sysctl changes
 
* sysctl changes
  
Currently sq1.wikimedia.org runs half of pmtpa bits workload, handling:
+
It was handling:
 
* 4500 requests/s
 
* 4500 requests/s
 
* 350mbps of traffic
 
* 350mbps of traffic
 
* ~65% of CPU, 326MB of RES
 
* ~65% of CPU, 326MB of RES
 +
 +
When CPU usage reached 100%, varnish melted.

Revision as of 14:37, 11 January 2010

Few notes:

  • We hit a bug where all threads are writing to acceptor pipe, but acceptor thread doesn't seem to pick that up
  • Originally thought as 2.6.24 kernel problem, a 2.6.32.3 was deployed, but still got same problem
  • This happens with both poll and epoll acceptors (managed to hit it much earlier with poll acceptor, may be coincidence)
  • Currently we are running:
    • varnish trunk (standard configure options)
    • Following sysctl.conf additional changes loaded:
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216
net.ipv4.tcp_fin_timeout = 3
net.core.netdev_max_backlog = 30000
net.ipv4.tcp_no_metrics_save=1
net.core.somaxconn = 262144
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
    • ulimit -s 128
    • ulimit -n 500000
    • varnishd -n /dev/shm -smalloc,1G -f /usr/local/etc/varnish/bits.vcl -T 127.0.0.1:6000 -w 2000 -a 0.0.0.0:80 -p thread_pool_add_delay=1 -p send_timeout=30 -p listen_depth=4096

Last attempt was successful with half load and failed with full load immediately, it had:

  • trunk
  • send_timeout
  • listen_depth
  • sysctl changes

It was handling:

  • 4500 requests/s
  • 350mbps of traffic
  • ~65% of CPU, 326MB of RES

When CPU usage reached 100%, varnish melted.

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox