Eqiad Migration Planning/Steps

From Wikitech
< Eqiad Migration Planning(Difference between revisions)
Jump to: navigation, search
m (Migrate bits apaches to eqiad)
(Migrate bits apaches to eqiad: Add Gerrit patch sets)
Line 27: Line 27:
 
  # varnishtop -i TxStatus
 
  # varnishtop -i TxStatus
  
Deploy Gerrit patch set XXX and run Puppet for node group XXX. This will change the apache backends for the ''eqiad Varnish servers '''only''''', giving us a chance to fall back on pmtpa bits Varnish servers quickly if needed.
+
Deploy [https://gerrit.wikimedia.org/r/#/c/44251/ Gerrit patch set 44251] and run Puppet for node group XXX. This will change the apache backends for the ''eqiad Varnish servers '''only''''', giving us a chance to fall back on pmtpa bits Varnish servers quickly if needed.
  
 
Check if the distribution of HTTP status codes changes drastically, esp. HTTP 2xx vs. 4xx/5xx.
 
Check if the distribution of HTTP status codes changes drastically, esp. HTTP 2xx vs. 4xx/5xx.
  
If bits@eqiad is confirmed to work correctly, after a while deploy Gerrit patchset XXX and run Puppet for node group XXX. This will switch the pmtpa bits Varnish servers to use the eqiad bits appservers as well.
+
If bits@eqiad is confirmed to work correctly, after a while deploy [https://gerrit.wikimedia.org/r/#/c/44252/ Gerrit patchset 44252] and run Puppet for node group XXX. This will switch the pmtpa bits Varnish servers to use the eqiad bits appservers as well.

Revision as of 14:21, 16 January 2013

Contents

Day 1: Tue Jan 22

Preparation (before maintenance window)

Check LVS pools apaches, api and rendering for down/depooled machines. A few machines may be broken (and should be removed from the config from the time being), but all others should be up and happy in health checks.

# ipvsadm -l
# less /var/log/pybal.log

Check whether the Nagios check for these LVS pools exists and is up.

Check whether all pooled application servers have the right LVS service IPs bound to loopback.

Check deployed MediaWiki revision / git status on all application servers

MySQL warm up?

Maintenance window

Migrate bits apaches to eqiad

Check whether the 4 bits apaches are healthy according to a bits Varnish server:

# varnishlog -i Backend_health -O

Test a few top bits URLs manually from the new bits app servers to see if valid content is being returned. To retrieve the most requested URLs, on a bits Varnish server:

# varnishtop -i RxURL

To test such a URL, use CURL, or:

fenari: $ /home/mark/firstbyte.py apache_host_name 80 bits.wikimedia.org URI

Run varnishtop for a histogram of HTTP status codes, and compare before/after migration:

# varnishtop -i TxStatus

Deploy Gerrit patch set 44251 and run Puppet for node group XXX. This will change the apache backends for the eqiad Varnish servers only, giving us a chance to fall back on pmtpa bits Varnish servers quickly if needed.

Check if the distribution of HTTP status codes changes drastically, esp. HTTP 2xx vs. 4xx/5xx.

If bits@eqiad is confirmed to work correctly, after a while deploy Gerrit patchset 44252 and run Puppet for node group XXX. This will switch the pmtpa bits Varnish servers to use the eqiad bits appservers as well.

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox