How to do a schema change

From Wikitech
(Difference between revisions)
Jump to: navigation, search
(update)
Line 4: Line 4:
 
RoanKattouw_away: And if anyone runs update.php I'll be on the first flight to SFO to slap them in the face
 
RoanKattouw_away: And if anyone runs update.php I'll be on the first flight to SFO to slap them in the face
 
</code>
 
</code>
 +
 +
== Major Migrations ==
 +
Major migrations (such as those for new releases of MediaWiki which often touch multiple large tables) are typically run via a custom script that automatically runs through every slave server defined in db.php (or db-secondary.php, to cover our backup datacenter).  We've been creating a version of this for each MediaWiki release >= 1.17.  It strives to be idempotent and fault tolerant.  See the version used for 1.19 here: https://www.mediawiki.org/wiki/Special:Code/MediaWiki/111122
 +
 +
The execution process for 1.19 was as follows:
 +
 +
* <code>mwscript maintenance/upgrade-1.19wmf1-1.php --wiki test2wiki --secondary</code>
 +
** When --secondary is given, it runs to the hosts defined in db-secondary.php. These are slaves not being used by the current live site, such as in the alternate datacenter and for analytics.
 +
* <code>mwscript maintenance/upgrade-1.19wmf1-1.php --wiki test2wiki</code>
 +
** Runs on all slaves for all wikis (the --wiki option is required for mwscript but not used here)
 +
* All slaves are now updated and the master for each shard needs to be rotated.  See [[Master switch|here [Master switch]]] - the first procedure was used very smoothly for 1.19.
 +
** Run the upgrade script again after one or more masters have been switched, repeat until all are done.
  
 
== sql.php ==
 
== sql.php ==

Revision as of 15:32, 12 April 2012

Whatever you do, do not run update.php on the cluster. Unless you want Roan to come slap you in the face:

RoanKattouw_away: And if anyone runs update.php I'll be on the first flight to SFO to slap them in the face

Contents

Major Migrations

Major migrations (such as those for new releases of MediaWiki which often touch multiple large tables) are typically run via a custom script that automatically runs through every slave server defined in db.php (or db-secondary.php, to cover our backup datacenter). We've been creating a version of this for each MediaWiki release >= 1.17. It strives to be idempotent and fault tolerant. See the version used for 1.19 here: https://www.mediawiki.org/wiki/Special:Code/MediaWiki/111122

The execution process for 1.19 was as follows:

  • mwscript maintenance/upgrade-1.19wmf1-1.php --wiki test2wiki --secondary
    • When --secondary is given, it runs to the hosts defined in db-secondary.php. These are slaves not being used by the current live site, such as in the alternate datacenter and for analytics.
  • mwscript maintenance/upgrade-1.19wmf1-1.php --wiki test2wiki
    • Runs on all slaves for all wikis (the --wiki option is required for mwscript but not used here)
  • All slaves are now updated and the master for each shard needs to be rotated. See here [Master switch] - the first procedure was used very smoothly for 1.19.
    • Run the upgrade script again after one or more masters have been switched, repeat until all are done.

sql.php

The sql.php script can be used to run .sql files with mwscript sql.php enwiki patchfile.sql. This is what you'll typically want to use for MediaWiki or extension schema changes, as these .sql files contain magic comments like /*$wgDBTableOptions*/ that are expanded by sql.php.

sql command

The sql command can be used to quickly get a MySQL command line. Because different wikis' databases live on different clusters, you need to pass the database name as the first argument, like so: sql enwiki. Note that this will connect to the master database, which means you will have full write access to everything.

This means you need to be careful:

  • Check your write query and be sure you really mean it before pressing Enter
  • At the start of your session, open a new transaction with BEGIN;. After you've done your write queries, use COMMIT; to commit your changes or ROLLBACK; to undo them
  • Don't run write queries that will insert, update or delete more than 500 rows in one go. Instead, run it in batches
    • If you don't know how many rows your query will touch, find out first. This'll also help you verify you got your WHERE clause right
  • Don't run slow or complex read queries on the master. If you can, run them on the toolserver. If this is not possible, run them on a slave. You can find a list of slaves in wmf-config/db.php.

runBatchedQuery.php

This is a fairly simple script that runs a write query over and over again until it touches zero rows. After each iteration, it waits for replication lag to go down if it's above a certain threshold (5 seconds by default) and prints a progress marker. It's designed to be used with UPDATE or DELETE queries with LIMIT 500 at the end (you need to add this yourself), which allows queries touching thousands of rows to be run without replication lag going through the roof as it would when running the query the normal way.

Usage: mwscript runBatchedQuery.php enwiki 'UPDATE ... LIMIT 500;'

Running things on all wikis

To run maintenance scripts like sql.php and runBatchedQuery.php on all wikis, you can use: foreachwiki sql.php patchfile.sql

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox