MySQL

From Wikitech
Revision as of 19:59, 20 October 2009 by Tim (Talk | contribs)

Jump to: navigation, search

This page desperately needs to be filled out or brought up to date. If you're familiar with the operations of this part of the site, please help!

FIXME:

  • Packaging, patches, configuration, installation
  • How to change masters
  • How to debug replication lag
  • How to fix broken replication

Contents

Source

The patched MySQL 4.0 source is available in a special repository on svn.wikimedia.org:

http://svn.wikimedia.org/svnroot/mysql

Sun has officially stopped maintaining MySQL 4.0, but community development continues to some extent, mainly in the form of backported patches from MySQL 5.0. Domas was responsible for patch selection in Wikimedia's fork of MySQL, and he called the result "version 4.0.40", thinking the number suitably poetic.

Packaging

This source is packaged in the wikimedia-mysql4 package, available from apt.wikimedia.org and with debian directory checked in to /mediawiki/trunk/debs as per usual. The package version number contains the revision number of the mysql repository described above.

The package contains a Wikimedia-specific default configuration file. Note that this configuration file must be edited manually after installation to give it a unique server ID.

Installation

  1. Partition the hard drive with a small root partition, and the bulk space mounted at /a. XFS and JFS are considered to be appropriate filesystems for /a.
  2. Install the wikimedia-mysql4 package. This creates a mysql user.
  3. Create directory /a/tmp, owned by mysql:mysql.
  4. Create directory /a/sqldata.
    • /a/sqldata is generally copied from an existing MySQL server, so that the new server can be added to that replicated cluster.
    • For those rare cases when you need to start a data directory from scratch, the shell script /usr/mysql-wikimedia/bin/mysql_install_db should be used. This creates necessary administrative files in /a/sqldata.
  5. Edit /etc/my.cnf and change:
    • The server-id variable, to a unique ID derived from the IP address
    • The innodb_buffer_pool_size, to around 80% of physical memory.
    • Note that many settings need to be changed if MyISAM is to be used, e.g. for external storage.

Copying the data directory

  • Depool a slave from the same cluster, which will be used as the data source.
  • Shut down MySQL on it.
  • Find the following settings in /etc/my.cnf on the source server and copy them to the destination server:
    • innodb_data_file_path. The names and sizes of the data files must match.
    • innodb_log_file_size. Note that if this is small (<100MB) it can cause performance problems. You can change the log file size after a clean shutdown by deleting the ib_logfile* files.[1]
  • Copy the data. Various methods have been tried for this. Rsync is good but has some nasty performance cases that can cause problems. An uncompressed tarpipe over netcat is probably best. First start the receiver:
nc -l 9999 | tar -xv -C/a

Then start the sender:

tar -cv -C/a sqldata | nc <dest_server> 9999

This should work with the nc available on Ubuntu, other flavours may require other command-line options to avoid hanging on to the connection after EOF.

Editing db.php

db.php is MediaWiki's configuration file for database connections. It sets a variable called $wgLBFactoryConf.

Common tasks

Depooling 
If the server is down, comment it out in sectionLoads and (if necessary) groupLoadsByDB. To remove load from an overloaded but up server, set its load ratio in sectionLoads to zero. A server which is in sectionLoads but with zero load will delay maintenance scripts, they will wait for it to have low replication lag.
Load tuning 
The load numbers in sectionLoads are ratios. They can be converted to a load fraction by dividing by the sum of loads in that section. So for example if you have (db1 => 100, db2 => 200), then db1 will have 33% and db2 will have 67%. Tweak these values to balance the I/O load as seen in Ganglia.
Master switch 
The first server in a given section in sectionLoads is the master, so you can change the master by changing the order of servers or commenting out the first server. See switch master for the full procedure. The master should generally have zero read load, because the load from applications requiring a master connection will be substantial.

Comprehensive documentation

The primary source of documentation is includes/db/LBFactory_Multi.php, copied below:

sectionsByDB 
A map of database names to section names
sectionLoads 
A 2-d map. For each section, gives a map of server names to load ratios. For example: array( 'section1' => array( 'db1' => 100, 'db2' => 100 ) )
serverTemplate 
A server info associative array as documented for $wgDBservers. The host, hostName and load entries will be overridden.
groupLoadsBySection 
A 3-d map giving server load ratios for each section and group. For example: array( 'section1' => array( 'group1' => array( 'db1' => 100, 'db2' => 100 ) ) )
groupLoadsByDB 
A 3-d map giving server load ratios by DB name.
hostsByName 
A map of hostname to IP address.
externalLoads 
A map of external storage cluster name to server load map
externalTemplateOverrides 
A set of server info keys overriding serverTemplate for external storage
templateOverridesByServer 
A 2-d map overriding serverTemplate and externalTemplateOverrides on a server-by-server basis. Applies to both core and external storage.
templateOverridesByCluster 
A 2-d map overriding the server info by external storage cluster
masterTemplateOverrides 
An override array for all master servers.
readOnlyBySection 
A map of section name to read-only message. Missing or false for read/write.


Template:PD

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox