External storage/Update 2011-08

From Wikitech
< External storage
Revision as of 14:07, 31 August 2011 by Tim (Talk | contribs)

Jump to: navigation, search

A description of the project to bring the external storage service up to date.

This project is active.
External Storage Update
RT: 1300
Due: 2011-09-30


Contents

Background and context

The "External_storage" hosts are used to store compressed old revisions of wiki pages. When a user asks for a diff of a sufficiently old revision of a page, mediawiki grabs the compressed content from the external store database and uses it to display the user's request.

External storage is made up of several (12 maybe?) clusters of sets of 3 databases running mysql. Most of these are three copies of the same content, are read-only, and are not set up with replication. One cluster gets all new writes. The sharding to different clusters is done somewhere within mediawiki and is currently a mystery, but not something we actually need to worry about. The clusters are defined in dbs.php. Many of the clusters are running mysql on apache nodes (srv###).

There seem to be two types of compression in use on the clusters. One has a gzip header, the other does not. <fill in examples of each.>

I believe there may be (or may have been) a few other (old) types in use, but I don't know the details of that. Tim and Ariel would know. -- Mark Bergsma 10:07, 12 August 2011 (UTC)
Text storage row types are listed at Text storage data -- Tim 14:07, 31 August 2011 (UTC)

Goals

  • get external storage replicated to eqiad
  • get external storage off of apache nodes and onto dedicated database hardware
  • Recompress the existing clusters into a new one, for far better compression
  • document external storage's setup, maintenance requirements, and procedures

Additional research

  • Find and examine the 'recompression' job, figure out what it does, how it works, and how it will affect this project
  • Find out how much space is used on each of the clusters
  • Determine if we can consolidate the clusters onto fewer larger boxes, and if so, how many
  • Understand how the sharding works to understand how to adjust the shards if we can consolidate onto fewer boxes
    • the answer may be to simply maintain sharded tables on a consolidated host, but we'll see.
  • find out what hardware is currently available for this project
    • ms1-3 in pmtpa (currently in use)
    • es1001-1004 in eqiad
  • what machines currently back external store?
  • other things that will cause problems
  • what are all the compression methods?
    • see Text_storage_data. Re-run the queries that created this page to bring it up to date.

Overall intended approach

Data Migration

  • ensure the ms3 cluster is in the correct compression format
    • if not:
    • take one of the three hosts out of rotation (ms1)
      • record replication details - we're going to need to re-attach to get new data
    • recompress from the other slave (ms2) onto ms1
    • slave a host in eqiad (es1001) off ms1 so there's a backup
    • slave ms1 off of ms3 using previously recorded binlog position to get new data
    • rotate master so ms1 is getting new writes
    • reslave ms2 off of ms1
    • wipe ms3
  • prep the second cluster
    • create 3 empty dbs: ms3, es1002, es1003
    • make ms3 the master, set up es1002 and es1003 to slave off ms3
  • get all apache-based dbs onto db hardware (note: this is all read-only data)
    • for each existing cluster:
    • recompress the data into a new cluster table on ms3
    • back up then delete mysql data on the apache host
  • rearrange slaving for the second cluster
    • promote es1002 to master, es1003 and ms3 both slaving off es1002

Documentation

  • Starting from http://wikitech.wikimedia.org/view/External_storage, expand and improve documentation until it sufficiently describes the current environment and how to work with it
    • architecture of the current cluster
    • how to move new inserts from one cluster to the other
    • how to age data off the current cluster
      • noop?
    • how to set up a new cluster

Database replication - desired end state

This section assumes we will be able to fit all external store data on two clusters. This is likely to be true but has not yet been validated.

We have 7 database machines between pmtpa and eqiad available to devote to the external store replication setup. The desired end state will have two clusters, each replicated to both data centers. Each one will have one master and slave in one colo and one slave in the other, with the masters in opposite colos. New inserts will always go to the 'active' cluster; this is the one whose master is in the same colo as the main database. (Note: the active cluster will be pmtpa until we are able to failover everything to eqiad.) The seventh box will hang off the active cluster and be available as a spare to swap out failed hosts.

Here's a picture: External storage cluster-2.png

Tasks

  • rack and set up eqiad servers (rt-xxx)
  • set up puppet for external storage class (rt-91)
Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox