Dumps/Rerunning a job

From Wikitech
< Dumps
Revision as of 20:03, 19 January 2012 by ArielGlenn (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

(This needs cleanup.)

If you need to run dumps for one wiki, be the user backup on one of the snapshot hosts and then do

  • cd /backups
  • python ./worker.py name-of-wiki-here

For example,

  • python ./worker.py enwiki

If you have the stub file and want to generate the full XML file with text revisions from the stub file, be the user backup on one of the snapshot hosts, and run just that job for that date:

  • cd /backups
  • python ./worker.py --job metahistorybz2dump --date YYYYmmdd [--configfile configfilenamehere] name-of-wiki-here

Example:

  • python ./worker.py --job metahistorybz2dump --date 20100904 --configfile wikidump.conf.bigwikis ruwiki

For other types of page dumps, change the job name. If you give the command with --job help it will produce a list of the various jobs; they should be self explanatory.

Ordinarily this will run with prefetch. If you want no prefetch (i.e. you want to get every text revision directly from the database instead of looking at the files from the previous run to get what you can from there), you can give the addition option --noprefetch.

Note that the name of the wiki must always be last on the command line, and it is the name as seen in /home/wikipedia/common/all.dblist .


The old way to run jobs (still ok but you probably won't need it) is described below:

  • cd /backups
  • /usr/bin/php -q /apache/common/php-1.5/maintenance/dumpTextPass.php --wiki=name-of-wiki --stub=gzip:/mnt/dumps/public/name-of-wiki/timestamp/name-of-wiki-timestamp-stub-meta-history.xml.gz --prefetch=bzip2:/mnt/dumps/public/name-of-wiki/timestamp/name-of-wiki-timestamp-pages-meta-history.xml.bz2 --force-normal --report=1000 --server=10.0.0.234 --spawn=/usr/bin/php --output=bzip2:/mnt/dumps/public/name-of-wiki/timestamp/name-of-wiki-timestamp-pages-meta-history.xml.bz2 --full

Example:

  • /usr/bin/php -q /apache/common/php-1.5/maintenance/dumpTextPass.php --wiki=ruwiki --stub=gzip:/mnt/dumps/public/ruwiki/20100531/ruwiki-20100531-stub-meta-history.xml.gz --prefetch=bzip2:/mnt/dumps/public/ruwiki/20100331/ruwiki-20100331-pages-meta-history.xml.bz2 --force-normal --report=1000 --server=10.0.0.234 --spawn=/usr/bin/php --output=bzip2:/mnt/dumps/public/ruwiki/20100531/ruwiki-20100531-pages-meta-history.xml.bz2 --full

If you want to run against a different type of stub file and produce the text XML output for it, adjust the names appropriately for the stub, prefetch and output files, and change the --full option accordingly (to e.g. --current).

If you want to run without spawning the fetchText.php process for text revision retrievals, leave off the --spawn=/usr/bin/php option.

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox