Dumps/Dump servers

From Wikitech
< Dumps(Difference between revisions)
Jump to: navigation, search
 
(5 intermediate revisions by one user not shown)
Line 3: Line 3:
 
===Hardware===
 
===Hardware===
  
We have three hosts:
+
We have two hosts:
  
* [[Dataset1]] in Tampa, out of operation for over a year, still working with the vendor
+
*[[Dataset2]] in Tampa, '''production''':
* Dataset2 in Tampa, '''production''': PowerEdge R410, Ubuntu 10.04, 2 MD1000 arrays, 16GB RAM, 4 6-core Xeon X5650 cpus
+
*:Hardware/OS: PowerEdge R410, Ubuntu 10.04, 2 MD1000 arrays, 16GB RAM, 4 6-core Xeon X5650 cpus
*:Disk information: 144GB on the internal HDs with raid 1, 48T on the arrays with two raid 6 partitions set up as one LVM
+
*:Disks: 144GB on the internal HDs with raid 1, 48T on the arrays with two raid 6 partitions set up as one LVM volume
 
*:Note that this host also serves other public datasets such as some POTY files, the pagecount stats, etc.
 
*:Note that this host also serves other public datasets such as some POTY files, the pagecount stats, etc.
* [[Dataset1001]] in D.C, waiting for setup and installation
+
*[[Dataset1001]] in D.C., '''rsync/mirrors''':
 +
*:Hardware/OS: PowerEdge R510, Ubuntu 10.04, 1 MD-something array, 16GB RAM, 1? quad-core Xeon E5640 cpus
 +
*:Disks: 24 2TB disks in 2 12-disk raid6 volumes; 120GB partition for the OS, 1GB for swap, the rest combined into one 38T LVM volume
 +
*:Currently doing initial rsync of data from dataset2
  
 
===Services===
 
===Services===

Latest revision as of 23:25, 20 February 2012

Contents

[edit] XML Dump servers

[edit] Hardware

We have two hosts:

  • Dataset2 in Tampa, production:
    Hardware/OS: PowerEdge R410, Ubuntu 10.04, 2 MD1000 arrays, 16GB RAM, 4 6-core Xeon X5650 cpus
    Disks: 144GB on the internal HDs with raid 1, 48T on the arrays with two raid 6 partitions set up as one LVM volume
    Note that this host also serves other public datasets such as some POTY files, the pagecount stats, etc.
  • Dataset1001 in D.C., rsync/mirrors:
    Hardware/OS: PowerEdge R510, Ubuntu 10.04, 1 MD-something array, 16GB RAM, 1? quad-core Xeon E5640 cpus
    Disks: 24 2TB disks in 2 12-disk raid6 volumes; 120GB partition for the OS, 1GB for swap, the rest combined into one 38T LVM volume
    Currently doing initial rsync of data from dataset2

[edit] Services

The production host serves dump files and other public data sets to the public.

It relies on lighttpd. Sometimes this service dies for no good reason. To restart it,

/etc/init.d/lighttpd restart

[edit] Deploying a new host

You'll need to set up the raid arrays by hand. We typically have two arrays so set up two raid 6 arrays with LVM to make one giant volume, xfs.

Install in the usual way (add to puppet, copying a pre-existing production dataset host stanza, set up everything for PXE boot and go). You may or may not want to include the download mirror classes from puppet for the new host. If you replace the host that is the current download mirror, make sure you tweak the cron job that generates the mirror file list, see Dumps/Snapshot hosts#Other_tasks for that and other jobs you might need to check.

[edit] Space issues

If we run low on space, we can keep fewer rounds of XML dumps; see Dumps#Space for how to do that.

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox