User:Bhartshorne/swift tasks 2012-08-13
From Wikitech
< User:Bhartshorne(Difference between revisions)
Bhartshorne (Talk | contribs) |
Bhartshorne (Talk | contribs) |
||
| Line 11: | Line 11: | ||
* redo zones in pmtpa | * redo zones in pmtpa | ||
* improve reaction-based documentation (instead of feature-based documentation) | * improve reaction-based documentation (instead of feature-based documentation) | ||
| + | ** what to do when a host fails; what to do when a nagios alert triggers (for each nagios alert); etc. | ||
* audit and replace disks across all backends | * audit and replace disks across all backends | ||
* improve dead disk detection methods, automate alerting and replacing | * improve dead disk detection methods, automate alerting and replacing | ||
* document how to switch from pmtpa to eqiad | * document how to switch from pmtpa to eqiad | ||
** container synchronization is an eventually consistent thing; how to synchronize the change? | ** container synchronization is an eventually consistent thing; how to synchronize the change? | ||
Revision as of 21:26, 13 August 2012
- upgrade to 1.5.0 (with ganglia statsd stuff disabled)
- test in labs
- test proxy, test storage
- test on eqiad
- test in labs
- sync content
- test between eqiad-prod cluster and ??? (eiqad-test? labs?
- enable 1.5 statsd ganglia stuff
- disable ganglia-logtailer
- disable local logging?
- update ganglia view for new metrics
- redo zones in pmtpa
- improve reaction-based documentation (instead of feature-based documentation)
- what to do when a host fails; what to do when a nagios alert triggers (for each nagios alert); etc.
- audit and replace disks across all backends
- improve dead disk detection methods, automate alerting and replacing
- document how to switch from pmtpa to eqiad
- container synchronization is an eventually consistent thing; how to synchronize the change?