Server admin log/Archive 20
From Wikitech
< Server admin log(Difference between revisions)
(moving preilly from restricted to mortals, for deploy access (Ryan_Lane)) |
|||
| Line 1: | Line 1: | ||
== June 21 == | == June 21 == | ||
| + | * 19:26 river: taking db18 out of rotation to dump s7 for TS | ||
* 18:56 Ryan_Lane: moving preilly from restricted to mortals, for deploy access | * 18:56 Ryan_Lane: moving preilly from restricted to mortals, for deploy access | ||
* 15:01 apergos: deleted the old ssh key for snap2.wm.o from the puppet db also, as fenari was whining about it similarly, hope that gets is. spence puppet run *finally* done, sheesh | * 15:01 apergos: deleted the old ssh key for snap2.wm.o from the puppet db also, as fenari was whining about it similarly, hope that gets is. spence puppet run *finally* done, sheesh | ||
Revision as of 19:24, 21 June 2011
June 21
- 19:26 river: taking db18 out of rotation to dump s7 for TS
- 18:56 Ryan_Lane: moving preilly from restricted to mortals, for deploy access
- 15:01 apergos: deleted the old ssh key for snap2.wm.o from the puppet db also, as fenari was whining about it similarly, hope that gets is. spence puppet run *finally* done, sheesh
- 13:17 apergos: after changing snapshot2 from external to internal subnet (oops forgot to log that huh), puppet on spence was complaining about the external resource Nagios_host[snapshot2] not being able to override the local resource, so in puppet db on db9 I deleted the old snapshot2 nagios resources, leaving only the new ones, puppet run is in process now on spence and is bleeping sloooow.
- 09:14 apergos: taking down snapshot2 in prep for os upgrade (reinstall)
- 01:52 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'Fix checkuserwiki permissions back to how they were'
- 00:49 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Permission changes for checkuserwiki. Bug 28781.'
June 20
- 23:57 binasher: reconfiguring mobile1 for testing the new php mobile mediawiki extension
- 23:03 notpeter: added checks for google safe browsing for lots of MWF sites to nagios
- 22:53 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Missed Comment 1 in bug 28785.'
- 22:42 logmsgbot: reedy synchronized php-1.17/includes/api/ApiEditPage.php 'bug 29278 r90943'
- 21:57 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Changed language, sitename and namespace for checkuserwiki. Bug 28781.'
- 21:33 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Changed icons on checkuserwiki. Bug 28785.'
- 19:20 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/ext.articleFeedback/ext.articleFeedback.js 'r90480'
- 18:51 binasher: drop db29 (s6 master) weight to 0
- 18:51 logmsgbot: root synchronized php-1.17/wmf-config/db.php 'drop db29 weight to 0'
- 18:21 logmsgbot: root synchronized php-1.17/wmf-config/db.php 'adding db43 to s6 with a weight of 20'
- 18:21 binasher: increased db43 weight to 1500
- 18:07 binasher: just put db43 into rotation in the s6 cluster
- 18:04 logmsgbot: root synchronized php-1.17/wmf-config/db.php 'adding db43 to s6 with a weight of 20'
- 16:53 pdhanda: Upgrading wikitech.wikimedia.org to 1.17
- 15:43 apergos: started en pedia dumps run on snapshot 4, from root screen session as backup user. This will run as 32 jobs in parallel; we should keep a half an eye on the db in about 16-17 hours (previous runs had only 15 jobs going at once).
- 14:18 mark: Configured TiNet transit on cr2-eqiad (BGP sessions deactivated)
- 14:18 mark: Configured border ACL filtering on the eqiad border
- 00:46 logmsgbot: tstarling synchronized php-1.17/includes/db/LoadBalancer.php 'r90423'
June 17
- 20:39 RobH: email is working again, but address in question is not filtered
- 20:38 RobH: pushed a small exim change to try to sort out a sender address, but it didnt work, rolled back changes
- 16:44 RobH: upgraded blog.wikimedia.org to wordpress 3.1.3
- 16:44 RobH: restarted morebots.
- 00:33 Ryan_Lane: restarting pybal on amslvs4
- 00:32 Ryan_Lane: restarting pybal on amslvs2
- 00:31 Ryan_Lane: restarting pybal on amslvs3
- 00:30 Ryan_Lane: restarting pybal on amslvs1
June 16
- 23:51 Ryan_Lane: restarting pybal on amslvs3
- 23:51 Ryan_Lane: restarting pybal on amslvs1
- 23:50 Ryan_Lane: adding uploadsvc and uploadsecure services to amslvs1 and amslvs3
- 22:25 Ryan_Lane: restarting pybal on lvs1
- 21:47 RobH: puppet now works on eqiad hosts.
- 21:38 Ryan_Lane: enabled https for bits. still needs varnish changes.
- 21:38 RobH: updating dns with puppet cname in eqiad to sockpuppet
- 21:37 RobH: dispatched gilman to peter for puppetize testing of grosley
- 21:21 RobH: dns update for puppetmaster cname in eqiad
- 21:13 RobH: updated dns with carbon install server and cp server ranges
- 21:10 mark: Added all eqiad subnets to dhcpd.conf
- 21:10 mark: Prepared puppet's base.pp for eqiad
- 21:05 Ryan_Lane: restarting pybal on amslvs3
- 21:03 Ryan_Lane: killing and starting pybal on amslvs1
- 21:03 Ryan_Lane: restarting pybal on amslvs1
- 21:01 Ryan_Lane: adding bitssvc address to amslvs systems
- 20:52 notpeter: restarting drbd replication on nfs2
- 20:52 RobH: db1001 is our first wmf proper installed host in eqiad, huzzah
- 20:44 mark: Added static route for 10.2.1.23/32 to lvs2 on csw1-sdtpa
- 20:09 mark: Added 10.64.0.0/12 to squid http_access on brewster
- 18:51 Ryan_Lane: bound bitssvc address to lvs2, restarting pybal
- 18:33 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php
- 18:24 Ryan_Lane: restarting pybal on lvs2
- 18:24 Ryan_Lane: adding bitssvc and bitssecure services to pmtpa lvs
- 18:01 notpeter: reimaging nfs2
- 16:36 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php
- 14:54 andrewS: srv169 plugged in and powered on
- 14:26 RobH: updating again, there are db1001-1016 in a single rack, doh
- 14:24 RobH: updating dns with new dbs in eqiad
- 12:00 RoanKattouw: Set $wgInstantCommons back to true on prototype
- 11:59 RoanKattouw: Enabled WikiLove on prototype
- 11:15 andrewS: will be working on srv278 later today
- 11:12 logmsgbot: root synchronized php-1.17/wmf-config/mc.php
- 09:50 logmsgbot: ariel synchronized php-1.17/wmf-config/CommonSettings.php 'raise account creation throttle for el pedia workshop'
- 01:38 Tim: success, my SIP connection now works again
- 01:38 Tim: attempting to remove the SIP connection rate limit from sfoasterisk
June 15
- 21:32 awjr: killing epicly long running query in civicrm db on db9
- 21:28 logmsgbot: reedy synchronized php-1.17/wmf-config/codereview.php 'Couple more auto defers'
- 19:07 logmsgbot: catrope synchronized php-1.17/extensions/WikiLove/WikiLove.api.php 'r90138'
- 18:47 logmsgbot: catrope synchronized php-1.17/extensions/WikiLove/modules/ext.wikiLove/ext.wikiLove.core.js 'r90135'
- 18:45 logmsgbot: catrope synchronized php-1.17/extensions/WikiLove/WikiLove.hooks.php 'r90135'
- 18:18 logmsgbot: catrope synchronized php-1.17/resources/mediawiki.util/mediawiki.util.js 'r90128'
- 18:03 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enable WikiLove on officewiki'
- 18:02 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Add switch for WikiLove'
- 18:01 mark: Upgraded msw1-eqiad to JUNOS 10.4R3.4
- 17:59 RoanKattouw: Adding WikiLove table on officewiki (extensions/WikiLove/patches/WikiLoveLog.sql)
- 17:58 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Add $wmgUseWikiLove, false by default'
- 17:53 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 90124:
- 17:52 RoanKattouw: Syncing WikiLove code. Code sync only, not enabled anywhere yet
- 16:19 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Revert raised account creation throttle for Lennart on testwiki'
- 12:37 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29351 Set timezone to America New_York for pcdwiki'
- 05:17 apergos: phase 2 of integrity checks on dump files (bz2s) running from screen on sn1 as root
June 14
- 19:41 RobH: chad is crazy it seems.
- 19:41 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29215: import sources for arbcom_dewiki'
- 19:40 logmsgbot: catrope synchronized php-1.17/README 'Test sync to confirm whether Chad is crazy'
- 18:29 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'bug 29331 - Set e-mail capture sender to improve@wikimedia.org'
- 17:50 notpeter: restarting udp2log on locke after upgrade
- 15:10 mark: Upgraded asw-a-eqiad to JUNOS 10.4R3.4
- 14:42 mark: Upgraded asw-b-eqiad to JUNOS 10.4R3.4 (jtac recommended version)
- 14:26 apergos: running a bunch of rudimentary integrity checks on xml dumps, out of screen as root on snapshot1
- 06:09 logmsgbot: tstarling synchronized php-1.17/includes/HistoryBlob.php 'r90036'
June 13
- 23:40 notpeter: taking down nfs-home on nfs2
- 23:03 Ryan_Lane: ran sync-docroot - how does this not get logged?
- 19:16 RobH: snapshot4 properly boots to serial console display on bootloader and os
- 18:31 RobH: snapshot4 rebooting to troubleshoot console redirection
- 13:27 apergos: snapshot4 installed and *almost* correct (stray apache started up but the rest looks ok)
June 12
- 19:47 JeLuF: Software RAID on fenari was broken, mirror sdb1 was failed. Re-added sdb1. If the error is persistent, sdb needs to be replaced.
- 19:19 JeLuF: cron.{hourly,daily,...} on hume was not running due to a broken record in /etc/crontab, so logrotate was not running
- 12:58 logmsgbot: ariel synchronized php-1.17/includes/HistoryBlob.php 'check for existence of mhash function, not extension (need for lucid php build) (take 2, after svn up :-P)'
- 12:51 logmsgbot: ariel synchronized php-1.17/includes/HistoryBlob.php 'check for existence of mhash function, not extension (need for lucid php build)'
June 11
- 07:37 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Move section disabling minordefault out of the if($wmgUseUsabilityInitiative) conditional and document it'
- 07:12 RoanKattouw: Running script to fix preferences of users that can't turn off minordefault on enwiki. See bug 24313 comment 55. Script is in maintenance/fixMinorPrefs.php
June 10
- 21:26 binasher: mobile2 goes natty
- 18:08 Ryan_Lane: added ip and dns entry for internproxy.wikimedia.org in the sandbox vlan
- 17:28 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'See whether this fixes bug 29314'
- 15:21 RobH: copying old data from ms4 for solaris specific info. do not disrupt ms4.
- 02:24 RobH: ms4 repaired and back online, ready for reinstall per ticket 885
- 01:15 Ryan_Lane: starting services on virt1, since iptables rules are in
June 9
- 23:54 binasher: upgrading mobile2 to maverick (pre natty)
- 23:04 Ryan_Lane: added preilly account as a restricted user. Added him to mobile1-5
- 22:30 binasher: upgrading mobile3 to natty ice
- 21:38 binasher: upgrading mobile3 to maverick (then natty)
- 20:02 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 89789:
- 20:02 RoanKattouw: Running scap to deploy ArticleFeedback changes
- 18:59 binasher: mobile5 natty
- 18:42 binasher: mobile5 upgrade to maverick on the way to natty
- 15:28 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Raise account creation throttle for Lennart's IP on testwiki'
- 15:10 RobH: dns updated, all nameservers dig correctly.
- 15:10 RobH: pushing dns updates to add mgmt of all eqiad devices
- 01:22 Ryan_Lane: shutting off services on virt1 (including puppet) until I get a chance to firewall them properly
June 8
- 22:35 Ryan_Lane: rebooting virt1
- 21:58 hcatlin: deploying to mobile cluster
- 21:34 Ryan_Lane: changing controller.labs.wikimedia.org to a CNAME for virt1.wikimedia.org
- 21:27 mark: Started ircd on ekrem
- 21:20 mark: Moved routing of vlan 101 from csw1-sdtpa to csw5-pmtpa
- 21:06 binasher: changing pybal healthcheck url for mobile
- 20:47 Ryan_Lane: rebooting controller.labs.wikimedia.org
- 20:44 Ryan_Lane: restarting pdns on ns0, it's locked up
- 20:33 Ryan_Lane: make that deleting virt1.pmtpa.wmnet, and changing virt1.mgmt.pmtpa.wmnet to controller.labs.mgmt.pmtpa.wmnet
- 20:32 Ryan_Lane: changing virt1.pmtpa.wmnet to controller.labs.pmtpa.wmnet
- 20:27 Ryan_Lane: changing ip of controller.labs.wikimedia.org
- 20:17 mark: Rebooting ekrem
- 20:16 mark: Changed ekrem's ip
- 20:10 mark: Added 2nd ip to ekrem (208.80.152.178)
- 19:56 binasher: upgrade mobile4 to natty
- 19:53 mark: Shutdown ekrem's switch port and null routed the ip for a while
- 19:53 Ryan_Lane: deleting vmdev1 and vmdev2, and moving the controller.labs.wikimedia.org entry into the public services subnet
- 19:45 binasher: upgrading mobile4 to maverick on its way to natty
- 19:31 Ryan_Lane: reclaiming sandbox ips for owa.tesla, owa1.tesla, owa2.tesla, lvs1.tesla, and grid.tesla
- 18:10 Ryan_Lane: depooling mobile5
- 18:05 Ryan_Lane: changing weight for mobile5
- 17:42 Ryan_Lane: upping the weight of mobile5 to 24
- 17:27 Ryan_Lane: upping weight of mobile5 to 16
- 17:18 Ryan_Lane: depooling mobile4
- 17:16 Ryan_Lane: reducing weight of mobile4/5 to 8
- 17:13 Ryan_Lane: repooling mobile4/5 with a weight of 16
- 16:29 ^demon|away: irc feeds are down, somebody who has access should investigate.
- 15:39 mark: Rebooting sq71 to remove old temp eqiad tunnel config
- 14:49 mark: Setup PIM-SM on eqiad core routers and joined the pmtpa domain
- 14:49 mark: Changed pmtpa PIM-DM to PIM-SM
- 11:31 mark: Added A records for cr1-eqiad and cr2-eqiad loopback interfaces
- 10:47 mark: Set local-pref value for private peers on cr1-eqiad and cr2-eqiad
- 10:43 mark: Removed cr1-eqiad:ae1.666 configuration to remove the temporary tunneled link
- 10:39 mark: Configured OSPF and OSPFv3 on csw1-sdtpa:e16/1 and cr2-eqiad:xe-5/2/1
- 10:39 mark: Assigned IP addresses to csw1-sdtpa:e16/1 and cr2-eqiad:xe-5/2/1 to bring up the second 10G wave
- 00:37 logmsgbot: awjrichards synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enabling CustomUserSignup on testwiki'
June 7
- 23:51 Ryan_Lane: rebooting virt1/2
- 23:21 Ryan_Lane: rebooting virt1
- 23:21 Ryan_Lane: moving nova-network to virt2, as it was causing issues accessing the public IP on virt1
- 23:12 Ryan_Lane: rebooting virt1
- 22:33 Ryan_Lane: removed nova-network from virt2-4, since multiple network nodes isn't supported. Added to virt1. Rebooting virt2-4.
- 22:11 hcatlin: deploying to mobile cluster
- 21:55 Ryan_Lane: restarting opendj on nfs2
- 19:56 logmsgbot: tfinc synchronized php-1.17/wmf-config/InitialiseSettings.php 'enabling collections extension on wikinews'
- 19:39 andrewS: ms4 ticket 729 updated, system is in testing
- 17:46 Ryan_Lane: rebooting virt1 to enable virtualization in the bios
- 17:42 Ryan_Lane: rebooting virt2 to enable virtualization in the bios
- 17:39 Ryan_Lane: rebooting virt3 to enable virtualization in the bios
- 17:35 Ryan_Lane: rebooting virt4 to enable virtualization in the bios
- 17:01 logmsgbot: awjrichards synchronized php-1.17/wmf-config/CommonSettings.php 'Disabling Tidy on collabwiki per bug 29295'
- 16:50 mark: Adjusted BGP filters on cr1-eqiad to allow advertisement of pmtpa aggregates to BGP peers
- 16:29 mark: Removed static routes from cr1-eqiad and cr2-eqiad
- 16:18 mark: Setup full mesh of iBGP sessions between csw5-pmtpa, csw1-sdtpa, cr1-eqiad, cr2-eqiad (separate v4/v6 sessions)
- 15:17 mark: Removed IPv6-only iBGP session between cr1-eqiad and cr2-eqiad and converted the v4 iBGP session into multi-address-family
- 14:22 mark: Setup OSPFv3 between csw1-sdtpa and csw5-pmtpa
- 14:08 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 28292'
- 14:04 logmsgbot: reedy synchronized php-1.17/includes/api/ApiPageSet.php 'r89646'
- 13:58 mark: Setup OSPF and OSPFv3 on the first 10G wave circuit between csw1-sdtpa and cr1-eqiad
- 13:39 RobH: updaed puppet giving andrew shields basic restricted shell access to fenari
- 11:08 mark: Setup quick log rotation of ipv6and4 on maerlant
- 04:51 logmsgbot: demon synchronized php-1.17/wmf-config/CommonSettings.php 'Let normal users test/signoff on revisions, just like we let them post comments. Restricting it to "coders" kind of defeats the purpose'
June 6
- 20:22 logmsgbot: hashar: ci.tesla: cleaned up backlog and /tmp (I really need to fix this)
- 17:00 logmsgbot: hashar: ran namespaceDupes.php frwikiversity --fix (bug 29015)
- 15:45 logmsgbot: catrope synchronized php-1.17/extensions/WikimediaIncubator/IncubatorTest.php 'r89571'
- 15:45 logmsgbot: catrope synchronized php-1.17/extensions/WikimediaIncubator/SpecialViewUserLang.php 'r89571'
- 14:54 logmsgbot: catrope synchronized php-1.17/extensions/WikimediaIncubator/SpecialRandomByTest.php 'r89565'
- 12:23 logmsgbot: catrope synchronized php-1.17/extensions/WikimediaIncubator/IncubatorTest.php 'r89561'
- 11:41 logmsgbot: catrope synchronized php-1.17/extensions/WikimediaIncubator/SpecialViewUserLang.php 'Attempted fix for fatal error'
- 11:36 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enable WikimediaIncubator extension on incubatorwiki'
- 11:28 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Add incubator loading code'
- 11:24 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Add $wmgUseIncubator, set to false'
- 11:14 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 89556:
- 11:13 RoanKattouw: Running scap to deploy WikimediaIncubator extension (code push only, not yet enabled anywhere)
- 09:46 logmsgbot: catrope synchronized php-1.17/includes/json/Services_JSON.php 'r89555'
- 06:18 Ryan_Lane: rebooted virt2/3, so that nova-network would reconfigure itself properly
- 05:26 Ryan_Lane: depooled mobile4/5
June 5
- 23:22 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29269 Enable wgRawHtml on collabwiki - it is locked down wiki'
- 18:06 logmsgbot: reedy synchronized php-1.17/includes/api/ApiPageSet.php 'r89514 for bug 25734'
- 15:32 logmsgbot: reedy synchronized php-1.17/wmf-config/codereview.php 'Reduce semantic autodefer regex'
- 09:18 mark: killed old stray lsearchd process, started new one on search7 (search pool 3)
June 4
- 11:17 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29270'
June 3
- 22:44 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php
- 22:43 logmsgbot: pdhanda synchronized closed.dblist
- 22:10 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 29007. Permission changed per comment 2'
- 21:50 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 29007. Added autopatrol to fawiki'
- 21:34 Ryan_Lane: upping weights of mobile4/5 to 24
- 21:28 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/ArticleFeedback.i18n.php 'r89444, r89445'
- 21:28 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/ArticleFeedback.hooks.php 'r89444, r89445'
- 21:27 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/jquery.articleFeedback/jquery.articleFeedback.css 'r89444, r89445'
- 21:27 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/jquery.articleFeedback/jquery.articleFeedback.js 'r89444, r89445'
- 21:27 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/ext.articleFeedback/ext.articleFeedback.js 'r89444, r89445'
- 20:43 Ryan_Lane: added temporary dns entry controller.labs.wikimedia.org until we figure out a proper DNS scheme for labs
- 19:43 Ryan_Lane: repooling mobile4/5 with a weight of 8
- 19:09 notpeter: pushing mible 4 and 5 into pool
- 19:00 Ryan_Lane: reinstalling opendj on virt1 with a new domain; using labs instead of tesla
- 17:59 awjr: killing eternally running query on civicrm db on db9
- 17:28 pdhanda: running maintenance/namespaceDupes.php on frwikiversity
June 2
- 20:00 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/ext.articleFeedback/ext.articleFeedback.js 'Live hack to disable join CTA'
- 19:03 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Bump AFT tracking version to 8'
- 19:02 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/ext.articleFeedback/ext.articleFeedback.js 'r89355'
- 19:02 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/ext.articleFeedback/ext.articleFeedback.startup.js 'r89355'
- 19:01 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/jquery.articleFeedback/jquery.articleFeedback.css
June 1
- 21:46 logmsgbot: reedy synchronized php-1.17/includes/User.php 'bug 27514 Make unblockself available as a global permission'
- 21:24 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 26319 Enable collection on fawiki. Testing on rtl'
- 21:13 logmsgbot: reedy synchronized php-1.17/wmf-config/CommonSettings.php 'bug 27323 WMF Logo/Button not served via Bits'
- 19:07 Ryan_Lane: switching test.wikipedia.org back to text cname, since it is using a fake cert. (testing was successful)
- 18:47 Ryan_Lane: changing cname for test.wikipedia.org to wikipedia-lb, to test https
- 18:41 Ryan_Lane: changing scheduler for https services from wrr to sh on pmtpa lvs servers. restarting pybal on lvs1
- 18:38 Ryan_Lane: make that ns1
- 18:38 Ryan_Lane: restarting pdns on ns2, it locked up after authdns-update
- 18:36 Ryan_Lane: adding wikipedia-lb geodns entry
- 18:30 Ryan_Lane: restarting pybal on amslvs1
- 18:29 Ryan_Lane: adding wikipedia-lb service to amslvs1
- 16:11 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29228'
- 15:43 logmsgbot: reedy synchronized php-1.17/extensions/CentralAuth/SpecialCentralAuth.php 'bug 28767'
- 15:40 logmsgbot: reedy synchronized php-1.17/extensions/CentralAuth/SpecialCentralAuth.php 'bug 28767'
- 14:39 Reedy: Poking at CentralAuth (bug 28767) on testwiki. If things get more broken, it's probably me
- 09:32 apergos: there's a hung tcpdump on amssq37 tht won't die. it doesn't seem to be doing much but it won't die either (that was me looking at purge issues)
- 08:18 thedj: Cache invalidations issues continue, but squid source unknown atm.
May 31
- 23:28 logmsgbot: asher synchronized php-1.17/wmf-config/db.php 'removing db7 from rotation'
- 23:19 Ryan_Lane: restarting pybal on lvs1
- 23:07 Ryan_Lane: adding forward entry for wikipedia-lb, for https testing
- 22:44 Ryan_Lane: restarting pdns on ns0, it's locked up
- 22:43 Ryan_Lane: changing cname for wikimania2005 to wikimedia-lb, for testing https
- 22:37 Ryan_Lane: adding geodns entry for wikimedia-lb
- 22:26 Ryan_Lane: adding wikimedia-lb forward entry for esams
- 21:56 Ryan_Lane: switching wikimania2005 back to text
- 21:45 Ryan_Lane: changing cname of wikimania2005 from text to wikimedia-lb to test https
- 20:51 hashar: wikitech 'ci.tesla: setup a cron for user ci to delete /tmp/mwParser* directories'
- 20:26 mark: Extended LVS prefix lists on csw1-esams and csw2-esams with 91.198.174.224/28
- 20:25 Ryan_Lane: restarting pybal on lvs1
- 19:52 Ryan_Lane: updating yvon and gurvin in dns, moving them to the squid-lvs address range
- 19:40 Ryan_Lane: adding wikimedia https service to pybal, and restarting pybal
- 18:47 mark: Disabled (redundant) PIM on csw5-pmtpa:ve3, so only csw1-sdtpa is running PIM now
- 18:20 Ryan_Lane: and restarting apache on it
- 18:20 Ryan_Lane: running sync common on srv186
- 16:07 logmsgbot: hashar: ci.tesla: was actually out of inodes on /. /tmp filled with mwParser-...-images directories
- 16:02 logmsgbot: hashar: ci.tesla: restarted cruisecontrol (no space left on device) error. It still has space though!
May 29
- 21:19 logmsgbot: demon synchronized php-1.17/extensions/LiquidThreads/classes/View.php 'r89136'
- 19:17 logmsgbot: hashar: ci.tesla installed imagemagick php-gd php5-gd
- 18:42 logmsgbot: hashar: resumed continuous integration on ci.tesla
- 18:37 logmsgbot: hashar: ci.tesla out of disk, cleaning up /dev/sda1 wich is / (a good idea would be to use separate partitions
- 17:12 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29119 Allow nlwiki crats to remove rollbacker group'
- 05:57 Ryan_Lane: restarted pdns on dobson, it wasn't responding to requests
- 03:07 Ryan_Lane: killed some long running civi queries on db9 that have been running for over 200,000 seconds.
- 02:54 mark: Killed oprofile daemon and unloaded oprofile module on db32
- 02:38 mark: Restarted squid-frontend on sq76
May 28
- 19:41 logmsgbot: reedy synchronized php-1.17/wmf-config/CommonSettings.php 'Set wgCodeReviewMaxDiffPaths = 30'
- 17:52 Andrew: Loaded board vote configuration from extensions/SecurePoll/cli/wm-setup/bv2011-*-final3-jump.xml
- 17:48 logmsgbot: andrew synchronized php-1.17/wmf-config/CommonSettings.php 'Hooks for SecurePoll'
- 15:01 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29119'
- 08:15 logmsgbot: andrew synchronized php-1.17/extensions/SecurePoll/includes/user/Auth.php 'deploy r89026'
- 07:49 logmsgbot: andrew synchronizing Wikimedia installation... Revision: 89008:
- 07:48 Andrew: Running scap to deploy SecurePoll updates
May 27
- 22:17 logmsgbot: demon synchronized closed.dblist 'bug 29064: cleanup wikimania wiki closures (2005-2010)'
- 22:17 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29064: cleanup wikimania wiki closures (2005-2010)'
- 22:00 logmsgbot: catrope synchronized php-1.17/includes/GlobalFunctions.php 'Remove wfMsgExt profiling again'
- 21:53 logmsgbot: catrope synchronized php-1.17/includes/MessageBlobStore.php 'Prospective fix for the UW bug'
- 21:38 logmsgbot: catrope synchronized php-1.17/includes/MessageBlobStore.php 'More profiling'
- 21:35 logmsgbot: catrope synchronized php-1.17/includes/GlobalFunctions.php 'Add profiling points in wfMsgExt()'
- 21:29 logmsgbot: catrope synchronized php-1.17/includes/MessageBlobStore.php 'Add more profiling'
- 21:15 logmsgbot: catrope synchronized php-1.17/StartProfiler.php 'Live hack for debugging UploadWizard RL weirdness'
- 21:13 logmsgbot: catrope synchronized php-1.17/load.php 'Revert live hack'
- 21:04 logmsgbot: catrope synchronized php-1.17/load.php 'Add live hack for profiling'
- 20:48 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 29129. Added the new namespace to be search by default for metawiki.'
- 20:30 logmsgbot: pdhanda synchronized php/wmf-config/InitialiseSettings.php
- 12:10 Andrew: Installing SecurePoll DB tables on wikis in ~/centralauth_wikis_missing_securepoll
- 07:57 apergos: the images Rl2-quality-full-003.jpg and Picture_602.11.85.5.jpg have been removed from our image server permanently, office action (legal request).
- 06:37 apergos: thanks tim
- 06:37 apergos: restarted irc bot on ekrem
- 06:09 apergos: er... on ekrem :-P
- 06:09 apergos: restarted irc daemon, hopefully properly
- 06:02 apergos: attempting to reboot ekrem (was not pingable)
May 26
- 23:20 notpeter: powercycling knsq5
- 21:34 Ryan_Lane: adding en.prototype, de.prototype and test.prototype cnames for prototype.wikimedia.org
- 18:58 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Raise AFT tracking percentage from 2% to 10% (restores pre-ramp-up volume) and put all users in the show-expertise bucket. Also bump tracking version to 7'
- 18:54 Ryan_Lane: adding wikimedialbsecure service to amslvs1, for testing https
- 18:52 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/jquery.articleFeedback/jquery.articleFeedback.js 'r88909'
- 18:47 Ryan_Lane: adding wikimedia-lb.esams address to ams and kn text squids, also adding lvs configuration for this address
- 18:31 Ryan_Lane: removing mediawiki-lb.esams and foundation-lb.esams reverse addresses, they were assigned to the same ips as text.esams and bits.esams respectively
- 17:48 Ryan_Lane: adding forward entry for wikimedia-lb.pmtpa.wikimedia.org
- 15:34 mark: Moved AS13030 connection configuration from port 2/11 to port 2/11 on br1-knams
May 25
- 22:39 awjr: svn up'd on civicrm.wikimedia.org to r189 - deploying further optimizations to contribution searching
- 21:22 Ryan_Lane: adding wikimedia-lb svc address to squids
- 21:22 Ryan_Lane: adding wikimedialbsecure service to lvs2, for testing https access to wikimedia.org sites.
- 21:21 Ryan_Lane: modified pybal configuration to add wikimedia-lb ip to text backend
- 19:52 logmsgbot: demon synchronized php-1.17/extensions/CentralAuth/SpecialWikiSets.php 'r88825'
- 18:32 Ryan_Lane: turning off the selenium grid virtual machines on tesla
- 17:04 RobH: updated bugzilla per rt 764 and 768. whine feature should work now
- 16:44 awjr: deployed patch to civicrm.wikimedia.org to optimize contribution search query (r185, wikimedia repo)
- 14:57 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'Don't allow users to create accounts on closed wikis either'
- 14:54 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'Cleanup wikimania200[58] entries: bug 29064'
- 14:53 logmsgbot: demon synchronized closed.dblist 'Add wikimania200[58] to closed.dblist: bug 29064'
- 14:45 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29064: close wikimania2009wiki'
- 14:43 logmsgbot: demon synchronized closed.dblist 'bug 29064: close wikimania2009wiki'
- 12:26 Tim: marked the deletion jobs in OTRS as "invalid" to avoid confusing error messages
- 12:25 Tim: re-enabled anti-deletion security patch in OTRS, disabled "temporarily" by Fred on June 4, 2009
- 11:58 domas: rebooting db9, looks unhealthy
- 11:06 apergos: so that was me on srv225: stopped apache, waited for puppet restart: it now syncs, then does the restart. oh btw srv225 isn't in the pybal apache conf file, weirdly... anyone know why?
- 09:15 apergos: restarted front and backend squids on sq75
- 08:47 apergos: restarted front and back end squids on sq81, front end was 20gb and back end was out to lunch
- 08:35 apergos: ekrem's root partition was full; cleared out some older logs from /var/log/apache2, compressed the most recently rotated (it had failed from disk full), reloaded puppet
- 07:20 Tim: set read_only=1 on db10, it's a slave so it should be read only. Nagios says it was broken for 6 months.
- 07:07 logmsgbot: midom synchronized php-1.17/wmf-config/db.php
- 05:09 Tim: restarted nagios IRC bot
- 01:05 Tim: created an account for myself in watchmouse
- 00:48 Andrew: Running populateBv2011EditCount.php in a screen on hume
- 00:32 Ryan_Lane: pruning binlogs on db9
May 24
- 23:12 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 29129. Added a namespace alias for metawiki.'
- 20:36 notpeter_: resyncing drbd nfs1 to nfs2
- 20:28 RobH: mobile4 awaiting final configurations for service, mobile5 installation done, not yet run puppet updates
- 20:20 RobH: mobile4 is running puppet updates, mobile5 is mid-install
- 19:46 RobH: mobile4 & mobile5 racked and accessible for mgmt. installing OS on mobile4 now
- 18:00 Ryan_Lane: added new reverse zone for svc ips in esams; added entries for bits.svc.esams.wmnet and upload.svc.esams.wmnet
- 17:31 Ryan_Lane: removing bits-lb and upload-lb public ip enties, and adding bits.svc.pmtpa.wmnet and upload.svc.pmtpa.wmnet entries for https testing
- 16:46 mark: Upgraded cr2-eqiad to JUNOS 10.4R4.5
- 16:44 mark: PIM configuration was missing on some virtual interfaces on csw1-sdtpa, which broke purging to esams. Put back in place
- 16:09 mark: Reloading re1.cr2-eqiad
- 16:01 mark: Fixed tunnel hack between pmtpa and eqiad
- 15:01 RobH: db20 networking restarted, back online
- 14:58 RobH: Miguel is rebooting db18
- 13:55 mark: reloading csw5
- 13:52 domas: ns2 has been down for 3 hours, will try restarting pdns
- 13:47 mark: reloading asw-a4
- 13:45 mark: reloading asw-d3
- 13:39 domas: db18/20/28 are still down, may need power cycle. Took them out of rotation.
- 13:37 RobH: restarted networking on db33-39.
- 13:30 mark: restarted networking on db30-32.
- 13:25 domas: DB server issue appears to be a linux kernel issue, forcedeth-related, can be fixed with /etc/init.d/networking restart. Doing so on db10-30, asking mark to do the rest.
- 13:23 mark: restarted slot 3 since that's where the problematic database servers are
- 13:12 mark: csw1-sdtpa back up, but a whole lot of database servers are unreachable
- 13:07 mark: upgraded and reloaded csw1-sdtpa
- 12:05 mark: Upgraded bootrom and software on asw-d1-sdtpa and asw-a5-sdtpa
- 10:53 mark: Copied existing 2.7.0.2 image to secondary flash, and installed 2.7.0.3a image on primary flash on csw5-pmtpa
- 10:42 mark: Copied existing 2.6.0 image to secondary flash, and installed 2.7.0.3a image on primary flash on csw1-sdtpa
- 04:20 logmsgbot: demon synchronized php-1.17/extensions/GoogleNewsSitemap/GoogleNewsSitemap_body.php 'r88707'
- 04:03 logmsgbot: tstarling synchronized php-1.17/wmf-config/InitialiseSettings.php
- 04:03 Tim: enabling GoogleNewsSitemap on enwikinews
May 23
- 21:18 Ryan_Lane: added upload-lb.pmtpa.wm.o and bits-lb.pmtpa.wm.o addresses to reverse dns for https preparation
- 19:32 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29003'
- 18:45 apergos: reloading puppet on all hosts, with a minute sleep in between, running in screen on fenari as root. needed for enabling pluginsync on clients
- 18:20 Ryan_Lane: moved ipv6 proxy from iris to maerlant
- 17:35 RobH: db20 back online, needs mysql setup for replication on another cluster. created rt#836
- 17:29 RobH: db20 was offline from earlier hardware issues that appear to have been resolved. it is in no current deployment group. booting it online to see if it requires reinstallation
- 14:31 logmsgbot: demon synchronized php-1.17/extensions/GoogleNewsSitemap/SitemapFeed.php
- 14:31 logmsgbot: demon synchronized php-1.17/extensions/GoogleNewsSitemap/FeedSMItem.php 'r88640'
- 14:30 logmsgbot: demon synchronized php-1.17/extensions/GoogleNewsSitemap/GoogleNewsSitemap_body.php 'r88640'
- 14:23 hcatlin: Deploying code update to Mobile wikipedia
- 05:08 Tim: in watchmouse: updated subversion monitoring, was broken since May 7
May 22
- 17:47 logmsgbot: reedy synchronized php-1.17/extensions/CodeReview/ui/CodeStatusChangeListView.php 'r88585'
- 17:46 logmsgbot: reedy synchronized php-1.17/extensions/CodeReview/ui/CodeCommentsListView.php 'r88585'
- 00:09 Reedy: Commons is back up and editable
May 21
- 23:38 Reedy: Intermittant MySQL errors occuring on hits to 10.0.6.41, it is being looked into
- 23:35 Ryan_Lane: powercycling db31, it's dead
- 10:27 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php '25963 - Enable of Recent changes patrol for Serbian Wikisource (sr.wikisource.org)'
May 20
- 22:30 Reedy: StewardWiki security issues fixed
- 22:22 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'Stewardwiki'
- 21:23 awjr: running contribution auditing scripts on civicrm.wikimedia.org
- 21:21 awjr: enabling log audit module on civicrm.wikimedia.org
- 21:18 awjr: svn up'ing to r183 of wikimedia branch on production instance of civicrm on grosley
- 20:57 logmsgbot: pdhanda ran sync-common-all 'Bug 28717. Group changes and flagged revs enabled for bnwiki'
- 20:50 Ryan_Lane: maerlant too
- 20:50 Ryan_Lane: rebooting maerlent
- 20:13 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28773. more rights and namespaces for stewardwiki'
- 17:16 RobH: updated dns for new mobile servers mgmt
- 17:12 RoanKattouw: Applying index changes on click_tracking_user_properties from r88462 to the cluster. Using /home/catrope/patch-click_tracking_user_properties-index.sql
- 15:50 logmsgbot: pdhanda synchronized php-1.17/wmf-config/abusefilter.php 'Bug 28461. Custom abuse filter settings for be_x_oldwiki'
- 15:50 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28461. Custom abuse filter settings for be_x_oldwiki'
- 14:41 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28614. Added suppressredirect to autopatrolled on svwikisource'
- 14:33 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28576. Changed upload settings for huwiki'
- 03:12 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'Touch config'
- 03:12 logmsgbot: demon synchronized closed.dblist 'bug 29044: closure of tkwikibooks'
- 02:48 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29058: import sources for arwikinews'
- 02:04 logmsgbot: LocalisationUpdate failed
- 00:06 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 29057'
May 19
- 22:26 ^demon: add new 'tools' group to public svn authz
- 21:02 logmsgbot: hashar: cruisecontrol: tests should now run in roughly 1min30! Still need attention though
- 20:16 logmsgbot: hashar: cruisecontrol: archived build logs for 2010 and 2011 from 01 till 04
- 18:48 logmsgbot: hashar: relaunched CruiseControl service on ci
- 02:01 logmsgbot: LocalisationUpdate failed
May 18
- 19:44 Ryan_Lane: restarting squid processes on sq71
- 18:11 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'Turning GNSM back off for now -- apache spike'
- 17:50 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'Deploying GNSM for enwikinews'
- 17:29 logmsgbot: demon synchronizing Wikimedia installation... Revision: 88303:
- 16:06 ^demon: didn't work, had errors with ArticleAssessmentPilot, FlaggedRevs, LQT_alpha, and MessagesTp. Plus lots of rsync permission errors. Saved output to /home/demon/logs/l10nupdatefail.log
- 15:53 ^demon: doing manual run of l10nupdate
- 15:24 logmsgbot: demon synchronized php-1.17/wmf-config/extension-list
- 02:05 logmsgbot: LocalisationUpdate failed
May 17
- 18:33 logmsgbot: reedy synchronized php-1.17/wmf-config/abusefilter.php 'fix capitalisation fail'
- 18:30 logmsgbot: reedy synchronized php-1.17/wmf-config/abusefilter.php 'bug 29017 again doesn't seem to have pushed out'
- 13:24 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Re-enable CLDR'
- 13:23 logmsgbot: catrope synchronized php-1.17/languages/Language.php 'r88303'
- 13:09 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Disable CLDR because of fatal'
- 13:03 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enable CLDR extension on all wikis'
- 12:52 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 88299:
- 12:51 RoanKattouw: Running scap to deploy CLDR extension. Only enabled on test for now, will go site-wide in a minute
- 11:57 logmsgbot: catrope synchronized php-1.17/languages/messages/MessagesEn.php 'r88295'
- 11:34 logmsgbot: catrope synchronized php-1.17/extensions/UploadWizard/resources/jquery/jquery.validate.wmCommonsBlacklist.js 'r88293'
- 09:52 Tim: creating an account on wikitech for priyanka
- 09:51 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Re-enable UploadWizard on Commons, adding note about how to switch it back off if it explodes'
- 09:39 logmsgbot: catrope synchronized php-1.17/extensions/UploadWizard/SpecialUploadWizard.php 'Fix fatal in live hack'
- 09:37 logmsgbot: catrope synchronized php-1.17/extensions/UploadWizard/SpecialUploadWizard.php 'Live hack to debug UploadWizard issue'
- 09:17 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Re-enabling UploadWizard on test to hopefully reproduce ResourceLoader issue'
- 05:21 mark: Reduced passenger pool size to 200 on all mobile servers
- 04:46 mark: Depooled mobile2 and rebalanced LVS weights on m.wikipedia
May 16
- 21:56 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php '28504 - Need to remove all tokipona settings from InitialiseSettings.php'
- 21:52 logmsgbot: hashar: cleaning up settings files for tokipona stuff (bug 28504)
- 19:03 logmsgbot: reedy synchronized php-1.17/wmf-config/abusefilter.php 'bug 29017'
- 09:37 logmsgbot: ariel synchronized php-1.17/wmf-config/CommonSettings.php 'raise account throttle for el wiki for workshop (maybe I can talk the community into upping the threshhold generally??)'
May 15
- 21:13 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'Closing 4 wikis: thwikinews (bug 28341), huwikinews (bug 28342), cowikibooks (bug 28644), astwikiquote (bug 28964)'
- 21:13 logmsgbot: demon synchronized closed.dblist 'Closing 4 wikis: thwikinews (bug 28341), huwikinews (bug 28342), cowikibooks (bug 28644), astwikiquote (bug 28964)'
- 21:02 logmsgbot: demon synchronized php-1.17/includes/DefaultSettings.php 'r88203'
- 21:01 logmsgbot: demon synchronized php-1.17/includes/User.php 'r88203'
- 12:27 logmsgbot: pdhanda ran sync-common-all
- 12:12 logmsgbot: reedy synchronized php-1.17/wmf-config/CommonSettings.php 'bug 28141'
- 11:48 logmsgbot: robh synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 28987'
- 11:32 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php '28727 - please create Portal namespace on am.wikipedia'
- 11:05 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php '25172 - Enable DynamicPageList extension on it.wikisource'
- 10:35 logmsgbot: hashar: had to manually delete a row from the ptwiki.langlink table (ll_from: 162503, ll_lang: zh-classic, ll_title: Template:Link A (bug 28805)
- 10:27 logmsgbot: hashar: running update ignore query on ptwiki for bug28805
- 10:23 logmsgbot: hashar: batch for bug 28805 finished.
- 10:17 logmsgbot: hashar: running batch queries on all wikis for bug 28805
- 08:44 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php '28609 - Disable uploads for eu.wikipedia.org'
- 01:14 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php '28914 - Adding 'ipblock-exempt' userright to the Bots usergroup in enwiki'
- 00:52 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php '28959 - si.wikipedia $wgSitename'
May 14
- 22:46 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 27434'
- 22:08 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php 'wgMetanamespace for amwiki (bug 28727)'
- 22:05 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php '28727 - please create Portal namespace on am.wikipedia'
- 17:11 mark: Fixed IPv6 routing on csw5-pmtpa
- 16:34 logmsgbot: reedy synchronized php-1.17/wmf-config/abusefilter.php 'bug 28153'
- 16:30 logmsgbot: reedy synchronized php-1.17/wmf-config/abusefilter.php 'bug 28153'
- 16:16 logmsgbot: reedy synchronized php-1.17/wmf-config/CommonSettings.php 'bug 28141'
- 16:16 logmsgbot: reedy synchronized php-1.17/wmf-config/CommonSettings.php 'bug 28141'
- 16:16 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 28141'
- 16:14 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 28141'
- 15:57 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 27078'
- 15:52 RobH: added dns entry for wikimania2012.w.o
- 15:23 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 28773'
- 15:22 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php
- 14:50 awjr: cleaned out hudson job files for 'Donations queue consume' on grosley and archived them to storage3
- 14:22 mark: Allocated service IPs for all projects for HTTPS in DNS
- 14:06 logmsgbot: tstarling synchronized php-1.17/lib/geshi
- 13:53 mark: Upgraded kernel on streber, rebooting it
- 13:41 logmsgbot: reedy synchronized php-1.17/lib/geshi 'bug 24383'
- 13:31 logmsgbot: reedy synchronized php-1.17/lib/geshi 'bug 24383'
- 13:17 logmsgbot: reedy synchronized php-1.17/lib/geshi 'bug 24383'
- 13:13 logmsgbot: reedy synchronized php-1.17/lib/geshi 'bug 24383'
- 13:11 logmsgbot: reedy synchronized php-1.17/lib/geshinew 'Test pushing symlink'
- 12:35 Ryan_Lane: shutting down owa1-3.tesla.usability.wikimedia.org
- 12:24 Ryan_Lane: adding maerlant.esams.wikimedia.org as a https/ipv6 proxy node
- 10:06 logmsgbot: tstarling synchronized php-1.17/wmf-config/InitialiseSettings.php 'enabled enotif for user talk pages on all wikis'
- 09:54 logmsgbot: reedy synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 28895'
- 09:43 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28612. Added the autopatrolled group to elwiktionary'
- 09:28 logmsgbot: hashar synchronized php-1.17/wmf-config/InitialiseSettings.php 'Fill in template for bug 28727 (Portal namespace creation for am.wikipedia)'
- 09:16 logmsgbot: tstarling synchronized php-1.17/includes/MessageCache.php 'removed message stats patch'
- 08:51 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28479. Added the patroller group to plwikiquote'
May 13
- 17:33 Ryan_Lane: yvon and gurvin configured as https servers
- 17:26 RobH: gurvin os installed pupppet not run
- 17:14 RobH: moved gurvin vlan form 2 to 101
- 17:06 RoanKattouw: Uncommented srv196 in the node list and ran sync-common on it. It was giving me colorless diffs on CodeReview
- 17:03 RobH: yvon reinstalled, wont run puppet initial run due to error from new https service entry
- 16:51 RobH: updating dns for hosts that didnt have public ips (gurvin, some corrections for yvon)
- 16:28 logmsgbot: catrope ran sync-common-all
- 16:22 RoanKattouw: Running sync-common-all because sync-docroot didn't work
- 16:12 RoanKattouw: Syncing docroot
- 16:12 RoanKattouw: Adding xml/api directory to mediawiki.org's docroot, with an index.html similar to the one in the export-0.4 directory
- 16:02 logmsgbot: reedy synchronized php-1.17/RELEASE-NOTES
- 12:14 logmsgbot: robh synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 28347'
- 10:48 logmsgbot: robh synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 26037'
- 10:31 Tim: deployed a patch to gather i18n usage stats for siebrand
May 11
- 22:09 mark: Reloaded csw5-pmtpa on existing code base
- 01:16 logmsgbot: demon synchronized php-1.17/wmf-config/CommonSettings.php 'Temporarily disabling UploadWizard per Neil K while a bug is being worked out.'
- 01:15 logmsgbot: demon synchronized php-1.17/extensions/UploadWizard/SpecialUploadWizard.php 'r87865'
- 01:14 logmsgbot: demon synchronized php-1.17/extensions/UploadWizard/UploadWizard.i18n.php 'r87865'
- 00:10 notpeter: puppet back to normal on all boxes that are up except for knsq5.
May 10
- 23:36 Ryan_Lane: fixing torrus. why oh why must you corrupt yourself constantly, torrus?
- 23:06 notpeter: distributing new puppet certs to all boxes managed by puppet
- 22:51 logmsgbot: tstarling synchronized php-1.17/wmf-config/CommonSettings.php 'ED 1.17'
- 21:34 logmsgbot: awjrichards synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enabling Article Feedback dashboard on enwiki'
- 21:30 logmsgbot: awjrichards synchronized php-1.17/extensions/ArticleFeedback/ArticleFeedback.php 'r87844'
- 21:29 logmsgbot: awjrichards synchronized php-1.17/extensions/ArticleFeedback/ArticleFeedback.i18n.php 'r87844'
- 21:28 logmsgbot: awjrichards synchronized php-1.17/extensions/ArticleFeedback/populateAFStatistics.php 'r87844'
- 21:28 logmsgbot: awjrichards synchronized php-1.17/extensions/ArticleFeedback/SpecialArticleFeedback.php 'r87844'
- 21:02 Ryan_Lane: powercycling srv281 since it's dead
- 21:01 Ryan_Lane: powercycling srv247 since it's dead
- 20:56 Ryan_Lane: mobile has stablized, moving weights back to normal
- 20:50 Ryan_Lane: lowering weight on mobile1 and mobile2
- 20:45 Ryan_Lane: restarted apache on mobile2.
- 19:16 mark: Moved all inter-subnet routing from csw5-pmtpa to csw1-pmtpa
- 19:16 mark: Reconfigured pybal on lvs3 to BGP connect to csw1-sdtpa instead of csw5-pmtpa, and restarted it
- 17:53 mark: Moved vlan 2 routing from csw5-pmtpa:ve2 to csw1-pmtpa:ve2
- 16:27 apergos: or at least srv254.
- 16:25 apergos: nagios is whining about some services that are actually in operation (eg srv250)
- 16:07 Ryan_Lane: or not, mgmt interface isn't letting me in
- 16:06 Ryan_Lane: powercycling srv250
- 13:56 Tim: removed myself from SMS notification in nagios, since the configuration is so broken that it sends me a text every 10 minutes day or night
- 13:49 RobH: nagios sms pages seem to be delayed from the actual service outage by a long, long delay
- 13:10 Tim: did the above in both puppet and nagios directly since puppet seems to be still broken
- 13:07 Tim: increased the number of retries of payments.wikimedia.org before sending an SMS from 1 to 20
- 02:02 notpeter: accidentally deleted sockpuppet:/var/lib/puppet/ssl am regnerating keys and will begin signing soon
May 9
- 23:36 notpeter: so annoying. doubling puppet freshness check timing to 4 hours.
- 22:01 logmsgbot: awjrichards synchronized php-1.17/wmf-config/InitialiseSettings.php 'Disabling dashboard on enwiki pending fixes to data collection script'
- 21:20 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 87774:
- 21:20 RoanKattouw: Running scap to deploy UploadWizard changes
- 20:54 RoanKattouw: Running extensions/ArticleFeedback/populateAFStatistics.php on enwiki
- 20:49 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/populateAFStatistics.php 'r87770'
- 20:49 logmsgbot: catrope synchronized php-1.17/extensions/ArticleFeedback/modules/ext.articleFeedback/ext.articleFeedback.js 'r87770'
- 20:30 logmsgbot: awjrichards synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enabling ArticleFeedback dashboard on testwiki'
- 20:28 RoanKattouw: Running populateAFStatistics.php on testwiki
- 19:39 logmsgbot: catrope synchronized php-1.17/resources/startup.js 'Dummy sync to push config change through in ResourceLoader'
- 19:39 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Re-enable AFT click tracking at 2%, and bump tracking version'
- 19:32 logmsgbot: catrope synchronized php-1.17/resources/startup.js 'Force config changes to update in startup module'
- 19:21 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Raise AFT lottery odds to the target value of 2.7%'
- 19:03 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Raise AFT lottery odds to 2%'
- 18:41 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Raise AFT lottery odds to 1%'
- 18:39 RobH: upgraded Reedy's cluster access to allow him to deploy and do shell requests, will take awhile to push out to the cluster
- 18:25 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Raise AFT lottery odds to 0.5% for real this time'
- 18:21 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Bump AFT tracking version, needed to actually change the click tracking percentage'
- 18:16 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 28495'
- 18:14 Ryan_Lane: adding a shell account for asher (Asher Feldman), and adding him to roots
- 18:13 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Temporarily disable click tracking for AFT'
- 18:11 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bump AFT lottery odds to 1%'
- 18:10 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enable AFT dashboard on enwiki'
- 18:09 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Bump ArticleFeedback tracking version'
- 18:01 RoanKattouw: Running schema change (adding index of aa_timestamp field on enwiki.article_feedback table) on s1 by directly running it on the master. Table size is ~275k rows
- 17:56 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Set $wmgArticleFeedbackLotteryOdds to 0.005 on enwiki. This enables AFT on 0.5% of all enwiki main namespace pages'
- 17:44 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 87742:
- 17:43 RoanKattouw: Running scap to deploy ArticleFeedback changes
- 16:49 RoanKattouw: Deploying ArticleFeedback changes to testwiki
- 14:21 apergos: restarting amssq39 squid back-end (memory hog)
- 13:41 apergos: restarting squid back-end on amssq40 (memory hog)
- 13:27 logmsgbot: catrope synchronized php-1.17/resources/mediawiki/mediawiki.js 'r87712'
- 13:27 logmsgbot: catrope synchronized php-1.17/includes/OutputPage.php 'r87712'
- 13:26 logmsgbot: catrope synchronized php-1.17/includes/resourceloader/ResourceLoaderContext.php 'r87712'
- 13:26 RoanKattouw: Deploying r87712, which should fix the ResourceLoader breakage in IE
- 13:12 apergos: restarted squid back-end on amssq37 (hogging memory)
- 08:06 apergos: rebooting srv198 for testing
- 07:52 apergos: rebooting srv281 for testing
May 8
- 22:46 RobH: fixed deep linking issue for uploads from old techblog. merge did not copy, had to manually pull from singer and also add additional rewrite rule to apache via puppet
- 22:40 RobH: copied over techblog upload data, which upon further review was not copied by the merge plugin as expected
- 18:29 RobH: updated categories on blog to remove testblog url entry
- 18:20 RobH: updating dns to revert bad change on blog propogation, split wikimediafoundation template files to themselves, were soft linked to wiktionary.org
- 18:05 RobH: blog redirects are in place for projects, not sure why or what added them, as they do not belong. fixed hooper via puppet to handle blog.wikimediafoundation.org
- 15:46 RobH: added exim simple mail sender to puppet entry for hooper
May 7
- 20:31 RobH: hooper running back on puppet, with all config files being updated normally
- 18:11 RobH: tinkering with redirects on hooper. disabled puppet daemon while doing local apache config changes, once tested will push up to puppet and reenable it on hooper
- 18:08 RobH: blog dns updated enough for both myself and guillom to test basic blog function. all basic use seems fine. still testing more advanced use cases
- 16:58 RobH: updated dns for blog and techblog. may take an hour or so to propogate, cannot update new blog until it resolves for authors to it. (old blogs will slowly cycle out of dns)
- 16:40 RobH: blog migration moving along, all data migrated, basic rewrites and such are in place, final review and cleanup in progress before changing testblog.w.o to blog.w.o in database and updating DNS
- 16:03 RobH: testblog.w.o currently now running 'master' database for the wmf blog. online as testblog, can do quick db replace when its ready to go online
- 15:57 RobH: starting the copy of the main blog, any changes to blog.w.o after this time will be lost in the migration to the new host
- 15:53 RobH: updating blog software on hooper, installing all extensions before connecting to database copy of newsblog (not yet copied)
- 15:37 RobH: wiping out testblog on hooper and getting basic software installed for new blog deployment. database not yet copied, old blogs still 'primary'
- 12:55 logmsgbot: ariel synchronized php-1.17/wmf-config/CommonSettings.php 'raise account throttle for el wiki for workshop'
- 09:20 rainman-sr: deleted some old logs on search11 to free up space
May 6
- 22:46 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28773. Modified wgAddGroups and enabled subpages for stewardwiki'
- 20:39 Ryan_Lane: added apaches::service to imagescalers class in puppet to ensure that sync-common will run before apache is started on the imagescalers
- 18:10 awjr: enabling drupal/civicrm logging on civicrm.wikimedia.org (forgot to turn this back on post-upgrade)
- 17:41 notpeter: killing the shit out of puppet to clear its cache.
- 17:38 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28791. Fixed logo for piwiki'
- 17:19 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28689 - enabled NewUserMessage on thwiki'
- 13:44 RobH: installed and activated wordpress importer on testblog
- 08:58 RoanKattouw: Ran sync-common on srv281 and restarted Apache. It was complaining about missing directories in docroot
- 08:55 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 87486:
- 08:55 RoanKattouw: Uncommenting srv281 in /etc/dsh/group/mediawiki-installation and scapping again
- 08:53 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 87486:
- 08:52 RoanKattouw: Running scap because srv281 is out of sync. I thought we had measures to prevent out-of-sync servers from being pooled?
- 05:03 apergos: powercycled srv281 (again)
- 01:55 Tim: stopped xinetd on fenari, running svn up to add REL1_17 branch to extdist snapshot
May 5
- 17:32 notpeter: taking down nfs1 to upgrade to lucid and upgrade udplog
- 05:47 logmsgbot: tstarling synchronized php-1.17/img_auth.php 'r87486'
- 05:47 logmsgbot: tstarling synchronized php-1.17/includes/WebRequest.php 'r87486'
- 05:46 logmsgbot: tstarling synchronized php-1.17/includes/User.php 'r87486'
May 4
- 23:11 awjr: all CiviCRM-related cron and hudson jobs re-enabled
- 22:19 awjr: CiviCRM upgrade complete, putting civicrm.wikimedia.org back on-line
- 22:17 logmsgbot: demon synchronized php-1.17/extensions/GoogleNewsSitemap/GoogleNewsSitemap_body.php 'r87462'
- 22:17 logmsgbot: demon synchronized php-1.17/extensions/GoogleNewsSitemap/GoogleNewsSitemap.php 'r87462'
- 22:16 logmsgbot: demon synchronized php-1.17/extensions/GoogleNewsSitemap/GoogleNewsSitemap.alias.php 'r87462'
- 21:33 awjr: performing CiviCRM upgrade (3.1.6->3.4.0) on grosley
- 21:24 awjr: upgrade to drupal 6.20 complete for civicrm.wikimedia.org
- 21:10 awjr: replacing /srv/org.wikimedia.civicrm dir on grosley with svn co from new deployment branch in wikimedia repo (/branches/deployment/fundraising-civicrm/d620c34)
- 20:51 awjr: disabling civicrm and drupal-related cron/hudson jobs
- 20:49 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 87452:
- 20:43 logmsgbot: catrope synchronized php-1.17/extensions/GoogleNewsSitemap/GoogleNewsSitemap_body.php 'Fix GNSM, r87488'
- 20:37 awjr: backing up 'civicrm' database from db9
- 20:34 awjr: backing up 'drupal' database from db9
- 20:30 awjr: taking civicrm.wikimedia.org offline for drupal and civicrm upgrade
- 20:14 RoanKattouw: Deploying new UploadWizard code to testwiki
- 18:42 Ryan_Lane: created a restricted account for neilk
- 15:33 logmsgbot: demon synchronized php-1.17/extensions/CentralAuth/SpecialWikiSets.php
- 14:29 hcatlin: deploying mobile language update
May 3
- 22:38 logmsgbot: pdhanda synchronized php-1.17/wmf-config/abusefilter.php 'Bug 28502 - custom config for thwiki'
- 22:37 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28502 - enabled abusefilter for thwiki'
- 21:21 Ryan_Lane: powercycling srv281
- 21:12 logmsgbot: catrope synchronized php-1.17/README 'Test sync to see which hosts are borked'
- 21:04 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Bug 28257. Added requested NamespaceAlias to orwiki'
- 20:32 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 87355:
- 20:31 RoanKattouw: Running scap to deploy 1.17wmf1 merges (r87353)
- 16:24 awjr: re-upgrading dev.civicrm.wikimedia.org from civicrm 3.1.6 to 3.4.0
- 15:59 RobH: testblog ops work done, no other services impacted
- 15:13 RobH: working on testblog on hooper and db9, no other services should be affected
- 03:30 RobH: issue of singer network spiking and plummeting, along with swap death, at ~3am gmt
- 03:25 RobH: singer blocked on db9, flushed hosts to allow to reconnect
- 01:30 logmsgbot: demon synchronized php-1.17/extensions/GoogleNewsSitemap/GoogleNewsSitemap_body.php 'r87311'
- 01:29 logmsgbot: demon synchronized php-1.17/extensions/FlaggedRevs/FlaggedRevs.hooks.php 'r87311'
- 00:44 tomaszf: killing long running CiviReport query on db9 due to watchdog being off for upgrade
May 2
- 20:02 logmsgbot: laner synchronized php-1.17/wmf-config/InitialiseSettings.php 'Adding Research namespace to meta, per bug #28742'
- 17:52 Ryan_Lane: powercycling srv281
- 17:50 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enabled LabeledSectionTransclusion for mediawiki - requested by guillom'
- 17:28 RobH: fixing issue with bad serial assignment leading to bad network config on ps1-a1/2-eqiad, restarting both power strips
- 15:47 mark: Powercycled amssq35
- 04:21 logmsgbot: tstarling synchronized php-1.17/includes/Wiki.php 'r87235'
- 03:58 logmsgbot: tstarling synchronized php-1.17/wmf-config/InitialiseSettings.php
- 03:34 logmsgbot: tstarling synchronized php-1.17/includes/parser/ParserCache.php
- 03:31 logmsgbot: tstarling synchronized php-1.17/includes/parser/ParserCache.php
- 02:13 logmsgbot: tstarling synchronized php-1.17/wmf-config/InitialiseSettings.php 'bug 27891 log'
- 02:00 logmsgbot: tstarling synchronized php-1.17/includes/parser/ParserCache.php 'r87234'
- 01:22 logmsgbot: demon synchronized php-1.17/extensions/FlaggedRevs/FlaggedRevs.hooks.php 'r87233'
- 01:21 logmsgbot: demon synchronized php-1.17/extensions/FlaggedRevs/FlaggedRevs.php 'r87233'
- 00:09 logmsgbot: tstarling synchronized php-1.17/wmf-config/InitialiseSettings.php 'wgBlockDisablesLogin on checkuserwiki'
May 1
- 21:14 Ryan_Lane: ran flush host on db9 to bring blog.wikimedia.org back
April 30
- 11:38 logmsgbot: catrope synchronized php-1.17/includes/api/ApiQueryAllUsers.php 'Attempt to fix auprop=blockinfo bug'
- 11:15 Ryan_Lane: (make that blog.wikimedia.org ;) )
- 11:14 Ryan_Lane: ran flush hosts on db9 to bring blogs.wikimedia.org back up
April 29
- 23:27 ^demon: all of that was to deploy GoogleNewsSitemap to testwiki for the enwikinews folks.
- 23:23 logmsgbot: demon synchronized php-1.17/wmf-config/CommonSettings.php
- 23:22 logmsgbot: demon synchronized php-1.17/wmf-config/InitialiseSettings.php
- 23:18 logmsgbot: demon synchronizing Wikimedia installation... Revision: 87149: r87149
- 19:43 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Fix config bug that caused tracking to fail for AFT, and bump tracking version'
- 19:18 mark: powercycled ersch
- 15:40 RobH: typo asw-b8
- 15:39 RobH: swapping msw1-eqiad fan tray with asw-a8-eqiad fan tray
- 05:58 logmsgbot: midom synchronized php-1.17/wmf-config/db.php 'adding srv170 after hw reset'
- 05:55 logmsgbot: midom synchronized php-1.17/wmf-config/db.php
April 28
- 23:46 Tim: doing hard reset of alsted, it's not sending ssh version strings and not allowing logins by serial
- 22:17 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Disabled option to allow blocked users to edit their talk pages on itwiki. See Bug 9073'
- 21:02 Ryan_Lane: created Ambassador-announce-l list
- 20:52 richcole1: replacing memory in srv281
- 20:49 RobH: srv281 shutdown for memory replacement
- 20:41 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enable by default, option to enable blocked users to edit their own talk page'
- 20:34 richcole1: replacing bad drive in srv284
- 20:08 Ryan_Lane: powercycling srv281
- 20:03 RobH: ms4 is powered down, was already offline and unresponsive to console.
- 18:19 logmsgbot: demon synchronizing Wikimedia installation... Revision: 87091: r87090, r87091
- 18:02 mark: Restarted apache on srv235, probably corrupted apc cache
- 17:57 logmsgbot: awjrichards synchronized php-1.17/wmf-config/InitialiseSettings.php 'Re-enabling EmailCapture on enwiki'
- 17:54 logmsgbot: awjrichards synchronizing Wikimedia installation... Revision: 87084:
- 17:47 logmsgbot: awjrichards synchronized php-1.17/wmf-config/CommonSettings.php 'Preparing conf for ArticleFeedback with way less click tracking to prevent cluster breakage'
- 17:40 RobH: etherpad.proxy to 000-etherpad.proxy on hooper apache2 enabled, should fix
- 17:34 RobH: hooper vhost for testblog back online
- 17:17 RobH: rolled back testblog vhost from hooper
- 17:01 RobH: testblog enabled for initial setup on hooper without disrupting etherpad service (on either of its url redirects)
- 16:56 RobH: updating etherpad.proxy apache config to include eiximenis serveralias and enabling the testblog vhost
- 16:48 Reedy: Manually pinged CR update script, had got stuck - 2 revisions. Let me know if it does it again
- 16:21 logmsgbot: catrope synchronizing Wikimedia installation... Revision: 87085:
- 16:21 RoanKattouw: Rolling back ArticleFeedback deployment
- 16:11 RoanKattouw: API cluster was briefly dead just now after enabling EmailCapture, disabling it seems to be bringing it back
- 16:09 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Disable EmailCapture, maybe this will bring our API Apaches back'
- 16:04 logmsgbot: awjrichards synchronized php-1.17/wmf-config/InitialiseSettings.php 'Enabling EmailCapture on enwiki'
- 16:01 RoanKattouw: (that was for the ArticleFeedback deployment)
- 15:58 logmsgbot: awjrichards synchronizing Wikimedia installation... Revision: 87083:
- 14:08 nelson: Upped the max_connections for rsyncd from 2 to 5 across the test cluster.
- 13:56 nelson: Upped the concurrency of swift-object-replicator from default of 1 to 5
- 13:38 mark: Restored I/O scheduler deadline for device /dev/sda on alsted
- 13:34 mark: Temporarily set I/O scheduler cfq for device /dev/sda on alsted
- 12:29 mark: powercycled harmon
- 01:00 awjr: upgraded dev.civicrm.wikimedia.org to CiviCRM 3.4.0
April 27
- 20:24 nimish_g: nimishg enabling CustomUserSignup extension on enwiki
- 20:20 logmsgbot: nimishg synchronized php-1.17/wmf-config/InitialiseSettings.php 'enabling'
- 01:44 logmsgbot: nimishg synchronized php-1.17/extensions/ClickTracking/modules/ext.UserBuckets.js 'setting cookie expiry to 90 days'
- 01:44 logmsgbot: nimishg synchronized php-1.17/extensions/ClickTracking/ClickTracking.hooks.php 'setting cookie expiry to 90 days'
- 01:43 logmsgbot: nimishg synchronized php-1.17/extensions/CustomUserSignup/CustomUserSignup.hooks.php 'setting cooky expiry to 90 days'
- 01:03 Tim: updated sudoers in puppet for r83000
- 00:25 Tim: reniced sshd on sockpuppet to -2 so that I can log in while catalog updates are running
April 26
- 23:40 logmsgbot: nimishg synchronizing Wikimedia installation... Revision: 86895:
- 23:33 logmsgbot: tstarling synchronized php-1.17/wmf-config/CommonSettings.php
- 23:26 logmsgbot: root synchronizing Wikimedia installation... Revision: 86895:
- 23:15 Tim: disabled CustomUserSignup, was causing fatal error on mediawiki.org due to incorrect path
- 23:15 logmsgbot: tstarling synchronized php-1.17/wmf-config/CommonSettings.php
- 23:13 logmsgbot: nimishg synchronized php-1.17/wmf-config/InitialiseSettings.php 'turning'
- 23:00 logmsgbot: nimishg synchronized php-1.17/wmf-config/InitialiseSettings.php 'turning'
- 22:55 logmsgbot: nimishg synchronized php-1.17/extensions/CustomUserSignup/CustomUserSignup.hooks.php 'r86994'
- 22:49 notpeter: stopping nagios for a wee sec and then restarting.
- 22:09 logmsgbot: nimishg synchronizing Wikimedia installation... Revision: 86895:
- 22:08 nimish_g: nimishg pushing CustomUserSignup code to cluster, running on mediawikiwiki only
- 21:28 logmsgbot: awjrichards synchronized php-1.17/includes/CategoryPage.php 'r86988'
- 20:38 logmsgbot: pdhanda synchronized php-1.17/wmf-config/InitialiseSettings.php 'enabled CustomUserSignup extension for testwiki'
- 20:37 logmsgbot: pdhanda synchronized php-1.17/wmf-config/CommonSettings.php 'enabled CustomUserSignup extension for testwiki'
- 20:32 awjr: disabled wmf_premiums module on civicrm.wikimedia.org - it is currently not being used and potentially causes queue consumption to choke with certain contribution messages
- 19:51 logmsgbot: catrope ran sync-common-all
- 19:50 RoanKattouw: Running sync-common-all to deploy UploadWizard changse
- 17:52 pdhanda: Running maintenance/populateParentId.php on all wikis
- 08:21 Andrew: sync-common-all worked. scap still broken
- 08:21 logmsgbot: andrew ran sync-common-all
- 08:21 Andrew: trying sync-common-all
- 08:19 Andrew: syncs are broken, log littered with XXX: [sudo] password for andrew:
- 08:12 Andrew: re-scapping, typo in extension-list
- 08:12 logmsgbot: andrew synchronizing Wikimedia installation... Revision: 86895:
- 08:11 Andrew: Scapping to enable DisableAccount extension
- 08:11 logmsgbot: andrew synchronizing Wikimedia installation... Revision: 86895:
- 08:02 logmsgbot: andrew synchronizing Wikimedia installation... Revision: 86895:
- 08:02 Andrew: running scap to deploy the code itself
- 08:01 Andrew: deploying DisableAccount extension to checkuserwiki, stewardwiki, arbcom_enwiki since the special page was removed without consulting Philippe
- 02:15 logmsgbot: robh synchronized php-1.17/wmf-config/InitialiseSettings.php 'adding settings for checkuser and steward wikis'
April 25
- 23:33 Ryan_Lane: added python-mwclient to lucid repo
- 21:36 RobH: storage2 still offline, wont boot into os, but is remotely accessible
- 21:20 RobH: trying to fix storage2
- 20:16 notpeter: actually adding everyone on ops to watchmouse service... didn't know this had not already been done.
- 20:02 RobH: updated csw1 to removed labels and move to default vlan ports 11/12, 11/14, 11/19, & 11/21. old connection ports for dataset2, tridge, ms1, and ms5
- 19:53 RobH: the datacenter is looking awesome.
- 19:45 RobH: ms1 moved from temp network to permanent home, no downtime, responding fine
- 19:42 RobH: ms5 connection moved, no downtime, responds fine, less than 4 seconds
- 19:40 RobH: updated csw1-sdtpa 15/1,15/2 from vlan 105 to vlan 2, 15/3 and 15/4 from vlan 105 to 101
- 18:52 RobH: snapshot4 relocated to new home, ready for os install
- 18:42 RobH: db19 and db20 back online (not in services as they have other issues)
- 18:39 RobH: db19 and db20 powering back up
- 18:25 RobH: virt4 experienced an accidental reboot when rebalancing power in the rack, my fault, not the hardware
- 18:12 RobH: rack b2 power rebalanced
- 18:01 RobH: db19 set to slave, depooled in db.php, no other services evident, shutting down (mysql stopped cleanly)
- 18:00 RobH: db20 shutdown
- 18:00 RobH: didnt log that i setup ports 11/38-40 for db19, db20, and snapshot4 on csw1-sdtpa. tested out fine and all my major configuration changes on netowrk should be complete
- 17:56 RobH: ok, db20 and db19 are coming offline to relocate their rack location due to power distro issues
- 15:47 RobH: delay, not coming down yet, need more cables
- 15:46 RobH: db19 is coming down as well, it is depooled anyhow
- 15:46 RobH: db20 is coming down, ganglia aggregation for those hosts may be delayed until it is back online.
- 15:21 RobH: relocating snapshot4 into rack c2, it will be offline during this process
- 15:20 RobH: db43-db47 network setup, sites not down, yay me
- 15:10 RobH: being on csw1 makes robh nervous.
- 15:09 RobH: labeling and setting up ports on 11/33 through 11/37 on csw1-sdtpa for db43 through db47
- 14:47 RobH: fixed storage2 serial console (set it to higher rate, magically works, or it just fears me) and also confirmed its remote power control is functioning
- 14:42 RobH: stealing dataset1's known good scs connection to test storage2. dataset1 service will remain unaffected.
April 24
- 21:30 Ryan_Lane: restarting apache on mobile1
- 15:35 RobH: swapping bad disk in db30, hotswap, should be fine
- 14:36 RobH: swapping out the management switch in c1-sdtpa. msw-c1-sdtpa will be offline, so the mgmt interfaces of servers in that rack will be offline. all normal services will remain unaffected.
April 23
- 22:31 RobH: required even.
- 22:31 RobH: no drives display error leds, futher investigation requried
- 22:27 RobH: ms2 is having bad drive investigated. if we do this right, it wont go down. if we don't it will. is a slave es server.
- 22:00 RobH: singer returned to operation, blog, techblog, survey, and secure returned to normal operation
- 21:52 RobH: singer is once again coming back down for drive replacement. This will take offline blog.wikimedia.org, techblog.wikimedia.org, survey.wikimedia.org, and secure.wikipedia.org. Service will be returned as soon as possible.
- 21:19 RobH: singer back online, for awhile, will come back down for further repair shortly.
- 21:05 RobH: singer going down, blogs will be offline, so will secure, system will return to service as soon as possible
- 21:00 RobH: preparing to fix the dead drive in singer, this will offline secure, blog, techblog, and survey during the drive replacement process
- 19:49 mark: Upgrading mr1-pmtpa to junos 10.4R3.4
- 17:49 RobH: migrating searchidx1 & search1-search10 to new ports in same rack. moving one at a time and ensuring link lights between moves. (already tested with search10)
- 14:11 RobH: db19 is back online, seems to not have any mysql setup done.
- 14:02 RobH: restarting db19
- 14:02 RobH: arcconf checks out all drives on db19 are indeed working as rich found earlier
- 12:47 mark: Added (x121Address=1) condition to the LDAP query of the ldap_aliases router on mchenry's exim
- 00:32 hcatlin: Mobile: Deploying fix to an issue that kept the standard-style Main_Page from displaying on mobile
- 00:25 Ryan_Lane: restarting memcached on all of the mobile servers
- 00:23 Ryan_Lane: repooling mobile3, since mobile will die without it (fun!!)
- 00:17 Ryan_Lane: depooling mobile3
- 00:13 Ryan_Lane: restarting apache on mobile3
- 00:10 Ryan_Lane: puppet was broken on mobile1, reinstalled it
April 22
- 23:56 domas: detached gdb from srv193 apache, apparently it was used for something
- 23:14 notpeter: restarting nagios (again)wq
- 22:43 notpeter: restarting nagios
- 19:23 apergos: shot all stopped rsyncs on ms5 (that were copying from ms4 about two weeks ago), changed all perms on the directories they had reached so thumbs can be served/read from them.. oh. not me, someone else must have done it, I'm not here :-P
- 19:02 RobH: ms4 shutting down for memory troubleshooting
- 18:52 RobH: ms4 troubleshooting, disragrd bounces]
- 18:51 notpeter: restarting nagios
- 12:41 hcatlin: Restarting mobile cluster with April code update.
- 00:49 notpeter: restarting nagios. hopefully now with more sms!
April 21
- 23:32 logmsgbot: midom synchronized php-1.17/includes/ImagePage.php
- 20:54 logmsgbot: pdhanda synchronized live-1.5/404.php
- 19:02 domas: bumped up maxclients/serverlimit on singer to 350 (up from 150), set maxrequestsperchild to 30 to avoid heap blowup (down from 0), all governed via apache2/conf.d/maxrequests
- 18:25 Ryan_Lane: restarting apache on singer
- 18:19 Ryan_Lane: applying system patches to raskin
- 17:47 Ryan_Lane: restarting apache on singer
- 17:04 logmsgbot: pdhanda synchronized live-1.5/404.php
- 16:52 rainman-sr: copying stuff from /home/ariel/searchidx/ to searchidx2 as it didn't get copied
- 15:32 logmsgbot: awjrichards synchronized php-1.17/wmf-config/CommonSettings.php 'Fixing typo to properly check for wgUseEmailCapture'
- 15:29 logmsgbot: awjrichards ran sync-common-all
- 15:06 logmsgbot: awjrichards synchronized php-1.17/wmf-config/InitialiseSettings.php 'Added section for EmailCapture, enabling on testwiki'
- 15:05 logmsgbot: awjrichards synchronized php-1.17/wmf-config/CommonSettings.php 'Added section for EmailCapture'
- 14:50 logmsgbot: awjrichards synchronizing Wikimedia installation... Revision: 86464:
- 14:49 awjr: running scap on fenari to dark-deploy EmailCapture extension
- 03:44 Ryan_Lane: uninstalled python-json on wikitech, and installed python-simplejson
- 01:45 Ryan_Lane: testing statusnet library again
- 01:45 Ryan_Lane: testing statusnet library again
- 01:44 Ryan_Lane: testing statusnet library again
- 01:35 Ryan_Lane: testing a statusnet library for morebots that has a sane license
- 01:20 Ryan_Lane: installing ca-certificates on wikitech
- 01:11 Ryan_Lane: installing curl on wikitech
- 01:04 Ryan_Lane: installing python-json on wikitech
- 00:11 Ryan_Lane: added a sweet, special message to morebots
- 00:03 notpeter: moving some searchidx files into puppet's sphere of influnece.
April 20
- 23:10 rainman-sr: search indexing is up and running again, might take a couple of days for things to catch up though
- 23:07 rainman-sr: doing a slow restart of the whole search cluster (2 minute intervals between host restarts)
- 22:52 rainman-sr: starting search indexer on searchidx2
- 16:29 mark: Squids finished rebuilding, removed iptables rules
- 16:21 mark: Removed iptables rules on sq77/78
- 16:09 mark: Downgraded sq65
- 16:06 mark: removed iptables rule on sq66
- 16:05 Ryan_Lane: downgrading squid on knsq27, knsq28, knsq29, knsq30, knsq26, knsq25, knsq24
- 16:02 Ryan_Lane: downgrading squid on amssq31
- 16:01 Ryan_Lane: downgrading squid on amssq38
- 16:01 Ryan_Lane: downgrading squid on amssq39
- 16:01 Ryan_Lane: downgrading squid on amssq40
- 16:01 Ryan_Lane: downgrading squid on amssq41
- 16:00 Ryan_Lane: downgrading squid on amssq37
- 15:59 Ryan_Lane: downgrading squid on amssq36
- 15:59 Ryan_Lane: downgrading squid on amssq32
- 15:59 Ryan_Lane: downgrading squid on amssq33
- 15:58 Ryan_Lane: downgrading squid on amssq34
- 15:58 Ryan_Lane: downgrading squid on amssq35
- 15:57 mark: Downgrading Squid on sq71-78
- 15:57 Ryan_Lane: downgrading squid on amssq42
- 15:56 Ryan_Lane: downgrading squid on amssq43
- 15:56 Ryan_Lane: downgrading squid on amssq45
- 15:54 Ryan_Lane: downgrading squid onamssq45
- 15:53 Ryan_Lane: downgrading squid on knsq23
- 15:33 RobH: lowered maxclients more on singer, its going to slow down secure, but hopefully keep it online longer.
- 15:04 Ryan_Lane: downgraded sq71 to squid-2.7.7
- 14:51 mark: Pulled squid 2.7.9 packages from the lucid-wikimedia reprepro APT repository, reinstated 2.7.7
- 14:34 mark: Downgrading squid on sq66
- 14:15 mark: Stopping squid on sq72
- 14:14 RobH: reduced singer apache maxclients from 400 to 200, hopefully will reduce singer crashes
- 14:09 RobH: singer died, rebooting.
- 14:07 RobH: secure, blogs on singer, bouncing due to singer cpu max
- 13:53 RobH: singer threw invalid cert warning, cpu spike was occuring, restarting apache and load is normalizing
- 11:34 logmsgbot: catrope ran sync-common-all
- 11:33 RoanKattouw: Running sync-common-all to deploy r86464
- 11:11 logmsgbot: catrope synchronized php-1.17/includes/Exif.php 'Fix for bug 28615 (fatal on image description page)'
- 00:41 awjr: deployment of r86441 to payments servers complete
- 00:37 awjr: deploying changes to send full donor address information to PayflowPro (r86441) to all payments servers
- 00:31 awjr: re-deploying changes to send full donor address information to PayflowPro on payments2.wikimedia.org (suspected cause of previous issue: local browser cacheing)
- 00:14 logmsgbot: midom synchronized php-1.17/wmf-config/db.php
April 19
- 23:24 richcole1: rebooting db11 to check raid
- 22:57 awjr: reverting to r84367 of deployment branch on payments2.wikimedia.org due to bad address information appearing in civicrm
- 22:53 awjr: deploying changes to send full donor address information to PayflowPro payments2.wikimedia.org
- 22:20 Ryan_Lane: virt1-4 installed
- 20:54 Ryan_Lane: powercycling singer
- 20:53 Ryan_Lane: restarting apache on spence
- 20:52 mark: powercycled singer
- 20:35 Ryan_Lane: powercycling srv196
- 17:31 Ryan_Lane: finished ams and kn text squids
- 17:17 RobH: change live, etherpad still online and functional
- 17:16 RobH: pushing change to hooper so it doesnt kill etherpad when other apache vhosts exist
- 17:16 RobH: updated puppet entry for etherpad vhost
- 16:54 Ryan_Lane: upgrading squid on all squid systems, slowly, over the course of the day
- 15:09 mark: Killed all etherpad-user processes and started etherpad
- 15:09 RoanKattouw: Removing data for the section edit link experiment from the clicktracking table. Data is backed up in /home/catrope/sel . This will delete 8.4M of the 9.6M rows in the clicktracking table
- 15:05 RoanKattouw: Etherpad is broken, serving 500s. Have e-mailed Peter
- 14:06 logmsgbot: demon synchronized php-1.17/includes/diff/DifferenceEngine.php 'r86395'
- 12:50 mark: NICEd bayes processing cron job on mchenry
- 10:14 RoanKattouw: <logmsgbot> !log catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Bump $wgArticleFeedbackTrackingVersion to 1'
- 10:14 RoanKattouw: <logmsgbot> !log catrope synchronized php-1.17/extensions/ArticleFeedback/ArticleFeedback.hooks.php 'r86384'
- 10:14 RoanKattouw: <logmsgbot> !log catrope synchronized php-1.17/extensions/LocalisationUpdate/LocalisationUpdate.class.php 'r86299'
- 10:14 RoanKattouw: Restarted morebots
- 01:05 Tim: added an entry to /etc/sudoers on fenari and hume allowing wikidev users to run commands as apache
April 18
- 21:28 mark: Setup PIM-DM on cr1-eqiad:ae1.666 (pseudowire port to pmtpa) and csw5-pmtpa:ve10 (counterpart) to get multicast routing from tampa
- 21:24 mark: Setup PIM-SM on cr1-eqiad and cr2-eqiad, with static RPs and a local RP on cr1-eqiad. PIM is running on all production subnets.
- 16:16 pdhanda: running maintenance/CleanupImages on all.dblist
- 15:58 mark: Rolling restart of pmtpa text frontend squids; one every two minutes
- 14:27 mark: Powercycled db11
- 14:13 logmsgbot: mark synchronized php-1.17/wmf-config/db.php 'depool db11'
- 14:12 Reedy: DB11 related issues. Ops are looking at the issue
- 10:30 logmsgbot: root synchronizing Wikimedia installation... Revision: 86086:
- 10:05 logmsgbot: root synchronizing Wikimedia installation... Revision: 86086:
- 08:43 Tim: running /home/wikipedia/bin/l10nupdate to get r86294
April 17
- 20:27 mark: Setup dirty "pseudowire" between pmtpa and eqiad; a GRETAP tunnel inside a v4-in-v6 tunnel bridged to eth1/eth0.666 on 2 linux boxes (sq71 on pmtpa side). Now running OSPF on it to get full v4 interconnectivity
April 16
- 08:33 apergos: disabled snaps on searchidx on in rainman's crontab, moving some en.wiki.* files in /a/search/indexes/snapshot off to fenari:/home/ariel/searchidx/....
April 15
- 18:41 mark: Rsynced private thumbs dir from ms4 to ms5
- 15:45 mark: Upgraded re0.cr1-eqiad to JUNOS 10.4R3.4
- 15:27 mark: Switching over routing engines from re0 to re1 on cr1-eqiad
- 15:23 mark: Upgraded re1.cr1-eqiad to JUNOS 10.4R3.4
- 14:44 mark: Upgraded re0.cr2-eqiad to JUNOS 10.4R3.4
- 14:34 mark: Upgraded re1.cr2-eqiad to JUNOS 10.4R3.4
- 09:57 apergos: moving some old (from 2009) dirs from /a/search/indexes/import on searchidx1 to /home/ariel/searchidx1/... on fenari, if I'm told we can toss em outright I'll do that later.
- 06:01 apergos: cleared out space on searchidx1 again. we only have 4.5gb free and it's still at 100%. out of tricks...
- 00:54 notpeter: py making some changes to nagios. subscribing the service to lots of files. also, lots more files being pushed out by puppet.
April 14
- 23:41 Ryan_Lane: adding reverse mappings for virt1-4
- 23:28 Ryan_Lane: adding dns entries for virt1-4
- 22:28 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Re-enable ArticleFeedback on enwiki, upgrade is complete'
- 22:09 RoanKattouw: Running populateAFRevisions.php on enwiki
- 22:06 RoanKattouw: Creating article_feedback_revisions table on enwiki
- 22:04 RoanKattouw: Clearing message blobs
- 22:03 logmsgbot: catrope ran sync-common-all
- 22:03 RoanKattouw: Deploying ArticleFeedback changes to the cluster with sync-common-all
- 22:02 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Disable ArticleFeedback on enwiki for upgrade'
- 21:42 mark: Added vlan 103 to ports e 15/5 to 15/8 (virt1-4 eth1) (tagged) on csw1-sdtpa
- 21:42 mark: Put ports e 15/1 to 15/4 (virt1-4 eth0) in vlan 105 (untagged) on csw1-sdtpa
- 21:40 mark: Created VLAN 105 ("virt-hosts") on csw1-sdtpa, assigned virtual interface ve 9, assigned ip 10.4.16.0/24
- 21:39 mark: Added virtual interface ve 7 ("virt") on csw1-sdtpa, assigned ip 10.4.0.1/24
- 21:39 mark: Removed virtual interface ve 7 ("virt") on csw5-pmtpa
- 21:24 Ryan_Lane: cleaning up searchidx1
- 21:11 RoanKattouw: Running extensions/ArticleFeedback/populateAFRevisions.php on testwiki
- 21:11 RoanKattouw: Creating article_feedback_revisions table on testwiki
- 21:08 RoanKattouw: Updating ArticleFeedback code on test
- 16:18 RobH: singer back online, blogs, survey, and secure should all be back in service
- 16:15 RobH: blogs offline due to singer outage, as well as survey, working on resolution
- 16:14 RobH: powercycling singer, unresponsive to ssh and to serial console
- 16:13 RobH: secure issue is being investigated
- 16:13 pdhanda: running cleanupTitles.php for all wikis in all.dblist
- 16:13 RobH: investigating singer cpu spike
- 15:36 mark: Implemented LDAP lookups on mail relay mchenry
- 14:13 Reedy: Delayed message. Ran repopulateCodePaths on mediawikiwiki
- 14:13 Reedy: Delayed message. Ran populateFollowupRevisions on mediawikiwiki
- 14:04 logmsgbot: catrope synchronized php-1.17/extensions/CodeReview/modules/ext.codereview.js 'Disable comment field autofocus'
- 07:54 logmsgbot: catrope synchronized php-1.17/extensions/CodeReview/CodeReview.php 'Fix CodeReview fatals'
- 07:50 logmsgbot: catrope ran sync-common-all
- 07:47 RoanKattouw: Running sync-common-all and clearMessageBlobs.php to deploy UploadWizard changes
- 07:35 RoanKattouw: Updating MediaWiki on fenari to r86034 for UploadWizard deployment. Only pushing changes to testwiki for now
- 07:14 logmsgbot: tstarling synchronized php-1.17/includes/WebRequest.php 'r86031'
- 07:14 logmsgbot: tstarling synchronized php-1.17/img_auth.php 'r86031'
April 13
- 23:18 notpeter: restarting nagios
- 21:15 notpeter: restarting nagios
- 19:54 awjr: disabling drupal/civicrm cron on contacts.wikimedia.org (singer) as part of an interim solution to civimail problems in CiviCRM 3.4beta1
- 18:31 ^demon: ran code review updates for mediawikiwiki
- 12:54 mark: Restarted pdns on linne
- 12:54 mark: Defined /64 neighbor blocks in 1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa in DNS
- 12:31 mark: Defined /31 neighbor blocks in 154.80.208.in-addr.arpa in DNS
- 09:15 logmsgbot: ariel synchronized php-1.17/wmf-config/CommonSettings.php 'increase account creation throttle value for el wiki for editing workshop (soon we have a two week break from these :-P)'
- 01:55 Ryan_Lane: upgrading squid to 2.7.9-5 on sq58 and sq73
- 01:35 Ryan_Lane: upgrading squid to 2.7.9-4 on sq58 and sq73
- 00:42 Ryan_Lane|food: also removing sheepdog from nova-*.tesla
- 00:39 Ryan_Lane|food: removing natty versions of libvirt and friends and re-installing the lucid versions
April 12
- 20:18 RobH: killed apache vhost for outreachcivi since it appears defunct in favor of contactscivi on singer
- 19:08 Ryan_Lane: upgrading nova-* on nova-*.tesla
- 19:05 Ryan_Lane: updating squid on sq73 to test new squid and patches
- 05:51 Ryan_Lane: purging nova-ajax-console-proxy from nova-compute1/2.tesla and adding to nova-controller.tesla
- 04:57 Ryan_Lane: purging nova-ajax-console-proxy from nova-controller.tesla and adding to nova-compute1/2.tesla
- 02:11 logmsgbot: tstarling synchronized php-1.17/includes/Sanitizer.php 'r85859'
- 01:30 logmsgbot: root synchronizing Wikimedia installation... Revision: 85853:
- 01:28 Tim: svn up/scap to deploy r85848
- 00:31 Ryan_Lane: installing glance on nova-controller.tesla
April 11
- 22:13 Ryan_Lane: installing python-nova.adminclient on nova-controller.tesla
- 21:19 notpeter: added lag check to storage3. restarting nagios
- 20:33 Ryan_Lane: pushing squid-2.7.9-1wm1 to wikimedia-lucid repo
- 05:15 Tim: did squid configuration update for bug 28235
- 04:40 logmsgbot: aaron synchronized php-1.17/wmf-config/flaggedrevs.php 'Made categories reviewable on kawiki'
- 03:23 logmsgbot: aaron synchronized php-1.17/wmf-config/flaggedrevs.php
- 03:21 logmsgbot: aaron synchronized php-1.17/wmf-config/InitialiseSettings.php 'flaggedrevs for kawiki'
- 03:13 AaronSchulz: Enabled FlaggedRevs for Georgian Wikipedia
- 03:11 logmsgbot: aaron ran sync-common-all
April 10
- 05:22 logmsgbot: ariel synchronized php-1.17/wmf-config/CommonSettings.php 'increase account creation throttle value for el wiki for editing workshop (can't wait for local sysadmins to be able to do this :-P)'
April 9
- 17:07 logmsgbot: ariel ran sync-common-all 'sync for elwikinews round 2, let's get the import right this time folks cause this is too nerve-wracking'
- 16:40 logmsgbot: ariel synchronized all.dblist 'remove elwikinews, need to drop and recreate after borked import'
- 16:29 logmsgbot: robh synchronized php-1.17/wmf-config/InitialiseSettings.php 'logo updates for a couple wikis for phillipe'
- 10:10 mark: Changed dataset1's clock source to HPET, synced it with ntpdate and restarted ntpd
- 07:58 apergos: power cycling searchidx1, load was at 60, unresponsive to commands after login from mgmt console
- 02:46 RobH: troubleshooting a couple new wikis, had to sync-apaches and restart them gracefully
- 01:01 notpeter: changed my.cnf on storage3 to replicated-do-db= drupal,mysql,civicrm
- 00:50 Ryan_Lane: installing nova-ajax-console-proxy on nova-controller.tesla
April 8
- 23:19 logmsgbot: laner ran sync-common-all
- 23:18 logmsgbot: laner synchronized php-1.17/wmf-config/InitialiseSettings.php
- 22:50 Ryan_Lane: that graceful was me
- 22:28 logmsgbot: laner synchronized php-1.17/wmf-config/InitialiseSettings.php
- 22:16 logmsgbot: laner synchronized php-1.17/wmf-config/InitialiseSettings.php
- 22:14 logmsgbot: laner synchronized php-1.17/wmf-config/InitialiseSettings.php
- 22:07 logmsgbot: laner ran sync-common-all
- 22:06 Ryan_Lane: gave myself deploy access in svn
- 17:10 notpeter: pushing out new dns zones. forgot to change ptr record for yvon...
- 15:09 RobH: updating dns with testblog info
- 13:36 mark: Added swap on /dev/sdc1 and /dev/sdd1 on ms5
- 13:34 mark: Stopped RAID10 array /dev/md2 again, sync takes too long
- 13:30 mark: Created RAID10 array for swap across first partition of 46 drives on ms5
- 13:21 mark: Stopped all rsyncs to investigate ms5's sudden kswapd system cpu load
- 07:57 apergos: assigned snapshot1 internal ip in dns
- 06:11 apergos: moving snapshot1 to internal vlan etc
- 04:15 notpeter: pushing new dns w/ eixamanius as cname for hooper and yvon as new name for box that was previously eixamanius
- 04:15 notpeter: stopping etherpad
April 7
- 21:43 notpeter: removed a silly check for hooper that I made and restarted nagios
- 19:06 Ryan_Lane: switching openstack deb repo back to trunk, and upgrading packages on nova-controller, since we are likely to target cactus now
- 15:40 mark: Restarted rsyncs
- 15:26 mark: Created a test btrfs snapshot of /export on ms6
- 15:12 mark: Temporarily stopped the rsyncs on ms5 to test zfs send performance
- 13:56 mark: Reenabled ms6 as backend on esams.upload squids
- 13:11 apergos: replaced ms4 in fstab on fenari with ms5 so we have thumbs mounted there
- 12:08 mark: Restarted rsyncs on ms5
- 12:07 apergos: nginx conf file change to "thumb" added to puppet
- 12:00 mark: Removed the test snapshot on ms5
- 11:47 apergos: edited in place /etc/nginx/sites-available/thumbs and /export/thumbs/scripts/thumb-handler.php to make thumbs generated on the fly return 200. they were returning 404
- 10:25 logmsgbot: catrope synchronized php-1.17/thumb.php 'Attempted fix for wrong temp/thumb paths'
- 10:20 apergos: (after reports from en vp that the search index has not been updated for over 4 days)
- 10:18 apergos: restarting search indexer on searchidx1
- 09:35 logmsgbot: catrope synchronized php-1.17/includes/specials/SpecialUploadStash.php 'debugging'
- 09:08 logmsgbot: catrope synchronized php-1.17/includes/specials/SpecialUploadStash.php 'r85612'
- 08:23 logmsgbot: catrope synchronized php-1.17/extensions/UploadWizard/UploadWizard.php 'Fix fatal due to missing API module'
- 08:17 logmsgbot: catrope ran sync-common-all
- 08:16 RoanKattouw: I meant srv196, not srv193
- 08:15 RoanKattouw: Deploying UploadWizard for real this time, forgot to svn up first. sync-common-all then clearMessageBlobs.php
- 08:14 RoanKattouw: Commenting out srv193 in mediawiki-installation node list because its timeouts take forever
- 08:10 RoanKattouw: srv196 is not responding to SSH or syncs from fenari (they time out after a looong time) but Nagios says SSH is fine. Should be fixed or temporarily depooled
- 08:08 RoanKattouw: Clearing message blobs
- 08:07 logmsgbot: catrope ran sync-common-all
- 08:04 RoanKattouw: Scap broke with sudo stuff AGAIN, running sync-common-all
- 08:01 RoanKattouw: Running scap to deploy UploadWizard changes
- 07:11 apergos: turned em off again, started seeing timeouts. bah
- 06:39 apergos: and two more...
- 06:31 apergos: restarted two of the 8 rsyncs on ms5, keeping an eye on them
- 01:31 domas: added nobarrier to xfs mount options on db32 and db37
April 6
- 20:38 RobH: updated puppet with a svn::client class (rt#721)
- 20:18 RobH: pulled wm09schols, wm10schols, and wm10reg out of enabled sites on singer
- 20:05 apergos: suspended all rsyncs on ms5, we were seeing nfs timeouts on the renderers all of a sudden
- 18:50 apergos: killed morebots and let the restart script start it up again
April 5
- 23:00 Ryan_Lane: restarting search indexer on searchidx to free space held by deleted logs
- 22:58 Ryan_Lane: clearing up some space on searchidx1
- 22:20 notpeter: crammed an etherpad db into db9's mysql hole.
- 17:57 Ryan_Lane: restarting llsearchd on all search boxes
- 17:45 RoanKattouw: Restarted morebots, running on wikitech as catrope
- 17:45 Ryan_Lane: changing the udp log location for search to emery
- 12:16 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Undo $wgForceUIMsgAsContentMsg change on incubator from last night per DannyB'
April 4
- 23:28 Ryan_Lane: uploading ircecho package to lucid-wikimedia repo, for nagios irc bot
- 22:22 Ryan_Lane: upgrading wikimedia-task-appserver package on srv281
- 22:22 Ryan_Lane: uploading new version of wikimedia-task-appserver to lucid-wikimedia repo; merges back in 1.17 changes that were missing
- 22:04 RobH: updated noc robots entry in its apache config on fenari
- 21:58 Ryan_Lane: srv281 is acting as a temporary scaling server for testing of lucid imagescalers, and to help with thumbs load.
- 21:27 Ryan_Lane: depooling srv281 from appservers
- 21:21 Ryan_Lane: syncing apaches to get configuration pushed to srv281
- 21:16 Ryan_Lane: rebooting srv281
- 21:01 Ryan_Lane: adding srv281 to rendering cluster in pybal via fenari
- 20:32 Ryan_Lane: uploading a new version of wikimedia-task-appserver fixing a problem with sync-common
- 20:13 logmsgbot: catrope synchronized php-1.17/wmf-config/InitialiseSettings.php 'Add mainpage to $wgForceUIMsgAsContentMsg for incubatorwiki'
- 19:55 Ryan_Lane: srv281 successfully ran imagescaler puppet class. ready for testing.
- 19:47 Ryan_Lane: adding php5-fss to lucid-wikimedia repo
- 19:11 Ryan_Lane: adding wikimedia-task-appserver to lucid-wikimedia repo
- 18:58 RobH: bugzilla updates complete
- 18:50 RobH: updating bugzilla per rt#718 bz#28409 bz#28402
- 18:42 notpeter: added cname etherpad for hooper.wikimedia.org
- 18:00 Ryan_Lane: added the wikimedia-fonts package to lucid-wikimedia repo
- 17:29 notpeter: adding self to nagios group. rebooterizing nagios.
- 05:58 apergos: cleaned up perms on commons/thumb/a/af, left over from interrupted rsync test last night
- 05:50 logmsgbot: tstarling synchronized php-1.17/wmf-config/InitialiseSettings.php 'enabling pool counter on all wikis'
- 04:12 logmsgbot: tstarling synchronized php-1.17/wmf-config/InitialiseSettings.php 'enabling PoolCounter on testwiki and test2wiki'
- 01:22 Tim: apache CPU overload lasted ~10 mins, v. high backend request rate, don't know cause, seems to have stopped now
April 3
- 18:42 apergos: 8 rsyncs of ms4 thumbs restarted with better perms so scalers can write... in screen as root on ms5. If we start seeing nfs timesouts in the scaler logs please shoot a couple
- 17:14 mark: Deployed max-connections on all cache peers for esams.upload squids to their florida parents (current limit 200)
- 17:00 mark: Removed the carp weights on the esams backends again, as the weighting was completely screwed up
- 16:59 mark: Started knsq13 backend
- 14:27 logmsgbot: catrope ran sync-common-all
- 14:26 RoanKattouw: Running sync-common-all to deploy r85256
- 13:03 apergos: shot rsyncs on ms5, setting 777 dir perms on all thumbnail dirs (eg e/ef/blablah.jpg) so scalers can write into them
- 12:53 apergos: did same for rest of projects and subdirs (777 on hash dirs)
- 12:46 apergos: chmod 777 on commons/thumb/*/* on ms5 so that scalers can create directories in there (mismatch of uid apache vs www-data)
- 11:12 mark: Raised per-squid connection limit to ms5 of 200 to 400 connections
- 11:05 mark: Raised per-squid connection limit to ms5 of 100 to 200 connections
- 10:55 mark: Fixed squid loop, the pmtpa.upload squids were using the esams squids as "CARP parents for distant content"
- 10:29 mark: Fixed puppet on sq42/43
- 09:44 mark: Lowered FCGI thumb handlers from 90 to 60 again, to reduce concurrency
- 08:08 mark: Started 4 more rsyncs (8 total now)
- 07:49 mark: Removed mlocate from ms5, puppetising
- 07:42 mark: Started 4 rsyncs from ms4 to ms5 (--ignore-existing)
- 07:32 mark: increased thumb handler count from 60 to 90
- 07:11 mark: Doubled the amount of fcgi thumb handlers
- 07:08 mark: Turned off logging of 404s to nginx error.log
- 06:50 mark: Restarted Apache on the image scalers
- 06:49 mark: Reconfigured ms5 to use the 404 thumb handler
- 06:48 Ryan_Lane: disabling nfs on ms4
- 06:33 mark: Running puppet on all apaches to fix fstab and mount ms5.pmtpa.wmnet:/export/thumbs
- 06:32 mark: Unmounting /mnt/thumbs on all mediawiki-installation servers
- 06:30 mark: Remounted NFS /mnt/thumbs on the scalers to ms5
- 06:28 Ryan_Lane: bring nfs back up
- 06:28 Ryan_Lane: brought ms4 back up. stopping the web server service and nfs
- 06:20 mark: Setup NFS kernel server on ms5
- 06:18 Ryan_Lane: powercycling ms4
- 05:29 Ryan_Lane: rebooting ms4 with -d to get a coredump
- 05:14 apergos: reanbling webserver on ms4 for testing
- 04:45 apergos: stopping web service on ms4 for the moment
- 04:29 apergos: shot webserver again
- 04:26 apergos: turned off hourly snaps on ms4, turned back on webserver and nfs
- 04:09 apergos: rebooted ms4, shut down webserver and nfsd temporarily for testing
- 02:58 apergos: still looking at kernel memory issues, still rebooting, ryan should be here in a few minutes to help out
- 02:03 apergos: a solaris advisor... also have zfs arch cache max to 2g which is ridiculously low but wtf right?
- 02:02 apergos: set tcp_time_wait_interval to 10000 at suggestion of
- 01:37 apergos: lowered zfs arch max to 2g (someone should reset this later)... will take effect on next reboot
- 00:29 apergos: rebooting with the new zfs arc cache max value, which will reduce the min value as well... dunno if this will give us enough breathing room or not
- 00:24 apergos: set zfs arc cache to ridiculously low value of 4gb, since when it's healthy it's using much less than that (1gb), this will take effect on reboot
- 00:22 Reedy: Still experiencing MS4 issues, thumb service is likely to be problematic for most users
April 2
- 23:47 apergos: rebooting ms4 from serial console, out to lunch and took the renderers down too
- 18:42 logmsgbot: catrope synchronized php-1.17/wmf-config/CommonSettings.php 'Per NeilK, change Category:Uploaded_by_UploadWizard to Category:Uploaded_with_UploadWizard'
- 17:59 mark: Upgrading varnish to 2.1.5
- 17:14 logmsgbot: demon synchronized php-1.17/includes/filerepo/LocalFile.php 'r85200'
- 14:19 mark: Implemented CARP weights for distant CARP parents on squid configurator (used to be all equal before)
- 11:36 mark: Created btrfs filesystem on ms6, striped (raid10 style) over 46 devices - very experimental
- 09:50 mark: Reinstalling ms6 with Ubuntu 10.04
- 09:50 mark: Fixed torrus again
- 06:02 mark: !wikipedia The image thumbnail servers appear stable now
- 04:59 mark: Increased nginx worker processes from 1 to 4, set file limit to 30k
- 04:40 mark: !wikipedia Image Thumbnail server outage, it's being worked on
- 04:34 mark: Power cycling ms4 again
- 04:06 mark: Power cycled ms4 again
- 04:02 mark: Removed ms4 from pmtpa.upload config, sending all thumbs to ms5
- 03:47 mark: Restarted rsyncs ms4->ms5
- 03:25 Ryan_Lane: powercycling ms4 again
- 02:59 Ryan_Lane: rebooting ms4
- 02:46 Ryan_Lane: seems ms4 is totally dead, powercycling it
- 01:09 Ryan_Lane: installing python-pyinotify on spence for an updated ircecho
April 1
- 21:35 Ryan_Lane: purging some binlogs on db9 to free up space
- 21:35 RobH: bugzilla now version 4
- 21:31 RobH: taking down bugzilla for a quick upgrade
- 18:48 Ryan_Lane: added ctwoo, brion, py, and reedy to the engineering alias
- 18:36 mark: Deployed ms5.pmtpa.wmnet as a special 'apache' for pmtpa squid uploads... now serving a small portion of commons thumbs
- 18:11 RobH: bugzilla back online, CRproxy was affected, and repaired
- 17:30 RobH: bugzilla.wikimedia.org going offline for database backup and upgrade
- 17:13 RobH: beginning upgrade process for bugzilla, it's availability will be in question during this time
- 16:59 mark: Turned off Etag in the webserver7 configuration (/opt/webserver7/https-ms4/config/obj.conf) on ms4
- 16:50 notpeter: rm-ing old binlogs on db9 after confirming that there is no slave lag on db10
- 15:53 mark: Puppetised nginx and htcp purger setup on ms5
- 11:36 apergos: restarted lighty on dataset2 (but why did it die?)
- 00:06 logmsgbot: tstarling synchronized php-1.17/includes/specials/SpecialImport.php 'r85099'
Archives
- Server admin log/Archive 1 (2004 Jun - 2004 Sep)
- Server admin log/Archive 2 (2004 Oct - 2004 Nov)
- Server admin log/Archive 3 (2004 Dec - 2005 Mar)
- Server admin log/Archive 4 (2005 Apr - 2005 Jul)
- Server admin log/Archive 5 (2005 Aug - 2005 Oct)
- Server admin log/Archive 6 (2005 Nov - 2006 Feb)
- Server admin log/Archive 7 (2006 Mar - 2006 Jun)
- Server admin log/Archive 8 (2006 Jul - 2006 Sep)
- Server admin log/Archive 9 (2006 Oct - 2007 Jan)
- Server admin log/Archive 10 (2007 Feb - 2007 Jun)
- Server admin log/Archive 11 (2007 Jul - 2007 Dec)
- Server admin log/Archive 12 (2008 Jan - 2008 Jul)
- Server admin log/2008-08
- Server admin log/2008-09
- Server admin log/Archive 13 (2008 Oct - 2009 Jun)
- Server admin log/Archive 14 (2009 Jun - 2009 Dec)
- Server admin log/Archive 15 (2010 Jan - 2010 Jun)
- Server admin log/Archive 16 (2010 Jul - 2010 Oct)
- Server admin log/Archive 17 (2010 Nov - 2010 Dec)
- Server admin log/Archive 18 (2011 Jan - 2011 Mar)