Sartoris

From Wikitech
(Difference between revisions)
Jump to: navigation, search
(Required)
Line 1: Line 1:
Using this tool: https://github.com/git-deploy/git-deploy
+
== Deployment location ==
  
== git-deploy brainstorming: ==
+
* MediaWiki:
 +
*: /srv/deployment/mediawiki/common
 +
*: /srv/deployment/mediawiki/slot0
 +
*: /srv/deployment/mediawiki/slot1
  
=== Deploying ===
+
== Deploying ==
  
    git deploy start
+
# git deploy start
    <bring core and extensions into what you want>
+
#* At this point, do git pull, checkout, cherry-pick, commit, or whatever other repo changes you need to make.
    git deploy sync
+
# git deploy sync
    <sync will tag the release, and run a sync-hook>
+
#* Alternatively, if you wish to abort a deploy: git deploy abort.
 +
 
 +
== Design ==
 +
 
 +
=== Basic design ===
 +
 
 +
Git repositories sit on the deployment system behind a web server. git-deploy (a perl script) initiates a deploy. It writes out a lock file to only allow a single deploy at a time. At this point the user updates the repo as necessary, or aborts the deploy. Once the deploy is ready, the user initiates the deploy, which triggers a sync script. The sync script updates the repo and submodules so that the application servers can fetch properly. After doing so it calls a salt run for fetch, then a salt run for checkout to the deploy tag.
  
 
=== Sync hook ===
 
=== Sync hook ===
  
# Update bare repository
+
* Location (on the deployment host): /var/lib/git-deploy/sync/shared.py
## Fetch bare repo from non-bare repo
+
* Managed in the puppet deployment module
## Purge varnish using ban.url .
+
 
# Make the application servers do a fetch
+
# Get the repo and submodules ready for fetching:
## Keep a list of servers that we couldn't connect to
+
## Update the repo: git update-server-info
## If the list of failed servers is too large, exit, aborting the deploy
+
## Tag all submodules with the same tag as parent repo: git submodule foreach "git tag <tag>"
## If the list is within the threshold, depool the systems that failed
+
## Update all submoudles: <for each extension in <repo>/.git/modules/extension> git update-server-info
# Once all application servers have done a fetch:
+
# Make the application servers do a fetch (via a salt runner)
 +
# Make the application servers do a checkout (via a salt runner)
 
## Switch core to the tag
 
## Switch core to the tag
##* Which slot gets updated depends on the git-deploy prefix, accessible via $GIT_DEPLOY_HOOK_PREFIX
+
## Update the submodules
## Update (with --init) the submodules
+
 
## Keep a list of systems that failed to update - at this point it's too late to abort, but we should depool the systems that failed to update
+
=== Salt deploy runner ===
 +
 
 +
* Location (on the salt master): /srv/runners/deploy.py
 +
* Managed in the puppet deployment module
 +
 
 +
A salt runner is a script that runs on the salt master and can combine many salt calls into a single function.
 +
 
 +
The salt deploy runner is called via ''salt-run deploy.<function>''. It has two functions:
 +
 
 +
;deploy.fetch(repo): calls fetch (via a salt module) on all application servers for the specified repo
 +
;deploy.checkout(repo): calls checkout (via a salt module) on all application servers for the specified repo
 +
 
 +
The runners return a report on which minions returned successfully, failed, or didn't return.
 +
 
 +
=== Salt deploy module ===
 +
 
 +
* Location (on the salt master): /srv/salt/_modules/deploy.py
 +
* Managed in the deployment puppet module
 +
 
 +
A salt module lives on every salt minion and can be called from the salt master or from any peer which is allowed access.
 +
 
 +
The salt deploy module is called via salt <matching-criteria> deploy.<function>. It has the following functions:
 +
 
 +
;deploy.sync_all: sync all repositories configured. This will also fully clone repositories, if they are missing.
 +
;deploy.fetch(repo): do a git fetch based on the repo location (repo_locations) and url (repo_urls) defined via salt pillars.
 +
;deploy.checkout(repo,reset=False): do a checkout of a repo based on the repo location (repo_locations), and url (repo_urls) from salt pillars, and .deploy file defined on the deployment host. Checkout will also modify the .gitmodules file based on sed configuration defined in salt pillars (repo_regex).
 +
 
 +
=== Salt deployment pillars ===
 +
 
 +
* Location (on the salt master): /srv/pillars
 +
* Managed in the puppet repo: role::deployment::salt_masters::production
 +
 
 +
Salt pillars are a set of configuration data available on every salt minion (via salt-call pillar.data). Pillars are managed on the master and are distributed to all minions on update.
  
 
=== Naming ===
 
=== Naming ===
  
     php-slot0 <- current
+
     slot0 <- current
     php-slot1 <- next
+
     slot1 <- next
     php-slot2 <- next + 1
+
     slot2 <- next + 1
 
     ...
 
     ...
  
 
On the deployment system, we should symlink version numbers to the slots, so that it's easy to tell version we are on, for instance:
 
On the deployment system, we should symlink version numbers to the slots, so that it's easy to tell version we are on, for instance:
  
     /home/w/common/php-1.20wmf1 -> /home/w/common/php-slot0
+
     /srv/deployment/mediawiki/common/php-1.20wmf1 -> /srv/deployment/mediawiki/slot0
     /home/w/common/php-1.20wmf2 -> /home/w/common/php-slot1
+
     /srv/deployment/mediawiki/common/php-1.20wmf2 -> /srv/deployment/mediawiki/slot1
    ...
+
 
+
On the appservers:
+
   
+
    /usr/local/apache/common/php-slot0
+
    /usr/local/apache/common/php-slot1
+
 
     ...
 
     ...
  
Line 70: Line 106:
 
==== Example deploy of a core change ====
 
==== Example deploy of a core change ====
  
    cd /home/w/common/php-1.20wmf1
+
cd /srv/deployment/mediawiki/common/php-1.20wmf1
    git deploy start
+
git deploy start
    git deploy sync
+
git pull
 +
git deploy sync
  
In the above scenario, 1.20wmf is the current version of MediaWiki we are running. /home/w/common/php-1.20wmf1 is a symlink to /home/w/common/php-slot0. When it syncs to the application servers, it is making git fetch and switch to a tag at /usr/local/apache/common/php-slot0. After switching to the tag, it'll also update all submodules to the versions listed in the tag point.
+
In the above scenario, 1.20wmf is the current version of MediaWiki we are running. /srv/deployment/mediawiki/common/php-1.20wmf1 is a symlink to /srv/deployment/mediawiki/slot0. When it syncs to the application servers, it is making git fetch and switch to a tag at /srv/deployment/mediawiki/slot0. After switching to the tag, it'll also update all submodules to the versions listed in the tag point.
  
 
==== Example of changing versions of mediawiki ====
 
==== Example of changing versions of mediawiki ====
  
    cd /home/w/common
+
cd /srv/deployment/mediawiki/common
    ln -s slot1 php-1.20wmf2
+
ln -s /srv/deployment/mediawiki/slot1 php-1.20wmf2
    cd php-1.20wmf2
+
cd php-1.20wmf2
    git deploy start
+
git deploy start
    git branch --track wmf/1.20wmf2 origin/wmf/1.20wmf2
+
git branch --track wmf/1.20wmf2 origin/wmf/1.20wmf2
    git checkout wmf/1.20wmf2
+
git checkout wmf/1.20wmf2
    git submodule update --init
+
git submodule update --init
    git deploy sync
+
git deploy sync
  
This example will the the same thing as the previous example, but it will update /usr/local/apache/common/php-slot1 rather than /usr/local/apache/common/php-slot0.
+
This example will the the same thing as the previous example, but it will update /srv/deployment/mediawiki/common/slot1 rather than /srv/deployment/mediawiki/common/slot0.
  
 
==== Example of an emergency live hack ====
 
==== Example of an emergency live hack ====
  
    cd /home/w/common/php-1.20wmf1
+
cd /srv/deployment/mediawiki/common/php-1.20wmf1
    git deploy --no-remote start
+
git deploy start
    <make changes>
+
<make changes>
    git commit
+
git commit
    git deploy --no-remote sync
+
git deploy sync
  
 
== Trying it ==
 
== Trying it ==
  
https://gerrit.wikimedia.org/r/#/c/8732
+
tin.eqiad.wmnet is the eqiad deployment host. There are a few eqiad mw hosts configured and ready to be tested. Simply go into /srv/deployment/mediawiki/<repo> and try it out.
 
+
=== Installation ===
+
 
+
This requires salt-minion installed on all application servers. salt-master should likely be installed on the puppet master. The module should be installed at:
+
 
+
/srv/salt/_modules/git_deploy.py
+
 
+
on the salt master. It can be deployed to all nodes via the following command from the master:
+
 
+
salt '*' saltutil.sync_modules
+
 
+
Technically it only needs to be sync'd to the application servers, but it doesn't hurt for the module to be installed everywhere.
+
 
+
The runner must currently be installed on the master at:
+
 
+
/usr/lib/pymodules/python2.6/salt/runners/git_deploy.py
+
 
+
and linked to:
+
 
+
/usr/share/pyshared/salt/runners/git_deploy.py
+
 
+
There's a [https://github.com/saltstack/salt/issues/1301 (closed) bug] in salt to have this be configurable. We need to upgrade versions, then we can install this in a saner location.
+
 
+
git-deploy must be installed somewhere in the path
+
 
+
After doing so, configure git deployment configuration location in git's global configuration. Then add a git deploy configuration, at minimum pointing the hooks directory to a central spot (for example, /usr/local/git-deploy/), then install the sync scripts.
+
 
+
=== Testing the demo in Labs ===
+
 
+
You'll need access to the demo project for this, and you'll currently need to be added as a sudoer.
+
 
+
On demo-deployment1, the deployment repos exist at:
+
 
+
/mnt/deployment/common
+
/mnt/deployment/common/slot0
+
/mnt/deployment/common/slot1
+
 
+
git-deploy is installed at /data/project/deployment/git-deploy; it's in root's path.
+
 
+
demo-web1 and demo-web2 have the repos cloned at:
+
 
+
/usr/local/apache/common
+
/usr/local/apache/common/slot0
+
/usr/local/apache/common/slot1
+
  
 
== TODOs ==
 
== TODOs ==
Line 151: Line 144:
 
* Create a sudo policy for wikidev users to be able to call the salt runners
 
* Create a sudo policy for wikidev users to be able to call the salt runners
 
** When 0.10.3 is released, also add an ACL so that we can run this without a sudo policy
 
** When 0.10.3 is released, also add an ACL so that we can run this without a sudo policy
* Package git-deploy
+
** There's a sudo policy temporarily in place on tin. This needs to be puppetized
* Have puppet configure git deploy
+
** Salt should actually do this, not puppet. The configuration is currently in the gerrit repo for git-deploy, but maybe this should be moved to the puppet repo.
+
 
* Add a finish script to git-deploy to write out to IRC
 
* Add a finish script to git-deploy to write out to IRC
 
* Add puppet exec to initialize repo, for new hosts
 
* Add puppet exec to initialize repo, for new hosts
 +
** There's a function in the salt deploy module for this, but it needs to be puppetized: salt-call deploy.sync_all
 
* Add puppet exec to bring repos up to date before apache starts
 
* Add puppet exec to bring repos up to date before apache starts
 
+
** The above sync_all needs to replace the scap call
=== Nice to have ===
+
 
+
* Make the git-deploy sync script depool minions that fail, or fail the deployment if too many minions fail
+
** We're currently doing this by checking all hosts by globs. salt 0.10.2 support minion data caching, so we should be able to do the same thing with grains, which would be more flexible.
+
 
+
=== Finished ===
+
 
+
* Make a library for all of the git-deploy scripts that does common functions for all application sync scripts and make the sync-scripts specific to each application, rather than linking them as doing now - done
+
** Linking works fine
+
* Write a salt peer communication policy to allow the deployment host to run the appropriate runners on the salt master - done
+
* Add a .gitmodules_internal file for the repo, that points at the deployment system - done
+
** Changed this to sed the file instead
+
* Puppetize salt master and minion - done
+
* Fix the reporting from the salt-runner - done
+
* Move the runner into the runners directory, since that's supported now - done
+

Revision as of 20:29, 4 December 2012

Contents

Deployment location

  • MediaWiki:
    /srv/deployment/mediawiki/common
    /srv/deployment/mediawiki/slot0
    /srv/deployment/mediawiki/slot1

Deploying

  1. git deploy start
    • At this point, do git pull, checkout, cherry-pick, commit, or whatever other repo changes you need to make.
  2. git deploy sync
    • Alternatively, if you wish to abort a deploy: git deploy abort.

Design

Basic design

Git repositories sit on the deployment system behind a web server. git-deploy (a perl script) initiates a deploy. It writes out a lock file to only allow a single deploy at a time. At this point the user updates the repo as necessary, or aborts the deploy. Once the deploy is ready, the user initiates the deploy, which triggers a sync script. The sync script updates the repo and submodules so that the application servers can fetch properly. After doing so it calls a salt run for fetch, then a salt run for checkout to the deploy tag.

Sync hook

  • Location (on the deployment host): /var/lib/git-deploy/sync/shared.py
  • Managed in the puppet deployment module
  1. Get the repo and submodules ready for fetching:
    1. Update the repo: git update-server-info
    2. Tag all submodules with the same tag as parent repo: git submodule foreach "git tag <tag>"
    3. Update all submoudles: <for each extension in <repo>/.git/modules/extension> git update-server-info
  2. Make the application servers do a fetch (via a salt runner)
  3. Make the application servers do a checkout (via a salt runner)
    1. Switch core to the tag
    2. Update the submodules

Salt deploy runner

  • Location (on the salt master): /srv/runners/deploy.py
  • Managed in the puppet deployment module

A salt runner is a script that runs on the salt master and can combine many salt calls into a single function.

The salt deploy runner is called via salt-run deploy.<function>. It has two functions:

deploy.fetch(repo)
calls fetch (via a salt module) on all application servers for the specified repo
deploy.checkout(repo)
calls checkout (via a salt module) on all application servers for the specified repo

The runners return a report on which minions returned successfully, failed, or didn't return.

Salt deploy module

  • Location (on the salt master): /srv/salt/_modules/deploy.py
  • Managed in the deployment puppet module

A salt module lives on every salt minion and can be called from the salt master or from any peer which is allowed access.

The salt deploy module is called via salt <matching-criteria> deploy.<function>. It has the following functions:

deploy.sync_all
sync all repositories configured. This will also fully clone repositories, if they are missing.
deploy.fetch(repo)
do a git fetch based on the repo location (repo_locations) and url (repo_urls) defined via salt pillars.
deploy.checkout(repo,reset=False)
do a checkout of a repo based on the repo location (repo_locations), and url (repo_urls) from salt pillars, and .deploy file defined on the deployment host. Checkout will also modify the .gitmodules file based on sed configuration defined in salt pillars (repo_regex).

Salt deployment pillars

  • Location (on the salt master): /srv/pillars
  • Managed in the puppet repo: role::deployment::salt_masters::production

Salt pillars are a set of configuration data available on every salt minion (via salt-call pillar.data). Pillars are managed on the master and are distributed to all minions on update.

Naming

   slot0 <- current
   slot1 <- next
   slot2 <- next + 1
   ...

On the deployment system, we should symlink version numbers to the slots, so that it's easy to tell version we are on, for instance:

   /srv/deployment/mediawiki/common/php-1.20wmf1 -> /srv/deployment/mediawiki/slot0
   /srv/deployment/mediawiki/common/php-1.20wmf2 -> /srv/deployment/mediawiki/slot1
   ...

This may be a good place for something like perl's 'storable' which allows you to serialize/deserialize complex data structures for writing to disk or transfer. Depending on what we use slots for it's an efficient way to store more data--e.g. metadata about deployment versions

Python's equivalent is pickle, and in php, we're already using cdb for version info (hetdeploy). The slots scheme would need to work with our hetdeploy stuff, which I think assumes versions. Either we'd need to sync the symlinks to the versions, or do a lot of work on hetdeploy.

Timeline for slots

   slot0=wmf1, slot1=wmf2
   move all wmf1 wikis to wmf2 over time
   replace wmf1 with wmf3
   slot0=wmf3 slot1=wmf2
   start deploying wmf3
   move all wmf2 wikis to wmf3 over time
   replace wmf2 with wmf4
   slot0=wmf3 slot1=wmf4
   etc etc

Or:

   slot0=wmf1, slot1=wmf2
   move all wmf1 wikis to wmf2 over time
   once all are moved, switch slot0 to wmf2, move wikis to slot0
   rinse/repeat for next cycle

Examples

Example deploy of a core change

cd /srv/deployment/mediawiki/common/php-1.20wmf1
git deploy start
git pull
git deploy sync

In the above scenario, 1.20wmf is the current version of MediaWiki we are running. /srv/deployment/mediawiki/common/php-1.20wmf1 is a symlink to /srv/deployment/mediawiki/slot0. When it syncs to the application servers, it is making git fetch and switch to a tag at /srv/deployment/mediawiki/slot0. After switching to the tag, it'll also update all submodules to the versions listed in the tag point.

Example of changing versions of mediawiki

cd /srv/deployment/mediawiki/common
ln -s /srv/deployment/mediawiki/slot1 php-1.20wmf2
cd php-1.20wmf2
git deploy start
git branch --track wmf/1.20wmf2 origin/wmf/1.20wmf2
git checkout wmf/1.20wmf2
git submodule update --init
git deploy sync

This example will the the same thing as the previous example, but it will update /srv/deployment/mediawiki/common/slot1 rather than /srv/deployment/mediawiki/common/slot0.

Example of an emergency live hack

cd /srv/deployment/mediawiki/common/php-1.20wmf1
git deploy start
<make changes>
git commit
git deploy sync

Trying it

tin.eqiad.wmnet is the eqiad deployment host. There are a few eqiad mw hosts configured and ready to be tested. Simply go into /srv/deployment/mediawiki/<repo> and try it out.

TODOs

Required

  • Create a sudo policy for wikidev users to be able to call the salt runners
    • When 0.10.3 is released, also add an ACL so that we can run this without a sudo policy
    • There's a sudo policy temporarily in place on tin. This needs to be puppetized
  • Add a finish script to git-deploy to write out to IRC
  • Add puppet exec to initialize repo, for new hosts
    • There's a function in the salt deploy module for this, but it needs to be puppetized: salt-call deploy.sync_all
  • Add puppet exec to bring repos up to date before apache starts
    • The above sync_all needs to replace the scap call
Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox