Build a new server

From Wikitech
Revision as of 18:04, 21 October 2011 by LeslieCarr (Talk | contribs)

Jump to: navigation, search

Details on all the steps necessary to take a new (or old repurposed) piece of hardware and turn it into a happy functional server

Contents

Before you begin

  • find the machine's MAC address (racadm getsysinfo)
  • decide on private / public IP address, and if it will need to exist in some special range
  • decide how the disks should be arranged (raid, partitioning, etc.)

Initial hardware setup

  • Get the hardware racked and cabled (RobH)
  • Get an IP and name (RobH)
  • follow DNS How-To section to add the name/ip to DNS
  • set up $name.mgmt.$loc.wmnet as well to access the management interface
  • set up $assettag.mgmt.$loc.wmnet to the same IP as $name.mgmt.$loc.wmnet.
  • Set up DHCP with the MAC address / name info
    • if it's a Dell, get MAC address from the mgmt console, run racadm getsysinfo; we use the first interface
    • log into brewster and edit /etc/dhcp3/* (any new server linux-host-entries.ttyS1-115200), run /etc/init.d/dhcpd3-server restart
  • Get the switch set up to pass traffic to the host (Mark)
  • set up the hardware raid (if it has it)

PXE boot and the initial OS

  • if it's a cluster host, set up netboot to partition the disk
    • log into the install server (presently brewster), edit /srv/autoinstall/netboot.cfg
    • it's a bash case statement. Make sure your hostname is matched by a regex in there.
  • if it's a misc host, you'll have to partition by hand (it will prompt you)
    • lvm over raid 1 is a decent config if you don't have anything more specific you need.
  • ssh to root@$servername.mgmt.$loc.wmnet, force a restart and pxe boot (if it's a dell, copy/paste this: dell pxe boot)
    • get the password from someone in ops if you don't have it
    • powercycle the host: racadm serveraction powercycle
      • this takes about 15s
    • connect to the cosole: console com2
      • there might be no output for a little bit immediately after the powercycle. wait at least 30s or so
    • during boot, force netboot: F12
      • when connecting from OSX via the Terminal, escape-shift-2 sends F12
    • you can leave this running to watch it complete
      • This is where you'll be prompted for partition info
    • when you're done, ctrl-\ will disconnect you from the console
      • exit will disconnect from the mgmt interface

Check your partitioning

If you used a cluster host, check the partitioning

  • "df -h" will tell you what the mounted partitions are
  • "fdisk -l | grep Disk" will tell you the physical size of the disk
  • "sfdisk -l /dev/sdX" will tell you the size and type of the partitions in blocks

Make sure most of the disk is used

Get puppet running

Warning: if you are rebuilding a pre-existing server (rather than a brand new name), on sockpuppet, run puppetca --clean $fqdn to clear out the old certificate before beginning this process. If you already began, also run find /var/lib/puppet/ssl -type f -exec rm {} \; to clean out the client.

  • get a shell on both $server and the puppet master (sockpuppet)
    • Only one key has access to new installs.
    • from sockpuppet, ssh -o StrictHostKeyChecking=no -i ~/.ssh/new_install root@$servername.wikimedia.org to log into $server
  • on $server, run puppetd --test It will fail.
  • on sockpuppet, run puppetca -s $server-fqdn
  • on $server, run puppetd --test It should now succeed.

Set up puppet

  • add $server to site.pp, either by hostname or within a regex if it's part of a class (eg srv\d\d*)
  • do whatever puppet goodies you want to get the server to do what you want it to.

Related pages

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox