Network design

From Wikitech
Revision as of 05:34, 20 February 2007 by Mark (Talk | contribs)

Jump to: navigation, search

FIXME: Check for accuracy

The purpose of this page is to give an overview of the current design of the network of the Wikimedia servers, and to provide a place to develop a new and improved network scheme.


Contents

Overall system design

The following is the general system design plan which the network layer must efficiently accommodate.

  • Databases in a central pool with each serving a subset of the wikis, so each has high cache efficiency and the total number needed to handle any query load is minimised. Database servers cost US$5,000-$8,000 each, depending on exact equipment.
  • A central pair of old text database servers (part of the long term storage growth plan for the databases, to move this high volume and seldom accessed data off costly and comparatively small disk systems).
  • Memcached caching spread on apaches across the whole cluster, producing one very large cache pool, accessible from any apache and stored on half or more of the apaches. Segmenting the pool would decrease the overall hit rate, increasing the number of apaches and database servers required for any given system load level.
  • Load balancing of squids and apaches, currently expected to use two or three systems between the internet and the squids and the same set between the squids and the apaches.

A key network systems design requirement is efficient access from any apache to any apache running memcached (expected to be more than half of all apaches) and efficient access from any apache to any database server. Losing this capability would dramatically increase overall system cost.

Current situation

The cluster names are a two character code for the colo provider, and a 3 character code for the city location. (Candidhosting is now Power Medium :-).

Florida cluster (PMTPA)

Floridaserversfront1.jpg

Amsterdam cluster (KNAMS)

Knams-network.png

The Kennisnet cluster's network follows a similar design as the old Florida cluster situation, with one Cisco 3560G-24 core switch connected via a Gigabit ethernet port in routed mode to the uplink Kennisnet router. The L3 switch routes between "The Internet" and the Wikimedia VLANs:

VLAN 100 
Public VLAN (145.97.39.128/27)
VLAN 101 
Private VLAN (10.20.1.0/24)
VLAN 10 
Installation VLAN

The uplink port is in a subnet 145.97.32.29/27, with gateway 145.97.32.3.

There is a separate management network (145.97.34.224/29), with a separate uplink connected to a Kennisnet firewall for out of band access. It's connected to the Service Processors of all Sun servers in a daisy chain.

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox