Search/UDP Logger

From Wikitech
< Search
Revision as of 01:15, 1 August 2009 by Tomasz Finc (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

UDP Logger

There is a UDP logger running on searchidx1 that is collecting mesg's from each one of the search servers. All search queries are getting logged into the following format:

db timestamp search

The udp logger was added in r51097 and r51324.

Host

searchidx1

Storage

Files are written to

/a/search/udplogger

Scripts

(start/stop)logger.sh - start/stop the process
udplogger.py - main worker for collection

Logs

All queries are written to searchqueries including public/private wiki's

Additions Needed

  • Public/Private filter - add a script to filter
  • Log rotation/archiving - rainman_: now, ideally i think.. there would be another script that would at midnight: 1) run stoplogger.sh 2) rename searchqueries to searchqueries-CURRENTDATE 3) upload/move this file to some public location 4) run startlogger.sh
    • Wrote a quick log splitter to break up the huge 40GB query file into day chunks. Threw away private searches and archived them into 7za format and will put them on download after getting approval . --Tomasz Finc 23:57, 31 July 2009 (UTC)
  • Publish data to download.wikimedia.org
Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox