Fundraising Analytics/Impression Stats

From Wikitech
< Fundraising Analytics(Difference between revisions)
Jump to: navigation, search
(proxy log copy to hume)
(monitoring and debugging)
Line 67: Line 67:
 
== monitoring and debugging ==
 
== monitoring and debugging ==
  
Both cron scripts log fairly verbosely, and /var/log/syslog will show you what files they touch, actions, and errors. If the scripts run successfully there is no output except in logs, when there is an error they print to stdout to cause cron to mail.  
+
The cron script logs verbosely and locke:/var/log/syslog will show you actions and errors.
  
 
Under normal operation, you should see this sequence:
 
Under normal operation, you should see this sequence:
# logs collect in realtime at locke:/a/squid/fundraising/logs/*.log
+
 
# every 15 min, logs rotate to locke:/a/squid/fundraising/logs/destined_for_storage3 and that dir is rsync'd to storage3:/archive/incoming_udplogs
+
<pre>
# every 5 min, storage3:/archive/incoming_udplogs is polled and any files are compressed and moved to storage3:/archive/udplogs for long term archiving
+
Sep  6 17:45:01 locke CRON[28592]: (file_mover) CMD (/home/file_mover/scripts/rotate_fundraising_logs)
# once files have been moved from storage3:/archive/incoming_udplog, they are deleted from locke:/a/squid/fundraising/logs/destined_for_storage3 by the log rotation script
+
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: move /a/squid/fundraising/logs/landingpages.log to /a/squid/fundraising/logs/buffer/2012/landingpages-20120906-174501.log
# if storage3 is inaccessible locke:/a/squid/fundraising/logs/destined_for_storage will continue to collect files until storage3 is able to process them.  
+
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: move /a/squid/fundraising/logs/bannerImpressions-sampled100.log to /a/squid/fundraising/logs/buffer/2012/bannerImpressions-sampled100-20120906-174501.log
 +
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: reload udp2log
 +
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: gzip /a/squid/fundraising/logs/buffer/2012/bannerImpressions-sampled100-20120906-174501.log
 +
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: gzip /a/squid/fundraising/logs/buffer/2012/landingpages-20120906-174501.log
 +
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: rsync -ar /a/squid/fundraising/logs/buffer/ /a/squid/fundraising/logs/fr_archive/
 +
Sep  6 17:45:02 locke rotate_fundraising_logs[28594]: done!
 +
</pre>
 +
 
 +
Note that /a/squid/fundraising/logs/fr_archive is the permanent storage location on the netapp.
 +
 
  
 
[[Category:Fundraising]]
 
[[Category:Fundraising]]
 
[[Category:Fundraising_Analytics]]
 
[[Category:Fundraising_Analytics]]
 
[[Category:Fundraising - Needs Updated]]
 
[[Category:Fundraising - Needs Updated]]

Revision as of 17:54, 6 September 2012

Banner impressions and landing page stats are collected from squid logs via udp2log, running on Locke.
From there log files are periodically moved to Storage3 via file_mover@locke's crontab. Logs are moved uncompressed because excess CPU utilization on locke interferes with udp log collection.
Once on storage3, log files are compressed on via cronjob running as logmover@storage3 and archived.
Finally, files are parsed from storage3 by Faulkner's analytics scripts.

Contents

udp2log proxy log collection

To enable
ssh to locke and uncomment fundraising-related lines in /etc/udp2log/squid to look like this:

...
## Fundraising
# Landing pages
pipe 1 /a/squid/fundraising/lp-filter >> /a/squid/fundraising/logs/landingpages.log

# Banner Impressions
pipe 1 /a/squid/fundraising/bi-filter >> /a/squid/fundraising/logs/bannerImpressions.log
...

Then HUP udp2log:

awjrichards@locke:~$ /home/file_mover/scripts/resetudp2log 
Sending SIGHUP to udp2log...

To disable
SSH into Locke, and comment fundraising-related lines in /etc/udp2log/squid.

Then HUP udp2log:

awjrichards@locke:~$ /home/file_mover/scripts/resetudp2log 
Sending SIGHUP to udp2log...

proxy log copy to hume

To enable:
Enable this crontab entry for file_mover@locke:

*/15 * * * * /home/file_mover/scripts/rotate_fundraising_logs

To disable:
Comment out this crontab entry for file_mover@locke:

#*/15 * * * * /home/file_mover/scripts/rotate_fundraising_logs

proxy log compression on storage3

To enable:
Enable this crontab entry for logmover@storage3:

*/5 * * * * /home/logmover/scripts/gzip_incoming_logs.pl

To disable:
Comment out this crontab entry for logmover@storage3:

#*/5 * * * * /home/logmover/scripts/gzip_incoming_logs.pl

monitoring and debugging

The cron script logs verbosely and locke:/var/log/syslog will show you actions and errors.

Under normal operation, you should see this sequence:

Sep  6 17:45:01 locke CRON[28592]: (file_mover) CMD (/home/file_mover/scripts/rotate_fundraising_logs)
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: move /a/squid/fundraising/logs/landingpages.log to /a/squid/fundraising/logs/buffer/2012/landingpages-20120906-174501.log
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: move /a/squid/fundraising/logs/bannerImpressions-sampled100.log to /a/squid/fundraising/logs/buffer/2012/bannerImpressions-sampled100-20120906-174501.log
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: reload udp2log
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: gzip /a/squid/fundraising/logs/buffer/2012/bannerImpressions-sampled100-20120906-174501.log
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: gzip /a/squid/fundraising/logs/buffer/2012/landingpages-20120906-174501.log
Sep  6 17:45:01 locke rotate_fundraising_logs[28594]: rsync -ar /a/squid/fundraising/logs/buffer/ /a/squid/fundraising/logs/fr_archive/
Sep  6 17:45:02 locke rotate_fundraising_logs[28594]: done!

Note that /a/squid/fundraising/logs/fr_archive is the permanent storage location on the netapp.

Personal tools
Namespaces

Variants
Actions
Navigation
Ops documentation
Wiki
Toolbox