Calculating bandwidth from a combined-format web server log

Tags: •  • 

Given a combined web server access log, such as the ones generated by Apache, it can be useful to know the total amount of data transfer of all requests in that log. This task is simple: extract the field listing the number of bytes sent for a request, and add them all up. For something so simply, there is an odd lack of examples or pre-made scripts that do this. Or, at least, I couldn’t find any.

I wrote my solution, calculate-data-transfer.py, in Python:

import re
import sys

fileName = sys.argv[1]

compiledExpression = re.compile(".*\".*\" [-0-9]* ([0-9]*)")

fpFullLog = file(fileName)

totalBytes = 0

for line in fpFullLog:
  matches = compiledExpression.match(line)

  if matches is None:
    continue

  bytes = matches.group(1)

  if len(bytes) > 0: # avoid zero-length matches
    bytes = int(bytes)
    totalBytes += bytes

fpFullLog.close()

print "%.2f MiB" % (totalBytes/2.0**20)

Use is simple:

% python calculate-data-transfer.py access.log

The script will print out the data transfer in MiB, based on the power of 2 (2^20) rather than 10 (10^6).


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

mmmm google

It is scary when you google search something and come across people you know. You saved me 5 minutes of work ;)


your regex appears to be off.

Also, it is much easier to so something like:

awk ‘{ sum += $10 } END { print sum }’ access_log

From the command line. You are guaranteed that *nix will have awk. Never know if you’ll have python.


was it so easy? :)) i

was it so easy? :))

i tried this script and saw that it is really working.

is bandwidth calculation so easy? :)

thanks!


Post new comment

The content of this field is kept private and will not be shown publicly.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • You can use Markdown syntax to format and style the text.
  • Images can be added to this post.
  • You may use [inline:xx] tags to display uploaded files or images inline.
More information about formatting options