|
|
Question : Calculate Statistics From Log file
|
|
This is not a homework assignment.
I have log files for multiple servers that are kept in a repository. The log file's filename format is application.servername.timestampinYYYYMMDD.log.
Each of the log files in the directory has the following format:
YYYYMMDD Time-TimeZone Servername Application PID; MeasuredStatistic (XX/YY) Value for TimeStamp 20030817 000216010-0400 servername application 4567;MeasuredStatisticOne(12/18) 5217 20030817 000216110-0400 servername application 4567;MeasuredStatisticTwo(12/14) 419276 20030817 000216110-0400 servername application 4567;MeasuredStatisticThree(12/31) 57912 20030817 000216110-0400 servername application 4567;MeasuredStatisticFour(12/12) 72 20030817 000216110-0400 servername application 4567;MeasuredStatisticFive(12/13) 1451718
The log files roll over at midnight. Each statistic is measured at a random interval.
I am trying to write a perl script that will do the following: For Each server *determine the number of times that a given statistic appears on a day for a server. Use that number to calculate the average for the MeasuredStatistic for the day. * sum the days to determine the totals and averages for the week for a server. * determine the average for all statics for all servers and export to a csv file to be used in an Excel spreadsheet. * determine totals for all servers and export to a csv file to be used in an Excel spreadsheet.
I have written the following code, but it is not producing the desired results. =====Begin code.pl #! /usr/bin/perl
%module_count = (); %module_sum = ();
while (<>) { chomp; next if (/^\s*$/);
my ($date, $time, $host, $server, $pid, $metric, $value) = split(/\s/);
$module_count{$module}++; $module_sum{$module} += $percent; }
foreach $module (sort keys %module_count) { printf "%s %dx average is %d%%\n", $module, $module_count{$module}, $module_sum{$module} / $module_count{$module}; =====End code.pl
How would I script this properly in perl?
|
Answer : Calculate Statistics From Log file
|
|
I've included codes below that should give you a very good idea as to how to accomplish all you want. I didn't do all you asked for, mostly because I do not completely understand exactly how some stats should be calculated (for example, average for a server -- average over a week per day, or over all the time per day? It's ambiguous). However my codes below should provide you enough detail to let you easily get the stats yourself. I also commented out some codes I felt not needed, and added "use strict" as it's always a good idea to have it.
#! /usr/bin/perl use strict; my (%module_count, %module_sum, %server_count, %all_count); my $sunday = 20030803; # used as a convenience, should be earlier than all date in file
while (<>) { next if (/^\s*$/); chomp;
my ($date, $time, $host, $server, $tmp, $value) = split(/\s/); my ($pid, $metric) = split(/;/, $tmp);
$server_count{$server}->{daystat}->{$date}++; # records each server's number of times per day $server_count{$server}->{weekstat}->{int($data-$sunday/7)}++; # records server's total for each week $all_count{daystat}->{$date}++; # records all servers' per day stat. not really needed, but convenient as we don't have to add servers up $all_count{weekstat}->{int($data-$sunday/7)}++; # same as above # $module_count{$module}++; # $module_sum{$module} += $percent; }
foreach my $server (keys %server_count) { foreach my $week (keys %{$server_count{$server}}) { my $start = $sunday + 7 * ($week); print "for server $server, weekly count for $start - " . $start+7 . " is $server_count{$server}->{weekstat}->{$week}\n"; } }
# foreach $module (sort keys %module_count) { # printf "%s %dx average is %d%%\n", # $module, # $module_count{$module}, # $module_sum{$module} / $module_count{$module};
|
|
|
|
|