Brian Kendig wrote:
> I've got a copy of all my exim "mainlog" data for the past year. I'd
> like to analyze it to know how many spam attempts I got each day: not
> only stuff that was "rejected by local_scan()" (because I'm using
> SA-Exim together with SpamAssassin), but also times when an SMTP
> connection was rejected because "no IP address found for host", or there
> was "input sent without waiting for greeting", or a HELO or EHLO was
> "syntactically invalid", or the server "could not complete sender
> verify", or anything else like that.
>
> Yes, I could just write a tool myself to pick these lines out of the log
> files and tally them up, but I've got over 130,000 lines of log files
> and I'm bound to miss some kind of spam attempts. I'd rather not
> reinvent the wheel if someone else has already addressed this issue -
> are there any solutions that already exist?
Here's a couple perl scripts that my friend gave me that work nice.
#!/usr/bin/perl -w
#
# sa_score_graph.pl -- print a quick little graph of SA scores
#
# 12/08/03 by rpuhek@???
#
# Gives a graph of scores observed in your log file. May be useful in
# determining where to place your required_hits value.
#
# Usage: sa_score_graph.pl <logfile>
#
# Maximum desired width...
$width=72;
$min=0;
$max=0;
$min_count=0;
$max_count=0;
while (<>) {
next unless ( /identified spam \(([\d\.-]*)\/([\d\.]*)\)/ or /clean message \(([\d\.-]*)\/([\d\.]*)\)/ );
$score = $1;
# Round off score.
$score =~ s/\..*//;
$scores{$score}++;
$min = $score if ( $score < $min );
$max = $score if ( $score > $max );
};
foreach $score (keys(%scores)) {
$count = $scores{$score};
$min_count = $count if ( $count <= $min_count );
$max_count = $count if ( $count >= $max_count );
};
$scale = 1 if ($max_count <= $width);
if ($max_count > $width) {
$scale = $width / $max_count;
};
print "SpamAssassin score summary:\n";
print "Minimum score: $min\n";
print "Maximum score: $max\n";
print "Minimum count: $min_count\n";
print "Maximum count: $max_count\n";
print "Scaling to: $scale\n";
for ( $i = $min; $i <= $max; $i++ ) {
print "$i\t:";
$score = $scores{$i} || 0;
#scale the graph to fit on one line.
$score = $score * $scale;
$score = 1 if ( ($score > 0 ) and ($score < 1) );
print "*" x $score;
print "\n";
};
#!/usr/bin/perl
#
# spam-stats.pl -- scan logfile and retrieve spam stats messages,
# this version uses logtail to save time
#
# 7/30/03 by rpuhek@???
#
$statfile="/var/local/spam-stats/counts";
$offsetfile="/var/local/spam-stats/offset";
$logfile="/var/log/syslog*";
$tail_cmd="/usr/sbin/logtail";
#open STATS, $statfile or die "ERROR: Unable to open $statfile $!\n";
dbmopen %stats, $statfile, 0666 or die "ERROR: Unable to open statsfile: $statfile $!\n";
open (LOG, "$tail_cmd $logfile $offsetfile|") or die "ERROR: Unable to open logfile: $logfile $!\n";
while (<LOG>) {
$stats{spam}++ if /identified spam/;
$stats{clean}++ if /clean message/;
$stats{skipped}++ if /skipped large/;
$stats{total}++ if /connection from/;
$stats{processed}++ if /processing message/;
};
close LOG;
foreach $key (keys(%stats)) {
print "$key: $stats{$key}\n";
};
$uptime = `uptime`;
$hostname = `hostname -f`;
chop $uptime;
chop $hostname;
$uptime =~ s/.*up //;
$uptime =~ s/,\s*?[0-9]* user.*//;
print "uptime: $uptime\nhostname: $hostname\n";
dbmclose %stats;