A program that gives a more comprehensive view of the SMTP traffic and other flows
Arnaud Taddei
Arnaud.Taddei at sun.com
Sun Mar 24 23:03:29 CET 2002
So you can read from the attached image
Orange-smtp-flow-model.jpg
that we tried to represent the flows of traffic comming from some
specific ranges of IP addresses and machine names.
In this case we have a 3 tier network Mail Architecture and we need to
make sure that we understand the flows at each block and between all the
clients and servers.
So on each of these arrows we should be able to read:
- the production flow: how from we received, how many rcpt mail we sent,
- the pathological flow: how connections we rejected and at which
position in the SMTP dialog:
- we can reject by client IP address
- block at EHLO level
- we can reject by from
- block at MAIL FROM
- we can reject by sender
- block at RCPT TO
- we can block relays attempts
- block after RCPT TO but taking MAIL FROM into account
So we could at least get the production flow more or less correct. We
then built a program that would read the DLF file and would compare the
IP addresses and client names with a table built by hand by the system
administrator.
We did a quick and dirty program called
smtp-flows
which is short and attached that would call the
Logreport::Hosts
perl package that is attached too. This program can read a file called
$HOME/.lire/etc/clusters
which looks like (I removed some of the lines on purpose)
127.0.0.1:DMZ Orange Mail Service
192.168.30.2:Service Mail Hub (iWeb)
192.168.20.2:Service Mail Hub (iWeb)
192.168.40.101:Webmail
192.168.40.103:Webmail Wap
localhost:DMZ Orange Mail Service
smtp.iorange.ch:Service Mail Hub (iWeb)
smtp.orangemail.ch:Service Mail Hub (iWeb)
212.215.1.67:Orange Corporate
154.15.51.:Fixed IP Customers
213.55.133.:HSCSD
10.13.:GPRS
10.14.:GPRS
and when you cat a DLF file into smtp-flows you typically end up with:
> cat <DLFFILE> | smtp-flows
..............
>From Table (with number of recipients)
======================================
SMTP Peer Total # Rcpt # From
- 267 266
DMZ Orange Mail Service 142 142
Fixed IP Customers 669 539
GPRS 15 10
Orange Corporate 23 23
Rest of the World 11276 11116
Service Mail Hub (iWeb) 1661 1643
---------------------------------------------------
TOTAL 14053 13739
To Table
=========
- 6008
DMZ Orange Mail Service 1
Rest of the World 2569
Service Mail Hub (iWeb) 5475
---------------------------------------------------
TOTAL 14053
So this approach shows that one can reassemble a view which is much more
comprehensible by IT managers, marketing and other troops and which
gives a lot of information on the real usage. Now indeed we know the
activity from the GPRS or Fixed IP lines, etc. which are key business
information.
This of course means that the clusters file has to be defined at each
level of the infrastructure but it shows how simple this is and how
useful such a report is.
Then Arnaud Gaillard made a zoom for each of these flows and showed the
evolution overtime for a one month period of some of these flows. Thus
we could have a drill down approach and understand more the dynamics of
the site.
Now among the actions to do are:
1) incorporate that analysis into logreport (ok I really have to learn
the analysor and the way you compute things - sic)
2) detail more the pathological flow (which is probably what we read
from the '-' category in the above report)
3) make sure that we have more flexibility to import external 'clusters'
files depending on what log we are looking and from which role in the
architecture
4) give a way to aggregate such outputs overtime and on several machines
in the same area in the architecture
etc.
Let me know what you think about
A++
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Orange-smtp-flow-model.jpg
Type: image/jpeg
Size: 165569 bytes
Desc: not available
Url : http://lists.logreport.org/pipermail/development/attachments/20020324/83044b11/attachment.jpg
-------------- next part --------------
#!/usr/bin/perl
use lib "/homedir/m/march/lib";
use Logreport::Hosts;
$| = 1;
$clusters="$ENV{'HOME'}/.lire/etc/clusters";
$H2C = &Logreport::Hosts::Load_Clusters($clusters);
while(<STDIN>) {
$lines++;
chomp;
@fields = split(/ /,$_);
$f_host = &Logreport::Hosts::Host2Cluster($fields[7],$H2C);
#printf "%-30s --> %-30s\n", $fields[7], $f_host;
$t_host = &Logreport::Hosts::Host2Cluster($fields[13],$H2C);
#printf "%-30s --> %-30s\n", $fields[13], $t_host;
$F_IP{$f_host}++;
$T_IP{$t_host}++;
$qid = $fields[2];
@l = grep(/$qid/, @{$FU_IP{$f_host}});
if ($#l < 0) {
push(@{$FU_IP{$f_host}},$qid);
}
if ($lines =~ /000$/) {
print ".";
}
}
print "\n\n";
print "From Table (with number of recipients)\n";
print "======================================\n";
printf "%-30s %15s %8s\n", "SMTP Peer", "Total # Rcpt", "# From";
print "\n";
foreach $ip (sort keys %F_IP) {
$c = $c + $F_IP{$ip};
@f = @{$FU_IP{$ip}};
$g = $#f +1;
$o = $o + $g;
printf "%-30s %15d %8d\n", $ip, $F_IP{$ip}, $g;
}
print "---------------------------------------------------\n";
printf "%-30s %15d %8d\n", 'TOTAL', $c, $o;
print "\n";
print "To Table \n";
print "=========\n";
print "\n";
foreach $ip (sort keys %T_IP) {
$d = $d + $T_IP{$ip};
printf "%-30s %15d\n", $ip, $T_IP{$ip};
}
print "---------------------------------------------------\n";
printf "%-30s %15d\n", 'TOTAL', $d;
-------------- next part --------------
package Logreport::Hosts;
=pod
=head1 NAME
Logreport::Hosts - Perl Module that allows manipulations on hosts for log analysers
=head1 SYNOPSIS
;
=head1 DESCRIPTION
This package intends to offer nice functions for mappings between IP addresses and hosts as well as hosts 'clusters'
=head1 FILES
Files to review
=head1 SEE ALSO
Other resources
=head1 COPYRIGHT
Copyright Sun - 2002
=head1 VERSION
$Revision: 0.1 $
=head1 DATE
$Date: 2000/06/27 09:04:00 $
=head1 AUTHOR
Arnaud Taddei <Arnaud.Taddei at sun.com>
=cut
use strict;
=pod
=head1 FUNCTIONS
=cut
=pod
=head2 Load_Clusters
=cut
sub Load_Clusters {
my($clusters) = @_;
my(%host2cluster, at cluster);
open(O,$clusters) || warn "$clusters is not readable: $!";
while(<O>) {
chomp;
@cluster = split(/:/,$_);
$host2cluster{$cluster[0]} = $cluster[1];
}
return \%host2cluster;
}
=pod
=head2 Host2Cluster
=cut
sub Host2Cluster {
my($host,$h2c,$mode) = @_;
my(%h2c) = %$h2c;
my($orig_host) = $host;
# We try to catch the IP ranges first
if (defined $h2c{$host}) {
return $h2c{$host};
} elsif ($host eq '-') {
return '-';
} elsif ($host =~ /(\d+\.\d+\.\d+\.)\d+/) {
$host = $1;
if (defined $h2c{$host}) {
return $h2c{$host};
} elsif ($host =~ /(\d+\.\d+\.)\d+/) {
$host = $1;
if (defined $h2c{$host}) {
return $h2c{$host};
}
}
}
if (defined $mode && $mode eq 'transparent') {
return $orig_host;
} else {
return "Rest of the World";
}
}
1;
More information about the Development
mailing list