Author: Jethro R Binks Date: To: exim-users Subject: [exim] Connections vs messages vs recipients
From time to time I get asked some variant of "how many emails do the
University systems reject?" (meaning at the MXs).
My problem is that I never really know how to answer that accurately and
meaningfully. I run eximstats over the consolidated logs from the MXs,
which gives me a report. I get a count of connections made, and some of
those will get through and cause emails to be delivered, and others will
be rejected, at HELO time or RCPT time or DATA time as seems appropriate
for efficiency or best information purposes (The report gives a breakdown
of the reasons for rejections through the use of custom patterns looking
for strings set with log_message per acl clause). Some connections may
just be shed off through the use of imposed SMTP delays and I never get
any more information than the remote IP.
The problem is, of course, that a connection could deliver one or more
email messages, and one email message may be addressed to one or more
recipients. This makes a direct comparison of connections vs deliveries
difficult.
It's also hard to say whether a message for two recipients is one or two
emails. The MTA transports it as one message, but on a traditional (Unix)
email system a copy may get delivered to each recipient, so does it then
become two messages? On the other hand, if delivery is to a modern
'database' message store, like Exchange, in some cases at least, I think I
am right in saying there is just one 'copy' of the message in the
database, and each user just receives a pointer to it from their inbox.
In either case, I imagine each user would count the copy of the message in
their own inbox as separate from the one in another recipients inbox -
each 'copy' is viewed as a unique message.
For the purposes of the report request, I usually end up giving the
relevant numbers with units, and include a rider saying that direct
comparisons of the numbers could be misleading with a brief explanation.
I just wondered if anyone had any neat ways of summarising this, or
resolving the issue of mismatched units (# connections vs # messages vs #
recipients). Or just calculate a converstion factor for '# messages
delivered from a connection' and multiply up, and not trouble whoever is
asking with the detail. Or perhaps I'm missing something obvious.
Jethro.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jethro R Binks
Computing Officer, IT Services
University Of Strathclyde, Glasgow, UK