[Exim] Mail lost when Network down

Top Page
Delete this message
Reply to this message
Author: John Boyle
Date:  
To: exim-users
Subject: [Exim] Mail lost when Network down
Hello

As post master I've inherited the unix part of the following:

Our mail server is a SPARC 20 running Solaris 2.6 (called kyle1).
Our MTA is EXIM 3.16.

Our clients are mostly NT workstations getting services from an
NT4 server (called staffmail) which relays smtp mail through the
unix box. Local mail doesn't leave the NT server.

There are also some local mailboxes held on the unix box. These
are popped periodically by our antiquated w3.11 users.

The MTA on the NT server is Mercury/32, v3.21c.

Problem occurs when our link to the outside world goes down.
Mail from the NT box requiring to be relayed by Exim to the 3.11
users gets dropped. Nothing gets queued - anywhere.


Errors in the Exim mainlog are like this:

2001-09-14 15:33:43 SMTP connection from staffmail.au.sac.ac.uk
[194.83.148.8]
2001-09-14 15:33:43 SMTP connection from staffmail.au.sac.ac.uk
[194.83.148.8] lost (error: Connection reset by peer)
2001-09-14 15:35:15 SMTP connection from staffmail.au.sac.ac.uk
[194.83.148.8]
2001-09-14 15:35:15 SMTP connection from staffmail.au.sac.ac.uk
[194.83.148.8] lost (error: Connection reset by peer)
2001-09-14 15:36:48 SMTP connection from staffmail.au.sac.ac.uk
[194.83.148.8]
2001-09-14 15:36:48 SMTP connection from staffmail.au.sac.ac.uk
[194.83.148.8] lost (error: Connection reset by peer)
2001-09-14 15:38:20 SMTP connection from staffmail.au.sac.ac.uk
[194.83.148.8]
2001-09-14 15:38:20 SMTP connection from staffmail.au.sac.ac.uk
[194.83.148.8] lost (error: Connection reset by peer)

You get the idea.

Nothing in paniclog.

MercuryC SMTP module log entries aren't much more helpful.
Outgoing general log shows lots of these for the period of
disconnection (didn't have the oportunity to enable session logging
so it's a bit sparse):

E 20010914 152351 187 Connection error during handshake with
kyle1.au.sac.ac.uk.
E 20010914 152351 187 Error connecting to kyle1.au.sac.ac.uk.

DNS looks to be ok. Kyle1 is our master DNS box as well and
nslookups of staffmail are ok. Think this is a red herring
personallly.

Responsibility for these servers is split between the solaris/exim
admin (me) and the NT/mercury admin (someone else). I have been
assured that this did not happen in the "good old days" of sendmail
- no comment :-&

Nothing in the archive (so appears to be our unique problem) or
generally on the net that describes this so far. Haven't stopped
looking though.

Any help/guidance will be appreciated.

regards

j





John Boyle    J.Boyle@???
Information Technology Support & Development Dept.
SAC, Cronin Building, Auchincruive, AYR KA6 5HW, Scotland
Tel 01292 525144   Fax 01292 525211