[exim] Deferred mail not getting processed by queue runners

Top Page
Delete this message
Reply to this message
Author: Dominic Hargreaves
Date:  
To: exim-users
Subject: [exim] Deferred mail not getting processed by queue runners
I have a recurring problem where mail to a certain domain takes a long
time to process. What happens is that the initial delivery to the domain
fails with a temporary SMTP error,

I hope I've included the right amount of information below.

The result of exim4 -brt on the domain is:

Retry rule: * * F,2h,15m; G,16h,1h,1.5; F,4d,6h;

The initial log is:

2006-06-19 14:55:50 1FsKEI-0005ue-L2 <= dom@??? U=dom P=local S=1205 id=20060619135550.GL22869@???
2006-06-19 14:55:51 1FsKEI-0005ue-L2 => thisrecipientworked@??? R=dnslookup T=remote_smtp H=mk-cpfront-9.mail.uk.tiscali.com [212.74.114.41]
2006-06-19 14:55:53 1FsKEI-0005ue-L2 == problemaddress@??? R=dnslookup T=remote_smtp defer (-44): SMTP error from remote mailer after RCPT TO:<problemaddress@???>: host inbound.oxfordbachchoir.org.emailmx.com [216.40.36.30]: 451 Exhausted MX records for domain oxfordbachchoir.org

Then, nothing happens in the main log with this message until
16:12:16, even though the retry rules, together with the daemons
running with -bd -q30m surely means that the message should have had
another delivery attempt earlier. The same deferal happens at 16:12.

I then, at 16:36, try an exim4 -q -d process. The relevant debug output
is here:

set_process_info: 9351 running queue: 1FsKEI-0005ue-L2-H
set_process_info: 15840 delivering 1FsKEI-0005ue-L2 (queue run pid 9351)
reading spool file 1FsKEI-0005ue-L2-H
user=dom uid=1002 gid=1002 sender=dom@???
sender_local=1 ident=dom
Non-recipients:
-->thisrecipientworked@??? [0]
---- End of tree ----
recipients_count=2
body_linecount=19 message_linecount=15
Delivery address list:
problemaddress@???
locking /var/spool/exim4/db/retry.lockfile
locked /var/spool/exim4/db/retry.lockfile
opened hints database /var/spool/exim4/db/retry: flags=0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Considering: problemaddress@???
unique = problemaddress@???
dbfn_read: key=R:example.com
dbfn_read: key=R:problemaddress@???
no domain retry record
post-process problemaddress@??? (1)
LOG: retry_defer MAIN
== problemaddress@??? routing defer (-51): retry time not reached
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

After routing:
  Local deliveries:
  Remote deliveries:
  Failed addresses:
  Deferred addresses:
    problemaddress@???
search_tidyup called

>>>>>>>>>>>>>>>> deliveries are done >>>>>>>>>>>>>>>>

changed uid/gid: post-delivery tidying
uid=113 gid=113 pid=15840
auxiliary group list: <none>
set_process_info: 15840 tidying up after delivering 1FsKEI-0005ue-L2
Processing retry items
Succeeded addresses:
Failed addresses:
Deferred addresses:
problemaddress@???: no retry items
end of retry processing
delivery deferred: update_spool=0 header_rewritten=0
end delivery of 1FsKEI-0005ue-L2
search_tidyup called
set_process_info: 9351 running queue: waiting for 1FsKEI-0005ue-L2 (15840)

All of which isn't particularly illuminating to the untrained eye,
except to say that it raises the question of why this process did not
try to retry the message (no other queue-runner touched this message
between 16:12:16 and now). In particular "no domain retry record".

So, what would the retry rule apparently be missing from this host? In
the time frame I discuss, no other deliveries or delivery attempts were
made to the problem host or domain. Are there any clues or other
debugging hints I've missed?

I've only noticed this problem to this particular domain
but that could be a coincidence (an unreliable destination and an
address I email quite frequently).

Thanks,

Dominic.

--
Dominic Hargreaves | http://www.larted.org.uk/~dom/
PGP key 5178E2A5 from the.earth.li (keyserver,web,email)