Re: [Exim] message expiration

Author: Nico Erfurth
Date:
To: Brian Bennett
CC: exim-users@exim.org
Subject: Re: [Exim] message expiration

On Sun, 27 Oct 2002, Brian Bennett wrote:

Please reply to the list, not personally to me.

> I'm up to 2600 messages and 300 exim processes. Most of them are:
> /bin/sh -c if [ -x /usr/sbin/exim -a -f /etc/exim/exim.conf ]; then
> /usr/sbin/exim -q ; fi

This is started via cron.

> The rest are:
> /usr/sbin/exim -q

This are the queue-runner themself

> There's a few that are:
> /usr/sbin/exim -Mc 185oGZ-0004rS-00 (but with different mID's.)

These are deliveries.

> > 25 days is too much, are those messages all frozen?
>
> No, nothing is marked frozen. I don't know what I did but somehow
> nothing gets marked frozen anymore. It happened after I messed with
> the retry options. I think that somehow all messages are retried
> forever.
>
> > try to set these options (exim4) > > ignore_errmsg_errors_after = 3d <--- this deletes undeliverable bounces > > after 3 days

>
> I already have:
> ignore_errmsg_errors = true
>
> > timeout_frozen_after = 7d <--- this removes frozen messages older > > than 7 days from your queue, i'm not > > sure if they are bounced or removed

>
> I already have:
> timeout_frozen_after = 24h
>
> > split_spool_directory <--- Reduces the load a little bit > > auto_thaw = 12h <--- to retry frozen messages after 12h

>
> Nothing is frozen :-(
>
> > > Instead those messages generate errors and cause the origonal message
> > > to sit in the queue attempting re-delivery. The bounce message also
> >
> > Please show us some log-entries and/or relevant parts of a debug-run.
>
> Is this what you mean?
>
> galaxy:/var/spool/exim/msglog# cat 185o2X-0003fA-00 2002-10-27 08:05:27
> ckly-return-199@??? R=lookuphost defer (-1): host
> lookup did not complete 2002-10-27 08:08:11
> ckly-return-199@??? R=lookuphost defer (-1): host
> lookup did not complete

You can use sender verifiction to prevent such things.

> galaxy:/var/spool/exim/msglog# cat 185o18-0003WZ-00 2002-10-27 08:03:50
> june@??? T=local_delivery defer (-23): mailbox is full
> (MTA-imposed quota exceeded while writing to file /var/spool/mail/june)
> 2002-10-27 08:08:41 june@??? T=local_delivery defer (-23):
> mailbox is full (MTA-imposed quota exceeded while writing to file
> /var/spool/mail/june) galaxy:/var/spool/exim/msglog#
>
> > Why should it never happen?
>
> because I don't care about them at this point, I just need to get
> anything and everything out of the queue.

See below.

> > > 4. After 24 hours, expire messages and delete them entirely.
> >
> > They will be bounced, not deleted!
>
> What is suposed to happen when a message gets bounced? From what I can
> tell on my system, it aparently generates a bounce message and continues
> to attempt delivery on the origonal message. Then I have 2 messages in
> the spool, causing greater problems.

Deleting means, the message will just disappear, when you bounce a message
the original message will be removed a bounce will be send.

> > > 5. Above all, get it out of the queue at all costs!!!
> >
> > Test the above settings first, and do
> > exim -qff to enforce a queue-run for ALL mesages.
> >
> > > In my exim.conf I have this in the RETRY CONFIGURATION section: > > > # Domain Error Retries > > > # ------ ----- -------

> > >
> > > * quota

> >
> > Thats ok.
>
> But it doesn't work. Quota messages are retried for an indefinate
> period of time. I meant to say "quota errors should cause to fail
> fataly and permanently, by the way, delete that from the queue" but
> instead it seems to mean "keep trying to deliver quota messages
> until hell freezes over"

See below.

> > > * refused > > > * timeout

> >
> > These both are BAD!
> > You should not use them, in todays internet you still get
> > delivery-problems, hotmail comes into my mind or even mail.exim.org was
> > unreachable for 24h last week.
>
> At this point I don't care, I want my queue empty. If I see an empty
> queue then I may change things.
>
> > > I'm using exim 3.35-1 on Debian 3.0 (woody). Please please please
> > > someone help me. As far as I can tell everything I've done is right,
> > > but exim just doesn't seem to care.
> >
> > How does your queue-runner is started?
> > IIRC Debian starts it from crontab by default, which options does it use?
>
> I'm not exactly sure how my queue runs get started but incoming
> messages are handled by inetd.

You SHOULD run exim as a daemon, in debian, the accepting is done via
inetd, and the queue-run via a cronscript.

You can run exim -bd -q5m to start an exim daemon, that will do a
(controlled) queuerun every five minutes.

> > If you still have much OLD messages in your queue, provide parts of your
> > mainlog and the relevant parts of a debug-run (exim -d9 -Mc MESSAGEID).
>
> OK....Here's the output from my oldest message:
>
> Exim version 3.35 debug level 9 uid=0 gid=0
> Berkeley DB: Sleepycat Software: Berkeley DB 2.7.7: (08/20/99)
> galaxy.orcacom.net in local_domains? yes (matched galaxy.orcacom.net)
> Unable to create IPv6 socket to find interface addresses:
> error 97 Address family not supported by protocol
> Trying for an IPv4 socket
> Actual local interface address is 127.0.0.1 (lo)
> Actual local interface address is 65.166.175.2 (eth0)
> Actual local interface address is 65.166.175.3 (eth0:0)
> Caller is an admin user
> Caller is a trusted user
> set_process_info: 23066 delivering specified messages
> delivering message 17wS4L-0007gJ-00
> set_process_info: 23066 delivering 17wS4L-0007gJ-00
> Opened spool file 17wS4L-0007gJ-00-H
> user=mail uid=8 gid=8 sender=info@???
> sender_fullhost = ne-dsl-166-248.accessus.net (accessus) [209.145.166.248]
> sender_rcvhost = ne-dsl-166-248.accessus.net ([209.145.166.248] helo=accessus)
> sender_local=0 resent=no ident=unset
> Non-recipients:
> -->medirep@??? [0]
> ---- End of tree ----
> recipients_count=3
> body_linecount=556 message_linecount=17
> Delivery address list:
> John.Nance@???
> Todd.Snell@???
> locked /var/spool/exim/db/retry.lockfile
> LOG: 0 MAIN
> failed to open DB file /var/spool/exim/db/retry: Invalid argument

DING
DING
DING

;)
you have some kind of problem here, either the retry database has wrong
permissions, or the db-directory, please check them, this COULD solve your
problems, but I'm not sure.

> checking status of beersskanska.com
> locked /var/spool/exim/db/retry.lockfile
> LOG: 0 MAIN
> failed to open DB file /var/spool/exim/db/retry: Invalid argument

DING again.

> added retry item for T:beersskanska.com:64.225.154.175: errno=110 65 flags=2
> all IP addresses skipped or deferred at least one address
> locked /var/spool/exim/db/wait-remote_smtp.lockfile
> LOG: 0 MAIN
> failed to open DB file /var/spool/exim/db/wait-remote_smtp: Invalid argument

...

> locked /var/spool/exim/db/retry.lockfile
> LOG: 0 MAIN
> failed to open DB file /var/spool/exim/db/retry: Invalid argument

....

> Well...What do you think?

That you should check the permissions on these directory/file, you can
safely delete the files to test again.

ciao