Re: [exim] Exim4 zombie processes are not killed

Top Page
Delete this message
Reply to this message
Author: Alexander Nagel
Date:  
To: exim-users
CC: exim-users
Subject: Re: [exim] Exim4 zombie processes are not killed

>
> On 2008-03-12 at 15:46 +0100, Alexander Nagel wrote:
> > My problem is that i have quite a big number of exim4 process with
> > state ZOMBIE during mail delivery. Sometimes for any reason i dont
> > know it happens that the ZOMBIE processes are not getting killed and
> > they are getting more and more until NAGIOS is alerting.
>
> A zombie process is already dead. The only state which still exists is
> an entry in the process table (and some other kernel house-keeping
> structures) which records the state of the process, etc. A zombie
> process is, simply, a dead process which the parent hasn't reaped.
>
> There is one way, and only one way, for a zombie process to go away: its
> parent process reaps it. However, if you kill the parent process then
> the zombie, like all that process's children, gets re-parented to be a
> child of init (pid 1) which always reaps all its children. Thus, the
> parent process reaps it (but it's a different parent).
>
> Exim's model is to have a process handle each delivery, so if that
> process has zombie children then the parent is stuck waiting on
> something else. In your case, the spamc process is typically running.
> When the spamc process exits, Exim cleans up.
>
> The existence of the zombie processes is not your problem. The presence
> of other processes which are hanging and which Exim is waiting on is a
> possible problem and the zombie processes are a symptom of that, a
> side-effect.

I know what zombie processes are, but the problem is that the zombies are
not getting killed by init (pid 1). They are still there and i can kill them only with -9
I always try -15 to prevent data loss ;-)

>
> There is no need to kill the exim processes with -9 (SIGKILL), since
> then Exim doesn't have a chance to clean up after itself and you
> potentially risk leaving corrupted DB files around. If you really have
> to kill Exim, is it really true that SIGTERM (-15, the default) doesn't
> work for you?
>
> > Debian-  16817  0.0  0.0  3264  768 ?        S    15:27   0:00      \_ /usr/bin/spamc -t 10 -u XXXXXXXXXX

>
> spamc should be timing out after 10 seconds (-t 10). If it's not then
> there's the problem. Figure out why spamassassin is hanging.

I check that.

Thx for your answer
Alex
>
> -Phil
>
> --
> ## List details at http://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>


_______________________________________________________________________
Jetzt neu! Schützen Sie Ihren PC mit McAfee und WEB.DE. 30 Tage
kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220