著者: Bill Hacker 日付: To: exim-users 題目: Re: [exim] Re: [exiscanusers] Exiscan for Exim-4.50?
Ian Eiloart wrote:
*SNIP*
> 2. Exim stops after a fixed period if it doesn't have access to a
> specific IP address. I'd like to be able to specify the period in Exim's
> config, and even to specify retry intervals.
>
> I have a cluster of SMTP servers, using IPFailover on MacOSX. On
> startup, the host machine has to recover its IP address from a failover
> host. Typically exim starts within a few seconds of IP failback, but
> finds that its ports are not available. I presume it is doing some sort
> of negotiation with the switch, but I don't really understand what's
> going on here. What I do know is that exim doesn't retry for 30 seconds
> - so I get an interval of up to 40 seconds total where the IP address
> doesn't provide an SMTP service.
>
> If I could tell exim to retry every second for a couple of minutes, then
> I think the dead period would be roughly halved. There's a dead period
> during failover of about three seconds.
>
> More seriously, if the timeout occurs, and I fail back to a machine
> without exim running, then I'm in big trouble. Therefore, I'd like to be
> able to have exim to keep retrying essentially forever. If I could
> specify retry intervals using the same syntax as for smtp retries, that
> would be excellent. It might be more flexibility than I need, but it
> might just be easier to reuse the code, and it might be better to stick
> with an existing syntax.
>
Have a look at 'monitord' and 'checkservice'.
If/as/when the failover 'trips' monitord could run a script to
validate the fallback viability, then reload Exim.
'checkservice' can do more than just test the port. See 'plugins'.
Likewise on the recovery side. Or secondary fallback.
And log it all. Or even (with another toolset) send you an SMS message..
Both 'checkservice' and 'monitord' are really simple to configure and
can be as flexible as your imagination allows.
I don't doubt that a way can be found to have Exim do complex
conditional branching - but it may not be the most robust way to do it.
After all, it is the failure of another MTA that you are covering, is it
not?