Auteur: Jonathan Knight Date: À: exim-dev Sujet: Re: [exim-dev] Old topic again: Option to avoid fsync()?
Michael Haardt wrote: > On Wed, Jan 10, 2007 at 04:26:29PM +0100, Florian Weimer wrote:
>
>> If you want to throw money at the problem, a RAID controller with a
>> battery-backed cache is a good option as well.
>>
>
> You completely miss the point, so let me rephrase it: I am _not_
> talking about regular operation. I am talking about cleaning up a mess,
> e.g. after an attack or double/triple fault that managed to kill all
> redundancy. Additionally, exotic applications benefit from disabling
> fsync().
>
> It's not economical to run systems at 10% of their maximum performance
> just to have enough if shit happens, unless of course you just run a
> small site, where the economic disadvantage of doing so can be tolerated.
>
Errrrrr. I am somewhat concerned about your last statement. I run the
mail system for the University here, which isn't really a big site, but
we see over a million attempts to deliver mail a day which translates
into about 46,000 real mail messages after greylisting.
We have internal mail servers which accept email from local users and
handle all internal communications and we have a pair of external mail
servers which talk to the outside world. Our mail servers are running
at a fraction of their capacity just because bad things happen too often.
All it takes is some annoying spammer out on the internet to use one of
our users as a fake "From" address and we will see hundreds of thousands
of error messages heading our way.
We've also seen cases of a trojan getting on a local users PC which has
then sent hundreds of thousands of email messages off site.
We've also has cases where our ISP, or the firewalls, or some other
system admin type mistake has taken us down for a weekend which means we
get three days of email on Monday.
So we do always plan for the unexpected and even though a mess happens
several times a year I don't need to do anything to fix it.
I have tried to run a mail system in the way that you are trying to and
I'm very happy that we have the resources here to run ours with lots of
spare capacity because it makes my life simpler.
Having said that here's what I used to do.
1. Find a way to stop whatever was generating the mess.
2. Move the input queue out of the way and restart exim
That at least gets you to the point where current email is flowing.
3. Move the valid email back into the real queue
Easier said than done, but judicious use of "grep" on the header files
usually results in a short list of real email and then its just a case
of moving the header and data files back into the normal queue space.
4. Delete the old queue that is now full of junk.
If you have more than one mail server then you could take the queue onto
another system and run it there rather than slowing down the main
server. In theory you could move it onto a tmpfs filesystem and perform
an exim queue run specifically on that input queue to avoid the fsync()
delays.