Re: Locking and concurrency -- system considerations

Top Page
Delete this message
Reply to this message
Author: Jon Morby
Date:  
To: Piete Brooks
CC: exim-users
Subject: Re: Locking and concurrency -- system considerations
> > In the environment I envisage running Exim shared spools are essential.
>
> Hmm -- interesting to see how others think !
>
> NFS is fine for user filespaces, but it seems odd to me to have NFS mounted
> (I assume that's how you are sharing) spool dirs ...
>
> > we have several machines (relay-1 -> relay-15 currently) all of which share


> > one of two spools.
>
> On two fileservers ?
> What if one of the FS dies ? you loose 8 relays ??


If the FS dies we swap the head.

It's a dedicated nfs server with a raid backend. Performance is pretty good
(several thousand nfs ops per second), and it beats anything and everything in
the laddice benchmarks.

>
> > Being restricted to one spool per delivery client would be a restriction I
> > wouldn't like to have to live with.
>
> Interesting ...
>
> > (What happens when one of your relays goes down, and you've got a couple of


> > thousand messages in it's queue?)
>
> You move the disks to a spare CPU, and off you go !


Well yes, if the CPU were to go we have spares, and if we can't get those
working we plonk a completely new head on the disks yes.

>
> What happens to you if one of your two file servers die ??
>
> > The mail gets delayed until you can bring another box up and mount that spo

ol.
>
> Yup -- what about your fileservers ?


Well we can live with a short outage (short enough to swap the heads), disk
failures are much less of a problem than they were because they're RAIDed

And in a perfect world they would be mirrored so if one completely died (or a
bomb went off) we'd just mount the other half of the spool and carry on from
the other facilities center.

>
> > Unacceptable in todays market with consumers rather than techies coming ont

o
> > the Net.
>
> It depends what kind of support you have ...
> Assuming you have "outside hours" cover, someone just replugs the disks,
> and email is delayed for 5 mins or so.


24 hour manned FC.

> Is that *really* unaccepatble ?


5 minute (or even 1 hour delay is probably fine), 12 hours or so wouldn't be -
nor would losing all the mail

> What is the MTBF for your servers ?


well (touch wood) we've had various versions of them for over 12 months now
and none of them have ever failed. And that's a hell of a lot of disks,
whilst previously we could guarantee at least one disk per array per week
would fail, and because we were using local disks and software raid/mirror the
machine was down for 12 hours while we rebuilt the disk array.

But then we had 7 or 8 separate servers all with their own disk arrays. It
wasn't ideal.

We're handling something like 10Gb of mail a day (about 650,000 messages) on a
quiet day .... we're expecting a considerable increase in volumes over the
next 6 - 8 months.


>
>
> I'm not trying to knock the way you do things (I do things in *really* weird
> ways !), just trying to see if your configuration is really optimal for you,
> or whether some re-arrangement might improve your througput and resilience.
> [ e.g. are you raaid boxes dual ported ? Do you need to replug, or can you
> just do a mount when the "other" server is noticed to be down ?
> ]
>


We've tried various methods, this seems to be more reliable/stable than any of
the others (I've not had to stay up all night nursing backlogs since we
installed it, and previously I'd be known to put in a 36 hour day when things
went wrong - which they did frequently).

I'm still looking for the optimum solution (aren't we all) but at least I'm 
getting a good nights sleep again :)
-- 
Jon Morby                                  mail: jon@???
Fidonet/Internet Gateway                   http: www.fido.net




--
* This is sent by the exim-users mailing list.  To unsubscribe send a
    mail with subject "unsubscribe" to exim-users-request@???
* Exim information can be found at http://www.exim.org/