Re: [exim] RHEL4 - Exim 4.x - cluster solution

Top Page
Delete this message
Reply to this message
Author: W B Hacker
Date:  
To: exim users
Subject: Re: [exim] RHEL4 - Exim 4.x - cluster solution
Tom Brown wrote:
> Hi
>
> I need to design and then build a clustered setup that is scalable that
> will distribute our MTA's across 2 of our datacentres. I have 4 boxes so
> there will be 2 in each location, more should be able to be added later
> if required. I will be configuring this in a master/standby config
> rather than balancing the load between them. I am thinking about running
> a linux ha cluster on these boxes and just treating all 4 as 2 sets of 2.
>
> I guess my questions are does exim play nicely in a linux ha type
> situation and if not what other ways can be employed to maintain a ha
> cluster of mta's ?
>
> thanks
>


I cannot answer the Linux HA part. (FreeBSD here)

But, w/r '...other ways.. Exim is *golden* - especially w/r managing ssl/tls
certs and such.

The concept of hot & standby is sound, with or without an actual cluster.

When co-located and in the same IP block, it is dead-easy to manage two
'ordinary' boxen w/r failover & restoral, so a formal cluter has just not been
an issue in our camp.

- Syncing message store is the only real challenge, and that is not a show-stopper.

- Conventional secondary and subsequent MX are not 100% predictable as to where
inbound traffic may end up, and getting it to where it may be read by pop or
imap w/o need for the users to alter MUA settings can increase complexity, raise
box-count, and add latency. We have chose to publish just one mx.

- it is faster and 'cleaner' to repoint BOTH smtp and pop/imap to the standby by
means of IP-takeover than by DNS changes. No MUA changes required.

- IMNSHO, maintenance of a 'prime' and 'secondary' is less work, espacially when
each is really a 'prime' that can carry double, is in day-to-day service so you
know it has not gone off - or out-of-date - while sitting on standby.

Our approach:

Each of two 'heavy' 2U servers (Tyan MB, dial Gig-E, Core-Duo CPU, 4 GB RAM,
triple RAID1 arrays) have an 'always mine' frontside IP - primarily for ssh access.

Each also has a 'public' IP which may be downed and taken over by the other box.
This is where the DNS point each 'set' of domains.

- These two (or more) IP are aliased onto the same 'external' NIC.

- On another NIC, each has an 'internal' IP on a backside LAN. Primary use is
data exchange & local storage, but it also serves to ssh from another box
if/as/when both frontside IP are wanted offline.

- *Normally* each server handles its own sepaate set of virtual domains, (per
what the DNS points to) ergo nothing is really 'hot' or 'standby - just two
lightly-loaded servers, each with more than enough reserve to do the entire job
of both.

The config's are identical, and both servers have each other's certs available.

The virtual user DB (PostgreSQL in this case, but it need not be so) is also
identical, i.e. - each server has all the data and storage structures it needs
to do BOTH sets of domains and virtual users for smtp and IMAP.

The 'master' DB is one a third, 1U Via C3 single-RAID1 box, which does not have
to be on the same site (though ours are). Draws about 12W or less and lasts a
long, long time.

Changes to the user DB may be made here. If it is hors de combat, traffic is not
significantly affected, other than as to new users or spam filter preferences.

Manual changes may be made directly to the two main boxen DB if there is a long
outage.

Day-to-day syncing:

For light loading, periodic rsync may be 'good enough' to keep bothway message
stores reasonably current. NB: Our users ordinarily also have local sync'ed IMAP
copies, so even if mailstore on the servers is not current, will still be able
to refer to older messages.

Shared external NAS with RAID storage can reduce that need, but becomes a
single-point-of-failure.

Failover:

- alias the 'public' IP of the offline box to the survivor. Make sure the
offline box does not come back on the net until you are ready for it.

Recovery:

- drop that alias when the offline box is ready to go back to work. It may be
tested on the 'private' IP before the 'public' IP is turned back on.

(we allow ip_literal for postmaster and such ...)

All else is already in-place.

Given the reliability of modern hardware and RAID, (a hard failure about once
every 3 to 5 years), this seems to make better use of the resources before they
go obsolete while on standby, and has allowed us to cut our server-count and UPS
power budget roughly in half.

One of the major driving factors was to not have any significant interruption in
IMAP access, (we do not offer pop).

IOW - transparency, and de facto redundancy, but with minimal 'idle' investment.

I don't know if a Linux HA cluster would make this simpler - or more complex -
that probably depends on the expertise of the implementor. But at least I can
say that a cluster is not mandatory [1].

HTH,

Bill

[1] I've 'presumed' a Linux environment that need not be taken offline for
application of upgrades/patches, i.e. - is normally run through a 45-60 second
reboot only after a no more than quarterly or bi-annual 'make buildworld/kernel'
cycle - or whatever the Linux equivalent is.

Exim and other upgrades and patches are done without going offline, the listener
daemon re-hupped in a few seconds.