[ On Tuesday, November 16, 2004 at 21:36:32 (+0100), Marcin Owsiany wrote: ]
> Subject: Re: [exim] exim capabilities fo 10-30 K email accounts
>
> On Tue, Nov 16, 2004 at 02:05:51PM -0500, Greg A. Woods wrote:
> > Depending on how many limits you put on your users it's possible to
> > handle upwards of 15,000 users on a measly little PII/300Mhz with 512MB
> > of RAM and a decently fast disk subsystem. The particular system I have
> > in mind also serves all the personal web pages for those users (maybe
> > 25% of users have homepages, some quite busy but most never get hit).
>
> Could you please tell how many block reads/writes per second you get at
> peak load time? (sar/iostat/whatever)
>
> Because statements like yours make me wonder whether it is my hardware
> that underperforms, or it is my users that are so special. :-)
Undoutably it's your users! :-)
Remember we limit to 4MB max msg size and all spam & virus blocking
happens at SMTP time before any junk ever hits the disk.
I had expected that we would have added a hardware RAID storage array
with additional SCSI controller(s) to this system a few years ago, but
so far we have been able to keep the storage requirements down to a bare
minimum and have only ever added a fourth drive for logs and upgraded
the memory to the 512MB maximum the system supports. I had originally
hoped it would support 1GB but the motherboard rev. we have will not
allow it. An additional 512MB of RAM would have given much more
acceptable performance even now. :-)
Here's a snapshot from "systat vmstat" while the system is being pounded
on with too many POP connections and lots of incoming SMTP:
5 users Load 53.36 37.34 27.81 Tue Nov 16 16:39
Mem:KB REAL VIRTUAL PAGING SWAPPING Interrupts
Tot Share Tot Share Free in out in out 886 total
Act 63104 37441211820 784392328892 count 106 100 irq0
All187168 5419621856601550516 pages 9 irq1
irq3
Proc:r p d s w Csw Trp Sys Int Sof Flt 286 cow irq4
6 49144 1139 3550 5444 1006 345 3275 87 objlk irq6
55 objht 467 irq11
57.9% Sys 41.3% User 0.0% Nice 0.8% Idle 1447 zfod 248 irq14
| | | | | | | | | | | 22038 nzfod 71 irq15
=============================>>>>>>>>>>>>>>>>>>>>> 6.57 %zfod
kern
Namei Sys-cache Proc-cache 47784 wire
Calls hits % hits % 130080 act
4022 3786 94 8 0 6776 inact
328892 free
Discs sd0 sd1 sd2 sd4 ccd daefr
seeks 1663 prcfr
xfers 95 54 54 49 101 240 react
Kbyte 756 337 309 495 646 scan
sec 0.9 0.5 0.5 0.5 0.7 hdrev
intrn
("xfers" and "Kbyte" are per second -- "sec" is the time the device was
busy out of one second -- i.e. multply by 100 to get percent busy)
The disks are all on the same aic7880 Ultra/Wide scsi bus (onboard --
it's an IBM PC-325 system).
Once the system disk, sd0, gets to 90% busy, as it is above, then the
system gets sluggish because loading and paging in of executables is
slow. This is where more RAM would help -- more buffer cache! :-)
Note that the "load average" can easily hit 60 or more. This is partly
because we leave many SMTP connections hanging on error responses for 10
seconds as a form of D.o.S. protection against broken servers that just
open a new connection immediately after being told to bugger off. Only
very rarely under true attack conditions do we ever suffer situations
where we have to reject new incoming SMTP connections. I won't say
publicly how many we allow simultaneously, but it is a lot. :-)
Inetd's much more primitive rate limiting controls cause us far more
headaches with the constant POPping idiots, especially when a family
computer might have 8 or more mailboxes that it checks simultaneously
every five minutes all day long.
The CCD disk is a stripe of sd1 & sd2. It's where the mail and
homepages sit, and sd4 has /var on it with all the logs.
There are swap slices on all four disks but they don't normally get
used:
16:49 [29] $ /sbin/swapctl -lk
Device 1K-blocks Used Avail Capacity Priority
/dev/sd0b 64968 4 64964 0% 0
/dev/sd1b 250000 4 249996 0% 0
/dev/sd2b 250000 4 249996 0% 0
/dev/sd4b 250000 4 249996 0% 0
Total 814968 16 814952 0%
The OS is too old to have UFS softdep support so filesystem metadata
operations can drag things down a bit too. This will certainly be
solved with the upgrade to the Alpha. :-)
16:45 [18] $ /usr/sbin/iostat -d -w 1 sd0 sd1 sd2 ccd0 sd4
sd0 sd1 sd2 sd4 ccd
KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s KB/t t/s MB/s
8.64 36 0.31 6.02 31 0.18 6.06 31 0.18 6.81 52 0.35 6.38 59 0.37
12.59 70 0.86 5.97 53 0.31 5.44 41 0.22 16.99 83 1.38 6.05 89 0.53
7.70 67 0.50 5.93 34 0.20 4.98 29 0.14 18.51 89 1.61 6.18 56 0.34
11.13 66 0.72 16.98 25 0.41 10.41 31 0.31 16.31 83 1.32 15.63 48 0.73
7.80 106 0.81 6.40 35 0.22 8.39 33 0.27 9.99 150 1.47 8.08 61 0.48
8.58 72 0.60 5.45 11 0.06 7.42 19 0.14 8.67 127 1.08 6.70 30 0.20
10.65 68 0.71 8.21 14 0.11 8.08 13 0.10 12.32 120 1.44 9.17 24 0.21
7.92 53 0.41 8.95 20 0.17 6.56 18 0.12 15.95 101 1.57 8.74 34 0.29
7.82 68 0.52 5.08 55 0.27 4.30 32 0.13 17.73 89 1.54 5.02 83 0.41
8.26 92 0.74 4.53 18 0.08 7.37 15 0.11 9.70 143 1.35 6.00 32 0.19
7.89 112 0.86 5.33 3 0.02 7.33 9 0.06 8.82 185 1.59 6.83 12 0.08
8.00 88 0.69 5.67 9 0.05 5.00 7 0.03 13.39 101 1.32 5.38 16 0.08
8.20 108 0.87 6.31 12 0.07 5.90 9 0.05 18.12 97 1.72 6.41 20 0.13
10.79 82 0.86 5.83 32 0.18 4.62 24 0.11 17.24 89 1.50 5.40 55 0.29
7.87 107 0.82 5.85 13 0.07 4.50 10 0.04 17.49 97 1.66 5.26 23 0.12
7.78 96 0.73 5.90 10 0.06 6.60 5 0.03 23.06 100 2.25 6.13 15 0.09
8.19 96 0.77 5.82 44 0.25 6.39 38 0.24 21.01 92 1.89 6.48 77 0.49
8.00 100 0.78 6.55 10 0.06 4.72 9 0.04 23.89 84 1.96 6.75 16 0.11
7.36 59 0.42 4.88 17 0.08 5.50 16 0.09 23.91 65 1.52 5.70 30 0.17
7.68 80 0.60 6.09 33 0.19 4.88 29 0.14 15.40 103 1.55 5.86 57 0.33
Just after school lets out, like right now, is worst. :-)
Now back to work so I can actually complete this upgrade! ;-)
--
Greg A. Woods
+1 416 218-0098 VE3TCP RoboHack <woods@???>
Planix, Inc. <woods@???> Secrets of the Weird <woods@???>