On 25 Jan 2020, at 17:46, Jeremy Harris via Exim-users <exim-users@???> wrote: > Of interest in that debug output - there's a consistent 6 or 7ms pause
> just after "Renaming spool header file". This is where exim has just
> written a spool file under a temporary name; it renames it to the final
> name and then does an fsync() on the directory.
>
> [ You can build exim with an option to disable that fsync(). I don't
> advise it if your mail is valuable to you. ]
>
> 6ms sounds like a 10k RPM disk rotation. Is your spool on rotating
> rust, and could you consider an SSD?
OK, so the environment here is a fairly large VMware virtual server platform running on Cisco UCS blades with NetApp storage underneath it. The virtual disks used by the VMs are all on NFS datastores with fairly eye-watering performance in terms of IOPs (almost 300k IOPS random read/write in acceptance testing). Reads are usually from cache, writes go into a non-volatile flash cache before being committed to disk. I suspect the latency here is partly from the VMware stack and partly from the NFS mounts underneath, with a tiny bit of network latency to boot.
We’re currently looking toward our next system, which may well be all flash, but that’s time and money away :)
On top of that, the sending VM here is very much not optimised for throughput; it’s my general purpose “throw stuff at it and see what sticks” box. Our MX and MTA farm have some sysctl tweaks for throughput, not that they’re really resource constrained in the first place.
> There's also a consistent 9 - 11ms pause just before that "Renaming"
> output line, after the previous line (which is a note about the last
> recipient entry in the file, just after it was written to the stdio
> buffer). The code between those two places flushes the stdio buffers
> and fsync()s to get the file data out on stable storage. That also
> sounds amenable to parallel operations.
Glad I could throw some data at you. If you need any more, please ask.