> > > If they all retry at once, they must all have the same queue runner > > > interval. Instead of, say, 15 minutes for them all, does it help to have
> > > a spread such as 13, 15, 17 ?
> >
> > In fact, their configuration is identical, which is very convenient
> > for a cluster. I would like to keep it that way and avoid expanded
> > retry rules. ;)
>
> Queue runner start times are not specified in retry rules.
I misread your mail and thought about different retry intervals. Right
now queue runners start every 15s, but I am thinking about migrating
those machines to the parallel queue runner I run on my outgoing systems.
As I wrote, I don't see a sharply defined collision, but it is like waves
of many and of few transferred mails. That's why it took me a while to
understand why sometimes the previously down machine was not loaded as
badly, when there was still mail waiting.
There is a bunch things that could avoid this problem, but thinking
about it, I found out that in fact things run more synchronised than
one would guess.