OK so more comparisons between working and non-working seems to show ha
proxy being the culprit rather than exim .. it seems it didnt always
send the proxy line straight away. HAProxy 1.5.2 is the version that
comes within the el7 distribution. Updated that to HAProxy 1.5.11
(latest stable version) and touch wood have not seen the error since.
rgds
Matt B.
On 27/02/2015 9:17 pm, Phil Pennock wrote:
> On 2015-02-27 at 19:47 +1000, Matt Bryant wrote:
>> been going through the TCP dump i got from one of the failures ...
>>
>>
>> haproxy (syn) ---------------> exim
>> exim (syn,ack) ---------------> haproxy
>> haproxy (ack) ----------------> exim
>> exim (banner) 3s later ---------> haproxy
>> haproxy (proxy line) ---------> exim
>> exim -----------------------> haproxy
> Why is haproxy waiting for output from Exim? The proxy protocol says
> that the client sends first, immediately upon connection.
>
> http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt
>
> The receiver may apply a short timeout and decide to abort the connection if
> the protocol header is not seen within a few seconds (at least 3 seconds to
> cover a TCP retransmit).
>
> Exim's approach to abort is to send the banner but then 503 fail all
> commands. See "experimental-spec.txt".
>
> You might be able to use an ACL plumbed into acl_smtp_connect to reject
> the connection if $proxy_session is "no". Something like (untested):
>
> ----------------------------8< cut here >8------------------------------
> hostlist proxy_hosts = <; 192.0.2.1 ; 192.0.2.7 ; 2001:db8::42
> proxy_required_hosts = +proxy_hosts
> acl_smtp_connect = acl_connect
>
> begin acl
>
> acl_connect_should_be_proxy:
> drop message = Missing proxy protocol
> condition = ${if !bool{$proxy_session}}
>
> acl_connect:
> accept hosts = +proxy_hosts
> endpass
> acl = acl_connect_should_be_proxy
> ----------------------------8< cut here >8------------------------------
>
> (and I know, we've deprecated endpass, but I'm about to go to bed so I'm
> not rewriting that further).
>
> If the problem is that haproxy has gotten slow and Exim responding as it
> does is confusing it, and holding connections open and slowing it down
> further, then this might help by dropping connections sooner and letting
> haproxy recover by managing less state.
>
> -Phil
>