Re: [exim] 4.95-RC0 - SIGSEGV (maybe attempt to write to imm…

Top Page
Delete this message
Reply to this message
Author: Jeremy Harris
Date:  
To: exim-users
Subject: Re: [exim] 4.95-RC0 - SIGSEGV (maybe attempt to write to immutable memory) & other oddities
On 28/07/2021 14:58, Matthew Frost via Exim-users wrote:
> Core file '/var/spool/exim/core.exim-4.95-RC0-2.40.42328.core' (x86_64) was loaded.
> (lldb) bt
> exim-4.95-RC0-2 was compiled with optimization - stepping may behave oddly; variables may not be available.
> * thread #1, name = 'exim-4.95-RC0-2', stop reason = signal SIGBUS
>    * frame #0: 0x00000000002cdb46 exim-4.95-RC0-2`smtp_start_session [inlined] string_from_gstring(g=0x2000000801a469ec) at functions.h:919:4 [opt]
>      frame #1: 0x00000000002cdb41 exim-4.95-RC0-2`smtp_start_session at smtp_in.c:3090 [opt]


Something is very different between a FreeBSD jail and normality. This is very
basic stuff, well exercised (including in our FreeBSD buildfarm systems, which
run the regression testsuite, but do not use jails as far as I know).
We're just finishing off building the SMTP banner message prior to sending it.

I assume your config isn't doing anything odd in that region?
If not, we'd need a non-optimised build to get a coredump we can look
at C variables in.

> Core file '/var/spool/exim/core.exim-4.95-RC0-2.40.42413.core' (x86_64) was loaded.
> (lldb) bt
> exim-4.95-RC0-2 was compiled with optimization - stepping may behave oddly; variables may not be available.
> * thread #1, name = 'exim-4.95-RC0-2', stop reason = signal SIGSEGV
>   * frame #0: 0x00000000002cdb8b exim-4.95-RC0-2`smtp_start_session [inlined] tfo_in_check at smtp_in.c:2437:11 [opt]
>     frame #1: 0x00000000002cdb8b exim-4.95-RC0-2`smtp_start_session at smtp_in.c:3096 [opt]


This is doing less-common getsockopt() operations. If it weren't for the above
symptom showing up I'd be more suspicious. For now I suggest we ignore it
(we could hack the code, commenting out the single call to tfo_in_check()
if needed).

> Core file '/var/spool/exim/core.exim-4.95-RC0-2.40.42645.core' (x86_64) was loaded.
> (lldb) bt
> * thread #1, name = 'exim-4.95-RC0-2', stop reason = signal SIGSEGV
>   * frame #0: 0x0000000000000000
>     frame #1: 0x00000000002619e0 exim-4.95-RC0-2 at daemon.c:63:1
>     frame #2: 0x0000000000260ff1 exim-4.95-RC0-2`daemon_go at daemon.c:528:8 [opt]
>     frame #3: 0x0000000000260eca exim-4.95-RC0-2`daemon_go at daemon.c:2594 [opt]


daemon.c:2594   - daemon_go()        calls handle_smtp_call()
daemon.c:528:8  - handle_smtp_call() calls smtp_start_session()
daemon.c:63:1   - appears to be in sighup_handler()


I suppose that's a feasible stack if we really did get a HUP. But that
handler is as simple as they come; I'm really starting to distrust your
environment now.


I wonder if the introduction of readonly-config is a factor?
Test for this by adding
#define MISSING_POSIX_MEMALIGN
to your OS/os.h-FreeBSD and running that build.


And, if there's a FreeBSD afficionado out there willing to set up,
maintain and monitor a FreeBSD with-jails buildfarm animal
we might catch such issues earlier. Right now we have the choice
of delaying the release or declaring FreeBSD not-supported.
--
Cheers,
Jeremy