[exim-dev] [Bug 2274] exim 4.91: segfault ... error 4 in li…

Top Page
Delete this message
Reply to this message
Author: admin
Date:  
To: exim-dev
Subject: [exim-dev] [Bug 2274] exim 4.91: segfault ... error 4 in libc-2.17.so
https://bugs.exim.org/show_bug.cgi?id=2274

Phil Pennock <pdp@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pdp@???


--- Comment #12 from Phil Pennock <pdp@???> ---
The abort server referenced in the crash details has nothing at the claimed
URL, and it's three days later, so I would really expect
https://retrace.fedoraproject.org/faf/reports/bthash/8fe5b7ec11a15288157804580a2f41dd3fe868d9
to have something if it were ever going to work. So there's nothing there to
trace.

As far as I can tell, Direct Admin is mostly restricted to those with license
keys, which makes it hard to tell what's going on. I've tried seeing how far I
could get in a Docker image, but (1) you're running Centos 7.5.1804, which is
newer than the latest Centos available on Docker Hub; and (2) the DirectAdmin
files mirror I'm looking at does not have packages for "7.5". Meanwhile the
crash report doesn't say which version of the package, installed for what
architecture, is in use.

So at this point we've got a third-party build of Exim, built for unknown OS
release, being run on a different OS release, and we have crashes telling us
that the death is inside libc, but no stack traces and no way to sanely figure
out which version of Exim it was.

The bug-report claims that it's Exim 4.91 segfaulting, but the crash report
says that it's Exim from the package "da_exim-4.89.1-1".

What sort of system is this, where things are failing? Is this running in
production somewhere, in this state?

We have from the dump a rather poor backtrace which can't tell us much without
the binary:

:        ,   "frames":
:              [ {   "address": 139775044603071
:                ,   "build_id": "cb4b7554d1adbef2f001142dd6f0a5139fc9aa69"
:                ,   "build_id_offset": 634047
:                ,   "function_name": "__strncpy_sse2_unaligned"
:                ,   "file_name": "/usr/lib64/libc-2.17.so"
:                }
:              , {   "address": 4352528
:                ,   "build_id": "bae74c686ca4940655bfcf0c4667e94956fa977b"
:                ,   "build_id_offset": 158224
:                ,   "function_name": "main"
:                ,   "file_name": "/usr/sbin/exim"
:                } ]


Okay. So somewhere directly in main() there's a call to strncpy; there's no
indication of if Exim was compiled such as to optimize away intermediate
frames, for example. The only _direct_ call to strncp (via Ustrncpy macro) is
when debug-logging, and the initial_cwd is copied into the big-buffer.

Perhaps if os_getcwd() fails? Could it be that the spam-checker has chdir()'d
to a directory which it unlinks, but calls Exim from inside, so that Exim's
os_getcwd() fails?

Other than that wild shot in the dark, there's really too little here to chase
this any further.

--
You are receiving this mail because:
You are on the CC list for the bug.