[exim] A study of failing tls certs, with valid certificate …

Top Page
Delete this message
Reply to this message
Author: Cyborg
Date:  
To: exim-users
Subject: [exim] A study of failing tls certs, with valid certificate files
Hi all,

please take this text as it is, a study for a fail you could avoid, no
fingerpointing, no flaming, only suggestions what to look for/change in
your toolchains.

In early December 2022 the server in question switched his os release
and was restarted (exim including). In this upgrade, the following
switch was made:

FROM:

2022-11-28T20:46:24+0100 SUBDEBUG Upgraded: exim-4.96-5.fc35.x86_64
2022-11-28T20:46:32+0100 SUBDEBUG Upgraded: *openssl-1:*1.1.1q-1.fc35.x86_64

TO:

2022-11-28T20:41:00+0100 SUBDEBUG Upgrade: *openssl-1:3*.0.5-2.fc36.x86_64
2022-11-28T20:42:54+0100 SUBDEBUG Upgrade: exim-4.96-5.fc36.x86_64

later was an update to 4.96-6

2022-12-01T08:01:27+0100 SUBDEBUG Upgrade: exim-4.96-6.fc36.x86_64
2022-12-01T08:01:45+0100 SUBDEBUG Upgraded: exim-4.96-5.fc36.x86_64

Certs are renewed by a periodic 5 day cron job ( to not hurt LE to much
) which restarts the apache, but not exim.

At that time the Let's Encrypt certificate for exim and all other
services had these dates:

            Not Before: Oct 10 21:07:39 2022 GMT
            Not After : Jan  8 21:07:38 2023 GMT

On the 11th of December 2022 0:08 it was auto renewed and switched to
these dates:

            Not Before: Dec 10 22:08:37 2022 GMT
            Not After : Mar 10 22:08:36 2023 GMT

-rw-r----- 1 root exim 1834 11. Dez 00:08 cert-1670713689.csr
-rw-r----- 1 root exim 2366 11. Dez 00:08 cert-1670713689.pem

Yesterday evening at around 22:25 CET ( +1 GMT ) openssl( via exim )
started to spit out these messages on incoming connections:

2023-01-08 22:25:18 TLS error on connection from
vmi395689.contaboserver.net [5.189.157.109] (SSL_accept):
error:0A000415:SSL routines::sslv3 alert certificate expired

This was caused by the EOT of the cert loaded at the last update
(2022-12-01) and exim not being restarted since.

This was happening for the first time since Let's Encrypted was formed (
we use it since then ), so for years by now.

ATM this exim is in use:

Name        : exim
Version     : 4.96
Release     : 6.fc36
Architecture: x86_64
Install Date: Do 01 Dez 2022 08:01:27 CET
Build Date  : Di 22 Nov 2022 15:25:30 CET

Name        : openssl
Version     : 3.0.5
Release     : 2.fc36
Architecture: x86_64
Install Date: Mo 28 Nov 2022 20:41:00 CET
Build Date  : Di 01 Nov 2022 17:26:57 CET

The original cert setup looks like this:

lrwxrwxrwx 1 root root 59 17. Sep 2018  /etc/pki/tls/certs/exim.pem ->
/etc/httpd/letsencrypt/certs/server.de/fullchain.pem
0 lrwxrwxrwx 1 root root 24 11. Dez 00:08 fullchain.pem ->
fullchain-1670713689.pem
8 -rw-r----- 1 root exim 6117 11. Dez 00:08 fullchain-1670713689.pem

/etc/pki/tls/certs/exim.pem is the default location for Fedoras exim
package.

O== are there more systems?

Yes, there are, this is just the one, we detected it first. So it's not
a glitch.

O== Conclusion:

As I can't remember any downstream patches to Exim inside Fedora's
build, so something changed how exim or openssl3  is handling the
underlying certificate switch detection. As Exim had only a tiny minor
switch, OpenSSL3 is my personal candidate for this.

O== Suggestions:

In this combination exim needs to be restarted, when the server cert was
renewed, as the auto detection is not reliable working any more.

It may be a good idea to check for a new solution inside exim like auto
reloading the used cert every 24h's the server is running, if openssl3
is causing this "detection" bug.


best regards,
Marius