[exim-cvs] Handle error on close of spool data file

Top Page
Delete this message
Reply to this message
Author: Exim Git Commits Mailing List
Date:  
To: exim-cvs
Subject: [exim-cvs] Handle error on close of spool data file
Gitweb: https://git.exim.org/exim.git/commitdiff/f70940c9489d0ff5dc44db5ba5ed5258a8fe8772
Commit:     f70940c9489d0ff5dc44db5ba5ed5258a8fe8772
Parent:     2e930f258a070b0c6f2aaf36bed85c39c89f8ce6
Author:     Jeremy Harris <jgh146exb@???>
AuthorDate: Sat Oct 14 22:29:08 2023 +0100
Committer:  Jeremy Harris <jgh146exb@???>
CommitDate: Sat Oct 14 22:29:08 2023 +0100


    Handle error on close of spool data file
---
 doc/doc-txt/ChangeLog | 11 ++++++++++
 src/src/receive.c     | 59 ++++++++++++++++++++++++++++++++++++---------------
 src/src/smtp_in.c     |  2 +-
 test/log/0010         |  2 +-
 test/log/0900         | 10 ++++-----
 test/log/0901         |  6 +++---
 test/stdout/0900      |  6 +++---
 test/stdout/0901      |  2 +-
 8 files changed, 67 insertions(+), 31 deletions(-)


diff --git a/doc/doc-txt/ChangeLog b/doc/doc-txt/ChangeLog
index 4c22f649f..c1b577f62 100644
--- a/doc/doc-txt/ChangeLog
+++ b/doc/doc-txt/ChangeLog
@@ -2,6 +2,17 @@ This document describes *changes* to previous versions, that might
affect Exim's operation, with an unchanged configuration file. For new
options, and new features, see the NewStuff file next to this ChangeLog.

+Since 4.97
+-----------------
+
+JH/01 Handle error on close of the spool data file during reception.  Previously
+      This was only logged, on the assumption that errors would be seen for
+      a previous fflush().  However, a fuse filesystem has been reported as
+      showing this an error for the fclose().  The spool is now in an uncertain
+      state, and we have logged and responded acceptance.  Change this to
+      respond with a temp-reject, wipe spoolfiles, and log the error detail.
+
+
 Exim version 4.97
 -----------------


diff --git a/src/src/receive.c b/src/src/receive.c
index 3318ee4d2..dbb43e749 100644
--- a/src/src/receive.c
+++ b/src/src/receive.c
@@ -3251,10 +3251,9 @@ if (!ferror(spool_data_file) && !(receive_feof)() && message_ended != END_DOT)
     {
     Uunlink(spool_name);        /* Lose data file when closed */
     cancel_cutthrough_connection(TRUE, US"sender closed connection");
-    message_id[0] = 0;        /* Indicate no message_accepted */
     smtp_reply = handle_lost_connection(US"");
     smtp_yield = FALSE;
-    goto TIDYUP;                /* Skip to end of function */
+    goto NOT_ACCEPTED;                /* Skip to end of function */
     }
       break;


@@ -3661,7 +3660,7 @@ else
     /* Check the recipients count again, as the MIME ACL might have changed
     them. */


-    if (acl_smtp_data != NULL && recipients_count > 0)
+    if (acl_smtp_data && recipients_count > 0)
       {
       rc = acl_check(ACL_WHERE_DATA, NULL, acl_smtp_data, &user_msg, &log_msg);
       add_acl_headers(ACL_WHERE_DATA, US"DATA");
@@ -4030,11 +4029,6 @@ else


receive_messagecount++;

-/* Add data size to written header size. We do not count the initial file name
-that is in the file, but we do add one extra for the notional blank line that
-precedes the data. This total differs from message_size in that it include the
-added Received: header and any other headers that got created locally. */
-
 if (fflush(spool_data_file))
   {
   errmsg = string_sprintf("Spool write error: %s", strerror(errno));
@@ -4054,8 +4048,13 @@ if (fflush(spool_data_file))
     /* Does not return */
     }
   }
-fstat(data_fd, &statbuf);


+/* Add data size to written header size. We do not count the initial file name
+that is in the file, but we do add one extra for the notional blank line that
+precedes the data. This total differs from message_size in that it include the
+added Received: header and any other headers that got created locally. */
+
+fstat(data_fd, &statbuf);
msg_size += statbuf.st_size - spool_data_start_offset(message_id) + 1;

 /* Generate a "message received" log entry. We do this by building up a dynamic
@@ -4349,9 +4348,9 @@ if(!smtp_reply)
 #endif
   {
   log_write(0, LOG_MAIN |
-    (LOGGING(received_recipients) ? LOG_RECIPIENTS : 0) |
-    (LOGGING(received_sender) ? LOG_SENDER : 0),
-    "%Y", g);
+        (LOGGING(received_recipients) ? LOG_RECIPIENTS : 0) |
+        (LOGGING(received_sender) ? LOG_SENDER : 0),
+        "%Y", g);


/* Log any control actions taken by an ACL or local_scan(). */

@@ -4389,9 +4388,16 @@ a queue-runner could grab it in the window.

A fflush() was done earlier in the expectation that any write errors on the
data file will be flushed(!) out thereby. Nevertheless, it is theoretically
-possible for fclose() to fail - but what to do? What has happened to the lock
-if this happens? We can at least log it; if it is observed on some platform
-then we can think about properly declaring the message not-received. */
+possible for fclose() to fail - and this has been seen on obscure filesystems
+(probably one that delayed the actual media write as long as possible)
+but what to do? What has happened to the lock if this happens?
+It's a mes because we already logged the acceptance.
+We can at least log the issue, try to remove spoolfiles and respond with
+a temp-reject. We do not want to close before logging acceptance because
+we want to hold the lock until we know that logging worked.
+Could we make this less likely by doing an fdatasync() just after the fflush()?
+That seems like a good thing on data-security grounds, but how much will it hit
+performance? */


 goto TIDYUP;
@@ -4404,8 +4410,27 @@ process_info[process_info_len] = 0;            /* Remove message id */
 if (spool_data_file && cutthrough_done == NOT_TRIED)
   {
   if (fclose(spool_data_file))                /* Frees the lock */
-    log_write(0, LOG_MAIN|LOG_PANIC,
-      "spoolfile error on close: %s", strerror(errno));
+    {
+    log_msg = string_sprintf("spoolfile error on close: %s", strerror(errno));
+    log_write(0, LOG_MAIN|LOG_PANIC |
+          (LOGGING(received_recipients) ? LOG_RECIPIENTS : 0) |
+          (LOGGING(received_sender) ? LOG_SENDER : 0),
+          "%s", log_msg);
+    log_write(0, LOG_MAIN |
+          (LOGGING(received_recipients) ? LOG_RECIPIENTS : 0) |
+          (LOGGING(received_sender) ? LOG_SENDER : 0),
+          "rescind the above message-accept");
+
+    Uunlink(spool_name);
+    Uunlink(spool_fname(US"input", message_subdir, message_id, US"-H"));
+    Uunlink(spool_fname(US"msglog", message_subdir, message_id, US""));
+
+    /* Claim a data ACL temp-reject, just to get reject logging and resposponse */
+    smtp_handle_acl_fail(ACL_WHERE_DATA, rc, NULL, log_msg);
+    smtp_reply = US"";        /* Indicate reply already sent */
+
+    message_id[0] = 0;        /* no message accepted */
+    }
   spool_data_file = NULL;
   }


diff --git a/src/src/smtp_in.c b/src/src/smtp_in.c
index 1a66e1835..ffc7779f8 100644
--- a/src/src/smtp_in.c
+++ b/src/src/smtp_in.c
@@ -2936,7 +2936,7 @@ Returns:     0 in most cases
 */


int
-smtp_handle_acl_fail(int where, int rc, uschar *user_msg, uschar *log_msg)
+smtp_handle_acl_fail(int where, int rc, uschar * user_msg, uschar * log_msg)
{
BOOL drop = rc == FAIL_DROP;
int codelen = 3;
diff --git a/test/log/0010 b/test/log/0010
index 5a25a722f..341f32558 100644
--- a/test/log/0010
+++ b/test/log/0010
@@ -1,4 +1,4 @@
1999-03-02 09:44:33 10HmaX-000000005vi-0000 <= someone@??? U=CALLER P=local-bsmtp S=sss
1999-03-02 09:44:33 10HmaX-000000005vi-0000 => CALLER <CALLER@???> R=localuser T=local_delivery
1999-03-02 09:44:33 10HmaX-000000005vi-0000 Completed
-1999-03-02 09:44:33 SMTP connection from CALLER lost while reading message data
+1999-03-02 09:44:33 10HmaY-000000005vi-0000 SMTP connection from CALLER lost while reading message data
diff --git a/test/log/0900 b/test/log/0900
index c3e2bb8e7..00d13e443 100644
--- a/test/log/0900
+++ b/test/log/0900
@@ -7,12 +7,12 @@
2017-07-30 18:51:05.712 10HmbA-000000005vi-0000 <= some_ne@??? H=(tester) [127.0.0.1] Ci=p1239 P=esmtp K S=sss for CALLER@???
2017-07-30 18:51:05.712 10HmbB-000000005vi-0000 <= someone@??? H=(tester) [127.0.0.1] Ci=p1239 P=esmtp K S=sss for CALLER@???
2017-07-30 18:51:05.712 10HmbC-000000005vi-0000 SMTP data timeout (message abandoned) on connection from (tester) [127.0.0.1] F=<someone@???> D=q.qqqs
-2017-07-30 18:51:05.712 SMTP connection Ci=p1240 from (tester) [127.0.0.1] lost while reading message data
-2017-07-30 18:51:05.712 SMTP connection Ci=p1241 from (tester) [127.0.0.1] lost while reading message data
-2017-07-30 18:51:05.712 10HmbD-000000005vi-0000 <= someone@??? H=(tester) [127.0.0.1] Ci=p1242 P=esmtp K S=sss for CALLER@???
+2017-07-30 18:51:05.712 10HmbD-000000005vi-0000 SMTP connection Ci=p1240 from (tester) [127.0.0.1] lost while reading message data
+2017-07-30 18:51:05.712 10HmbE-000000005vi-0000 SMTP connection Ci=p1241 from (tester) [127.0.0.1] lost while reading message data
+2017-07-30 18:51:05.712 10HmbF-000000005vi-0000 <= someone@??? H=(tester) [127.0.0.1] Ci=p1242 P=esmtp K S=sss for CALLER@???
2017-07-30 18:51:05.712 H=(tester) [127.0.0.1] Ci=p1234 F=<someone@???> rejected RCPT <dummy@???>: relay not permitted
2017-07-30 18:51:05.712 H=(tester) [127.0.0.1] Ci=p1235 F=<some3ne@???> rejected RCPT <dummy@???>: relay not permitted
2017-07-30 18:51:05.712 H=(tester) [127.0.0.1] Ci=p1235 F=<some4ne@???> rejected RCPT <dummy@???>: relay not permitted
-2017-07-30 18:51:05.712 10HmbE-000000005vi-0000 <= some6ne@??? H=(tester) [127.0.0.1] Ci=p1243 P=esmtp K S=sss for CALLER@???
+2017-07-30 18:51:05.712 10HmbG-000000005vi-0000 <= some6ne@??? H=(tester) [127.0.0.1] Ci=p1243 P=esmtp K S=sss for CALLER@???
2017-07-30 18:51:05.712 rejected from <someone@???> H=(tester) [127.0.0.1]: Non-CRLF-terminated header, under CHUNKING: message abandoned
-2017-07-30 18:51:05.712 10HmbF-000000005vi-0000 <= someone@??? H=(tester) [127.0.0.1] Ci=p1244 P=esmtp K S=sss for CALLER@???
+2017-07-30 18:51:05.712 10HmbH-000000005vi-0000 <= someone@??? H=(tester) [127.0.0.1] Ci=p1244 P=esmtp K S=sss for CALLER@???
diff --git a/test/log/0901 b/test/log/0901
index 8f4f5086a..7d4a715b5 100644
--- a/test/log/0901
+++ b/test/log/0901
@@ -7,9 +7,9 @@
1999-03-02 09:44:33 10HmbA-000000005vi-0000 <= someone2A@??? H=(tester) [127.0.0.1] P=esmtp K S=sss for CALLER@???
1999-03-02 09:44:33 10HmbB-000000005vi-0000 <= someone3A@??? H=(tester) [127.0.0.1] P=esmtp K S=sss for CALLER@???
1999-03-02 09:44:33 10HmbC-000000005vi-0000 SMTP data timeout (message abandoned) on connection from (tester) [127.0.0.1] F=<someone4@???> D=qqs
-1999-03-02 09:44:33 SMTP connection from (tester) [127.0.0.1] lost while reading message data
-1999-03-02 09:44:33 SMTP connection from (tester) [127.0.0.1] lost while reading message data
-1999-03-02 09:44:33 10HmbD-000000005vi-0000 <= someone8@??? H=(tester) [127.0.0.1] P=esmtp K S=sss for CALLER@???
+1999-03-02 09:44:33 10HmbD-000000005vi-0000 SMTP connection from (tester) [127.0.0.1] lost while reading message data
+1999-03-02 09:44:33 10HmbE-000000005vi-0000 SMTP connection from (tester) [127.0.0.1] lost while reading message data
+1999-03-02 09:44:33 10HmbF-000000005vi-0000 <= someone8@??? H=(tester) [127.0.0.1] P=esmtp K S=sss for CALLER@???
1999-03-02 09:44:33 SMTP protocol synchronization error (next input sent too soon: pipelining was not advertised): rejected "BDAT 1" H=(tester) [127.0.0.1] next input="BDAT 87 last\r\no: Susan@???"...
1999-03-02 09:44:33 SMTP call from (tester) [127.0.0.1] dropped: too many syntax or protocol errors (last command was "From: Sam@???", C=EHLO,MAIL,RCPT,BDAT)
1999-03-02 09:44:33 SMTP connection from (tester) [127.0.0.1] lost while reading message data (header)
diff --git a/test/stdout/0900 b/test/stdout/0900
index fbebcf5f0..9fe0eb7de 100644
--- a/test/stdout/0900
+++ b/test/stdout/0900
@@ -317,7 +317,7 @@ Connecting to 127.0.0.1 port 1225 ... connected
??? 250-
<<< 250- 88 byte chunk, total 88
??? 250
-<<< 250 OK id=10HmbD-000000005vi-0000
+<<< 250 OK id=10HmbF-000000005vi-0000
>>> quit

??? 221
<<< 221 testhost.test.ex closing connection
@@ -454,7 +454,7 @@ Connecting to 127.0.0.1 port 1225 ... connected
??? 250- 6 byte chunk, total 93
<<< 250- 6 byte chunk, total 93
??? 250 OK
-<<< 250 OK id=10HmbE-000000005vi-0000
+<<< 250 OK id=10HmbG-000000005vi-0000
>>> QUIT

??? 221
<<< 221 testhost.test.ex closing connection
@@ -523,7 +523,7 @@ Connecting to 127.0.0.1 port 1225 ... connected
??? 250-
<<< 250- 98 byte chunk, total 100
??? 250
-<<< 250 OK id=10HmbF-000000005vi-0000
+<<< 250 OK id=10HmbH-000000005vi-0000
>>> quit

??? 221
<<< 221 testhost.test.ex closing connection
diff --git a/test/stdout/0901 b/test/stdout/0901
index b2797046a..efd578a7d 100644
--- a/test/stdout/0901
+++ b/test/stdout/0901
@@ -305,7 +305,7 @@ Connecting to 127.0.0.1 port 1225 ... connected
??? 250-
<<< 250- 88 byte chunk, total 88
??? 250
-<<< 250 OK id=10HmbD-000000005vi-0000
+<<< 250 OK id=10HmbF-000000005vi-0000
>>> quit

??? 221
<<< 221 testhost.test.ex closing connection

--
## subscription configuration (requires account):
## https://lists.exim.org/mailman3/postorius/lists/exim-cvs.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-cvs-unsubscribe@???
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/