On Thu, 7 May 1998, Dave Holland wrote:
> Occasionally when exim is receiving messages from my ISP (Demon) it
will complain "Failed to create spool file" and close the smtp
connection. This is annoying because I then have to wait online for the
ISP to initiate another smtp connection which may be several minutes
later. See the attached mainlog fragment for an example (id
0yWlZG-00005Y-00). From reading the source (accept.c), I think this can
be caused by exim not being able to create its spool data file with
exclusive access. Is that at all likely? Might it be caused by a queue
run starting? In this example and another, a queue run started within a
second before the error message being generated.
That's very odd, and I haven't had other reports of anything similar.
The message does mean that Exim has failed to create the file on the
disc for holding the message. It shouldn't interact with the queue
runner in any way. Hmm. The queue runner, when it starts up, uses
an opendir()/readdir() loop to read the names of all existing spool
files. Could opendir() be locking the directory somehow and preventing
writing to it in such a way as to prevent the creation of new files?
This must be very operating system specific. Any Linux experts reading
this who can comment?
I see I have been remiss in not including the actual error code in the
log message. Stilly me. If you look around line 1741 of accept.c (the
1.92 release, but it hasn't changed for a long time), you will see
if (data_fd < 0)
{
if (errno == ENOENT)
{
char temp[16];
sprintf(temp, "input/%s", message_subdir);
if (message_subdir[0] == 0) temp[5] = 0;
directory_make(spool_directory, temp, INPUT_DIRECTORY_MODE);
data_fd = open(spool_name, O_RDWR|O_CREAT|O_EXCL, SPOOL_MODE);
}
if (data_fd < 0)
log_write(0, LOG_MAIN | LOG_PANIC_DIE, "Failed to create spool file",
spool_name);
}
A first step would be to change the last few lines of that to read
if (data_fd < 0)
log_write(0, LOG_MAIN | LOG_PANIC_DIE, "Failed to create spool file: %s",
spool_name, strerror(errno));
and then wait for it to happen again. The message should then tell us
why it failed, with any luck.
--
Philip Hazel University Computing Service,
ph10@??? New Museums Site, Cambridge CB2 3QG,
P.Hazel@??? England. Phone: +44 1223 334714
--
*** Exim information can be found at
http://www.exim.org/ ***