--
--
Howdy.
I'm a member of the NetOps team at osdn.com. Currently, I'm babysitting
our mail system, which uses exim of various versions. Our new
MX box (the one that is mail.sourceforge.net) is running exim 4.22 with
exiscan v10, using ClamAV and SpamAssassin for content scanning.
Since deployment we've being seeing this rather odd problem: it looks
like something in the processing chain is converting the carriage returns
(0x0d or \r) to newlines (0x0a or \n).
At first I suspected that it might be SpamAssassin or ClamAV, so I
disabled them. Nope. Then I thought exiscan might be the culprit. So I
recompiled exim on another box to test on, this time without exiscan.
The CRs still became LFs, such that CRLF in the body of the message
becomes LFLF, leading to double-spacing.
Interestingly enough, if you add a header that's delimited by \r\n
(there's a commented-out example of this in the perl script below), the
problem goes away, with \r\n -> \n and bare \r -> \n. Seems like the
parser is looking for \r in the header to switch translation modes or
something.
Here's what I'm seeing. This message:
--- in plaintext
Subject: Newline test Thu Sep 25 22:09:56 PDT 2003
This is a test of the newlines.
This text
should not
appear to be
doublespaced
Neither should
This test
Not sure what^Mthis will^Mlook like.
End test
--- end plaintext
--- in hexdump
0000000 7553 6a62 6365 3a74 4e20 7765 696c 656e
0000000 S u b j e c t : N e w l i n e
0000010 7420 7365 2074 6854 2075 6553 2070 3532
0000010 t e s t T h u S e p 2 5
0000020 3220 3a32 3930 353a 2036 4450 2054 3032
0000020 2 2 : 0 9 : 5 6 P D T 2 0
0000030 3330 0a0a 540a 6968 2073 7369 6120 7420
0000030 0 3 \n \n \n T h i s i s a t
0000040 7365 2074 666f 7420 6568 6e20 7765 696c
0000040 e s t o f t h e n e w l i
0000050 656e 2e73 0a0a 6854 7369 7420 7865 0d74
0000050 n e s . \n \n T h i s t e x t \r
0000060 730a 6f68 6c75 2064 6f6e 0d74 610a 7070
0000060 \n s h o u l d n o t \r \n a p p
0000070 6165 2072 6f74 6220 0d65 640a 756f 6c62
0000070 e a r t o b e \r \n d o u b l
0000080 7365 6170 6563 0a64 4e0a 6965 6874 7265
0000080 e s p a c e d \n \n N e i t h e r
0000090 7320 6f68 6c75 0a64 6854 7369 7420 7365
0000090 s h o u l d \n T h i s t e s
00000a0 0a74 4e0a 746f 7320 7275 2065 6877 7461
00000a0 t \n \n N o t s u r e w h a t
00000b0 740d 6968 2073 6977 6c6c 6c0d 6f6f 206b
00000b0 \r t h i s w i l l \r l o o k
00000c0 696c 656b 0d2e 0a0a 6e45 2064 6574 7473
00000c0 l i k e . \r \n \n E n d t e s t
00000d0
--- end hexdump
is being transformed into this:
--- in plaintext
Subject: Re: Newline test Thu Sep 25 21:45:39 PDT 2003
This is a test of the newlines.
This text
should not
appear to be
doublespaced
Neither should
This test
Not sure what
this will
look like.
--- end plaintext
--- in hexdump (excerpted)
00002e0 7369 6920 2073 2061 6574 7473 6f20 2066
00002e0 i s i s a t e s t o f
00002f0 6874 2065 656e 6c77 6e69 7365 0a2e 540a
00002f0 t h e n e w l i n e s . \n \n T
0000300 6968 2073 6574 7478 0a0a 6873 756f 646c
0000300 h i s t e x t \n \n s h o u l d
0000310 6e20 746f 0a0a 7061 6570 7261 7420 206f
0000310 n o t \n \n a p p e a r t o
0000320 6562 0a0a 6f64 6275 656c 7073 6361 6465
0000320 b e \n \n d o u b l e s p a c e d
0000330 0a0a 654e 7469 6568 2072 6873 756f 646c
0000330 \n \n N e i t h e r s h o u l d
0000340 540a 6968 2073 6574 7473 0a0a 6f4e 2074
0000340 \n T h i s t e s t \n \n N o t
0000350 7573 6572 7720 6168 0a74 6874 7369 7720
0000350 s u r e w h a t \n t h i s w
0000360 6c69 0a6c 6f6c 6b6f 6c20 6b69 2e65 0a0a
0000360 i l l \n l o o k l i k e . \n \n
0000370 000a
0000370 \n
0000371
--- end hexdump
I've setup a reflector address, reflect@???, that will send
your email back to you. Currently, it's running 4.24 without exiscan.
You can, of course, test against your own machines, if so should so
choose. :)
Here's the perl script I use to generate test mails. If you invoke it
thusly you'll get a copy of the munged mail:
perl testnewlines.pl | sendmail reflect@???
--- start testnewlines.pl
#!/usr/local/bin/perl
$date = `date`;
# uncommenting this line will make the problem dissappear
#print "X-Spoiler: this is here as a newline test.\r\n";
print "Subject: Newline test $date\n";
print "\r\n";
print "This is a test of the newlines.\n\n";
print "This text\r\n";
print "should not\r\n";
print "appear to be\r\n";
print "doublespaced\n\n";
print "Neither should\n";
print "This test\n\n";
print "Not sure what\rthis will\rlook like.\r\n";
print "\nEnd test";
--- end testnewlines.pl
Testing shows that it's not in the headers, only in the body of the
message.
This is a consistent problem. It showed up in our deployment, it showed
in a deployment by VA Software (our parent company) last week, and it's
eminently repeatable in my tests.
This seems like a rather major thing to overlook... I wondering if this
isn't a configuration issue or a compile time issue? I've scoured the
mailing lists and documentation and I don't see any references to this.
I've tested this against 4.22 both with and without the exiscan patch and
against 4.24. The behavior is consistent.
So in conclusion I have the following questions:
1) Is this behavior (pre or post fix) RFC specified? This is causing some
problems with messages generated by the SourceForge.net application. If
we're doing the wrong thing, we should fix it. (The bodies of Tracker
emails use \r\n as their newline delimiter, for those familiar with
SourceForge.net).
2) How/when was it fixed?
3) Can I get a patch to just fix this issue in 4.22 while we wait for
exiscan to catch up?
Thanks in advance for your help and for all the hard work that's gone into
making exim an amazing piece of software.
P.S. perl script attached for your convenience
--
Ari Gordon-Schlosberg <regs@???>, OSDN NetOps
http://sourceforge.net/
--
#!/usr/local/bin/perl
$date = `date`;
# uncommenting this line will make the problem dissappear
#print "X-Spoiler: this is here as a newline test.\r\n";
print "Subject: Newline test $date\n";
print "\r\n";
print "This is a test of the newlines.\n\n";
print "This text\r\n";
print "should not\r\n";
print "appear to be\r\n";
print "doublespaced\n\n";
print "Neither should\n";
print "This test\n\n";
print "Not sure what\rthis will\rlook like.\r\n";
print "\nEnd test";
--
[ Content of type application/pgp-signature deleted ]
--