On 2011-09-11 at 18:49 +0200, Pawel Rutkowski wrote:
> Hello again,
> > That's ... rather worrying to see in a backtrace; smtp_read_response()
> > is very well-used code and we shouldn't be seeing surprises there.
> >
> > Any chance that you could compile Exim with debug information (-ggdb)
> > and _not_ strip it, so we can see more information in the backtrace?
> >
>
> Yes, now more information:
>
> (gdb) bt
> #0 0x000000000046b660 in smtp_read_response
> (inblock=0x7ffff18bad50,buffer=0x7ffff18b9d10 "220 proksima.home.pl ESMTP
> IdeaSmtpServer v0.70 ready.\r", size=4041, okdigit=50, timeout=300) at
> smtp_out.c:512
I failed to follow up on this; sorry for the silence. I've taken a
look.
You have bad memory in your system and had a bit flip. Invest in ECC
RAM.
│0x46b64e <smtp_read_response+350> callq 0x41e5fe <debug_printf>
│0x46b653 <smtp_read_response+355> cmp $0x2,%ebx
│0x46b656 <smtp_read_response+358> jle 0x46b699 <smtp_read_response+425>
│0x46b658 <smtp_read_response+360> callq 0x4152e8 <__ctype_b_loc@plt>
│0x46b65d <smtp_read_response+365> mov (%rax),%rdx
>│0x46b660 <smtp_read_response+368> movzbl (%r15),%eax
│0x46b664 <smtp_read_response+372> testb $0x8,0x1(%rdx,%rax,2)
At this point, we are in smtp_out.c at:
512 if (count < 3 ||
513 !isdigit(ptr[0]) ||
514 !isdigit(ptr[1]) ||
515 !isdigit(ptr[2]) ||
Line 512 has just been executed, with the "cmp $0x2,%ebx" and if the
result is less_than_or_equal to 2, we jump away (< 3 became <= 2).
We've done an isdigit() check (__ctype_b_loc@plt)
We're now trying to load the byte stored at the address in %r15 into
%eax, but %r15 points to invalid memory.
(gdb) p ptr
$1 = <value optimized out>
but if you read the source, then:
488 uschar *ptr = buffer;
and nothing should be changing ptr after that, and buffer is correct, we
can examine the data in it.
(gdb) p *(char *)$r15
Cannot access memory at address 0x1007fff33cfb150
(gdb) p buffer
$6 = (uschar *) 0x7fff33cfb150 "220 [....]
Compare and contrast:
0x1007fff33cfb150
0x7fff33cfb150
Voila. Memory corruption. It just happens that when you deliver mail
to that one host, the memory ends up laid out such that you experience a
crash here.
That this is repeatable means you have bad RAM.
Invest in a system using ECC RAM.
-Phil