Re: [exim] General protection error

Top Page
Delete this message
Reply to this message
Author: Phil Pennock
Date:  
To: Pawel Rutkowski
CC: exim-users
Subject: Re: [exim] General protection error
On 2011-09-11 at 18:49 +0200, Pawel Rutkowski wrote:
> Hello again,
> > That's ... rather worrying to see in a backtrace; smtp_read_response()
> > is very well-used code and we shouldn't be seeing surprises there.
> >
> > Any chance that you could compile Exim with debug information (-ggdb)
> > and _not_ strip it, so we can see more information in the backtrace?
> >
>
> Yes, now more information:
>
> (gdb) bt
> #0 0x000000000046b660 in smtp_read_response
> (inblock=0x7ffff18bad50,buffer=0x7ffff18b9d10 "220 proksima.home.pl ESMTP
> IdeaSmtpServer v0.70 ready.\r", size=4041, okdigit=50, timeout=300) at
> smtp_out.c:512


I failed to follow up on this; sorry for the silence. I've taken a
look.

You have bad memory in your system and had a bit flip. Invest in ECC
RAM.

   │0x46b64e <smtp_read_response+350>       callq  0x41e5fe <debug_printf>
   │0x46b653 <smtp_read_response+355>       cmp    $0x2,%ebx
   │0x46b656 <smtp_read_response+358>       jle    0x46b699 <smtp_read_response+425>
   │0x46b658 <smtp_read_response+360>       callq  0x4152e8 <__ctype_b_loc@plt>
   │0x46b65d <smtp_read_response+365>       mov    (%rax),%rdx

  >│0x46b660 <smtp_read_response+368>       movzbl (%r15),%eax

   │0x46b664 <smtp_read_response+372>       testb  $0x8,0x1(%rdx,%rax,2)


At this point, we are in smtp_out.c at:
512   if (count < 3 ||
513      !isdigit(ptr[0]) ||
514      !isdigit(ptr[1]) ||
515      !isdigit(ptr[2]) ||


Line 512 has just been executed, with the "cmp $0x2,%ebx" and if the
result is less_than_or_equal to 2, we jump away (< 3 became <= 2).

We've done an isdigit() check (__ctype_b_loc@plt)

We're now trying to load the byte stored at the address in %r15 into
%eax, but %r15 points to invalid memory.

(gdb) p ptr
$1 = <value optimized out>

but if you read the source, then:
488 uschar *ptr = buffer;
and nothing should be changing ptr after that, and buffer is correct, we
can examine the data in it.

(gdb) p *(char *)$r15
Cannot access memory at address 0x1007fff33cfb150
(gdb) p buffer
$6 = (uschar *) 0x7fff33cfb150 "220 [....]

Compare and contrast:

  0x1007fff33cfb150
     0x7fff33cfb150


Voila. Memory corruption. It just happens that when you deliver mail
to that one host, the memory ends up laid out such that you experience a
crash here.

That this is repeatable means you have bad RAM.

Invest in a system using ECC RAM.

-Phil