[pcre-dev] [Bug 1803] segfault in pcre jit when running twig…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1803] segfault in pcre jit when running twig test suite (PHP7)
https://bugs.exim.org/show_bug.cgi?id=1803

--- Comment #16 from Nish Aravamudan <nish.aravamudan@???> ---
(In reply to Zoltan Herczeg from comment #15)
> (In reply to Nish Aravamudan from comment #14)
> > (gdb) break ext/pcre/php_pcre.c:1794 if strcmp(subject,
> > "\303\251\303\204\303\237\343\201\224a") == 0
>
> strcmp?


As mentioned earlier, it only reproduces if I run the entire test suite
(`phpunit` invocation). So I do that as the argument to php in gdb, but want to
break for this particular string as the known SEGV-inducing subject.

> Do you mean this line:
>
> count = pcre_exec(pce->re, extra, subject,
>                   subject_len, start_offset,
>                   exoptions|g_notempty, offsets, size_offsets);

>
> Actually the line 1794 is empty here, so I suspect there is an offset
> difference between your source code and the master:
>
> https://github.com/php/php-src/blob/master/ext/pcre/php_pcre.c#L1794


You're right. I have put two breakpoints in, one at the above pcre_exec line
and one at the count==0 check that follows; the first to get the values of
subject and start_offset passed to pcre_exec, the second to get the values of
offsets returned.

> > (gdb) c
> > ...
> > (gdb) print offsets[0]
> > $5 = 2
> > (gdb) print last_match
> > $6 = 0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a"
> > (gdb) print offsets[0]
> > $7 = 2
> > (gdb) print offsets[1]
> > $8 = 2
> > (gdb) c
> > ...
>
> So the first match is an empty match at offset 2.
>
> > (gdb) print last_match
> > $9 = 0x7fffed42e24a "\303\204\303\237\343\201\224a"
> > (gdb) print offsets[0]
> > $10 = -1
> > (gdb) print offsets[1]
> > $11 = -1
> > ...
>
> Is this a rerun because of:
>
> g_notempty = (offsets[1] == offsets[0])? PCRE_NOTEMPTY_ATSTART |
> PCRE_ANCHORED : 0;


Checking...

(gdb) print g_notempty
$70 = 268435472

which is 0x1000011E

#define PCRE_NOTEMPTY_ATSTART   0x10000000
#define PCRE_ANCHORED           0x00000010


So seems likely?

<snip>

> > SIGSEGV
>
> At this point I suspect something is wrong with start_offset, but it needs a
> proof. The last_match seemed to updated to offset 4 (substring
> "\303\237\343\201\224a"), but start_offset is below 4, and pcre returns a
> the same 2-4 match again. A string from offsets 4-2 cannot be constructed,
> since the end is smaller than the start.
>
> Could you also print start_offset and subject as well?
>
> (gdb) print substring


I assume you mean subject here?

> (gdb) print last_match
> (gdb) print start_offset
> (gdb) print offsets[0]
> (gdb) print offsets[1]
>
> For all iterations?


Here you go (excluding some typos on my part):

(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/bin/php /usr/bin/phpunit --bootstrap
lib/Twig/autoload.php
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
PHPUnit 5.1.3 by Sebastian Bergmann and contributors.

....FF.......................................................   61 / 1172 ( 
5%)
.............................................................  122 / 1172 (
10%)
.............................................................  183 / 1172 (
15%)
.............................................................  244 / 1172 (
20%)
.............................................................  305 / 1172 (
26%)
...........................................
Breakpoint 9, php_pcre_split_impl (pce=0x555555d33520, 
    subject=0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1786
1786            count = pcre_exec(pce->re, extra, subject,
(gdb) print subject
$44 = 0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a"
(gdb) print last_match
$45 = 0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a"
(gdb) print start_offset
$46 = 0
(gdb) c
Continuing.


Breakpoint 8, php_pcre_split_impl (pce=0x555555d33520, 
    subject=0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1794
1794            if (count == 0) {
(gdb) print offsets[0]
$47 = 2
(gdb) print offsets[1]
$48 = 2
(gdb) c
Continuing.


Breakpoint 9, php_pcre_split_impl (pce=0x555555d33520, 
    subject=0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1786
1786            count = pcre_exec(pce->re, extra, subject,
(gdb) print subject
$49 = 0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a"
(gdb) print last_match
$50 = 0x7fffed42e24a "\303\204\303\237\343\201\224a"
(gdb) print start_offset
$51 = 2
(gdb) c
Continuing.


Breakpoint 8, php_pcre_split_impl (pce=0x555555d33520, 
    subject=0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1794
1794            if (count == 0) {
(gdb) print offsets[0]
$52 = -1
(gdb) print offsets[1]
$53 = -1
(gdb) c
Continuing.


Breakpoint 9, php_pcre_split_impl (pce=0x555555d33520, 
    subject=0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1786
1786            count = pcre_exec(pce->re, extra, subject,
(gdb) print subject
$54 = 0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a"
(gdb) print last_match
$55 = 0x7fffed42e24a "\303\204\303\237\343\201\224a"
(gdb) print start_offset
$57 = 4
(gdb) c
Continuing.


Breakpoint 8, php_pcre_split_impl (pce=0x555555d33520, 
    subject=0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1794
1794            if (count == 0) {
(gdb) print offsets[0]
$58 = 2
(gdb) print offsets[1]
$59 = 4
(gdb) c
Continuing.


Breakpoint 9, php_pcre_split_impl (pce=0x555555d33520, 
    subject=0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1786
1786            count = pcre_exec(pce->re, extra, subject,
(gdb) print subject
$60 = 0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a"
(gdb) print last_match
$61 = 0x7fffed42e24c "\303\237\343\201\224a"
(gdb) print start_offset
$62 = 4
(gdb) c
Continuing.


Breakpoint 8, php_pcre_split_impl (pce=0x555555d33520, 
    subject=0x7fffed42e248 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1794
1794            if (count == 0) {
(gdb) print offsets[0]
$66 = 2
(gdb) print offsets[1]
$67 = 4
(gdb) c
Continuing.


Program received signal SIGSEGV, Segmentation fault.
__memcpy_avx_unaligned ()
    at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:273
273    ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: No such file or
directory.


> I am sorry for so many debugging requests, but I am not a php developer and
> just doing guesses here.


Neither am I :) I appreciate your help!

-Nish

--
You are receiving this mail because:
You are on the CC list for the bug.