[pcre-dev] [Bug 1803] segfault in pcre jit when running twig…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1803] segfault in pcre jit when running twig test suite (PHP7)
https://bugs.exim.org/show_bug.cgi?id=1803

--- Comment #22 from Nish Aravamudan <nish.aravamudan@???> ---
(In reply to Zoltan Herczeg from comment #21)
> Another idea just came to my mind.
>
> It seems that all patterns are compiled by pcre_compile here:
>
> https://github.com/php/php-src/blob/master/ext/pcre/php_pcre.c#L433
>
> Would it be possible to dump all regex compilation to some file after this
> call?
>
> E.g.
>
> re = pcre_compile(pattern,
>           coptions,
>           &error,
>           &erroffset,
>           tables);

>
> FILE *f = fopen("dump_file", "a"); // appending at the end
> fprintf(f, "/%s/ 0x%x -> %p\n", pattern, coptions, re);
> fclose(f);


Recompiling PHP7.0 is quite slow, so I tried doing this with gdb...

> It would be easy to find the offending pattern from this list. Just find the
> latest entry which has the same address as pce->re.


I set a breakpoint at:

ext/pcre/php_pcre.c:1720 or so, which is the pce->refcount++ in
(zif_)preg_split:

Ignored the first 232 hits of it, and on the last one:

Breakpoint 4, zif_preg_split (execute_data=<optimized out>, 
    return_value=0x7ffff381b240)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1720
1720        pce->refcount++;
(gdb) print pce
$48 = (pcre_cache_entry *) 0x555555d333f0
(gdb) print subject->val@10
$47 = {"\303", "\251", "\303", "\204", "\303", "\237", "\343", "\201", "\224", 
  "a"}
(gdb) print regex->val@14
$58 = {"/", "(", "?", "<", "!", "^", ")", "(", "?", "!", "$", ")", "/", "u"}
(gdb) print &pce->re
$60 = (pcre **) 0x555555d333f0
(gdb) cont
Continuing.


Program received signal SIGSEGV, Segmentation fault.
__memcpy_avx_unaligned ()
    at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:271
271    ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: No such file or
directory.
(gdb) up
#1  0x00005555556798d8 in memcpy (__len=18446744073709551614, 
    __src=0x7fffed40b1ac, __dest=0x7fffed7a6348)
    at /usr/include/x86_64-linux-gnu/bits/string3.h:53
53      return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
(gdb) up
#2  zend_string_init (persistent=0, len=18446744073709551614, 
    str=0x7fffed40b1ac "\303\237\343\201\224a")
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/Zend/zend_string.h:159
159        memcpy(ZSTR_VAL(ret), str, len);
(gdb) up
#3  php_pcre_split_impl (pce=0x555555d333f0, 
    subject=0x7fffed40b1a8 "\303\251\303\204\303\237\343\201\224a", 
    subject_len=10, return_value=0x7ffff381b240, limit_val=-1, 
    flags=<optimized out>)
    at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1808
1808                        ZVAL_STRINGL(&tmp, last_match,
&subject[offsets[0]]-last_match);
(gdb) print pce
$61 = (pcre_cache_entry *) 0x555555d333f0
(gdb) print &pce->re
$62 = (pcre **) 0x555555d333f0


So the regex in question, I think, is:

/(?<!^)(?!$)/u

which does correspond to the output I got from the above printf in gdb. Does
that help narrow down where the bug might be?

Do you still want me to do the control flow analysis?

--
You are receiving this mail because:
You are on the CC list for the bug.