https://bugs.exim.org/show_bug.cgi?id=1803
--- Comment #22 from Nish Aravamudan <nish.aravamudan@???> ---
(In reply to Zoltan Herczeg from comment #21)
> Another idea just came to my mind.
>
> It seems that all patterns are compiled by pcre_compile here:
>
> https://github.com/php/php-src/blob/master/ext/pcre/php_pcre.c#L433
>
> Would it be possible to dump all regex compilation to some file after this
> call?
>
> E.g.
>
> re = pcre_compile(pattern,
> coptions,
> &error,
> &erroffset,
> tables);
>
> FILE *f = fopen("dump_file", "a"); // appending at the end
> fprintf(f, "/%s/ 0x%x -> %p\n", pattern, coptions, re);
> fclose(f);
Recompiling PHP7.0 is quite slow, so I tried doing this with gdb...
> It would be easy to find the offending pattern from this list. Just find the
> latest entry which has the same address as pce->re.
I set a breakpoint at:
ext/pcre/php_pcre.c:1720 or so, which is the pce->refcount++ in
(zif_)preg_split:
Ignored the first 232 hits of it, and on the last one:
Breakpoint 4, zif_preg_split (execute_data=<optimized out>,
return_value=0x7ffff381b240)
at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1720
1720 pce->refcount++;
(gdb) print pce
$48 = (pcre_cache_entry *) 0x555555d333f0
(gdb) print subject->val@10
$47 = {"\303", "\251", "\303", "\204", "\303", "\237", "\343", "\201", "\224",
"a"}
(gdb) print regex->val@14
$58 = {"/", "(", "?", "<", "!", "^", ")", "(", "?", "!", "$", ")", "/", "u"}
(gdb) print &pce->re
$60 = (pcre **) 0x555555d333f0
(gdb) cont
Continuing.
Program received signal SIGSEGV, Segmentation fault.
__memcpy_avx_unaligned ()
at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:271
271 ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: No such file or
directory.
(gdb) up
#1 0x00005555556798d8 in memcpy (__len=18446744073709551614,
__src=0x7fffed40b1ac, __dest=0x7fffed7a6348)
at /usr/include/x86_64-linux-gnu/bits/string3.h:53
53 return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
(gdb) up
#2 zend_string_init (persistent=0, len=18446744073709551614,
str=0x7fffed40b1ac "\303\237\343\201\224a")
at /build/php7.0-WHFaJZ/php7.0-7.0.3/Zend/zend_string.h:159
159 memcpy(ZSTR_VAL(ret), str, len);
(gdb) up
#3 php_pcre_split_impl (pce=0x555555d333f0,
subject=0x7fffed40b1a8 "\303\251\303\204\303\237\343\201\224a",
subject_len=10, return_value=0x7ffff381b240, limit_val=-1,
flags=<optimized out>)
at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1808
1808 ZVAL_STRINGL(&tmp, last_match,
&subject[offsets[0]]-last_match);
(gdb) print pce
$61 = (pcre_cache_entry *) 0x555555d333f0
(gdb) print &pce->re
$62 = (pcre **) 0x555555d333f0
So the regex in question, I think, is:
/(?<!^)(?!$)/u
which does correspond to the output I got from the above printf in gdb. Does
that help narrow down where the bug might be?
Do you still want me to do the control flow analysis?
--
You are receiving this mail because:
You are on the CC list for the bug.