------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1437
Summary: Using PCRE-8.34 on x86-64 Linux with --enable-jit and --
enable-utf , grep -iP '^S' gets stuck on a binary file
consuming a lot of CPU for many seconds
Product: PCRE
Version: 8.34
Platform: x86-64
URL: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=16499
OS/Version: Linux
Status: NEW
Severity: bug
Priority: medium
Component: Code
AssignedTo: ph10@???
ReportedBy: shlomif@???
CC: pcre-dev@???
Created an attachment (id=683)
--> (
http://bugs.exim.org/attachment.cgi?id=683)
Offending file to be given as input.
Hi all,
I originally filed the bug in GNU grep. See:
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=16499
for more investigation.
Quoting from it:
Hi all,
after I save the attached file as 1.dat , I see that grep -iP on '^Subject:'
or on '^S' gets stuck in the en_US.UTF-8 locale. It is fine in pcregrep and in
ack.
[SHELL]
shlomif <at> telaviv1:~$ time LC_ALL=en_US.UTF-8
~/apps/TEST-grep-from-git-TO-DEL/bin/grep -iP '^Subject:' < 1.dat ^C
real 0m4.199s
user 0m4.195s
sys 0m0.003s
shlomif <at> telaviv1:~$ time LC_ALL=en_US.UTF-8
~/apps/TEST-grep-from-git-TO-DEL/bin/grep -iP '^S' < 1.dat ^C
real 0m3.486s
user 0m3.485s
sys 0m0.001s
shlomif <at> telaviv1:~$ time LC_ALL=en_US.UTF-8
~/apps/TEST-grep-from-git-TO-DEL/bin/grep -iE '^S' < 1.dat
real 0m0.002s
user 0m0.002s
sys 0m0.000s
shlomif <at> telaviv1:~$ time LC_ALL=en_US.UTF-8
~/apps/TEST-grep-from-git-TO-DEL/bin/grep -P '^S' < 1.dat ^C
real 0m1.887s
user 0m1.885s
sys 0m0.000s
shlomif <at> telaviv1:~$ time LC_ALL=en_US.UTF-8
~/apps/TEST-grep-from-git-TO-DEL/bin/grep -P '^Subject:' < 1.dat
real 0m0.003s
user 0m0.000s
sys 0m0.002s
shlomif <at> telaviv1:~$ time LC_ALL=en_US.UTF-8
~/apps/TEST-grep-from-git-TO-DEL/bin/grep -P '^Subject:' < 1.dat time LC_ALL=C
~/apps/TEST-grep-from-git-TO-DEL/bin/grep -iP '^Subject:' < 1.dat
real 0m0.003s
user 0m0.001s
sys 0m0.001s
shlomif <at> telaviv1:~$ time LC_ALL=C pcregrep -i '^Subject:' < 1.dat
real 0m0.002s
user 0m0.001s
sys 0m0.000s
shlomif <at> telaviv1:~$ time LC_ALL=C ack -i '^Subject:' 1.dat
real 0m0.066s
user 0m0.059s
sys 0m0.007s
shlomif <at> telaviv1:~$ time LC_ALL=en_US.UTF-8 ack -i '^Subject:' 1.dat
real 0m0.070s
user 0m0.063s
sys 0m0.006s
[/SHELL]
The same thing happens with grep-2.16 built from the sources. I'm on Mageia
Linux x86-64 Cauldron (what will be Mageia 4).
shlomif <at> telaviv1:~$ ldd ~/apps/TEST-grep-from-git-TO-DEL/bin/grep
linux-vdso.so.1 (0x00007fff2a7fe000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f19ed302000)
libc.so.6 => /lib64/libc.so.6 (0x00007f19ecf4d000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f19ecd30000)
/lib64/ld-linux-x86-64.so.2 (0x00007f19ed568000)
shlomif <at> telaviv1:~$ rpm -qf /lib64/libpcre.so.1
lib64pcre1-8.33-2.mga4
Regards,
Shlomi Fish
After some investigation I discovered that the problem was manifested on x86-64
systems only with PCRE-8.x that was built with JIT support (and --enable-utf
too naturally). The problem happens in a JIT-generated function without
debugging symbols.
If I built PCRE and GNU grep-2.16 like this on a Debian Testing ("jessie")
x86-64 VM then running LC_ALL=en_US.UTF-8 ~/apps/grep/bin/grep -iP '^S' < 1.dat
caused it to hang:
BUILD_pcre.bash:
«
#!/bin/bash
CFLAGS="-g" ./configure --prefix="$HOME/apps/pcre" --enable-utf --enable-jit
»
BUILD_grep.bash:
«
#!/bin/bash
# Source this file.
export CPATH="/home/shlomif/apps/pcre/include/"
export LD_LIBRARY_PATH="/home/shlomif/apps/pcre/lib"
export LIBRARY_PATH="/home/shlomif/apps/pcre/lib"
CFLAGS="-g" ./configure --prefix="$HOME/apps/grep"
»
(searcing for «-iP '^Su'» was fine).
---------------
I'll attach the 1.dat file here.
Regards,
-- Shlomi Fish
--
Configure bugmail:
http://bugs.exim.org/userprefs.cgi?tab=email