[pcre-dev] [Bug 1437] Using PCRE-8.34 on x86-64 Linux with …

Top Page
Delete this message
Author: Shlomi Fish
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1437] Using PCRE-8.34 on x86-64 Linux with --enable-jit and --enable-utf , grep -iP '^S' gets stuck on a binary file consuming a lot of CPU for many seconds
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1437




--- Comment #4 from Shlomi Fish <shlomif@???> 2014-01-23 11:45:11 ---
Hi Zoltan,

(In reply to comment #3)
> > Well, I also tried it with libpcre-8.34 on x86-64 on Debian Testing, and was
> > able to reproduce the bug there. 8.34 is the latest version according to
> > ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/ .
>
> I tried 8.33 and 8.34 here as well, and no difference in JIT.


Where is "here"? What is your architecture, operating system, distribution and
their versions? On which system are you deploying this?

> I forgot to
> compile the interpreter in release mode, so its speed is now 0.62 sec.
>


OK, it gets stuck here and takes many seconds so it cannot be the issue.


> > OK, I realise it is a binary file, but the longer story is that it was part of
> > a file in a Claws-Mail directory tree, which I searched using grep -iPr, and
> > once grep got to that file, it got stuck and started consuming a lot of CPU and
> > wouldn't proceed any further.
>
> The pattern is simple and the input is small, so it is unlikely that matching
> it takes more than a second.
>
> Am I understand correctly, that you do a case-sensitive match, and there is no
> match:


What do you mean by "Am I understand correctly"? Do you mean "If I understand
correctly,"? Or do you mean "Am I understanding correctly,"?

>
> ~/apps/TEST-grep-from-git-TO-DEL/bin/grep -P '^S' < 1.dat
>


That's not how I invoked it. I specifically used GNU grep's "-i" flag and an
en_US.UTF-8 locale. Case-insensitive search rather than case-sensitive one.

> real    0m1.887s
> user    0m1.885s
> sys     0m0.000s

>
> I still suspect that the interface between grep and pcre is not correct, either
> grep misinterpret a return value, or the JIT gives back something unexpected.
> That is why I need to number of times pcre_exec is called, its arguments and
> return values. Perhaps some uninitialized memory data (ovector?) is the
> culprit.


OK, I'll get this information for you.

Regards,

— Shlomi Fish


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email