[pcre-dev] [Bug 1149] New: Classes beginning with dots cause…

Top Page
Delete this message
Author: Justin Viiret
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1149] New: Classes beginning with dots cause pcre_compile to be very slow
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1149
           Summary: Classes beginning with dots cause pcre_compile to be
                    very slow
           Product: PCRE
           Version: 8.13
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
        AssignedTo: ph10@???
        ReportedBy: justin.viiret@???
                CC: pcre-dev@???



We encountered an issue when using PCRE 8.13 on a pattern that contains
character classes that begin with the '.' character -- version 8.12 did not
have this issue.

An example pattern:

/tempests[.\s]+hockshop[.\s]+crimping[.\s]+Jul[.\s]+bobbling[.\s]+severing[.\s]+heedlessness[.\s]+nicking[.\s]+quad[.\s]+deceases[.\s]+Chanukah[.\s]+neck[.\s]+gremlin[.\s]+directors[.\s]+depositing[.\s]+knockouts[.\s]+squirreled[.\s]+Charmaine[.\s]+mortally[.\s]+temblor[.\s]+librettos[.\s]+nationality[.\s]+spelling[.\s]+Kurile[.\s]+reprehensible[.\s]+commonplace[.\s]+cabs[.\s]+Kandahar[.\s]+bulbous[.\s]+rebind[.\s]+perkier[.\s]+descents[.\s]+robbed[.\s]+rope[.\s]+deicing[.\s]+prefects[.\s]+fool[.\s]+Kevin[.\s]+enthroning[.\s]+gravel[.\s]+obtains[.\s]+woollier[.\s]+dahlia[.\s]+sublets[.\s]+gummier[.\s]+fairest[.\s]+firstborns[.\s]+Rodgers[.\s]+depilatories[.\s]*/

Passing this pattern to pcretest -d makes it go away for a very long time, and
profiling or interrupting it with a debugger indicates that all that time is
being spent recursing in check_posix_syntax.

My suspicion is that this is due to the patch for bug 1123. There's also
potentially a correctness issue -- if we shorten the pattern and swap one of
the later '[.\s]' for a '[\s.]', the pattern is rejected:

$ pcre/pcretest -d
PCRE version 8.13 2011-08-16

re> /tempests[.\s]+hockshop[\s.]/

Failed: POSIX collating elements are not supported at offset 8

Perl accepts this pattern and is able to match "tempests hockshop " against it.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email