Re: [pcre-dev] (no subject)

Top Page
Delete this message
Author: swati upadhyaya
Date:  
To: pcre-dev
Old-Topics: Re: [pcre-dev] (no subject)
Subject: Re: [pcre-dev] (no subject)
hi,
I am compiling
<<((?:\\[<:>]|[^<:>])+?)(?<!\\):((?:\\[<:>]|[^<:>])+?)(?<!\\):((?:\\[<:>]|[^<:>])+?)(?<!\\)>((?:\\[<:>]|[^<:>])*)(\\\?)?(?<!\\)>
with pcre_compile its giving me error missing ). but when i am using the
same pattern with the link http://regex101.com/r/mH7iW6/2#python against
the subject sting
"Info;<<words:=:all>;>"
its giving me the correct result what i want..
So why is its showing compilation error ?

regards
Swati


On Thu, Apr 24, 2014 at 10:35 PM, <ph10@???> wrote:

> On Thu, 24 Apr 2014, swati upadhyaya wrote:
>
> >                        Thanks for your replt,it will be great if you can
> > shot out my problem...I have tried with many pattern and found that PCRE
> > talkes lesser time then any other regex lib thats why want to use PCRE
> but
> > there are some pattern like the one abpve for which its unable to match.

>
> Is this pattern generated by some process? It contains really silly
> sequences like \s*(?:(?:(?:\s+)))\s* and similar. I had a further look.
> I found it was failing at the \t in the sequence
>
> \s*\s*(?:(?:(?:[\t]+)))\s*\s*
>
> (another crazy sequence) because there were no tab characters in the
> data string. So I changed \t to \s (to match a space). The match then
> failed with
>
> Error -8 (match limit exceeded)
>
> In other words, the pattern makes a very large search tree, which takes
> a long time to scan. Sequences such as (?:(?:\w+\s?)+))) are dangerous
> because they contain nested unlimited repeats.
>
> This is such a crazy pattern that I really can't mess with any more. Can
> you not find a way of creating a clean pattern without all the
> redundancy? It might then be easier to see why it runs for so long. I'm
> suspicious of all the .*? items: each of those is going to try the rest
> of the pattern after swallowing 0, 1, 2, 3, ... characters. The use of
> atomic groups (?>.....) would also stop a lot of the backtracking.
>
> Aha! I changed (?:(?:\w+\s?)+))) to (?:(?>\w+\s?)+))) that is, made it
> into an atomic group, and lo and behold, when I ran pcretest:
>
> PCRE version 8.35 2014-04-04
>
> "MSWinEventLog\s*(?:(?:(?:\s+)))\s*(?:\s*(?:(?:(?:\d\s+)))\s*)?\s*(?:(?P<event_log__string>(?:\S+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:\s+)))\s*\s*(?:(?P<event_id__0>(?:4610|4614|4622)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?P<event_source__all>(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?P<event_category__all>(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:(A|An).*?)))\s*\s*(?:(?P<object__words>(?:(?>\w+\s?)+)))\s*\s*(?:(?:(?:has
> been)))\s*\s*(?:(?P<action__0>(?:loaded)))\s*\s*(?:(?:(?: by
> the)))\s*\s*(?:(?:(?:.*?)))\s*Package
> Name\:\s*(?:(?P<package__0>(?:\S+)))\s*"
> <14>Mar 2 11:34:38 89.237.143.23 MSWinEventLog 1 Security 6500 Fri Mar 02
> 11:34:37 2012 4610 Microsoft-Windows-Security-Auditing    N/A    N/A
>  Success Audit prabhat.ImmuneAps.com    User Logoff    A authentication
> package has been loaded by the Local Security Authority. This
> authentication package will be used to authenticate logon attempts.
>  Authentication Package Name: C:\\Windows\\system32\\msv1_0.dll :
> MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
>  0: MSWinEventLog 1 Security 6500 Fri Mar 02 11:34:37 2012 4610
> Microsoft-Windows-Security-Auditing    N/A    N/A    Success Audit
> prabhat.ImmuneAps.com    User Logoff    A authentication package has been
> loaded by the Local Security Authority. This authentication package will be
> used to authenticate logon attempts.  Authentication Package Name:
> C:\Windows\system32\msv1_0.dll
>  1: Security
>  2: 4610
>  3: Microsoft-Windows-Security-Auditing
>  4: prabhat.ImmuneAps.com    User Logoff
>  5: A
>  6: authentication package
>  7: loaded
>  8: C:\Windows\system32\msv1_0.dll

>
> ... and this was pretty well instantaneous.
>
> Philip
>
> --
> Philip Hazel
>