Re: [pcre-dev] (no subject)

Top Page
Delete this message
Author: swati upadhyaya
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] (no subject)
my code is

const char* pattern =
"<<((?:\\[<:>]|[^<:>])+?)(?<!\\):((?:\\[<:>]|[^<:>])+?)(?<!\\):((?:\\[<:>]|[^<:>])+?)(?<!\\)>((?:\\[<:>]|[^<:>])*)(\\\?)?(?<!\\)>";
pcre* re = pcre_compile(pattern,0,&error,&erroffset,NULL);

if (re == NULL){
cout << "PCRE compilation failed at offset " << erroffset << error <<
endl;
return NULL;
}


On Wed, Jul 9, 2014 at 10:32 AM, swati upadhyaya <swatiupadhyaya@???>
wrote:

> hi,
> I am compiling
>
> <<((?:\\[<:>]|[^<:>])+?)(?<!\\):((?:\\[<:>]|[^<:>])+?)(?<!\\):((?:\\[<:>]|[^<:>])+?)(?<!\\)>((?:\\[<:>]|[^<:>])*)(\\\?)?(?<!\\)>
> with pcre_compile its giving me error missing ). but when i am using the
> same pattern with the link http://regex101.com/r/mH7iW6/2#python against
> the subject sting
> "Info;<<words:=:all>;>"
> its giving me the correct result what i want..
> So why is its showing compilation error ?
>
> regards
> Swati
>
>
> On Thu, Apr 24, 2014 at 10:35 PM, <ph10@???> wrote:
>
>> On Thu, 24 Apr 2014, swati upadhyaya wrote:
>>
>> >                        Thanks for your replt,it will be great if you can
>> > shot out my problem...I have tried with many pattern and found that PCRE
>> > talkes lesser time then any other regex lib thats why want to use PCRE
>> but
>> > there are some pattern like the one abpve for which its unable to match.

>>
>> Is this pattern generated by some process? It contains really silly
>> sequences like \s*(?:(?:(?:\s+)))\s* and similar. I had a further look.
>> I found it was failing at the \t in the sequence
>>
>> \s*\s*(?:(?:(?:[\t]+)))\s*\s*
>>
>> (another crazy sequence) because there were no tab characters in the
>> data string. So I changed \t to \s (to match a space). The match then
>> failed with
>>
>> Error -8 (match limit exceeded)
>>
>> In other words, the pattern makes a very large search tree, which takes
>> a long time to scan. Sequences such as (?:(?:\w+\s?)+))) are dangerous
>> because they contain nested unlimited repeats.
>>
>> This is such a crazy pattern that I really can't mess with any more. Can
>> you not find a way of creating a clean pattern without all the
>> redundancy? It might then be easier to see why it runs for so long. I'm
>> suspicious of all the .*? items: each of those is going to try the rest
>> of the pattern after swallowing 0, 1, 2, 3, ... characters. The use of
>> atomic groups (?>.....) would also stop a lot of the backtracking.
>>
>> Aha! I changed (?:(?:\w+\s?)+))) to (?:(?>\w+\s?)+))) that is, made it
>> into an atomic group, and lo and behold, when I ran pcretest:
>>
>> PCRE version 8.35 2014-04-04
>>
>> "MSWinEventLog\s*(?:(?:(?:\s+)))\s*(?:\s*(?:(?:(?:\d\s+)))\s*)?\s*(?:(?P<event_log__string>(?:\S+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:\s+)))\s*\s*(?:(?P<event_id__0>(?:4610|4614|4622)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?P<event_source__all>(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?P<event_category__all>(?:.*?)))\s*\s*(?:(?:(?:[\s]+)))\s*\s*(?:(?:(?:(A|An).*?)))\s*\s*(?:(?P<object__words>(?:(?>\w+\s?)+)))\s*\s*(?:(?:(?:has
>> been)))\s*\s*(?:(?P<action__0>(?:loaded)))\s*\s*(?:(?:(?: by
>> the)))\s*\s*(?:(?:(?:.*?)))\s*Package
>> Name\:\s*(?:(?P<package__0>(?:\S+)))\s*"
>> <14>Mar 2 11:34:38 89.237.143.23 MSWinEventLog 1 Security 6500 Fri Mar 02
>> 11:34:37 2012 4610 Microsoft-Windows-Security-Auditing    N/A    N/A
>>  Success Audit prabhat.ImmuneAps.com    User Logoff    A authentication
>> package has been loaded by the Local Security Authority. This
>> authentication package will be used to authenticate logon attempts.
>>  Authentication Package Name: C:\\Windows\\system32\\msv1_0.dll :
>> MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
>>  0: MSWinEventLog 1 Security 6500 Fri Mar 02 11:34:37 2012 4610
>> Microsoft-Windows-Security-Auditing    N/A    N/A    Success Audit
>> prabhat.ImmuneAps.com    User Logoff    A authentication package has
>> been loaded by the Local Security Authority. This authentication package
>> will be used to authenticate logon attempts.  Authentication Package Name:
>> C:\Windows\system32\msv1_0.dll
>>  1: Security
>>  2: 4610
>>  3: Microsoft-Windows-Security-Auditing
>>  4: prabhat.ImmuneAps.com    User Logoff
>>  5: A
>>  6: authentication package
>>  7: loaded
>>  8: C:\Windows\system32\msv1_0.dll

>>
>> ... and this was pretty well instantaneous.
>>
>> Philip
>>
>> --
>> Philip Hazel
>>
>
>