Re: [pcre-dev] PCRE bug

Top Page
Delete this message
Author: Srinivas R Thota
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] PCRE bug
Hi Philip,

Thanks for your response, But in my use case it is not possible to feed
the regexp manually its autogenerated,
and may be its not possible to determine which regexp component can
consume maximum length,
Could you point if there is any way I can changes this in the PCRE code,
to make it match
longest string.

Thanks,
Srinivas Thota



Philip Hazel wrote:
> On Wed, 9 Apr 2008, Srinivas R Thota wrote:
>
>
>> When I tried to match
>>
>> "000*1211" on regular exp
>>
>> ((0){1,})|((0)|(1)|(2)){1,}(\\*)((0)|(1)|(2)){1,}
>>
>> with pcre, the result in the match structure's (regmatch_t) first
>> member should be containing
>> the whole string that is matched which should be ""000*1211"" , but PCRE
>> 7.0 is actually only
>> matching "000" only 3 characters ?
>>
>
> PCRE uses Perl semantics for matching, even if you use the POSIX
> interface, as I assume you have done (judging by your reference to
> regmatch_t). This is clearly documented: the pcreposix man page says
> this:
>
> "When PCRE is called via these functions, it is only the API that is
> POSIX-like in style. The syntax and semantics of the regular
> expressions themselves are still those of Perl..."
>
> Perl finds the *first* match, which is not necessarily the
> *longest* match. Your pattern starts like this (putting in some spaces
> for readability):
>
> ( (0){1,} ) |
>
> That alternative is tested first; it matches the 000, and so that is
> what is reported as the match. PCRE looks no further (neither does
> Perl).
>
> You probably want to write the top-level alternatives in the other
> order so that the more complicated one is tested first.
>
> Philip
>
>



-- 
***************************************************************************************
    This e-mail and attachments contain confidential information from 
HUAWEI, which is intended only for the person or entity whose address is 
listed above. Any use of the information contained herein in any way 
(including, but not limited to, total or partial disclosure, reproduction,
 or dissemination) by persons other than the intended recipient's) 
is prohibited. If you receive this e-mail in error, please notify the 
sender by phone or email immediately and delete it!