Re: [pcre-dev] Problem with math (fwd)

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Andrew Ho, Andrew Tarasov
CC: PCRE Developers
Subject: Re: [pcre-dev] Problem with math (fwd)
On Thu, 26 Apr 2007, Andrew Ho wrote:

> Here is a poorly formed error report.


Not that poorly formed! Took me 2 minutes to diagnose it.
This is not a bug.

> pattern: "^(.*?)(http:\/\/.+?|www\..+?)(\s|\)|'|\"|`|$)+"
> text: "(http://foxxxxxxxx.php?showforum=129)"
>
> After execute have match
> 0 - "(http://foxxxxxxxx.php?showforum=129)"
> 1 - "("
> 2 - "http://foxxxxxxxx.php?showforum=129"
> 3 - ""
>
> we have lost 3-th math


No, you have not. The reason is the final "+" in the pattern. The third
group loops. The first time round the loop it matches ")". The next time
round the loop it matches "$" (i.e. an empty string). The next time
round the loop it again matches "$" but there's magic code in PCRE to
stop it going round for ever (similar code is in Perl).

Therefore, the final value that is matched by the 3rd group is "". The
final value is what you get at the end of the match.

If you try this in Perl, you get exactly the same result.

Philip

--
Philip Hazel, University of Cambridge Computing Service.