Re: [pcre-dev] Unmatched subpattern become wildcard

Página superior
Eliminar este mensaje
Autor: Zoltán Herczeg
Fecha:  
A: pcre-dev
Asunto: Re: [pcre-dev] Unmatched subpattern become wildcard
Just my two cents:

===================================================================
--- pcre_exec.c (revision 813)
+++ pcre_exec.c (working copy)
@@ -2634,6 +2634,8 @@

if (length == 0) continue;

+ if (length < 0) RRETURN(MATCH_NOMATCH);
+
/* First, ensure the minimum number of matches are present. We get back
the length of the reference string explicitly rather than passing the
address of eptr, so that eptr can be a register variable. */

Philip, are you sure this is corrent in utf-8 when the length can be changed (maximizing part):

while (eptr >= pp)
{
RMATCH(eptr, ecode, offset_top, md, eptrb, RM15);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
eptr -= length;
}

Btw, jit is unaffected:

PCRE version 8.22 2011-12-12

re> /(another)?(\1?)test/S+
data> hello world test

0: test
1: <unset>
2:
data>
re> /(another)?(\1?)test/
data> hello world test

0: hello world test
1: <unset>
2: hello world
data>

Regards,
Zoltan

Tib <tiberius.teng@???> írta:
>Hi,>
>

Now I confirmed that this behavior starts from 8.13.>
8.12 only matches "test".>
>

Best regards,>
Tiberius Teng>
>

On Wed, Dec 21, 2011 at 3:10 PM, Tib <tiberius.teng@???> wrote:>
> Hi,>
>>
> I just upgraded from PCRE 8.10 to PCRE 8.21, and surprisingly found>
> the following behavior change:>
>>
> $t='hello world test';>
> preg_match('/(another)?(\1?)test/', $t, $m);>
> var_dump($m);>
>>
> on 8.10 I got:>
>>
> array(3) {>
> [0]=>>
> string(4) "test">
> [1]=>>
> string(0) "">
> [2]=>>
> string(0) "">
> }>
>>
> Which is fine, but on 8.21 I got:>
>>
> array(3) {>
> [0]=>>
> string(16) "hello world test">
> [1]=>>
> string(0) "">
> [2]=>>
> string(12) "hello world ">
> }>
>>
> When (another)? didn't match, I would assume \1 also match an empty string,>
> however (\1?) part captured "hello world ", is this a bug or I>
> misunderstand the syntax?>
>>
> Best regards,>
> Tiberius Teng>
>

-- >
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev