Re: [pcre-dev] What is the expected behavior of /(?=..(*MARK…

Inizio della pagina
Delete this message
Autore: Zoltán Herczeg
Data:  
To: Thanh Hong Dai
CC: pcre-dev
Oggetto: Re: [pcre-dev] What is the expected behavior of /(?=..(*MARK:a))(*SKIP:a)(*FAIL)|./g
Hi Thanh,

I think these questions are better suited to https://www.reddit.com/r/regex

anyway, I think the /g causes the regex to match all characters. Without that it probably just matches only one character, as you can see in regex101. The first half of your second assumption is correct, the (*MARK:a) has no effect since it is inside an assertion. However (*SKIP:a) has no effect if the 'a' is not found, so it is not a (*SKIP) in that case!

Hence your pattern is the same as /(?=..(*MARK:a))(*FAIL)|./ since 'a' is not found during backtracking.

Because the first alternative is failed, we matches the second, and that matches to the single character.

Regards,
Zoltan

Thanh Hong Dai <hdthanh@???> írta:
>Hi,
>
>
>
>When testing the behavior of (*SKIP) to understand its underlying
>implementation, I constructed the following regex to verify my
>understanding:
>
>
>
>/(?=..(*MARK:a))(*SKIP:a)(*FAIL)|./g
>
>
>
>Test input:
>
>
>
>aaaaaaaaaaaaaaabbbbbbbbbbbaa
>
>
>
>With the assumption that (*SKIP) fails the attempt at the current starting
>position and bump along to the position where (*SKIP) is at, I anticipated
>two cases:
>
>
>
>1)      (*MARK:a) somehow stores the position, so when (*SKIP) fails the
>current attempt, it bumps along 2 characters ahead.

>
>2)      (*MARK:a) is not backtracked into, so when (*SKIP) fails the current
>attempt and bumps along by 1 character as per normal.

>
>
>
>In either case, I expect there can only be at most one match at the end,
>since it's the only place the look-ahead fails.
>
>
>
>However, as it turns out, all characters are matched. Running the debugger
>on regex101 (https://regex101.com/r/dA9tI1/1) reveals that it tries the
>first branch twice, and manages to try the second branch and succeeds.
>
>
>
>What is the expected behavior here?
>
>--
>## List details at https://lists.exim.org/mailman/listinfo/pcre-dev