Re: [pcre-dev] Strangely long matching times. Could anyone h…

Kezdőlap
Üzenet törlése
Szerző: Philip Hazel
Dátum:  
Címzett: Ralf Junker
CC: pcre-dev@exim.org
Tárgy: Re: [pcre-dev] Strangely long matching times. Could anyone help to explain?
The explanation is that the fast ones benefit from a start-up optimization.
For example, with the added x it knows there must be an x in the subject
and it does a preliminary check. Same for y. Not the same for an added a.
If you run with no_start_optimize the fast ones will be slow and the
slowness in all cases is because it is checking a rather large search tree.
Regards,
Philip


On Fri, 20 Nov 2020 at 11:29, Ralf Junker <ralfjunker@???> wrote:

> Below is a test file for pcre2test which shows matching times which I
> cannot explain. I run it on Windows like this:
>
>    pcre2test -tm 1 tests.txt

>
> If I understand this correctly, at least the atomic grouping pattern
> should run fast: https://www.regular-expressions.info/catastrophic.html
>
> Interestingly, with JIT all patterns run instantly in 0 ms:
>
>    pcre2test -jit -tm 1 tests.txt

>
> I am using PCRE2 SVN 1284.
>
> Could anyone jump in with an explanation?
>
> Many thanks,
>
> Ralf
>
> ----------
>
> # Match is instantaneous (0 milliseconds).
>
> /aa.*?bb/
> \[aa                                 bb ]{200}

>
> # Failure with extra letter "x" is also extremely fast (0 milliseconds).
>
> /aa.*?bbx/
> \[aa                                 bb ]{200}

>
> # Failure with extra letter "y" is just as fast (0 milliseconds).
>
> /aa.*?bby/
> \[aa                                 bb ]{200}

>
> # Failure with extra letter "a" is very slow (> 6000 ms).
>
> /aa.*?bba/
> \[aa                                 bb ]{200}

>
> # Failure with extra letter "b" is also very slow (> 6000 ms).
>
> /aa.*?bbb/
> \[aa                                 bb ]{200}

>
> # Atomic grouping does not help (each > 6000 ms).
>
> /aa(?>.*?bba)/
> \[aa                                 bb ]{200}

>
> /aa(?>.*?bbb)/
> \[aa                                 bb ]{200}

>
> # Is this expected behavior? Suggestions? Thank you!
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
>