[pcre-dev] [Bug 2106] Please add support for parsing POSIX b…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2106] Please add support for parsing POSIX basic & extended regular expressions
https://bugs.exim.org/show_bug.cgi?id=2106

Zoltan Herczeg <hzmester@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hzmester@???


--- Comment #10 from Zoltan Herczeg <hzmester@???> ---
> If JIT is an extra step, then can you use the results of the initial pattern
> compilation step to guide a default "PCRE2_AUTOJIT" mode? Where if the
> result
> of the initial pattern compilation suggests JIT would be either 1) extremely
> beneficial for the pattern in question or 2) very low cost to compile for the
> pattern in question, it gets JITified (provided JIT's available).
> Future releases of PCRE2 could then improve on the metrics used to decide
> without need for any client adjustments.


Yes, it is an extra step. And the problem is: only users know how the compiled
pattern will be used. If the input is small (e.g. 10 bytes) and the pattern is
used only once, JIT probably not worth it except in some extreme cases. When
the pattern is used 100 times, JIT probably worth it even if the input is
around 30-50 bytes long. For a megabyte input, JIT likely worth it even if the
pattern only used once.

There are two unknown factors here: how many times the compiled pattern will be
used, and the average input length. These two are important for a good
prediction. In my experiences the pattern itself does not provide enough
information for any prediction.

> I'm inclined to auto JITifiy by default as you want to encourage folks to
> adopt PCRE2 and one good way to accomplish that is to provide an easy-to-use
> and familiar interface (i.e. pcreposix) that provides the fastest possible
> pattern matching by default (i.e. use of extra options not required).


JIT stack is the problem here. The default 32K stack might be not enough and
manual allocation might be required. That is not exactly easy-to-use interface.
To support a better interface we would need to exclude windows from the
supported platforms but people would not be happy about it.

--
You are receiving this mail because:
You are on the CC list for the bug.