Re: [pcre-dev] (*MARK:NAME) incompatibility

Top Page

Reply to this message
Author: Philip Hazel
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] (*MARK:NAME) incompatibility
On Tue, 24 Apr 2012, ND wrote:

> It seems that maximum (*:NAME) length in PCRE is restricted by about 2^8
> bytes. My regular expression needs about 550 symbols in MARK verb and is well
> worked in Perl. But I can not port it into PCRE due to this size resriction.

MARK names are stored as a length, the name, and a zero. In the 8-bit
library, this does mean that there is a limit of 255 on the length of
the name.

There is a bug in that there is no check on name length, which is why
you are getting an internal error. This is easy to fix and I will do
that so that you get a better error, at compile time.

The next question is: should the 8-bit library be upgraded to allow
longer names?

> How you think about restricting maximum MARK size by 2^(8*"with-link-size")
> bytes.

It never occurred to me that anybody would want MARK names longer than
255 bytes. I suspect that you are the only person on the planet who
needs this.

Zoltan (I assume you will read this): What do you think? Changing the
representation of MARK will need changes in the JIT code. Making the
default 2 bytes in the 8-bit library will penalize all users of MARK,
but I expect it will only be a small penalty, and it would make PCRE
just a little bit more Perl-compatible.


Philip Hazel