[pcre-dev] [Bug 2120] New: PCRE2_NO_UTF_CHECK does not disab…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2120] New: PCRE2_NO_UTF_CHECK does not disable all checks
https://bugs.exim.org/show_bug.cgi?id=2120

            Bug ID: 2120
           Summary: PCRE2_NO_UTF_CHECK does not disable all checks
           Product: PCRE
           Version: 10.23 (PCRE2)
          Hardware: x86
                OS: Linux
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: ph10@???
          Reporter: rob@???
                CC: pcre-dev@???


I'm using PCRE2 in UTF-8 mode for the JS interpreter of a custom web browser.
Works great. Thanks for the awesome library.

Unfortunately, some major websites use routine housekeeping regexes which test
for undesirable characters including \ud800-\udfff.

This causes the regex compile to fail on line 1473 of pcre2_valid_utf.c at ...

if (c >= 0xd800 && c <= 0xdfff) *errorcodeptr = ERR73;

PCRE2_NO_UTF_CHECK should disable this check but doesnt.

This error means a significant number of websites cant be viewed properly, so
i'm hand patching these lines of code with each new release of PCRE, but a
Google search shows plenty of other developers have been having problems with
this error for many years.

It would probably help a lot of people if PCRE2_NO_UTF_CHECK worked as expected
and disabled all UTF checks.

Thanks,
Rob

--
You are receiving this mail because:
You are on the CC list for the bug.