[pcre-dev] [Bug 2674] New: Regex stop or skip (at) whitespac…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2674] New: Regex stop or skip (at) whitespace but works perfectly using external regex tool
https://bugs.exim.org/show_bug.cgi?id=2674

            Bug ID: 2674
           Summary: Regex stop or skip (at) whitespace but works perfectly
                    using external regex tool
           Product: PCRE
           Version: 10.33 (PCRE2)
          Hardware: x86-64
                OS: Linux
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: Philip.Hazel@???
          Reporter: shiftag@???
                CC: pcre-dev@???


Hi,

I have an issue with libpcre2 and I think this is bug.

So I using the following regex with libpcre2demo (I don't use command-line I
just copy/paste the regex in demo file):

static const char patter[] = "^(?:(?:(?<scheme>[^:/?#    \
]+)://)?(?<authority>(?:((?<userinfo>[^/?#               \
]*)@)?(?<host>[^/?#:                                     \
]*)(?::(?<port>\\d+))?)))?(?<path>[^?#                   \
]*)(?:\\?(?<query>[^#                                    \
]*))?(?:#(?<anchor>.*))?";



Then, I used the following string as a subject:

PCRE2_SPTR subject = "https://www.google.com/bar/browser features foo+1/#";


Using external regex tool all matches are correct:

https://regex101.com/r/kawOBH/1


So my issue using libpcre2demo is the "path" named group, the output is the
following:

Named substrings
(9)    anchor: 
(2) authority: www.google.com
(5)      host: www.google.com
(7)      path: /bar/browser
(6)      port: 
(8)     query: 
(1)    scheme: https
(4)  userinfo: 


As you can see the "path" named group is truncated or stopped at the space.

Is there any workaround or is this a bug ?

Thanks

--
You are receiving this mail because:
You are on the CC list for the bug.