------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1616
Summary: Line begin anchor fits not at end of text, if the last
character is a new line character
Product: PCRE
Version: 10.10 (PCRE2)
Platform: Other
OS/Version: Windows
Status: NEW
Severity: bug
Priority: low
Component: Code
AssignedTo: ph10@???
ReportedBy: david.gausmann@???
CC: pcre-dev@???
Hello there,
I've found a mysterious behaviour.
My regular expression has the following pattern: ^
The search option is PCRE2_MULTILINE.
The text, I am search through, has the following content: \n\n\n
The new line behaviour is PCRE2_NEWLINE_ANYCRLF.
I whould expect, that I get four results:
- Offset 0
- Offset 1
- Offset 2
- Offset 3
Instead I get only the first three results.
If I add a character to the end of my haystack text, then I get four results.
My C++ code looks like that (I've removed some irrelevant information):
----------------------------------------------------------
while(nStart <= nLength)
{
// Search for next match
int nResult = pcre2_match(this->m_pRegEx,
reinterpret_cast<PCRE2_SPTR16>(wszTextOffset), static_cast<size_t>(nLength),
static_cast<size_t>(nStart), 0, pMatchData.get(), nullptr);
if(nResult < 0)
{
switch(nResult)
{
case PCRE2_ERROR_NOMATCH:
goto NoMatch;
default:
// Throw RegEx error
// ...
}
}
// Copy found match
// ...
if(!this->m_bGlobal)
break;
if(puOVector[0] == puOVector[1])
nStart = puOVector[1] + 1; // Zero length match (otherwise we get an
endless loop)
else
nStart = puOVector[1];
}
----------------------------------------------------------
The loop is executed four times, but in the fourth loop pcre2_match returns
PCRE2_ERROR_NOMATCH.
Is this a bug or must I do something differently to allow zero-length matches
like this at the end of text?
Kind Regards
David Gausmann
--
Configure bugmail:
http://bugs.exim.org/userprefs.cgi?tab=email