Re: [pcre-dev] DOT with (PCRE_DOTALL|PCRE_NEWLINE_ANY) is a …

Top Page
Delete this message
Author: Sheri
Date:  
To: Issaana
CC: pcre-dev
Subject: Re: [pcre-dev] DOT with (PCRE_DOTALL|PCRE_NEWLINE_ANY) is a unexpected result
Issaana@??? wrote:
> Hello,
>
> I tried the following code.
>
> re=pcre_compile("a.b",PCRE_DOTALL|PCRE_NEWLINE_ANY,&err,&erroff,NULL);
> rc=pcre_exec(re,NULL,"a\nb", 3,0,0,ov,3); //(A) rc=1
> rc=pcre_exec(re,NULL,"a\r\nb",4,0,0,ov,3); //(B) rc=PCRE_ERROR_NOMATCH
>
> (B) is a unexpected result. Which of a bug or my misunderstanding is it?
>
> Thanks,
> Issaana
>
>

I believe that's intentional. Where \r\n is a valid line-break sequence
it is treated as 2, not one character in the subject. You can match the
entire linebreak with a \R. Also, when an unanchored pattern fails to
match at the start of a newline sequence, PCRE advances past the entire
newline sequence before retrying the match. If the newline option
alternative in effect includes CRLF as one of the valid linebreaks, it
does not skip the \n in a CRLF if the pattern contains specific \r or \n
references.

Regards,
Sheri