Re: [pcre-dev] DOT with (PCRE_DOTALL|PCRE_NEWLINE_ANY) is a …

Inizio della pagina
Delete this message
Autore: Philip Hazel
Data:  
To: Sheri
CC: Issaana, pcre-dev
Oggetto: Re: [pcre-dev] DOT with (PCRE_DOTALL|PCRE_NEWLINE_ANY) is a unexpected result
On Mon, 12 May 2008, Sheri wrote:

> Issaana@??? wrote:
> > Hello,
> >
> > I tried the following code.
> >
> > re=pcre_compile("a.b",PCRE_DOTALL|PCRE_NEWLINE_ANY,&err,&erroff,NULL);
> > rc=pcre_exec(re,NULL,"a\nb", 3,0,0,ov,3); //(A) rc=1
> > rc=pcre_exec(re,NULL,"a\r\nb",4,0,0,ov,3); //(B) rc=PCRE_ERROR_NOMATCH
> >
> > (B) is a unexpected result. Which of a bug or my misunderstanding is it?
> >
> > Thanks,
> > Issaana
> >
> >
> I believe that's intentional. Where \r\n is a valid line-break sequence
> it is treated as 2, not one character in the subject.


That is correct; "." never matches more than one character. This is an
extract from the pcrepattern man page:

The behaviour of dot with regard to newlines can be changed. If the
PCRE_DOTALL option is set, a dot matches any one character, without
exception. If the two-character sequence CRLF is present in the subject
string, it takes two dots to match it.

Philip

--
Philip Hazel