Re: [pcre-dev] Dollar problem

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] Dollar problem
On Fri, 20 Aug 2010, ND wrote:

> Subject: aa<CR><LF>b
> Pattern: a$
> Match: RegEx error - PCRE_ERROR_NOMATCH
>
> If
> Pattern: aa$
> then
> Match: aa
>
> Why 'PCRE_ERROR_NOMATCH' in first example?


When I test, I get NOMATCH in the second example as well.

You don't say what options you used, if any. You need

(1) PCRE_MULTILINE to tell PCRE that the subject is more than one line.
(2) PCRE_NEWLINE_CRLF or PCRE_ANYCRLF or PCRE_ANY so that <CR><LF> is
interpreted as a newline sequence. Testing with pcretest shows that this
works:

PCRE version 8.10 2010-06-25

/a$/m<crlf>
aa\r\nb
0: a

It also works if the pattern is aa$.

This is what the documentation says:

       PCRE_MULTILINE                                                       


     By  default,  PCRE  treats the subject string as consisting of a single
     line of characters (even if it actually contains newlines). The  "start
     of  line"  metacharacter  (^)  matches only at the start of the string,
     while the "end of line" metacharacter ($) matches only at  the  end  of
     the string, or before a terminating newline (unless PCRE_DOLLAR_ENDONLY
     is set). This is the same as Perl.                                     


     When PCRE_MULTILINE it is set, the "start of line" and  "end  of  line"
     constructs  match  immediately following or immediately before internal
     newlines in the subject string, respectively, as well as  at  the  very
     start  and  end.





Philip

--
Philip Hazel