Revision: 332
http://vcs.pcre.org/viewvc?view=rev&revision=332
Author: ph10
Date: 2008-04-05 17:11:05 +0100 (Sat, 05 Apr 2008)
Log Message:
-----------
Alan Lehotsky's patch for REG_STARTEND.
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/doc/pcreposix.3
code/trunk/pcreposix.c
code/trunk/pcreposix.h
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2008-04-05 15:38:53 UTC (rev 331)
+++ code/trunk/ChangeLog 2008-04-05 16:11:05 UTC (rev 332)
@@ -10,7 +10,7 @@
2. Applied Craig's patch to pcrecpp.cc to restore ABI compatibility with
pre-7.6 versions, which defined a global no_arg variable instead of putting
- it in the RE class.
+ it in the RE class. (See also #8 below.)
3. Remove a line of dead code, identified by coverity and reported by Nuno
Lopes.
@@ -43,7 +43,10 @@
9. Applied Craig's patch to remove the use of push_back().
+10. Applied Alan Lehotsky's patch to add REG_STARTEND support to the POSIX
+ matching function regexec().
+
Version 7.6 28-Jan-08
---------------------
Modified: code/trunk/doc/pcreposix.3
===================================================================
--- code/trunk/doc/pcreposix.3 2008-04-05 15:38:53 UTC (rev 331)
+++ code/trunk/doc/pcreposix.3 2008-04-05 16:11:05 UTC (rev 332)
@@ -157,8 +157,9 @@
.rs
.sp
The function \fBregexec()\fP is called to match a compiled pattern \fIpreg\fP
-against a given \fIstring\fP, which is terminated by a zero byte, subject to
-the options in \fIeflags\fP. These can be:
+against a given \fIstring\fP, which is by default terminated by a zero byte
+(but see REG_STARTEND below), subject to the options in \fIeflags\fP. These can
+be:
.sp
REG_NOTBOL
.sp
@@ -169,6 +170,17 @@
.sp
The PCRE_NOTEOL option is set when calling the underlying PCRE matching
function.
+.sp
+ REG_STARTEND
+.sp
+The string is considered to start at \fIstring\fP + \fIpmatch[0].rm_so\fP and
+to have a terminating NUL located at \fIstring\fP + \fIpmatch[0].rm_eo\fP
+(there need not actually be a NUL at that location), regardless of the value of
+\fInmatch\fP. This is a BSD extension, compatible with but not specified by
+IEEE Standard 1003.2 (POSIX.2), and should be used with caution in software
+intended to be portable to other systems. Note that a non-zero \fIrm_so\fP does
+not imply REG_NOTBOL; REG_STARTEND affects only the location of the string, not
+how it is matched.
.P
If the pattern was compiled with the REG_NOSUB flag, no data about any matched
strings is returned. The \fInmatch\fP and \fIpmatch\fP arguments of
@@ -221,6 +233,6 @@
.rs
.sp
.nf
-Last updated: 06 March 2007
-Copyright (c) 1997-2007 University of Cambridge.
+Last updated: 05 April 2008
+Copyright (c) 1997-2008 University of Cambridge.
.fi
Modified: code/trunk/pcreposix.c
===================================================================
--- code/trunk/pcreposix.c 2008-04-05 15:38:53 UTC (rev 331)
+++ code/trunk/pcreposix.c 2008-04-05 16:11:05 UTC (rev 332)
@@ -263,7 +263,7 @@
regexec(const regex_t *preg, const char *string, size_t nmatch,
regmatch_t pmatch[], int eflags)
{
-int rc;
+int rc, so, eo;
int options = 0;
int *ovector = NULL;
int small_ovector[POSIX_MALLOC_THRESHOLD * 3];
@@ -296,7 +296,23 @@
}
}
-rc = pcre_exec((const pcre *)preg->re_pcre, NULL, string, (int)strlen(string),
+/* REG_STARTEND is a BSD extension, to allow for non-NUL-terminated strings.
+The man page from OS X says "REG_STARTEND affects only the location of the
+string, not how it is matched". That is why the "so" value is used to bump the
+start location rather than being passed as a PCRE "starting offset". */
+
+if ((eflags & REG_STARTEND) != 0)
+ {
+ so = pmatch[0].rm_so;
+ eo = pmatch[0].rm_eo;
+ }
+else
+ {
+ so = 0;
+ eo = strlen(string);
+ }
+
+rc = pcre_exec((const pcre *)preg->re_pcre, NULL, string + so, (eo - so),
0, options, ovector, nmatch * 3);
if (rc == 0) rc = nmatch; /* All captured slots were filled in */
Modified: code/trunk/pcreposix.h
===================================================================
--- code/trunk/pcreposix.h 2008-04-05 15:38:53 UTC (rev 331)
+++ code/trunk/pcreposix.h 2008-04-05 16:11:05 UTC (rev 332)
@@ -59,6 +59,7 @@
#define REG_DOTALL 0x0010 /* NOT defined by POSIX. */
#define REG_NOSUB 0x0020
#define REG_UTF8 0x0040 /* NOT defined by POSIX. */
+#define REG_STARTEND 0x0080 /* BSD feature: pass subject string by so,eo */
/* This is not used by PCRE, but by defining it we make it easier
to slot PCRE into existing programs that make POSIX calls. */