Revision: 702
http://www.exim.org/viewvc/pcre2?view=rev&revision=702
Author: ph10
Date: 2017-03-24 18:20:34 +0000 (Fri, 24 Mar 2017)
Log Message:
-----------
Make \=find_limits apply to DFA matching, to find the minimum depth limit.
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/doc/pcre2test.1
code/trunk/src/pcre2test.c
code/trunk/testdata/testinput6
code/trunk/testdata/testoutput6
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2017-03-24 16:53:38 UTC (rev 701)
+++ code/trunk/ChangeLog 2017-03-24 18:20:34 UTC (rev 702)
@@ -84,7 +84,11 @@
14. The alternative matching function, pcre2_dfa_match() misbehaved if it
encountered a character class with a possessive repeat, for example [a-f]{3}+.
+15. The depth (formerly recursion) limit now applies to DFA matching (as
+of 10.23/36); pcre2test has been upgraded so that \=find_limits works with DFA
+matching to find the minimum value for this limit.
+
Version 10.23 14-February-2017
------------------------------
Modified: code/trunk/doc/pcre2test.1
===================================================================
--- code/trunk/doc/pcre2test.1 2017-03-24 16:53:38 UTC (rev 701)
+++ code/trunk/doc/pcre2test.1 2017-03-24 18:20:34 UTC (rev 702)
@@ -1,4 +1,4 @@
-.TH PCRE2TEST 1 "21 March 2017" "PCRE 10.30"
+.TH PCRE2TEST 1 "24 March 2017" "PCRE 10.30"
.SH NAME
pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@@ -1052,7 +1052,7 @@
copy=<number or name> copy captured substring
depth_limit=<n> set a depth limit
dfa use \fBpcre2_dfa_match()\fP
- find_limits find match and recursion limits
+ find_limits find match and depth limits
get=<number or name> extract captured substring
getall extract all captured substrings
/g global global matching
@@ -1297,23 +1297,26 @@
.SS "Finding minimum limits"
.rs
.sp
-If the \fBfind_limits\fP modifier is present, \fBpcre2test\fP calls
-\fBpcre2_match()\fP several times, setting different values in the match
-context via \fBpcre2_set_match_limit()\fP and \fBpcre2_set_depth_limit()\fP
-until it finds the minimum values for each parameter that allow
-\fBpcre2_match()\fP to complete without error.
+If the \fBfind_limits\fP modifier is present on a subject line, \fBpcre2test\fP
+calls the relevant matching function several times, setting different values in
+the match context via \fBpcre2_set_match_limit()\fP or
+\fBpcre2_set_depth_limit()\fP until it finds the minimum values for each
+parameter that allows the match to complete without error.
.P
If JIT is being used, only the match limit is relevant. If DFA matching is
-being used, only the depth limit is relevant, but at present this modifier is
-ignored (with a warning message).
+being used, only the depth limit is relevant.
.P
The \fImatch_limit\fP number is a measure of the amount of backtracking
that takes place, and learning the minimum value can be instructive. For most
simple matches, the number is quite small, but for patterns with very large
numbers of matching possibilities, it can become large very quickly with
-increasing length of subject string. The \fIdepth_limit\fP number is
-a measure of how much memory for recording backtracking points is needed to
-complete the match attempt.
+increasing length of subject string.
+.P
+For non-DFA matching, the minimum \fIdepth_limit\fP number is a measure of how
+much memory for recording backtracking points is needed to complete the match
+attempt. In the case of DFA matching, \fIdepth_limit\fP controls the depth of
+recursive calls of the internal function that is used for handling pattern
+recursion, lookaround assertions, and atomic groups.
.
.
.SS "Showing MARK names"
@@ -1765,6 +1768,6 @@
.rs
.sp
.nf
-Last updated: 21 March 2017
+Last updated: 24 March 2017
Copyright (c) 1997-2017 University of Cambridge.
.fi
Modified: code/trunk/src/pcre2test.c
===================================================================
--- code/trunk/src/pcre2test.c 2017-03-24 16:53:38 UTC (rev 701)
+++ code/trunk/src/pcre2test.c 2017-03-24 18:20:34 UTC (rev 702)
@@ -5258,8 +5258,20 @@
* Check match or depth limit *
*************************************************/
+/* This is used for DFA, normal, and JIT fast matching. For DFA matching it
+should only called with the third argument set to PCRE2_ERROR_DEPTHLIMIT.
+
+Arguments:
+ pp the subject string
+ ulen length of subject or PCRE2_ZERO_TERMINATED
+ errnumber defines which limit to test
+ msg string to include in final message
+
+Returns: the return from the final match function call
+*/
+
static int
-check_match_limit(uint8_t *pp, size_t ulen, int errnumber, const char *msg)
+check_match_limit(uint8_t *pp, PCRE2_SIZE ulen, int errnumber, const char *msg)
{
int capcount;
uint32_t min = 0;
@@ -5279,10 +5291,22 @@
{
PCRE2_SET_DEPTH_LIMIT(dat_context, mid);
}
-
- if ((pat_patctl.control & CTL_JITFAST) != 0)
+
+ if ((dat_datctl.control & CTL_DFA) != 0)
+ {
+ if (dfa_workspace == NULL)
+ dfa_workspace = (int *)malloc(DFA_WS_DIMENSION*sizeof(int));
+ if (dfa_matched++ == 0)
+ dfa_workspace[0] = -1; /* To catch bad restart */
+ PCRE2_DFA_MATCH(capcount, compiled_code, pp, ulen, dat_datctl.offset,
+ dat_datctl.options, match_data,
+ PTR(dat_context), dfa_workspace, DFA_WS_DIMENSION);
+ }
+
+ else if ((pat_patctl.control & CTL_JITFAST) != 0)
PCRE2_JIT_MATCH(capcount, compiled_code, pp, ulen, dat_datctl.offset,
dat_datctl.options, match_data, PTR(dat_context));
+
else
PCRE2_MATCH(capcount, compiled_code, pp, ulen, dat_datctl.offset,
dat_datctl.options, match_data, PTR(dat_context));
@@ -6243,12 +6267,6 @@
/* Handle matching via the native interface. Check for consistency of
modifiers. */
-if ((dat_datctl.control & (CTL_DFA|CTL_FINDLIMITS)) == (CTL_DFA|CTL_FINDLIMITS))
- {
- fprintf(outfile, "** Finding match limits is not relevant for DFA matching: ignored\n");
- dat_datctl.control &= ~CTL_FINDLIMITS;
- }
-
/* ALLUSEDTEXT is not supported with JIT, but JIT is not used with DFA
matching, even if the JIT compiler was used. */
@@ -6579,14 +6597,19 @@
(double)CLOCKS_PER_SEC);
}
- /* Find the match and depth limits if requested. The depth limit
- is not relevant for JIT. */
+ /* Find the match and depth limits if requested. The match limit is not
+ relevant for DFA matching and the depth limit is not relevant for JIT. */
if ((dat_datctl.control & CTL_FINDLIMITS) != 0)
{
- capcount = check_match_limit(pp, arg_ulen, PCRE2_ERROR_MATCHLIMIT, "match");
- if (FLD(compiled_code, executable_jit) == NULL)
- (void)check_match_limit(pp, arg_ulen, PCRE2_ERROR_DEPTHLIMIT,
+ if ((dat_datctl.control & CTL_DFA) == 0)
+ capcount = check_match_limit(pp, arg_ulen, PCRE2_ERROR_MATCHLIMIT,
+ "match");
+ else capcount = 0;
+ if (FLD(compiled_code, executable_jit) == NULL ||
+ (dat_datctl.options & PCRE2_NO_JIT) != 0 ||
+ (dat_datctl.control & CTL_DFA) != 0)
+ capcount = check_match_limit(pp, arg_ulen, PCRE2_ERROR_DEPTHLIMIT,
"depth");
}
Modified: code/trunk/testdata/testinput6
===================================================================
--- code/trunk/testdata/testinput6 2017-03-24 16:53:38 UTC (rev 701)
+++ code/trunk/testdata/testinput6 2017-03-24 18:20:34 UTC (rev 702)
@@ -4889,4 +4889,7 @@
/(02-)?[0-9]{3}-[0-9]{3}/
02-123-123
+/^(a(?2))(b)(?1)/
+ abbab\=find_limits
+
# End of testinput6
Modified: code/trunk/testdata/testoutput6
===================================================================
--- code/trunk/testdata/testoutput6 2017-03-24 16:53:38 UTC (rev 701)
+++ code/trunk/testdata/testoutput6 2017-03-24 18:20:34 UTC (rev 702)
@@ -7689,4 +7689,9 @@
02-123-123
0: 02-123-123
+/^(a(?2))(b)(?1)/
+ abbab\=find_limits
+Minimum depth limit = 2
+ 0: abbab
+
# End of testinput6