Revision: 1443
http://vcs.pcre.org/viewvc?view=rev&revision=1443
Author: ph10
Date: 2014-01-12 19:20:27 +0000 (Sun, 12 Jan 2014)
Log Message:
-----------
Check alternative outputs for the locale test in RunTest. It should now work
for the 'fr' locale (which was broken).
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/RunTest
code/trunk/testdata/testinput3
code/trunk/testdata/testoutput3
code/trunk/testdata/wintestoutput3
Added Paths:
-----------
code/trunk/testdata/testoutput3A
code/trunk/testdata/testoutput3B
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/ChangeLog 2014-01-12 19:20:27 UTC (rev 1443)
@@ -68,6 +68,18 @@
14. Improve pattern prefix search by a simplified Boyer-Moore algorithm in JIT.
The algorithm provides a way to skip certain starting offsets, and usually
faster than linear prefix searches.
+
+15. Change 13 for 8.20 updated RunTest to check for the 'fr' locale as well
+ as for 'fr_FR' and 'french'. For some reason, however, it then used the
+ Windows-specific input and output files, which have 'french' screwed in.
+ So this could never have worked. One of the problems with locales is that
+ they aren't always the same. I have now updated RunTest so that it checks
+ the output of the locale test (test 3) against three different output
+ files, and it allows the test to pass if any one of them matches. With luck
+ this should make the test pass on some versions of Solaris where it was
+ failing. Because of the uncertainty, the script did not used to stop if
+ test 3 failed; it now does. If further versions of a French locale ever
+ come to light, they can now easily be added.
Version 8.34 15-December-2013
Modified: code/trunk/RunTest
===================================================================
--- code/trunk/RunTest 2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/RunTest 2014-01-12 19:20:27 UTC (rev 1443)
@@ -31,6 +31,11 @@
# except test 10. Whatever order the arguments are in, the tests are always run
# in numerical order.
#
+# The special argument "3S" runs test 3, stopping if it fails. Test 3 is the
+# locale test, and failure usually means there's an issue with the locale
+# rather than a bug in PCRE, so normally subsequent tests are run. "3S" is
+# useful when you want to debug or update the test.
+#
# Inappropriate tests are automatically skipped (with a comment to say so): for
# example, if JIT support is not compiled, test 12 is skipped, whereas if JIT
# support is compiled, test 13 is skipped.
@@ -458,8 +463,9 @@
# Locale-specific tests, provided that either the "fr_FR" or the "french"
# locale is available. The former is the Unix-like standard; the latter is
-# for Windows. Another possibility is "fr", which needs to be run against
-# the Windows-specific input and output files.
+# for Windows. Another possibility is "fr". Unfortunately, different versions
+# of the French locale give different outputs for some items. This test passes
+# if the output matches any one of the alternative output files.
if [ $do3 = yes ] ; then
locale -a | grep '^fr_FR$' >/dev/null
@@ -467,20 +473,28 @@
locale=fr_FR
infile=$testdata/testinput3
outfile=$testdata/testoutput3
+ outfile2=$testdata/testoutput3A
+ outfile3=$testdata/testoutput3B
else
infile=test3input
outfile=test3output
+ outfile2=test3outputA
+ outfile3=test3outputB
locale -a | grep '^french$' >/dev/null
if [ $? -eq 0 ] ; then
locale=french
sed 's/fr_FR/french/' $testdata/testinput3 >test3input
sed 's/fr_FR/french/' $testdata/testoutput3 >test3output
+ sed 's/fr_FR/french/' $testdata/testoutput3A >test3outputA
+ sed 's/fr_FR/french/' $testdata/testoutput3B >test3outputB
else
locale -a | grep '^fr$' >/dev/null
if [ $? -eq 0 ] ; then
locale=fr
- sed 's/fr_FR/fr/' $testdata/wintestinput3 >test3input
- sed 's/fr_FR/fr/' $testdata/wintestoutput3 >test3output
+ sed 's/fr_FR/fr/' $testdata/intestinput3 >test3input
+ sed 's/fr_FR/fr/' $testdata/intestoutput3 >test3output
+ sed 's/fr_FR/fr/' $testdata/intestoutput3A >test3outputA
+ sed 's/fr_FR/fr/' $testdata/intestoutput3B >test3outputB
else
locale=
fi
@@ -492,18 +506,20 @@
for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $infile testtry
if [ $? = 0 ] ; then
- $cf $outfile testtry
- if [ $? != 0 ] ; then
- echo " "
- echo "Locale test did not run entirely successfully."
- echo "This usually means that there is a problem with the locale"
- echo "settings rather than a bug in PCRE."
- break;
- else
+ if $cf $outfile testtry >teststdout || \
+ $cf $outfile2 testtry >teststdout || \
+ $cf $outfile3 testtry >teststdout
+ then
if [ "$opt" = "-s" ] ; then echo " OK with study"
elif [ "$opt" = "-s+" ] ; then echo " OK with JIT study"
else echo " OK"
fi
+ else
+ echo "** Locale test did not run successfully. The output did not match"
+ echo " $outfile, $outfile2 or $outfile3."
+ echo " This may mean that there is a problem with the locale settings rather"
+ echo " than a bug in PCRE."
+ exit 1
fi
else exit 1
fi
@@ -989,6 +1005,6 @@
done
# Clean up local working files
-rm -f test3input test3output testNinput testsaved* teststderr teststdout testtry
+rm -f test3input test3output test3outputA testNinput testsaved* teststderr teststdout testtry
# End
Modified: code/trunk/testdata/testinput3
===================================================================
--- code/trunk/testdata/testinput3 2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/testdata/testinput3 2014-01-12 19:20:27 UTC (rev 1443)
@@ -1,7 +1,10 @@
-/-- This set of tests checks local-specific features, using the fr_FR locale.
- It is not Perl-compatible. There is different version called wintestinput3
- f or use on Windows, where the locale is called "french". --/
-
+/-- This set of tests checks local-specific features, using the "fr_FR" locale.
+ It is not Perl-compatible. When run via RunTest, the locale is edited to
+ be whichever of "fr_FR", "french", or "fr" is found to exist. There is
+ different version of this file called wintestinput3 for use on Windows,
+ where the locale is called "french" and the tests are run using
+ RunTest.bat. --/
+
< forbid 8W
/^[\w]+/
Modified: code/trunk/testdata/testoutput3
===================================================================
--- code/trunk/testdata/testoutput3 2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/testdata/testoutput3 2014-01-12 19:20:27 UTC (rev 1443)
@@ -1,7 +1,10 @@
-/-- This set of tests checks local-specific features, using the fr_FR locale.
- It is not Perl-compatible. There is different version called wintestinput3
- f or use on Windows, where the locale is called "french". --/
-
+/-- This set of tests checks local-specific features, using the "fr_FR" locale.
+ It is not Perl-compatible. When run via RunTest, the locale is edited to
+ be whichever of "fr_FR", "french", or "fr" is found to exist. There is
+ different version of this file called wintestinput3 for use on Windows,
+ where the locale is called "french" and the tests are run using
+ RunTest.bat. --/
+
< forbid 8W
/^[\w]+/
Added: code/trunk/testdata/testoutput3A
===================================================================
--- code/trunk/testdata/testoutput3A (rev 0)
+++ code/trunk/testdata/testoutput3A 2014-01-12 19:20:27 UTC (rev 1443)
@@ -0,0 +1,174 @@
+/-- This set of tests checks local-specific features, using the "fr_FR" locale.
+ It is not Perl-compatible. When run via RunTest, the locale is edited to
+ be whichever of "fr_FR", "french", or "fr" is found to exist. There is
+ different version of this file called wintestinput3 for use on Windows,
+ where the locale is called "french" and the tests are run using
+ RunTest.bat. --/
+
+< forbid 8W
+
+/^[\w]+/
+ *** Failers
+No match
+ \xC9cole
+No match
+
+/^[\w]+/Lfr_FR
+ \xC9cole
+ 0: \xC9cole
+
+/^[\w]+/
+ *** Failers
+No match
+ \xC9cole
+No match
+
+/^[\W]+/
+ \xC9cole
+ 0: \xc9
+
+/^[\W]+/Lfr_FR
+ *** Failers
+ 0: ***
+ \xC9cole
+No match
+
+/[\b]/
+ \b
+ 0: \x08
+ *** Failers
+No match
+ a
+No match
+
+/[\b]/Lfr_FR
+ \b
+ 0: \x08
+ *** Failers
+No match
+ a
+No match
+
+/^\w+/
+ *** Failers
+No match
+ \xC9cole
+No match
+
+/^\w+/Lfr_FR
+ \xC9cole
+ 0: \xC9cole
+
+/(.+)\b(.+)/
+ \xC9cole
+ 0: \xc9cole
+ 1: \xc9
+ 2: cole
+
+/(.+)\b(.+)/Lfr_FR
+ *** Failers
+ 0: *** Failers
+ 1: ***
+ 2: Failers
+ \xC9cole
+No match
+
+/\xC9cole/i
+ \xC9cole
+ 0: \xc9cole
+ *** Failers
+No match
+ \xE9cole
+No match
+
+/\xC9cole/iLfr_FR
+ \xC9cole
+ 0: \xC9cole
+ \xE9cole
+ 0: \xE9cole
+
+/\w/IS
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+ Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
+
+/\w/ISLfr_FR
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+ Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
+ \xAA \xB5 \xBA \xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 \xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF \xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \xE0 \xE1 \xE2
+ \xE3 \xE4 \xE5 \xE6 \xE7 \xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF \xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF8 \xF9 \xFA \xFB \xFC \xFD \xFE \xFF
+
+/^[\xc8-\xc9]/iLfr_FR
+ \xC9cole
+ 0: \xC9
+ \xE9cole
+ 0: \xE9
+
+/^[\xc8-\xc9]/Lfr_FR
+ \xC9cole
+ 0: \xC9
+ *** Failers
+No match
+ \xE9cole
+No match
+
+/\W+/Lfr_FR
+ >>>\xaa<<<
+ 0: >>>
+ >>>\xba<<<
+ 0: >>>
+
+/[\W]+/Lfr_FR
+ >>>\xaa<<<
+ 0: >>>
+ >>>\xba<<<
+ 0: >>>
+
+/[^[:alpha:]]+/Lfr_FR
+ >>>\xaa<<<
+ 0: >>>
+ >>>\xba<<<
+ 0: >>>
+
+/\w+/Lfr_FR
+ >>>\xaa<<<
+ 0: \xAA
+ >>>\xba<<<
+ 0: \xBA
+
+/[\w]+/Lfr_FR
+ >>>\xaa<<<
+ 0: \xAA
+ >>>\xba<<<
+ 0: \xBA
+
+/[[:alpha:]]+/Lfr_FR
+ >>>\xaa<<<
+ 0: \xAA
+ >>>\xba<<<
+ 0: \xBA
+
+/[[:alpha:]][[:lower:]][[:upper:]]/DZLfr_FR
+------------------------------------------------------------------
+ Bra
+ [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
+ [a-z\xaa\xb5\xba\xdf-\xf6\xf8-\xff]
+ [A-Z\xc0-\xd6\xd8-\xde]
+ Ket
+ End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+
+/-- End of testinput3 --/
Added: code/trunk/testdata/testoutput3B
===================================================================
--- code/trunk/testdata/testoutput3B (rev 0)
+++ code/trunk/testdata/testoutput3B 2014-01-12 19:20:27 UTC (rev 1443)
@@ -0,0 +1,174 @@
+/-- This set of tests checks local-specific features, using the "fr_FR" locale.
+ It is not Perl-compatible. When run via RunTest, the locale is edited to
+ be whichever of "fr_FR", "french", or "fr" is found to exist. There is
+ different version of this file called wintestinput3 for use on Windows,
+ where the locale is called "french" and the tests are run using
+ RunTest.bat. --/
+
+< forbid 8W
+
+/^[\w]+/
+ *** Failers
+No match
+ \xC9cole
+No match
+
+/^[\w]+/Lfr_FR
+ \xC9cole
+ 0: \xC9cole
+
+/^[\w]+/
+ *** Failers
+No match
+ \xC9cole
+No match
+
+/^[\W]+/
+ \xC9cole
+ 0: \xc9
+
+/^[\W]+/Lfr_FR
+ *** Failers
+ 0: ***
+ \xC9cole
+No match
+
+/[\b]/
+ \b
+ 0: \x08
+ *** Failers
+No match
+ a
+No match
+
+/[\b]/Lfr_FR
+ \b
+ 0: \x08
+ *** Failers
+No match
+ a
+No match
+
+/^\w+/
+ *** Failers
+No match
+ \xC9cole
+No match
+
+/^\w+/Lfr_FR
+ \xC9cole
+ 0: \xC9cole
+
+/(.+)\b(.+)/
+ \xC9cole
+ 0: \xc9cole
+ 1: \xc9
+ 2: cole
+
+/(.+)\b(.+)/Lfr_FR
+ *** Failers
+ 0: *** Failers
+ 1: ***
+ 2: Failers
+ \xC9cole
+No match
+
+/\xC9cole/i
+ \xC9cole
+ 0: \xc9cole
+ *** Failers
+No match
+ \xE9cole
+No match
+
+/\xC9cole/iLfr_FR
+ \xC9cole
+ 0: \xC9cole
+ \xE9cole
+ 0: \xE9cole
+
+/\w/IS
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+ Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
+
+/\w/ISLfr_FR
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+ Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
+ \xAA \xB5 \xBA \xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 \xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF \xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \xE0 \xE1 \xE2
+ \xE3 \xE4 \xE5 \xE6 \xE7 \xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF \xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF8 \xF9 \xFA \xFB \xFC \xFD \xFE \xFF
+
+/^[\xc8-\xc9]/iLfr_FR
+ \xC9cole
+ 0: \xC9
+ \xE9cole
+ 0: \xE9
+
+/^[\xc8-\xc9]/Lfr_FR
+ \xC9cole
+ 0: \xC9
+ *** Failers
+No match
+ \xE9cole
+No match
+
+/\W+/Lfr_FR
+ >>>\xaa<<<
+ 0: >>>
+ >>>\xba<<<
+ 0: >>>
+
+/[\W]+/Lfr_FR
+ >>>\xaa<<<
+ 0: >>>
+ >>>\xba<<<
+ 0: >>>
+
+/[^[:alpha:]]+/Lfr_FR
+ >>>\xaa<<<
+ 0: >>>
+ >>>\xba<<<
+ 0: >>>
+
+/\w+/Lfr_FR
+ >>>\xaa<<<
+ 0: \xAA
+ >>>\xba<<<
+ 0: \xBA
+
+/[\w]+/Lfr_FR
+ >>>\xaa<<<
+ 0: \xAA
+ >>>\xba<<<
+ 0: \xBA
+
+/[[:alpha:]]+/Lfr_FR
+ >>>\xaa<<<
+ 0: \xAA
+ >>>\xba<<<
+ 0: \xBA
+
+/[[:alpha:]][[:lower:]][[:upper:]]/DZLfr_FR
+------------------------------------------------------------------
+ Bra
+ [A-Za-z\x83\x8a\x8c\x8e\x9a\x9c\x9e\x9f\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
+ [a-z\x83\x9a\x9c\x9e\xaa\xb5\xba\xdf-\xf6\xf8-\xff]
+ [A-Z\x8a\x8c\x8e\x9f\xc0-\xd6\xd8-\xde]
+ Ket
+ End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+
+/-- End of testinput3 --/
Modified: code/trunk/testdata/wintestoutput3
===================================================================
--- code/trunk/testdata/wintestoutput3 2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/testdata/wintestoutput3 2014-01-12 19:20:27 UTC (rev 1443)
@@ -84,7 +84,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
/\w/ISLfrench
@@ -93,7 +93,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
\x83 \x8A \x8C \x8E \x9A \x9C \x9E \x9F \xAA \xB2 \xB3 \xB5 \xB9 \xBA \xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 \xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF \xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6
\xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \xE0 \xE1 \xE2 \xE3 \xE4 \xE5 \xE6 \xE7 \xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF \xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF8 \xF9 \xFA \xFB \xFC