[Pcre-svn] [1443] code/trunk: Check alternative outputs for …

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [1443] code/trunk: Check alternative outputs for the locale test in RunTest .
Revision: 1443
          http://vcs.pcre.org/viewvc?view=rev&revision=1443
Author:   ph10
Date:     2014-01-12 19:20:27 +0000 (Sun, 12 Jan 2014)


Log Message:
-----------
Check alternative outputs for the locale test in RunTest. It should now work
for the 'fr' locale (which was broken).

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/RunTest
    code/trunk/testdata/testinput3
    code/trunk/testdata/testoutput3
    code/trunk/testdata/wintestoutput3


Added Paths:
-----------
    code/trunk/testdata/testoutput3A
    code/trunk/testdata/testoutput3B


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/ChangeLog    2014-01-12 19:20:27 UTC (rev 1443)
@@ -68,6 +68,18 @@
 14. Improve pattern prefix search by a simplified Boyer-Moore algorithm in JIT.
     The algorithm provides a way to skip certain starting offsets, and usually
     faster than linear prefix searches.
+    
+15. Change 13 for 8.20 updated RunTest to check for the 'fr' locale as well
+    as for 'fr_FR' and 'french'. For some reason, however, it then used the 
+    Windows-specific input and output files, which have 'french' screwed in. 
+    So this could never have worked. One of the problems with locales is that 
+    they aren't always the same. I have now updated RunTest so that it checks 
+    the output of the locale test (test 3) against three different output 
+    files, and it allows the test to pass if any one of them matches. With luck 
+    this should make the test pass on some versions of Solaris where it was 
+    failing. Because of the uncertainty, the script did not used to stop if 
+    test 3 failed; it now does. If further versions of a French locale ever 
+    come to light, they can now easily be added.



Version 8.34 15-December-2013

Modified: code/trunk/RunTest
===================================================================
--- code/trunk/RunTest    2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/RunTest    2014-01-12 19:20:27 UTC (rev 1443)
@@ -31,6 +31,11 @@
 # except test 10. Whatever order the arguments are in, the tests are always run
 # in numerical order.
 #
+# The special argument "3S" runs test 3, stopping if it fails. Test 3 is the
+# locale test, and failure usually means there's an issue with the locale 
+# rather than a bug in PCRE, so normally subsequent tests are run. "3S" is
+# useful when you want to debug or update the test.
+#
 # Inappropriate tests are automatically skipped (with a comment to say so): for
 # example, if JIT support is not compiled, test 12 is skipped, whereas if JIT
 # support is compiled, test 13 is skipped.
@@ -458,8 +463,9 @@


# Locale-specific tests, provided that either the "fr_FR" or the "french"
# locale is available. The former is the Unix-like standard; the latter is
-# for Windows. Another possibility is "fr", which needs to be run against
-# the Windows-specific input and output files.
+# for Windows. Another possibility is "fr". Unfortunately, different versions
+# of the French locale give different outputs for some items. This test passes
+# if the output matches any one of the alternative output files.

 if [ $do3 = yes ] ; then
   locale -a | grep '^fr_FR$' >/dev/null
@@ -467,20 +473,28 @@
     locale=fr_FR
     infile=$testdata/testinput3
     outfile=$testdata/testoutput3
+    outfile2=$testdata/testoutput3A 
+    outfile3=$testdata/testoutput3B 
   else
     infile=test3input
     outfile=test3output
+    outfile2=test3outputA 
+    outfile3=test3outputB 
     locale -a | grep '^french$' >/dev/null
     if [ $? -eq 0 ] ; then
       locale=french
       sed 's/fr_FR/french/' $testdata/testinput3 >test3input
       sed 's/fr_FR/french/' $testdata/testoutput3 >test3output
+      sed 's/fr_FR/french/' $testdata/testoutput3A >test3outputA
+      sed 's/fr_FR/french/' $testdata/testoutput3B >test3outputB
     else
       locale -a | grep '^fr$' >/dev/null
       if [ $? -eq 0 ] ; then
         locale=fr
-        sed 's/fr_FR/fr/' $testdata/wintestinput3 >test3input
-        sed 's/fr_FR/fr/' $testdata/wintestoutput3 >test3output
+        sed 's/fr_FR/fr/' $testdata/intestinput3 >test3input
+        sed 's/fr_FR/fr/' $testdata/intestoutput3 >test3output
+        sed 's/fr_FR/fr/' $testdata/intestoutput3A >test3outputA
+        sed 's/fr_FR/fr/' $testdata/intestoutput3B >test3outputB
       else
         locale=
       fi
@@ -492,18 +506,20 @@
     for opt in "" "-s" $jitopt; do
       $sim $valgrind ./pcretest -q $bmode $opt $infile testtry
       if [ $? = 0 ] ; then
-        $cf $outfile testtry
-        if [ $? != 0 ] ; then
-          echo " "
-          echo "Locale test did not run entirely successfully."
-          echo "This usually means that there is a problem with the locale"
-          echo "settings rather than a bug in PCRE."
-          break;
-        else
+        if $cf $outfile testtry >teststdout || \
+           $cf $outfile2 testtry >teststdout || \
+           $cf $outfile3 testtry >teststdout 
+        then
           if [ "$opt" = "-s" ] ; then echo "  OK with study"
           elif [ "$opt" = "-s+" ] ; then echo "  OK with JIT study"
           else echo "  OK"
           fi
+        else
+          echo "** Locale test did not run successfully. The output did not match"
+          echo "   $outfile, $outfile2 or $outfile3."
+          echo "   This may mean that there is a problem with the locale settings rather"
+          echo "   than a bug in PCRE."
+          exit 1
         fi
       else exit 1
       fi
@@ -989,6 +1005,6 @@
 done


# Clean up local working files
-rm -f test3input test3output testNinput testsaved* teststderr teststdout testtry
+rm -f test3input test3output test3outputA testNinput testsaved* teststderr teststdout testtry

# End

Modified: code/trunk/testdata/testinput3
===================================================================
--- code/trunk/testdata/testinput3    2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/testdata/testinput3    2014-01-12 19:20:27 UTC (rev 1443)
@@ -1,7 +1,10 @@
-/-- This set of tests checks local-specific features, using the fr_FR locale. 
-    It is not Perl-compatible. There is different version called wintestinput3
-  f  or use on Windows, where the locale is called "french". --/
-  
+/-- This set of tests checks local-specific features, using the "fr_FR" locale. 
+    It is not Perl-compatible. When run via RunTest, the locale is edited to
+    be whichever of "fr_FR", "french", or "fr" is found to exist. There is
+    different version of this file called wintestinput3 for use on Windows,
+    where the locale is called "french" and the tests are run using
+    RunTest.bat. --/
+
 < forbid 8W 


/^[\w]+/

Modified: code/trunk/testdata/testoutput3
===================================================================
--- code/trunk/testdata/testoutput3    2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/testdata/testoutput3    2014-01-12 19:20:27 UTC (rev 1443)
@@ -1,7 +1,10 @@
-/-- This set of tests checks local-specific features, using the fr_FR locale. 
-    It is not Perl-compatible. There is different version called wintestinput3
-  f  or use on Windows, where the locale is called "french". --/
-  
+/-- This set of tests checks local-specific features, using the "fr_FR" locale. 
+    It is not Perl-compatible. When run via RunTest, the locale is edited to
+    be whichever of "fr_FR", "french", or "fr" is found to exist. There is
+    different version of this file called wintestinput3 for use on Windows,
+    where the locale is called "french" and the tests are run using
+    RunTest.bat. --/
+
 < forbid 8W 


/^[\w]+/

Added: code/trunk/testdata/testoutput3A
===================================================================
--- code/trunk/testdata/testoutput3A                            (rev 0)
+++ code/trunk/testdata/testoutput3A    2014-01-12 19:20:27 UTC (rev 1443)
@@ -0,0 +1,174 @@
+/-- This set of tests checks local-specific features, using the "fr_FR" locale. 
+    It is not Perl-compatible. When run via RunTest, the locale is edited to
+    be whichever of "fr_FR", "french", or "fr" is found to exist. There is
+    different version of this file called wintestinput3 for use on Windows,
+    where the locale is called "french" and the tests are run using
+    RunTest.bat. --/
+
+< forbid 8W 
+
+/^[\w]+/
+    *** Failers
+No match
+    \xC9cole
+No match
+
+/^[\w]+/Lfr_FR
+    \xC9cole
+ 0: \xC9cole
+
+/^[\w]+/
+    *** Failers
+No match
+    \xC9cole
+No match
+
+/^[\W]+/
+    \xC9cole
+ 0: \xc9
+
+/^[\W]+/Lfr_FR
+    *** Failers
+ 0: *** 
+    \xC9cole
+No match
+
+/[\b]/
+    \b
+ 0: \x08
+    *** Failers
+No match
+    a
+No match
+
+/[\b]/Lfr_FR
+    \b
+ 0: \x08
+    *** Failers
+No match
+    a
+No match
+
+/^\w+/
+    *** Failers
+No match
+    \xC9cole
+No match
+
+/^\w+/Lfr_FR
+    \xC9cole
+ 0: \xC9cole
+
+/(.+)\b(.+)/
+    \xC9cole
+ 0: \xc9cole
+ 1: \xc9
+ 2: cole
+
+/(.+)\b(.+)/Lfr_FR
+    *** Failers
+ 0: *** Failers
+ 1: *** 
+ 2: Failers
+    \xC9cole
+No match
+
+/\xC9cole/i
+    \xC9cole
+ 0: \xc9cole
+    *** Failers
+No match
+    \xE9cole
+No match
+
+/\xC9cole/iLfr_FR
+    \xC9cole
+ 0: \xC9cole
+    \xE9cole
+ 0: \xE9cole
+
+/\w/IS
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+  Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
+
+/\w/ISLfr_FR
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+  Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
+  \xAA \xB5 \xBA \xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 \xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF \xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \xE0 \xE1 \xE2 
+  \xE3 \xE4 \xE5 \xE6 \xE7 \xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF \xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF8 \xF9 \xFA \xFB \xFC \xFD \xFE \xFF 
+
+/^[\xc8-\xc9]/iLfr_FR
+    \xC9cole
+ 0: \xC9
+    \xE9cole
+ 0: \xE9
+
+/^[\xc8-\xc9]/Lfr_FR
+    \xC9cole
+ 0: \xC9
+    *** Failers 
+No match
+    \xE9cole
+No match
+
+/\W+/Lfr_FR
+    >>>\xaa<<<
+ 0: >>>
+    >>>\xba<<< 
+ 0: >>>
+
+/[\W]+/Lfr_FR
+    >>>\xaa<<<
+ 0: >>>
+    >>>\xba<<< 
+ 0: >>>
+
+/[^[:alpha:]]+/Lfr_FR
+    >>>\xaa<<<
+ 0: >>>
+    >>>\xba<<< 
+ 0: >>>
+
+/\w+/Lfr_FR
+    >>>\xaa<<<
+ 0: \xAA
+    >>>\xba<<< 
+ 0: \xBA
+
+/[\w]+/Lfr_FR
+    >>>\xaa<<<
+ 0: \xAA
+    >>>\xba<<< 
+ 0: \xBA
+
+/[[:alpha:]]+/Lfr_FR
+    >>>\xaa<<<
+ 0: \xAA
+    >>>\xba<<< 
+ 0: \xBA
+    
+/[[:alpha:]][[:lower:]][[:upper:]]/DZLfr_FR 
+------------------------------------------------------------------
+        Bra
+        [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
+        [a-z\xaa\xb5\xba\xdf-\xf6\xf8-\xff]
+        [A-Z\xc0-\xd6\xd8-\xde]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+
+/-- End of testinput3 --/


Added: code/trunk/testdata/testoutput3B
===================================================================
--- code/trunk/testdata/testoutput3B                            (rev 0)
+++ code/trunk/testdata/testoutput3B    2014-01-12 19:20:27 UTC (rev 1443)
@@ -0,0 +1,174 @@
+/-- This set of tests checks local-specific features, using the "fr_FR" locale. 
+    It is not Perl-compatible. When run via RunTest, the locale is edited to
+    be whichever of "fr_FR", "french", or "fr" is found to exist. There is
+    different version of this file called wintestinput3 for use on Windows,
+    where the locale is called "french" and the tests are run using
+    RunTest.bat. --/
+
+< forbid 8W 
+
+/^[\w]+/
+    *** Failers
+No match
+    \xC9cole
+No match
+
+/^[\w]+/Lfr_FR
+    \xC9cole
+ 0: \xC9cole
+
+/^[\w]+/
+    *** Failers
+No match
+    \xC9cole
+No match
+
+/^[\W]+/
+    \xC9cole
+ 0: \xc9
+
+/^[\W]+/Lfr_FR
+    *** Failers
+ 0: *** 
+    \xC9cole
+No match
+
+/[\b]/
+    \b
+ 0: \x08
+    *** Failers
+No match
+    a
+No match
+
+/[\b]/Lfr_FR
+    \b
+ 0: \x08
+    *** Failers
+No match
+    a
+No match
+
+/^\w+/
+    *** Failers
+No match
+    \xC9cole
+No match
+
+/^\w+/Lfr_FR
+    \xC9cole
+ 0: \xC9cole
+
+/(.+)\b(.+)/
+    \xC9cole
+ 0: \xc9cole
+ 1: \xc9
+ 2: cole
+
+/(.+)\b(.+)/Lfr_FR
+    *** Failers
+ 0: *** Failers
+ 1: *** 
+ 2: Failers
+    \xC9cole
+No match
+
+/\xC9cole/i
+    \xC9cole
+ 0: \xc9cole
+    *** Failers
+No match
+    \xE9cole
+No match
+
+/\xC9cole/iLfr_FR
+    \xC9cole
+ 0: \xC9cole
+    \xE9cole
+ 0: \xE9cole
+
+/\w/IS
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+  Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
+
+/\w/ISLfr_FR
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+  Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 
+  \xAA \xB5 \xBA \xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 \xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF \xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \xE0 \xE1 \xE2 
+  \xE3 \xE4 \xE5 \xE6 \xE7 \xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF \xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF8 \xF9 \xFA \xFB \xFC \xFD \xFE \xFF 
+
+/^[\xc8-\xc9]/iLfr_FR
+    \xC9cole
+ 0: \xC9
+    \xE9cole
+ 0: \xE9
+
+/^[\xc8-\xc9]/Lfr_FR
+    \xC9cole
+ 0: \xC9
+    *** Failers 
+No match
+    \xE9cole
+No match
+
+/\W+/Lfr_FR
+    >>>\xaa<<<
+ 0: >>>
+    >>>\xba<<< 
+ 0: >>>
+
+/[\W]+/Lfr_FR
+    >>>\xaa<<<
+ 0: >>>
+    >>>\xba<<< 
+ 0: >>>
+
+/[^[:alpha:]]+/Lfr_FR
+    >>>\xaa<<<
+ 0: >>>
+    >>>\xba<<< 
+ 0: >>>
+
+/\w+/Lfr_FR
+    >>>\xaa<<<
+ 0: \xAA
+    >>>\xba<<< 
+ 0: \xBA
+
+/[\w]+/Lfr_FR
+    >>>\xaa<<<
+ 0: \xAA
+    >>>\xba<<< 
+ 0: \xBA
+
+/[[:alpha:]]+/Lfr_FR
+    >>>\xaa<<<
+ 0: \xAA
+    >>>\xba<<< 
+ 0: \xBA
+    
+/[[:alpha:]][[:lower:]][[:upper:]]/DZLfr_FR 
+------------------------------------------------------------------
+        Bra
+        [A-Za-z\x83\x8a\x8c\x8e\x9a\x9c\x9e\x9f\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
+        [a-z\x83\x9a\x9c\x9e\xaa\xb5\xba\xdf-\xf6\xf8-\xff]
+        [A-Z\x8a\x8c\x8e\x9f\xc0-\xd6\xd8-\xde]
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+
+/-- End of testinput3 --/


Modified: code/trunk/testdata/wintestoutput3
===================================================================
--- code/trunk/testdata/wintestoutput3    2014-01-12 17:17:29 UTC (rev 1442)
+++ code/trunk/testdata/wintestoutput3    2014-01-12 19:20:27 UTC (rev 1443)
@@ -84,7 +84,7 @@
 No first char
 No need char
 Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P 
   Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z 


/\w/ISLfrench
@@ -93,7 +93,7 @@
No first char
No need char
Subject length lower bound = 1
-Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
+Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
\x83 \x8A \x8C \x8E \x9A \x9C \x9E \x9F \xAA \xB2 \xB3 \xB5 \xB9 \xBA \xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 \xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF \xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6
\xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \xE0 \xE1 \xE2 \xE3 \xE4 \xE5 \xE6 \xE7 \xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF \xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF8 \xF9 \xFA \xFB \xFC