Re: [pcre-dev] Powerpc optimisation

Startseite
Nachricht löschen
Autor: Frederic Bonnard
Datum:  
To: pcre-dev
Betreff: Re: [pcre-dev] Powerpc optimisation
And result file :)
Foreword

This page has been taken [1]here. pcre directory has been updated to
use latest 8.37 instead of 8.32. The following is from the original
author, taken as-is on the above page as a reminder. I used 3 VMs which
I installed with Ubuntu 14.04 LTS, one x86_64, ppc64el and ppc64. F.

Participants

   The following popular engines were choosen:
     * [2]PCRE 8.32
     * [3]tre 0.8.0
     * [4]Oniguruma 5.9.3
     * [5]re2 by Google [source tree: 29.10.2012]
     * [6]PCRE 8.37 with sljit JIT compiler support


   Before anyone jump to any conclusions, I should note the followings:
     * The engines were not fine tuned (because of my lack of knowledge
       about their internal workings). I just compiled them with the
       default options. I know enabling or disabling some features can
       heavily affect the results. If you feel that you have a better
       configuration just drop me an e-mail and I will update the results
       (hzmester(at)freemail(dot)hu).
     * The regular expression engines are compiled with -O3 to allow the
       best performance.
     * This comparison page was inspired by the work of John Maddock (See
       his own regex comparison [7]here). The input is also the same he
       used before: [8]mtent12.zip. It is a text file (e-book) which size
       is about 20 Mbytes.
     * Only common patterns are selected, they are not pathological cases
       nor have any PERL specific features. The comparison was caseful.


Results

x86-64 4x2.3GHz 4G (gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2)

Regular expression PCRE PCRE
-DFA TRE Onig-
uruma RE2 PCRE
-JIT
Twain 13 ms 13 ms 392 ms 18 ms 2 ms 17 ms
^Twain 104 ms 118 ms 193 ms 18 ms 77 ms 24 ms
Twain$ 12 ms 13 ms 407 ms 18 ms 2 ms 17 ms
Huck[a-zA-Z]+|Finn[a-zA-Z]+ 16 ms 17 ms 603 ms 40 ms 71 ms 22 ms
a[^x]{20}b 64 ms 382 ms 649 ms 353 ms 359 ms 56 ms
Tom|Sawyer|Huckleberry|Finn 22 ms 26 ms 1011 ms 47 ms 73 ms 40 ms
.{0,3}(Tom|Sawyer|Huckleberry|Finn) 3912 ms 5172 ms 3408 ms 111 ms 87
ms 412 ms
[a-zA-Z]+ing 739 ms 1666 ms 587 ms 872 ms 134 ms 182 ms
^[a-zA-Z]{0,4}ing[^a-zA-Z] 127 ms 153 ms 300 ms 39 ms 78 ms 25 ms
[a-zA-Z]+ing$ 773 ms 1771 ms 577 ms 883 ms 117 ms 183 ms
^[a-zA-Z ]{5,}$ 165 ms 328 ms 382 ms 289 ms 88 ms 49 ms
^.{16,20}$ 150 ms 242 ms 343 ms 494 ms 78 ms 30 ms
([a-f](.[d-m].){0,2}[h-n]){2} 511 ms 737 ms 859 ms 493 ms 151 ms 114 ms
([A-Za-z]awyer|[A-Za-z]inn)[^a-zA-Z] 737 ms 1035 ms 1035 ms 178 ms 117
ms 29 ms
"[^"]{0,30}[?!\.]" 19 ms 39 ms 442 ms 67 ms 8 ms 15 ms
Tom.{10,25}river|river.{10,25}Tom 48 ms 67 ms 674 ms 79 ms 81 ms 40 ms

ppc64el 4x3GHz 4G (gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2)

Regular expression PCRE PCRE
-DFA TRE Onig-
uruma RE2 PCRE
-JIT
Twain 26 ms 22 ms 425 ms 17 ms 3 ms 13 ms
^Twain 162 ms 189 ms 251 ms 17 ms 58 ms 13 ms
Twain$ 26 ms 22 ms 439 ms 17 ms 2 ms 13 ms
Huck[a-zA-Z]+|Finn[a-zA-Z]+ 26 ms 27 ms 769 ms 122 ms 58 ms 17 ms
a[^x]{20}b 113 ms 442 ms 645 ms 685 ms 718 ms 68 ms
Tom|Sawyer|Huckleberry|Finn 38 ms 39 ms 1459 ms 133 ms 59 ms 37 ms
.{0,3}(Tom|Sawyer|Huckleberry|Finn) 7382 ms 7761 ms 5049 ms 279 ms 59
ms 1224 ms
[a-zA-Z]+ing 1412 ms 2097 ms 660 ms 1343 ms 82 ms 431 ms
^[a-zA-Z]{0,4}ing[^a-zA-Z] 206 ms 236 ms 426 ms 41 ms 59 ms 52 ms
[a-zA-Z]+ing$ 1478 ms 2207 ms 649 ms 1380 ms 59 ms 513 ms
^[a-zA-Z ]{5,}$ 248 ms 429 ms 541 ms 489 ms 68 ms 135 ms
^.{16,20}$ 237 ms 331 ms 440 ms 713 ms 59 ms 64 ms
([a-f](.[d-m].){0,2}[h-n]){2} 752 ms 1119 ms 1105 ms 708 ms 76 ms 146
ms
([A-Za-z]awyer|[A-Za-z]inn)[^a-zA-Z] 1149 ms 1533 ms 1300 ms 286 ms 58
ms 23 ms
"[^"]{0,30}[?!\.]" 44 ms 60 ms 527 ms 136 ms 8 ms 43 ms
Tom.{10,25}river|river.{10,25}Tom 73 ms 99 ms 796 ms 164 ms 58 ms 21 ms

ppc64 4x3GHz 4G (gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2)

Regular expression PCRE PCRE
-DFA TRE Onig-
uruma RE2 PCRE
-JIT
Twain 27 ms 21 ms 543 ms 17 ms 12 ms 13 ms
^Twain 174 ms 189 ms 450 ms 17 ms 58 ms 13 ms
Twain$ 27 ms 21 ms 565 ms 17 ms 12 ms 13 ms
Huck[a-zA-Z]+|Finn[a-zA-Z]+ 27 ms 28 ms 866 ms 57 ms 58 ms 18 ms
a[^x]{20}b 114 ms 438 ms 805 ms 806 ms 567 ms 73 ms
Tom|Sawyer|Huckleberry|Finn 41 ms 40 ms 1541 ms 74 ms 58 ms 37 ms
.{0,3}(Tom|Sawyer|Huckleberry|Finn) 9107 ms 8621 ms 5184 ms 246 ms 59
ms 1265 ms
[a-zA-Z]+ing 1825 ms 2348 ms 773 ms 2678 ms 83 ms 456 ms
^[a-zA-Z]{0,4}ing[^a-zA-Z] 223 ms 238 ms 692 ms 51 ms 59 ms 53 ms
[a-zA-Z]+ing$ 1897 ms 2403 ms 763 ms 2789 ms 59 ms 507 ms
^[a-zA-Z ]{5,}$ 307 ms 500 ms 810 ms 609 ms 69 ms 141 ms
^.{16,20}$ 284 ms 329 ms 662 ms 1260 ms 59 ms 64 ms
([a-f](.[d-m].){0,2}[h-n]){2} 1206 ms 1290 ms 1191 ms 996 ms 77 ms 147
ms
([A-Za-z]awyer|[A-Za-z]inn)[^a-zA-Z] 2085 ms 1868 ms 1316 ms 250 ms 58
ms 22 ms
"[^"]{0,30}[?!\.]" 46 ms 59 ms 649 ms 129 ms 16 ms 43 ms
Tom.{10,25}river|river.{10,25}Tom 91 ms 99 ms 893 ms 143 ms 58 ms 22 ms

References

1. http://sljit.sourceforge.net/regex_perf.html
2. http://www.pcre.org/
3. http://laurikari.net/tre/
4. http://laurikari.net/tre/
5. http://code.google.com/p/re2/
6. file:///tmp/pcre.html
7. http://www.boost.org/doc/libs/1_41_0/libs/regex/doc/gcc-performance.html
8. http://www.gutenberg.org/files/3200/old/mtent12.zip