------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1049
--- Comment #35 from Zoltan Herczeg <hzmester@???> 2011-12-17 07:11:30 ---
I did some measurements with the 16 bit PCRE on UTF8 and UTF16 (same test). It
seems to me that the UTF16 is a bit slower in interpreter, but the JIT is
basically unaffected.
The utf8_input.txt (size: 1392505 x 4) is loaded.
Pattern: 'die der' Matches: 236
8 bit: Int runtime: 20 ms JIT runtime: 10 ms 2.00 as fast 50.0% save
16 bit: Int runtime: 20 ms JIT runtime: 10 ms 2.00 as fast 50.0% save
Pattern: 'ist|der|die|und' Matches: 118816 Caseless
8 bit: Int runtime: 100 ms JIT runtime: 30 ms 3.33 as fast 70.0% save
16 bit: Int runtime: 120 ms JIT runtime: 40 ms 3.00 as fast 66.7% save
Pattern: '\b\w+\b' Matches: 803324
8 bit: Int runtime: 260 ms JIT runtime: 120 ms 2.17 as fast 53.8% save
16 bit: Int runtime: 260 ms JIT runtime: 120 ms 2.17 as fast 53.8% save
Pattern: '(?:da|ge|om)+(?:n|me)*' Matches: 93640 Caseless
8 bit: Int runtime: 80 ms JIT runtime: 30 ms 2.67 as fast 62.5% save
16 bit: Int runtime: 100 ms JIT runtime: 30 ms 3.33 as fast 70.0% save
Pattern: '\b(?(?=\w+ro)\w+pa|\w+lle)\w+\b' Matches: 7972
8 bit: Int runtime: 580 ms JIT runtime: 260 ms 2.23 as fast 55.2% save
16 bit: Int runtime: 640 ms JIT runtime: 250 ms 2.56 as fast 60.9% save
Pattern: '\b(\W)\1+\b|(^(?=.*kl)(?=.*no).{15,40}$)' Matches: 148
8 bit: Int runtime: 650 ms JIT runtime: 180 ms 3.61 as fast 72.3% save
16 bit: Int runtime: 800 ms JIT runtime: 210 ms 3.81 as fast 73.8% save
Pattern: '^.{4,32}(\P{N})\1{2,}.{4,32}(?<![nuk])$' Matches: 264
8 bit: Int runtime: 220 ms JIT runtime: 50 ms 4.40 as fast 77.3% save
16 bit: Int runtime: 240 ms JIT runtime: 50 ms 4.80 as fast 79.2% save
Pattern: '^(\w{3,})(?!\1).*\h.*\1$' Matches: 576 Caseless
8 bit: Int runtime: 4560 ms JIT runtime: 1890 ms 2.41 as fast 58.6% save
16 bit: Int runtime: 4290 ms JIT runtime: 1890 ms 2.27 as fast 55.9% save
Pattern: '((\w{2,8},?(\P{Z}|\R)){1,2}\.\s?)$' Matches: 5040 Caseless
8 bit: Int runtime: 4500 ms JIT runtime: 1270 ms 3.54 as fast 71.8% save
16 bit: Int runtime: 5090 ms JIT runtime: 1220 ms 4.17 as fast 76.0% save
Pattern: '\b\w*?((.){1,3}\w*\2)\w*?(?1)' Matches: 251012
8 bit: Int runtime: 1710 ms JIT runtime: 620 ms 2.76 as fast 63.7% save
16 bit: Int runtime: 1900 ms JIT runtime: 610 ms 3.11 as fast 67.9% save
Pattern: '\w*?(b{2,3})\w*?c' Matches: 16
8 bit: Int runtime: 1240 ms JIT runtime: 320 ms 3.88 as fast 74.2% save
16 bit: Int runtime: 1250 ms JIT runtime: 320 ms 3.91 as fast 74.4% save
Pattern: '\P{Lu}\P{L&}{0,12}[\s\-]{1,4}..[\P{L}\P{N}]{4}' Matches: 469547
8 bit: Int runtime: 300 ms JIT runtime: 90 ms 3.33 as fast 70.0% save
16 bit: Int runtime: 370 ms JIT runtime: 110 ms 3.36 as fast 70.3% save
Pattern: '\b(\B([c-h])\B|[a-z]+?(?1)[a-z])' Matches: 297816
8 bit: Int runtime: 950 ms JIT runtime: 300 ms 3.17 as fast 68.4% save
16 bit: Int runtime: 1060 ms JIT runtime: 310 ms 3.42 as fast 70.8% save
Average:
8 bit: Int runtime: 1166 ms JIT runtime: 397 ms 3.04 as fast 65.2% save
16 bit: Int runtime: 1241 ms JIT runtime: 397 ms 3.22 as fast 66.9% save
--
Configure bugmail:
http://bugs.exim.org/userprefs.cgi?tab=email