Re: [pcre-dev] JIT fails with NEON instructions

Top Page
Delete this message
Author: Sebastian Pop
Date:  
To: Zoltán Herczeg
CC: pcre-dev@exim.org
Subject: Re: [pcre-dev] JIT fails with NEON instructions
Hi Zoltán,

please find attached a patch that fixes the problem.
For utf-16 and utf-32 the vectorization factor is adjusted to 8 and 4
respectively following PCRE2_CODE_UNIT_WIDTH.
The patch factors up the 6 functions under a same function
that gets instantiated 6 times following different preprocessor defines.
This patch also implements on the same function skeleton the matching
of pairs of characters. Only the most frequent cases of pairs matching
have been special cased: the default automaton is used in the other cases.

Tested on ubuntu 18.04 running on a Graviton A1 instance (A72 core).

Benchmark https://github.com/rust-leipzig/regex-performance
shows the following on an A1 instance:

base = svn rev. 1171, pcre-jit time in ms.
patch-1 = svn rev. 1172 pcre-jit time in ms.
patch-2 = attached patch pcre-jit time in ms.
diff = (base - patch-2) / patch-2 in %

| Benchmark                             | base | patch-1  | patch-2 | diff  |
| 'Twain'                               |18     | 18            | 4

         | 350%  |

| '(?i)Twain'                           |18.6   | 18.6          | 5.3

         | 251%  |

| '[a-z]shing'                          |16.9   | 16.8          | 5.1

         | 231%  |

| 'Huck[a-zA-Z]+|Saw[a-zA-Z]+'          |21.9   | 4.9           | 5.9

         | 271%  |

| '\\b\\w+nn\\b'                        |117.1  | 117           | 117

         | 0%    |

| '[a-q][^u-z]{13}x'                    |227.9  | 228.2         | 228

         | 0%    |

| 'Tom|Sawyer|Huckleberry|Finn'         |37.1   | 37.1          | 37.1

         | 0%    |

| '(?i)Tom|Sawyer|Huckleberry|Finn'     |100.6  | 100.5         |

100.6         | 0%    |

| '.{0,2}(Tom|Sawyer|Huckleberry|Finn)' |552.5  | 552.5         |

552.4         | 0%    |

| '.{2,4}(Tom|Sawyer|Huckleberry|Finn)' |613.7  | 613.7         |

613.7         | 0%    |

| 'Tom.{10,25}river|river.{10,25}Tom'   |32.3   | 26.6          | 10.2

         | 217%  |

| '[a-zA-Z]+ing'                        |94.9   | 94.8          | 94.9

         | 0%    |

| '\\s[a-zA-Z]{0,12}ing\\s'             |127.9  | 128.1         | 128

         | 0%    |

| '([A-Za-z]awyer|[A-Za-z]inn)\\s'      |62.2   | 61.2          | 27.2

         | 129%  |

| '[\"'][^\"']{0,30}[?!\\.][\"']'       |35.4   | 19.1          | 19

         | 86%   |

| '∞|✓'                                 |21     | 1.7           | 4.6

         | 357%  |

| '\\p{Sm}'                             |112    | 112           |

112.1         | 0%    |

| Total Results:                        |2210   | 2150.9        |

2065.1        | 7%    |