[pcre-dev] PCRE2 10.21 JIT matching causes Valgrind errors

Top Page
Delete this message
Author: Tavian Barnes
Date:  
To: pcre-dev
Subject: [pcre-dev] PCRE2 10.21 JIT matching causes Valgrind errors
The following test case produces Valgrind errors on x86-64 with 10.21,
whereas it did not with 10.20. I suspect it has to do with the
introduction of SSE support.

$ cat foo.c
#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>
#include <stdlib.h>
#include <string.h>

int main()
{
char regex[4], subject[4];
strcpy(regex, "a");
strcpy(subject, "b");
/* regex[3..4] and subject[3..4] are uninitialized */

  int errorcode;
  PCRE2_SIZE erroroffset;
  pcre2_code *code = pcre2_compile(
    (PCRE2_SPTR)regex,
    PCRE2_ZERO_TERMINATED,
    PCRE2_UCP | PCRE2_UTF,
    &errorcode,
    &erroroffset,
    NULL
  );
  pcre2_jit_compile(code, PCRE2_JIT_COMPLETE);


  pcre2_match_data *match_data =
pcre2_match_data_create_from_pattern(code, NULL);
  int err = pcre2_match(
    code,
    (PCRE2_SPTR)subject,
    PCRE2_ZERO_TERMINATED,
    0,
    0,
    match_data,
    NULL
  );


  return err = PCRE2_ERROR_NOMATCH ? EXIT_SUCCESS : EXIT_FAILURE;
}
$ gcc foo.c -lpcre2-8 -o foo
$ valgrind -q ./foo
==18980== Conditional jump or move depends on uninitialised value(s)
==18980==    at 0x40230C8: ???
==18980==    by 0xFFEFFFD9F: ???
==18980==    by 0xFFEFF7A4F: ???


I suspect that it's not really a bug in the generated code; it's
probably just doing a wide load on the subject string and masking away
(or otherwise ignoring) the characters past the end. It looks like
the load is always aligned so there's no worries about hitting an
unreadable page that's right after the subject string.

If you use valgrind --gen-suppressions=yes, the generated suppression
will be too aggressive, suppressing uninitialized value errors
*everywhere*. To suppress just those from the JIT code, use:

$ cat valgrind.supp
{
PCRE2 JIT wide loads
Memcheck:Cond
obj:???
}

--
Tavian Barnes