Re: [pcre-dev] PCRE2 10.35-RC1 testing release is available

Top Page
Delete this message
Author: Petr Pisar
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] PCRE2 10.35-RC1 testing release is available
On Fri, Apr 24, 2020 at 09:17:47AM +0200, Petr Pisar via Pcre-dev wrote:
> On Thu, Apr 23, 2020 at 04:43:07PM +0100, ph10@??? wrote:
> > On Thu, 16 Apr 2020, Petr Pisar via Pcre-dev wrote:
> >
> > > I noticed a new warning with GCC 10:
> > >
> > > gcc -DHAVE_CONFIG_H -I. -I./src  "-I./src"     -pthread -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mshstk  -c -o src/pcre2test-pcre2test.o `test -f 'src/pcre2test.c' || echo './'`src/pcre2test.c
> > > src/pcre2test.c: In function 'copy_and_get':
> > > src/pcre2test.c:3062:27: warning: array subscript -1 is outside array bounds of 'uint32_t[256]' {aka 'unsigned int[256]'} [-Warray-bounds]
> > >  3062 | if (length < 0) length = p[-1];

> >
> > I have committed some revised code. Does this solve the issue?
> >
> It does not. The compiler is too smart (or dumb).


I think it's a mistake in PCRE2 code.

I focused on the first warning from this code:

/* Test copy strings by number */

for (i = 0; i < MAXCPYGET && dat_datctl.copy_numbers[i] >= 0; i++)
  {
  int rc;
  PCRE2_SIZE length, length2;
  uint32_t copybuffer[256];
  uint32_t n = (uint32_t)(dat_datctl.copy_numbers[i]);
  length = sizeof(copybuffer)/code_unit_size;
  PCRE2_SUBSTRING_COPY_BYNUMBER(rc, match_data, n, copybuffer, &length);
  if (rc < 0)
    {
    fprintf(outfile, "Copy substring %d failed (%d): ", n, rc);
    if (!print_error_message(rc, "", "\n")) return FALSE;
    }
  else
    {
    PCRE2_SUBSTRING_LENGTH_BYNUMBER(rc, match_data, n, &length2);
    if (rc < 0)
      {
      fprintf(outfile, "Get substring %d length failed (%d): ", n, rc);
      if (!print_error_message(rc, "", "\n")) return FALSE;
      }
    else if (length2 != length)
      {
      fprintf(outfile, "Mismatched substring lengths: %"
        SIZ_FORM " %" SIZ_FORM "\n", SIZ_CAST length, SIZ_CAST length2);
      }
    fprintf(outfile, "%2dC ", n);
→   PCHARSV(copybuffer, 0, length, utf, outfile);
    fprintf(outfile, " (%" SIZ_FORM ")\n", SIZ_CAST length);
    }
  }   


It is triggered by calling then marked PCHARSV() macro. I first thought that
length variable could be uninitialized in some cases, but that's not true.
It's always set in length = sizeof(copybuffer)/code_unit_size assignment.

The macro end with in pchars32() call:

static int pchars32(PCRE2_SPTR32 p, int length, BOOL utf, FILE *f)
{
int yield = 0;
(void)(utf); /* Avoid compiler warning */
if (length < 0)
{
PCRE2_SPTR32 pp = p - 1;
length = *pp;
}
while (length-- > 0)
{
uint32_t c = *p++;
yield += pchar(c, utf, f);
}
return yield;
}

The offending code is reachable only if length < 0, thus I wrapped the
PCHARSV() call into an inverse condition to elude the offendig code like this:

for (i = 0; i < MAXCPYGET && dat_datctl.copy_numbers[i] >= 0; i++)
  {
  int rc;
  PCRE2_SIZE length, length2;
  uint32_t copybuffer[256];
  uint32_t n = (uint32_t)(dat_datctl.copy_numbers[i]);
  length = sizeof(copybuffer)/code_unit_size;
  PCRE2_SUBSTRING_COPY_BYNUMBER(rc, match_data, n, copybuffer, &length);
  if (rc < 0)
    {
    fprintf(outfile, "Copy substring %d failed (%d): ", n, rc);
    if (!print_error_message(rc, "", "\n")) return FALSE;
    }
  else
    {
    PCRE2_SUBSTRING_LENGTH_BYNUMBER(rc, match_data, n, &length2);
    if (rc < 0)
      {
      fprintf(outfile, "Get substring %d length failed (%d): ", n, rc);
      if (!print_error_message(rc, "", "\n")) return FALSE;
      }
    else if (length2 != length)
      {
      fprintf(outfile, "Mismatched substring lengths: %"
        SIZ_FORM " %" SIZ_FORM "\n", SIZ_CAST length, SIZ_CAST length2);
      }
    fprintf(outfile, "%2dC ", n);
→   if (length >= 0)
      {
→     PCHARSV(copybuffer, 0, length, utf, outfile);
      fprintf(outfile, " (%" SIZ_FORM ")\n", SIZ_CAST length);
      }
    }
  }


I copiled it and the warning was still there! Then I reset the length variable
to 0 or 1 or 100 inside the condition and the warning disappeared. Then
I realized the bug: a type of the variable differs. It's PCRE2_SIZE outside
and int inside pchars32. They probably differ in signess and a type coercion
applied by the compiler when passing an argument to the function wraps the
value to negative numbers on half the the PCRE2_SIZE domain.

And indeed type casting it explicitly helps:

→    if ((int)length >= 0)
      {
      PCHARSV(copybuffer, 0, length, utf, outfile);
      fprintf(outfile, " (%" SIZ_FORM ")\n", SIZ_CAST length);
      }


So the compiler was right at the end, although there are many false negatives
and positives reported against GCC
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456>.

I hope it will help you in fixing this issue.

-- Petr