On Fri, Apr 24, 2020 at 09:17:47AM +0200, Petr Pisar via Pcre-dev wrote:
> On Thu, Apr 23, 2020 at 04:43:07PM +0100, ph10@??? wrote:
> > On Thu, 16 Apr 2020, Petr Pisar via Pcre-dev wrote:
> >
> > > I noticed a new warning with GCC 10:
> > >
> > > gcc -DHAVE_CONFIG_H -I. -I./src "-I./src" -pthread -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mshstk -c -o src/pcre2test-pcre2test.o `test -f 'src/pcre2test.c' || echo './'`src/pcre2test.c
> > > src/pcre2test.c: In function 'copy_and_get':
> > > src/pcre2test.c:3062:27: warning: array subscript -1 is outside array bounds of 'uint32_t[256]' {aka 'unsigned int[256]'} [-Warray-bounds]
> > > 3062 | if (length < 0) length = p[-1];
> >
> > I have committed some revised code. Does this solve the issue?
> >
> It does not. The compiler is too smart (or dumb).
I think it's a mistake in PCRE2 code.
I focused on the first warning from this code:
/* Test copy strings by number */
for (i = 0; i < MAXCPYGET && dat_datctl.copy_numbers[i] >= 0; i++)
{
int rc;
PCRE2_SIZE length, length2;
uint32_t copybuffer[256];
uint32_t n = (uint32_t)(dat_datctl.copy_numbers[i]);
length = sizeof(copybuffer)/code_unit_size;
PCRE2_SUBSTRING_COPY_BYNUMBER(rc, match_data, n, copybuffer, &length);
if (rc < 0)
{
fprintf(outfile, "Copy substring %d failed (%d): ", n, rc);
if (!print_error_message(rc, "", "\n")) return FALSE;
}
else
{
PCRE2_SUBSTRING_LENGTH_BYNUMBER(rc, match_data, n, &length2);
if (rc < 0)
{
fprintf(outfile, "Get substring %d length failed (%d): ", n, rc);
if (!print_error_message(rc, "", "\n")) return FALSE;
}
else if (length2 != length)
{
fprintf(outfile, "Mismatched substring lengths: %"
SIZ_FORM " %" SIZ_FORM "\n", SIZ_CAST length, SIZ_CAST length2);
}
fprintf(outfile, "%2dC ", n);
→ PCHARSV(copybuffer, 0, length, utf, outfile);
fprintf(outfile, " (%" SIZ_FORM ")\n", SIZ_CAST length);
}
}
It is triggered by calling then marked PCHARSV() macro. I first thought that
length variable could be uninitialized in some cases, but that's not true.
It's always set in length = sizeof(copybuffer)/code_unit_size assignment.
The macro end with in pchars32() call:
static int pchars32(PCRE2_SPTR32 p, int length, BOOL utf, FILE *f)
{
int yield = 0;
(void)(utf); /* Avoid compiler warning */
if (length < 0)
{
PCRE2_SPTR32 pp = p - 1;
length = *pp;
}
while (length-- > 0)
{
uint32_t c = *p++;
yield += pchar(c, utf, f);
}
return yield;
}
The offending code is reachable only if length < 0, thus I wrapped the
PCHARSV() call into an inverse condition to elude the offendig code like this:
for (i = 0; i < MAXCPYGET && dat_datctl.copy_numbers[i] >= 0; i++)
{
int rc;
PCRE2_SIZE length, length2;
uint32_t copybuffer[256];
uint32_t n = (uint32_t)(dat_datctl.copy_numbers[i]);
length = sizeof(copybuffer)/code_unit_size;
PCRE2_SUBSTRING_COPY_BYNUMBER(rc, match_data, n, copybuffer, &length);
if (rc < 0)
{
fprintf(outfile, "Copy substring %d failed (%d): ", n, rc);
if (!print_error_message(rc, "", "\n")) return FALSE;
}
else
{
PCRE2_SUBSTRING_LENGTH_BYNUMBER(rc, match_data, n, &length2);
if (rc < 0)
{
fprintf(outfile, "Get substring %d length failed (%d): ", n, rc);
if (!print_error_message(rc, "", "\n")) return FALSE;
}
else if (length2 != length)
{
fprintf(outfile, "Mismatched substring lengths: %"
SIZ_FORM " %" SIZ_FORM "\n", SIZ_CAST length, SIZ_CAST length2);
}
fprintf(outfile, "%2dC ", n);
→ if (length >= 0)
{
→ PCHARSV(copybuffer, 0, length, utf, outfile);
fprintf(outfile, " (%" SIZ_FORM ")\n", SIZ_CAST length);
}
}
}
I copiled it and the warning was still there! Then I reset the length variable
to 0 or 1 or 100 inside the condition and the warning disappeared. Then
I realized the bug: a type of the variable differs. It's PCRE2_SIZE outside
and int inside pchars32. They probably differ in signess and a type coercion
applied by the compiler when passing an argument to the function wraps the
value to negative numbers on half the the PCRE2_SIZE domain.
And indeed type casting it explicitly helps:
→ if ((int)length >= 0)
{
PCHARSV(copybuffer, 0, length, utf, outfile);
fprintf(outfile, " (%" SIZ_FORM ")\n", SIZ_CAST length);
}
So the compiler was right at the end, although there are many false negatives
and positives reported against GCC
<
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456>.
I hope it will help you in fixing this issue.
-- Petr