[pcre-dev] [Bug 2707] New: pcre2-posix library provides rege…

Top Page

Reply to this message
Author: admin
To: pcre-dev
Subject: [pcre-dev] [Bug 2707] New: pcre2-posix library provides regex symbols which clash with system regex if a program links to pcre2-posix indirectly

            Bug ID: 2707
           Summary: pcre2-posix library provides regex symbols which clash
                    with system regex if a program links to pcre2-posix
           Product: PCRE
           Version: 10.36 (PCRE2)
          Hardware: x86
                OS: Linux
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: Philip.Hazel@???
          Reporter: ppisar@???
                CC: pcre-dev@???

I got a report that pacemaker which links to ncurses library which links to
pcre2-posix library has problems with calling regex functions. I don't have a
reproducer at hand, but here is how I understand it:

pacemaker does not use pcre2-posix library it all. It includes <regex.h>, calls
regexec(), uses REG_NOMATCH constant and does not link to pcre2-posix library.
Thus pacemaker program is compiled with REG_NOMATCH = 1 (see
/usr/include/regex.h) and has undefined regexec symbol which is expected to be
resolved to a regexp function of libc.

But pacemaker also uses and links to an ncurses library which can be optionally
compiled against pcre2-posix. Thus when pacemaker program is loaded, ncurses
library and its dependencies are mapped and their symbols are available to the
pacemaker process. And one of the dependencies is pcre2-posix which provides
its own regexec symbol.

So pacemaker ends up with two regexec symbols (from libc and from pcre2-posix)
in its symbol space and the dynamic linker must decide which to use. If it
binds the pacemaker's reference to libc's regex() everything is fine. But if it
binds the reference to pcre2-posix's regex(), bad things happen.

Namely, pcre2-posix REG_NOMATCH = 17 (/usr/include/pcre2posix.h) does not match
pacemaker's REG_NOMATCH = 1.

All this happens because pocre2-posix decided to keep the regex functions
defined, from pcre2posix(3):

       Although  they  are  not  defined as protypes in pcre2posix.h, the
library does contain
       functions with the POSIX names regcomp() etc. These simply pass their
arguments to
       the PCRE2 functions. These functions are provided  for  backwards 
compatibility  with
       earlier versions of PCRE2, so that existing programs do not have to be

and at the same time libc's and pcre2-posix's ABI differ.

We already tackled a related problem in bug #1830 and a similar issue with
PCRE1 was reported in bug #2654.

I must confess that I cannot reproduce this issue because my libc (glibc-2.33)
versions the regex symbols:

$ nm -D /usr/lib64/libc.so.6 | grep regexec
000000000016b530 T regexec@GLIBC_2.2.5
00000000000e4f10 T regexec@@GLIBC_2.3.4

So they differ from pcre2-posix:

$ nm -D /usr/lib64/libpcre2-posix.so.2 | grep regexec
0000000000001590 T pcre2_regexec
0000000000001750 T regexec

But in general, the standard library does not have to version the symbols and
the problem can emerge.

I propose removing the POSIX regex functions from src/pcre2posix.c:

#undef regexec
PCRE2POSIX_EXP_DECL int regexec(const regex_t *, const char *, size_t,
regmatch_t *, int);
regexec(const regex_t *preg, const char *string, size_t nmatch,
regmatch_t pmatch[], int eflags)
return pcre2_regexec(preg, string, nmatch, pmatch, eflags);

while retaining the redefinitions in src/pcre2posix.h:

#define regexec pcre2_regexec

That would mean that program still could build against PCRE2 by including
<pcre2posix.h> without rewriting regex function calls, but old programs which
happened to include <regex.h> would stop working against PCRE2. I actually
wonder whether any of these program exist and work against PCRE2 because of the
ABI differences.

If you agree, the only open question is whether we should bump pcre2-posix
SOANAME or not. Technically the library would lost the symbols and changes ABI,
but since the regex functions were never part of the (current) API, they could
be perceived as a non-public internal implementation detail.

You are receiving this mail because:
You are on the CC list for the bug.