Re: [pcre-dev] regarding use regcomp/regexec on multiple str…

Top Page
Delete this message
Author: Nuno Lopes
Date:  
To: john guo, pcre-dev
Subject: Re: [pcre-dev] regarding use regcomp/regexec on multiple strings
> I use the following pseudo-code to better describe my problem:
>
>  line 1:    char *str1 = "test";
>         2:    char *str2 = "example";
>         3:    char *str3 = "this is an example that I run in my test 
> program";

>
>         4:    regex_t        str1_re, str2_re;

>
>                 //I would like to compile str1 and str2 to regular 
> expression. But I notice problem with line 6 already
>         5:    regcomp(&str1_re, str1, 0);
>         6:     regcomp(&str2_re, str2, 0);     // when I watch str1_re and 
> str2_re on ddd debugger, I saw str1_re got changed
>                                                                     // 
> after line 6. Is this right behavior?


nop. regcomp() only changes the regex_t structure that it receives. Make
sure you are really using PCRE (use ldd and check if you are linking against
PCRE).


> Basically, as I mentioned in the above C++ like comment, I would like to
> be able to compile 2 strings both into regular expressions then use them
> repeatedly to match other strings. Is this a feasible usage of PCRE
> library? I don't know what happens in 2nd regcomp (line 6) that changes
> the regular repression structure content I just compiled on line 5. Is
> there a way to get those two regex_t completely separated? Do I need to
> use some flag during compilation time?


You need to do it by hand. You can do it by building a new regex like: "(" +
str1 + ") | (" + str2 + ")"
This way you can also pray and eventually PCRE will optimize some stuff and
perform better than calling it twice :)

Nuno