[pcre-dev] pcre not matching unicode characters

Startseite
Nachricht löschen
Autor: Jayaprakasam, Kannan
Datum:  
To: pcre-dev@exim.org
Betreff: [pcre-dev] pcre not matching unicode characters
I'm compiling a pcre pattern with utf8 flag enabled and am trying to match a utf8 char* string against it, but it is not matching and pcre_exec returns negative. I'm passing the subject length as 65 to pcre_exec which is the number of characters in the string. Please help/

(If I try without the flag PCRE_UTF8 however, it matches but the offset vector[1] is 30 which is index of the character just before a unicode character in my input string)
#include "stdafx.h"
#include "pcre.h"
#include <pcre.h>               /* PCRE lib        NONE  */
#include <stdio.h>              /* I/O lib         C89   */
#include <stdlib.h>             /* Standard Lib    C89   */
#include <string.h>             /* Strings         C89   */
#include <iostream>


int main(int argc, char *argv[])
{
     pcre *reCompiled;


  int pcreExecRet;
  int subStrVec[30];
  const char *pcreErrorStr;
  int pcreErrorOffset;
  char* aStrRegex = "(\\?\\w+\\?\\s*=)?\\s*(call|exec|execute)\\s+(?<spName>\\w+)("
                                     // params can be an empty pair of paranthesis or have parameters inside them as well.
                                     "\\(\\s*(?<params>[?\\w,]+)\\s*\\)"
                                     // paramList along with its paranthesis is optional below so a SP call can be just "exec sp_name" for a stored proc call without any parameters.
                                     ")?";
   reCompiled = pcre_compile(aStrRegex, 0, &pcreErrorStr, &pcreErrorOffset, NULL);
   if(reCompiled == NULL) {
    printf("ERROR: Could not compile '%s': %s\n", aStrRegex, pcreErrorStr);
    exit(1);
   }


  char* line = "?rt?=call SqlTxFunctionTesting(?înFîéld?,?outField?,?inOutField?)";
  pcreExecRet = pcre_exec(reCompiled,
                            NULL,
                            line,
                            65,  // length of string
                            0,                      // Start looking at this point
                            0,                      // OPTIONS
                            subStrVec,
                            30);                    // Length of subStrVec


printf("\nret=%d",pcreExecRet);


}


Thanks,
Kannan