[pcre-dev] [Bug 1848] pcregrep outputs duplicate matches

Top Pagina
Delete this message
Auteur: admin
Datum:  
Aan: pcre-dev
Oude Onderwerpen: [pcre-dev] [Bug 1848] New: pcregrep outputs duplicate matches
Onderwerp: [pcre-dev] [Bug 1848] pcregrep outputs duplicate matches
https://bugs.exim.org/show_bug.cgi?id=1848

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |FIXED


--- Comment #4 from Philip Hazel <ph10@???> ---
I have committed a patch that fixes this. It is not your patch, because yours
(and my previous buggy one) did not take account of further matches starting
later on the line where the previous one ended. Consider, for example, a file
such as

123\n
456\n
789\n
---abc\n
def\n
xyz\n

This should produce 2 matches, the second one starting "abc...". I hope I've
got it right this time.

Here's my patch:

Index: pcregrep.c
===================================================================
--- pcregrep.c  (revision 1677)
+++ pcregrep.c  (working copy)
@@ -1804,11 +1804,6 @@
         if (line_buffered) fflush(stdout);
         rc = 0;                      /* Had some success */


-        /* If the current match ended past the end of the line (only possible
-        in multiline mode), we are done with this line. */
-
-        if ((unsigned int)offsets[1] > linelength) goto END_ONE_MATCH;
-
         startoffset = offsets[1];    /* Restart after the match */
         if (startoffset <= oldstartoffset)
           {
@@ -1818,6 +1813,21 @@
           if (utf8)
             while ((matchptr[startoffset] & 0xc0) == 0x80) startoffset++;
           }
+
+        /* If the current match ended past the end of the line (only possible
+        in multiline mode), we must move on to the line in which it did end
+        before searching for more matches. */                                
+                                                          
+        while (startoffset > (int)linelength)
+          {                                                                  
+          matchptr = ptr += linelength + endlinelength;                      
+          filepos += (int)(linelength + endlinelength);                        
+          linenumber++;                    
+          startoffset -= (int)(linelength + endlinelength);
+          t = end_of_line(ptr, endptr, &endlinelength);
+          linelength = t - ptr - endlinelength;
+          }              
+
         goto ONLY_MATCHING_RESTART;
         }
       }


--
You are receiving this mail because:
You are on the CC list for the bug.