[pcre-dev] [Bug 1848] pcregrep outputs duplicate matches

Etusivu
Poista viesti
Lähettäjä: admin
Päiväys:  
Vastaanottaja: pcre-dev
Vanhat otsikot: [pcre-dev] [Bug 1848] New: pcregrep outputs duplicate matches
Aihe: [pcre-dev] [Bug 1848] pcregrep outputs duplicate matches
https://bugs.exim.org/show_bug.cgi?id=1848

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |FIXED


--- Comment #4 from Philip Hazel <ph10@???> ---
I have committed a patch that fixes this. It is not your patch, because yours
(and my previous buggy one) did not take account of further matches starting
later on the line where the previous one ended. Consider, for example, a file
such as

123\n
456\n
789\n
---abc\n
def\n
xyz\n

This should produce 2 matches, the second one starting "abc...". I hope I've
got it right this time.

Here's my patch:

Index: pcregrep.c
===================================================================
--- pcregrep.c  (revision 1677)
+++ pcregrep.c  (working copy)
@@ -1804,11 +1804,6 @@
         if (line_buffered) fflush(stdout);
         rc = 0;                      /* Had some success */


-        /* If the current match ended past the end of the line (only possible
-        in multiline mode), we are done with this line. */
-
-        if ((unsigned int)offsets[1] > linelength) goto END_ONE_MATCH;
-
         startoffset = offsets[1];    /* Restart after the match */
         if (startoffset <= oldstartoffset)
           {
@@ -1818,6 +1813,21 @@
           if (utf8)
             while ((matchptr[startoffset] & 0xc0) == 0x80) startoffset++;
           }
+
+        /* If the current match ended past the end of the line (only possible
+        in multiline mode), we must move on to the line in which it did end
+        before searching for more matches. */                                
+                                                          
+        while (startoffset > (int)linelength)
+          {                                                                  
+          matchptr = ptr += linelength + endlinelength;                      
+          filepos += (int)(linelength + endlinelength);                        
+          linenumber++;                    
+          startoffset -= (int)(linelength + endlinelength);
+          t = end_of_line(ptr, endptr, &endlinelength);
+          linelength = t - ptr - endlinelength;
+          }              
+
         goto ONLY_MATCHING_RESTART;
         }
       }


--
You are receiving this mail because:
You are on the CC list for the bug.