------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1130
Summary: pcregrep doesn't copy entire lines to output when they
are long ( > 25000 chars)
Product: PCRE
Version: 8.12
Platform: Other
OS/Version: Linux
Status: NEW
Severity: bug
Priority: medium
Component: Code
AssignedTo: ph10@???
ReportedBy: peter@???
CC: pcre-dev@???
I create a file with a single long line with an 'a', many 'x's and a 'b'.
pcregrep for a or for b produces output that differs from the origial file,
which I didn't expect.
This happens for 25000 'x's, but for 20000 'x's it is fine, as this shell
snippet illustrates:
> perl -e 'print "a", "x"x25000, "b\n"' > long
> perl -e 'print "a", "x"x20000, "b\n"' > short
> for ls in long short ; do for ab in a b ; do pcregrep $ab $ls > ${ls}${ab} ; done ; done
> md5sum long* short*
a02b4cbbb437eaf52997832952a1d052 long
1dfac8b938bfaec4c6bd727ffae356fd longa
1d5d6df30c643aed4a626dd8ab36f2ec longb
27ee48c18be91ac0038ba8d9a3988625 short
27ee48c18be91ac0038ba8d9a3988625 shorta
27ee48c18be91ac0038ba8d9a3988625 shortb
> ls -l long* short*
-rw-r--r-- 1 pvm pvm 25003 2011-07-12 14:33 long
-rw-r--r-- 1 pvm pvm 24576 2011-07-12 14:33 longa
-rw-r--r-- 1 pvm pvm 427 2011-07-12 14:33 longb
-rw-r--r-- 1 pvm pvm 20003 2011-07-12 14:33 short
-rw-r--r-- 1 pvm pvm 20003 2011-07-12 14:33 shorta
-rw-r--r-- 1 pvm pvm 20003 2011-07-12 14:33 shortb
I expected
> pcregrep a long
to generate the same output as
> pcregrep b long
to be identical to long, just as is the case with the short* files
I was pcregrep-ping through minified javascript files (that are all in one
line) and noticed that the output doesn't have a terminating newline:
> (cat long ; cat long) > 2long
> pcregrep a 2long | wc -l
0
> grep a 2long | wc -l
2
> pcregrep --version
pcregrep version 8.12 2011-01-15
on Ubuntu natty
--
Configure bugmail:
http://bugs.exim.org/userprefs.cgi?tab=email