https://bugs.exim.org/show_bug.cgi?id=1841
Bug ID: 1841
Summary: Pcre using a lot of cpu and time to match
Product: PCRE
Version: 8.34
Hardware: x86-64
OS: Windows
Status: NEW
Severity: bug
Priority: medium
Component: Code
Assignee: ph10@???
Reporter: pankaj-g@???
CC: pcre-dev@???
Created attachment 890
-->
https://bugs.exim.org/attachment.cgi?id=890&action=edit
Very Sleepy Data
Hi,
We are using PCRE is our application to match regex against log files.
A log file of size 352 MB (369,827,143 bytes), 1,819,451 lines take close to
5hrs to match with our application.
Used Very Sleepy(
http://www.codersnotes.com/sleepy/) to check what was hogging
the cpu and why it was taking so much time.
And from the data of 60minutes (3600 sec) we can see that pcre_exec took
59.4minutes(3565sec) totally.
the regex used is
/(consider|minute|accord|evident|practice|intend|concern|commit|issue|approach|establish|utter|conduct|engage|obtain|scarce|straight|stock|apparent|fancy|concept|court|appoint|passage|vain|coast|project|commission|constant|circumstances|constitute|affect|institute|render|appeal|theory|range|campaign|league|labor|confer|grant|dwell|entertain|contract|earnest|yield|wander|insist|knight|convince|inspire|convention|skill|harry|financial|novel|furnish|compel|venture|territory|temper|bent|intimate|undertake|majority|assert|crew|chamber|humble|keen|liberal|despair|tide|attitude|justify|flag|merit|manifest|notion|scale|formal|persist|contempt|tour|plead|weigh|distinction|inclined|exert|oppress|contend|stake|toil|perish|disposition|rail|cardinal|boast|advocate|bestow|allege|notwithstanding|lofty|multitude|steep|heed|modest|apt|esteem|credible|provoke|tread|ascertain|fare|perpetual|decree|contrive|elaborate|substantial|frontier|facile|cite|warrant|sob|rider|dense|afflict|flourish|ordain|pious|vex|gravity|suspended|conspicuous|retort|bolt|assent|purse|plus|sanction|proceeding|exalt|siege|malice|extravagant|wax|throng|venerate|assail|sublime|exploit|exertion|kindle|endow|imposed|humiliate|suffrage|ensue|brook|gale|muse|satire|intrigue|indication|dispatch|cower|wont|canon|impel|latitude|vacate|undertaking|slay|predecessor|delicacy|forsake|beseech|philosophical|grove|frustrate|illustrious|pomp|entreat|impart|propriety|consecrate|proceeds|fathom|objective|clad|partisan|faction|contrived|venerable|restrained|besiege|manifestation|rebuke|insurgent|rhetoric|scrupulous|ratify|stump|discreet|imposing|wistful|mortify|ripple|premise|subside|adverse|caprice|muster|comprehensive|accede|fervent|cohere|tribunal|austere|recovering|stratum|conscientious|arbitrary|exasperate|conjure|ominous|edifice|elude|pervade|foster|admonish|repeal|retiring|incidental|acquiesce|slew|usurp|sentinel|precision|depose|wanton|odium|precept|deference|fray|candid|enduring|impertinent|bland|insinuate|nominal|suppliant|languid|rave|monetary|headlong|infallible|coax|explicate|gaunt|morbid|ranging|pacify|pastoral|dogged|aide|appease|stipulate|recourse|constrained|bate|aversion|conceit|loath|rampart|extort|tarry|perpetrate|decorum|luxuriant|cant|enjoin|avarice|disconcert|symmetry|capitulate|arbitrate|cleave|append|visage|horde|parable|chastise|foil|veritable|grapple|gentry|projection|prowess|dingy|semblance|tout|fortitude|asunder|rout|staid|beguile|purport|deprave|bequeath|enigma|assiduous|vassal|quail|outskirts|bulwark|swerve|gird|betrothed|prospective|advert|peremptory|rudiment|deduce|halting|ignominy|ideology|pallid|chagrin|obtrude|audacious|construe|ford|repast|stint|fresco|dutiful|hew|parity|affable|interminable|pillage|foreboding|rend|livelihood|deign|capricious|stupendous|chaff|innate|reverie|wrangle|crevice|ostensible|craven|vestige|plumb|reticent|propensity|chide|espouse|raiment|intrepid|seemly|allay|fitful|erode|unaffected|canto|docile|patronize|teem|estrange|spat|warble|mien|sate|constituency|patrician|parry|practitioner|ravel|infest|actuate|surly|convalesce|demoralize|devolve|alacrity|waive|unwonted|seethe|scrutinize|diffident|execrate|implacable|pique|mite|encumber|uncouth|petulant|expiate|cavalier|banter|bluster|debase|retainer|subjugate|extol|fraught|august|fissure|knoll|callous|inculcate|nettle|blanch|inscrutable|tenacious|thrall|exigency|disconsolate|impetus|imposition|auspices|sonorous|exploitation|bane|dint|ignominious|amicable|onset|conservatory|zenith|voluble|yeoman|levity|rapt|sultry|pinion|axiom|descry|retinue|functionary|imbibe|diversified|maraud|grudging|partiality|philology|wry|caucus|permeate|propitious|salient|propitiate|excise|betoken|palatable|upbraid|renegade|hoary|pedantic|coy|troth|encroachment|belie|armada|succor|imperturbable|irresolute|knack|unseemly|accentuate|divulge|brawn|burnish|palpitate|promiscuous|dissemble|flotilla|invective|hermitage|despoil|sully|malevolent|irksome|prattle|subaltern|welt|wreak|tenable|inimitable|depredation|amalgamate|immutable|proxy|dote|reactionary|rationalism|endue|discriminating|brooch|disembark|trappings|abet|clandestine|distend|glib|pucker|rejoinder|spangle|blighted|nicety|aggrieve|vestment|urbane|defray|spectral|munificent|dictum|scabbard|adulterate|beleaguer|gripe|remission|exorbitant|invocation|cajole|inclusive|interdict|abase|obviate|hurtle|unanimity|mettle|interpolate|surreptitious|dissimulate|ruse|specious|revulsion|hale|palliate|obtuse|querulous|vagary|incipient|obdurate|grovel|refractory|dregs|ascendancy|supercilious|pundit|commiserate|alcove|assay|parochial|conjugal|abjure|frieze|ornate|inflammatory|machination|mendicant|meander|bullion|diffidence|makeshift|husbandry|podium|dearth|granary|whet|imposture|diadem|fallow|hubbub|dispassionate|harrowing|askance|lancet|rankle|ramify|gainsay|polity|credence|indemnify|ingratiate|declivity|importunate|whittle|repine|flay|larder|threadbare|grisly|untoward|idiosyncrasy|quip|blatant|stanch|incongruity|perfidious|platitude|revelry|delve|extenuate|polemic|enrapture|virtuoso|glower|mundane|fatuous|incorrigible|postulate|vociferous|purvey|baleful|gibe|dyspeptic|prude|luminary|amenable|willful|overbearing|dais|automate|enervate|wheedle|gusto|bouillon|omniscient|apostate|carrion|emolument|ungainly|impiety|decadence|homily|avocation|circumvent|syllogism|collation|haggle|waylay|savant|cohort|unction|adjure|acrimony|clarion|turbid|cupidity|disaffected|preternatural|eschew|expatiate|didactic|sinuous|rancor|puissant|homespun|embroil|pathological|resonant|libretto|flail|bandy|gratis|upshot|aphorism|redoubtable|corpulent|benighted|sententious|cabal|paraphernalia|vitiate|adulation|quaff|unassuming|libertine|maul|adage|expostulation|tawdry|trite|hireling|ensconce|egregious|cogent|incisive|errant|sedulous|incandescent|derelict|entomology|execrable|sluice|moot|evanescent|vat|dapper|asperity|flair|circumspect|inimical|apropos|gruel|gentility|disapprobation|cameo|gouge|oratorio|inclement|scintilla|confluence|squalor|stricture|emblazon|augury|abut|banal|congeal|pilfer|malcontent|sublimate|eugenic|lineament|firebrand|fiasco|foolhardy|retrench|ulterior|equable|inured|invidious|unmitigated|concomitant|cozen|phlegmatic|dormer|pontifical|disport|apologist|abeyance|enclave|improvident|disquisition|categorical|placate|redolent|felicitous|gusty|natty|pacifist|buxom|heyday|herculean|burgeon|crone|prognosticate|lout|simper|iniquitous|rile|sentient|garish|readjustment|erstwhile|aquiline|bilious|vilify|nuance|gawk|refectory|palatial|mincing|trenchant|emboss|proletarian|careen|debacle|sycophant|crabbed|archetype|cryptic|penchant|bauble|mountebank|fawning|hummock|apotheosis|discretionary|pithy|comport|checkered|ambrosia|factious|disgorge|filch|wraith|demonstrable|pertinacious|emend|laggard|waffle|loquacious|venial|peon|effulgence|fanfare|dilettante|pusillanimous|ingrained|quagmire|mannered|squeamish|proclivity|miserly|vapid|mercurial|perspicuous|nonplus|enamor|hackneyed|spate|pedagogue|acme|masticate|sinecure|indite|emetic|temporize|unimpeachable|genesis|mordant|smattering|suavity|stentorian|junket|appurtenance|nostrum|immure|astringent|unfaltering|tutelage|testator|elysian|fulminate|fractious|pummel|manumit|unexceptionable|triumvirate|sybarite|jibe|magisterial|roseate|obloquy|hoodwink|striate|arrogate|rarefied|chary|credo|superannuated|impolitic|aspersion|abysmal|poignancy|stilted|effete|provender|endemic|jocund|procedural|rakish|skittish|peroration|nonentity|abstemious|viscid|doggerel|sleight|rubric|plenitude|rebus|wizened|whorl|fracas|iconoclast|saturnine|madrigal|discursive|zealot|moribund|modicum|connotation|adventitious|recondite|zephyr|countermand|captious|cognate|forebear|cadaverous|foist|dotage|nexus|choleric|garble|bucolic|denouement|animus|overweening|tyro|preen|largesse|retentive|unconscionable|badinage|insensate|sherbet|beatific|bemuse|microcosm|factitious|gestate|traduce|sextant|coiffure|malleable|rococo|fructify|nihilist|ellipsis|accolade|codicil|roil|grandiloquent|inconsequential|effervescence|stultify|tureen|pellucid|euphony|apocryphal|apocryphal|veracious|pendulous|exegesis|effluvium|apposite|viscous|misanthrope|vintner|halcyon|anthropomorphic|turgid|malaise|polemical|gadfly|atavism|contusion|parsimonious|dulcet|reprise|anodyne|bemused)/
This is one scenario, our general regex can have more then 1000 elements in OR
to match the log file.
We are using 8.34 in our application, we can not upgrade our library because
its is a multi-platform application that will take huge amount of time and a
lot of approvals as well.
Is there a work around or anything we can do to minimize this time and cpu
usage?
Let me know if you want to check the log file as well.
Regard,
Pankaj
--
You are receiving this mail because:
You are on the CC list for the bug.