[pcre-dev] PCRE Regex for compression

Top Page

Reply to this message
Author: Era Scarecrow
Date:  
To: pcre-dev
Subject: [pcre-dev] PCRE Regex for compression
Dear Dev team,

Glancing through the API, i'm not seeing a way to get the exact details of how a particular regex matched. The idea i am thinking, involves making several patterns that are applied to a text and used for compression. If this type of feature is already available a reference to the function set would be nice. If not, here's some examples:

'(Wyoming|Idaho|Alaska|...' ect. This could easily go for all 50 states (or month names, days of the week, Ect) when a state is matched a escape character put along with the exact compressed data of how the regex was matched. (in this case, 0-49, being maybe 3-4 bytes to store all that information). More complicated regexes and fixed matches would of course not require middle information to store that data.

'I just bought \d+ apples? (to|yester)day)\.' would only save the digits data, that being how many, and then the exact combination to rebuild it. Since the fixed required text doesn't have choices it doesn't need any notes.

I see this as being possibly potential, but not specifically a regex issue. However i would hate to have to build my own regex from scratch which would not have nearly the features and many more bugs, when i could instead find a few functions that give me the appropriate information to do/experiment with this.

Era

---
"I synchronize and I specialize and I classify so much. Don’t worry about dreaming because I don’t sleep. I wish I could have least 30 percent, maybe 50 for pleasure then skip all the rest. If I only was more human I would count every single second the rest of my life" - Be Human