Hi there,
I was wondering if somebody has any experience with extracting Patterns
from the mainlog for example for graylog or elastic.
The only thing I found were these patterns
https://github.com/vjeantet/grok/blob/master/patterns/exim
but I'd like to have a more extensive, more complete list. For example
vjeantet's GROK misses the "message fakereject" documented here Ch52.5 (
https://www.exim.org/exim-html-current/doc/html/spec_html/ch-log_files.html)
The basic idea would be to have one or a few giant GROK patterns to match
every possible field in any log line.
I have a few problems with doing this myself:
1. Different kinds of log lines:
If you exigrep some log lines there are those grouped together under one
message id and those which come before message delivery and thus cannot be
grouped to a specific message. No idea how many different "syntaxes" there
are for such lines.
2. Doubly used fields:
At the moment I only know of one field which is used for different purposed
on different log lines. "T" for "Topic" or "Transport" inbound and
outbound, which you can identify by the log line flags (Ch52.5). You can
find most Fields in Cha52.13 but it is incomplete because at least the
second "T" is not explained. So there could be more undocumented fields.
3. Complex Fields:
If you look at the H-Field it contains several pieces of information:
Hostname, HELO, IP and Port; separated by spaces and a colon. Sometimes you
can have every one of those Fields, sometimes there is no Hostname because
the PTR and A records doe not match(at least I think this is the reason),
sometimes there is no HELO because the server sent his hostname as HELO
(which is great) The linked GROK patterns do account for this correctly.
But if I do not know of other fields that could have a similar behavior.
4. Different keywords and extra messages:
Most fields have a clear "F=" or "H=" but there are also keywords like
"from" and "for"
I can only do so much to generate the most complete log lines. It would be
helpful to have a set of every field in the right order that might or might
not be displayed in a particular type of log line. Or at least know what
different kind of log lines there are would be helpful, there are at least
inbound ( <=, (= ), outbound (**, =>, *>, >>, etc.), completed, rejects
before delivering/submitting a mail.
Does anybody know of a complete pattern for mainlog (or even reject and
panic log as well)? Or are there maybe already GROK/Regex patterns I did
not stumble upon that satisfy my requirements or at least come close?
Cheers and have a nice weekend,
Christian Kugler