Author: Kevin P. Fleming Date: To: exim-users Subject: [Exim] [RFC] Callout caching version 2 discussion
I've been running with my callout caching patch for four days now; no
problems caused so far. However, after reviewing the database this
morning I can see some room for improvement. On a busy mail server, the
callout cache could grow uncontrollably. Granted, someone could run
exim_tidydb on a daily basis and delete records older than (for example)
48 hours. However, some automatic pruning could be good:
1) Sender addresses can vary in case, even the domain portion. As it
stands, this causes multiple records in the cache for what are actually
the same address, and can result in unnecessary callouts.
2) There is currently no automatic purging of the database (other than
old records being deleted if that address is verified again and fails).
There are also incoming sender addresses that are always unique (some
mailing lists, and things like Yahoo! Groups). Caching these addresses
is a waste of space, since a new sender address is generated for every
message.
To solve item number 1, I can make some changes to use Exim's
deliver_split_address function to break the address apart using Exim's
existing logic (including percent hacks and all the rest) and then use
only the lowercased domain name as my cache key.
Philip: This could be done one of two ways: copy the incoming
address_item structure into a temporary structure on the stack, or
modify the original structure. After looking at the code, the only two
places call verify_address with callout enabled don't care about the
domain/cc_local_part/local_part fields in the address_item structure
they're passing, so modifying them should not cause a problem. However,
if you'd rather verify_address cause no new side effects to the incoming
structure, I can just copy it and modify the copy.
To solve item number 2, I'm thinking of maybe adding an additional
configuration option, callout_cache_ignore, which could be a list of
regex items. If the (now lowercased) key matches one of those items,
then don't bother entering the key into the database (for either a
positive or negative verification). I haven't looked at the difficulty
of doing this yet, but I'm always up for a challenge.