Of course its not going to generate a unique filename when you filter. But
thats not the intent of the "tainting" system. The intent of the tainting
system, is to prevent command injection and escaping out of directories and
similiar threats, that can occur when concatenating user-submitted data with
other data.
SQL injection is another such a case.
The tainting system is not designed to be foolproof, but more catch mistakes
from system administrators when they might accidentially change a variable
into a user-submitted one that allows an remote attacker to write to
arbitary locations in filesystem and similiar.
Thats why I suggest this "detaint" operator, which can both be used on
tainted data to detaint them AND filter them in same place, but also as a
fool-proof safe filter that can also be used in other situations where you
want to restrict the permitted characters to avoid security issues.
This means also the "detaint" operator must filter in such a secure way,
that its impossible to in any way inject, or cause to be injected, any other
character than those listed in <charlist>, even with buffer overflows,
malicious UTF8 characters or other threats that might bypass the <charlist>
filter.
About domain names with greek/cyrillic characters would of course get
filtered UNLESS the system administrator has configured the charlist to
include these greek/cyrilic characters. Thats why I suggest the charlist to
be user-configurable, so it can both be configured for locale, but also for
the intent where the tainted data is to be used.
For example, in a filesystem call, / and \ is dangerous characters, and in a
pipe call, " " (space) might be dangerous since it moves the parser into the
next argument, and in SQL then ' and - is dangerous characters.
And yes, hashing or for example converting into URL-safe Base64 or Hex,
should be safe operations to do on unsafe data, where the resulting data can
safely be directly be fed into a shell, filesystem path, database query or
similiar, but as you said, I don't either know if such operations do untaint
data.
After all, these types of data output formats, disallow the characters that
can be used to escape out of a command, query or filesystem path and thus is
safe in such regards.
-----Ursprungligt meddelande-----
Från: Andrew C Aitchison via Exim-users <exim-users@???>
Skickat: den 3 juni 2020 16:03
Till: Sebastian Nielsen <sebastian@???>
Kopia: exim-users@???
Ämne: Re: [exim] Suggestion: detainting via string exp
On Wed, 3 Jun 2020, Sebastian Nielsen via Exim-users wrote:
> I have a suggestion, and that is to allow detainting of data via a
> new string expansion called
> detaint.${detaint{<string>}{<charlist>}}Idea is that you supply the
> string you want detainted, and a "permitted character list" where
> all characters not on that list will be deleted.However, <charlist>
> must be untainted. Would even be better to completely disable string
> expansion for <charlist>.Example:if $domain contains
> "sebbe.eu/../../../../etc/passwd" then
>
${detaint{$domain}{abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234
56789_-.}}
> will return: sebbe.eu........etcpasswd in detainted form - which
> will be safe to use as a filename OF COURSE its the system
> administrator's resposibility to supply a character list that is
> "safe" for the use case of said tainted data.Since of the new
> stricter taint rules, system administrators shouldn't need to
> specify complete domain whitelists, its better to allow for "any
> domain" as long as this "any domain" is scrubbed from any unsafe
> data.
"Safe" as in will create a valid file, but not necessarily a unique one
eg 3.com.com and 3com.com will use the same file
(these are both real addresses:
# host -t mx 3.com.com
3.com.com mail is handled by 10 mx203.inbound-mx.net.
3.com.com mail is handled by 10 mx203.inbound-mx.org.
# host -t mx 3com.com
3com.com mail is handled by 10 mxa-00010e01.gslb.pphosted.com.
3com.com mail is handled by 10 mxb-00010e01.gslb.pphosted.com.
).
And what about domain names with special (ie non-ascii) characters ?
Did you know that .eu also maintains Greek and Cyrillic top level domains -
see
https://eurid.eu/en/register-a-eu-domain/domain-names-with-special-character
s-idns/
-----------------------
Exim has several hash functions. A hash of $domain is not as human readable
as your suggestion, but it would be a reasonable alternative filename,
although I have not verified that hashing untaints a string.
--
Andrew C. Aitchison Kendal, UK
andrew@???
--
## List details at
https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at
http://www.exim.org/
## Please use the Wiki with this list -
http://wiki.exim.org/