[Exim] rewriting full name in From-header with non-ascii characters

Autor: Meik Hellmund
Fecha:
A: exim-users
Asunto: [Exim] rewriting full name in From-header with non-ascii characters

Hi,

First, let me say many thanks for exim which is a terrific program!
Let me also mention that spec.txt is the best description of how an
MTA works and how it should be configured I ever found.
(And I have the O'Reily sendmail book lurking on my bookshelves.)

I am configuring the central mail hub for a university institute with dozens
of half-/self-/not-administrated Linux/Windows machines and I would like to
rewrite not only the sender address on outgoing mails but also the
"real name" as it appears in the From: header. The nice thing is that
exim 4.12 allows this using the "w" flag in a rewrite rule.
But what I needed was something to encode names with all the German
umlauts according to RFC 2047. This function already exists in the internal
exim code, it is parse_fix_phrase() in parse.c. So I decided to make this
function available as a new expansion operator by a small modification
of src/expand.c (see the attached diff):

${rfc2047:<string>}  applies  parse_fix_phrase() to string to construct valid
                    from-lines with non-ascii characters

This allows a rewrite rule of the form

*@my.domain \
"${lookup{$1}lsearch{/etc/mailnames}{${rfc2047:$value}}fail} <$1@???>" fw

which produces a from-line like

from: =?iso-8859-1?Q?Otto_K=E4=DFmann?= <otto@???>

when /etc/mailnames contains the line

otto: Otto Käßmann

What do you think about including something like this in exim?
Another possibility would be to call parse_fix_phrase() automatically when a
rewriting with the w flag takes place, but I didn't try that.

The next problem was that I wanted to use OpenLDAP as database. It stores
non-ascii strings in utf8 encoding. Since adding expansion operators is so
easy I added another one:

${utf8:<string>} converts string from utf8 to iso-8859-1

This is only a hack, it needs the iconv..() functions as provided by the
GNU libc. But it allows such nice rewriting rules as:

*@MAINDOMAIN \
  "${lookup ldap{user=LDAPUSER \
              ldap:///ou=People,LDAPDN\
              ?displayName?sub?(&(objectClass=inetOrgPerson)\
              (uid=${quote_ldap:$1}))}\
    {${rfc2047:${utf8:$value}}}fail} <$1@MAINDOMAIN>" fw

Another remark:
parse_fix_phrase() depends - via mac_isprint() - on the
print_topbitchars configuration option, which seems wrong to me. Is this a bug?

Thanks, Meik

--- src/expand.c.bak    2003-01-13 11:36:18.000000000 +0100
+++ src/expand.c        2003-01-13 18:01:18.000000000 +0100
@@ -27,7 +27,7 @@
 #include "lookups/ldap.h"
 #endif

-
+#include <iconv.h>

/* Recursively called function */

@@ -97,11 +97,13 @@
US"nh",
US"nhash",
US"quote",
+ US"rfc2047",
US"rxquote",
US"s",
US"sha1",
US"substr",
- US"uc" };
+ US"uc",
+ US"utf8" };

enum {
EOP_ADDRESS,
@@ -121,11 +123,13 @@
EOP_NH,
EOP_NHASH,
EOP_QUOTE,
+ EOP_RFC2047,
EOP_RXQUOTE,
EOP_S,
EOP_SHA1,
EOP_SUBSTR,
- EOP_UC };
+ EOP_UC,
+ EOP_UTF8 };

@@ -3565,6 +3569,35 @@
         continue;
         }

+      case EOP_RFC2047:
+        {
+        uschar *t, *big_buf;
+        /* we need a new big buffer for parse_fix_phrase() to avoid
+           trouble with outer regexps */
+        big_buf = big_buffer;
+        big_buffer=malloc(1024);
+        t = parse_fix_phrase(sub);
+        yield = string_cat(yield, &size, &ptr, t, Ustrlen(t));
+        free(big_buffer);
+        big_buffer=big_buf;
+        continue;
+        }
+
+      case EOP_UTF8:
+        {
+       iconv_t iv;
+       size_t in,out=99;
+       char outbuf[100],* outp;
+       iv = iconv_open("iso-8859-1","utf8");
+       in = Ustrlen(sub);
+       outp=outbuf;
+       iconv(iv,&sub,&in,&outp,&out);
+       iconv_close(iv);
+       *outp='\0';
+       yield = string_cat(yield, &size, &ptr, outbuf, out);
+        continue;
+       }
+
       /* Unknown operator */

       default:
########################END diff #######################

[Exim] rewriting full name in From-header with non-ascii cha…