[exim-dev] Bad utf-8 in pgsql lookup and mainlog

Top Page
Delete this message
Reply to this message
Author: Axel Rau
Date:  
To: exim-dev
Subject: [exim-dev] Bad utf-8 in pgsql lookup and mainlog
Recording of utf-8 characters from headers in mainlog and PostgreSQL DB via lookup usually works flawlessly.

Occasionally PostgreSQL complains during INSERT of header items or main log events (our log host uses PostgreSQL as bakend) about invalid byte sequence, like here:

[1\3] 1V085d-00067H-9X H=mail03.noris.net [62.128.1.223] Warning: ACL "warn" statement skipped: condition test deferred: PGSQL: query failed: ERROR: invalid byte sequence for encoding "UTF8": 0xfc

2013-07-19T10:39:10.005396+00:00 db1 rsyslogd: db error (22021): invalid byte sequence for encoding "UTF8": 0xfc
2013-07-19T10:39:10.005415+00:00 db1 rsyslogd: db error (event): |2013-07-19t10:39:09.991124+00:00|6|2|mx4|exim| [2\3] (PGRES_FATAL_ERROR) (SELECT * FROM record_Reception( '1525916', '1V085d-00067H-9X', 'Staatstheater Nürnberg <info@???>', 'Newsletter Staatstheater Nürnberg', 'none', 'N/A'))

Does this come from bad encoding of original mail headers?
Is there an easy solution to skip bad characters before sending them to the DB?

In lokkups/pgsql.c:258 I see:
PQsetClientEncoding(pg_conn, "SQL_ASCII");

but I think it's not related.

Thanks, Axel
---
PGP-Key:29E99DD6 ☀ +49 151 2300 9283 ☀ computing @ chaos claudius