[exim] "PGSQL: query failed: " randomly/mysteriously without…

Top Page
Delete this message
Reply to this message
Author: Alexander Neumann
Date:  
To: exim-users
New-Topics: [exim] exim randomly closes database connection (was: "PGSQL: query failed: "...)
Subject: [exim] "PGSQL: query failed: " randomly/mysteriously without any (real) error message
Hi,

I added support for virtual domains stored in a postgresql database to the
basic Debian exim4 config, with support for sa-exim. The sa-exim reject and
tempreject scores and a global variable indicating if spamassassin scanning
should be performed are looked up in the database.

Exim is version 4.50, postgresql version 7.4.7.

General setup ("|" indicates lines from the configuration, the SELECTs
should either extract the values for the mailbox or for a catchall):

| ALIAS_MAILBOX_TABLE = exim_virtual_addresses


[...]

| # tempreject score value for or catchall
| SELECT_TEMPREJECT_SCORE = SELECT sa_tempreject_score FROM ALIAS_MAILBOX_TABLE WHERE domain = '${quote_pgsql:$domain}' AND (mailbox = '${quote_pgsql:$local_part}' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1
|
| # reject score value for given recipient or catchall
| SELECT_REJECT_SCORE = SELECT sa_reject_score FROM ALIAS_MAILBOX_TABLE WHERE domain = '${quote_pgsql:$domain}' AND (mailbox = '${quote_pgsql:$local_part}' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1


[...]

| acl_spamcheck_deny:
|   accept
|       condition = ${tr {${lookup pgsql{SELECT_SPAMCHECK}}}{tf}{10}}



For using these values in sa-exim, some special headers are added in the
RCPT acl:

|  # otherwise (if spamcheck enabled) include scores in header
|  warn
|    message = X-SA-Tempreject-Score: ${lookup pgsql{SELECT_TEMPREJECT_SCORE}}
|    log_message = tempreject_score for $local_part@$domain is ${lookup pgsql{SELECT_TEMPREJECT_SCORE}}
|    acl = acl_spamcheck_deny
| 
|  warn
|    message = X-SA-Reject-Score: ${lookup pgsql{SELECT_REJECT_SCORE}}
|    log_message = reject_score for $local_part@$domain is ${lookup pgsql{SELECT_REJECT_SCORE}}
|    acl = acl_spamcheck_deny



This works in most cases, but from time to time, (perhaps) depending on the
remote Mailserver, some of these queries fail mysteriously. An example
(from mainlog):

=================<snip>========================
2005-09-01 18:19:01 H=www.xxxxxxxxx.de [XXX.XXX.XXX.XXX]:40692 I=[XX.XX.XXX.XX]:25 Warning: tempreject_score for xxxxxxxxx@??? is 2
2005-09-01 18:19:01 failed to expand ACL message "X-SA-Reject-Score: ${lookup pgsql{SELECT sa_reject_score FROM exim_virtual_addresses WHERE domain = '${quote_pgsql:$domain}' AND (mailbox = '${quote_pgsql:$local_part}' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1}}": lookup of "SELECT sa_reject_score FROM exim_virtual_addresses WHERE domain = 'xxxx.org' AND (mailbox = 'xxxxxxxxx' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1" gave DEFER: PGSQL: query failed:

2005-09-01 18:19:01 failed to expand ACL message "reject_score for $local_part@$domain is ${lookup pgsql{SELECT sa_reject_score FROM exim_virtual_addresses WHERE domain = '${quote_pgsql:$domain}' AND (mailbox = '${quote_pgsql:$local_part}' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1}}": lookup of "SELECT sa_reject_score FROM exim_virtual_addresses WHERE domain = 'xxxx.org' AND (mailbox = 'xxxxxxxxt' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1" gave DEFER: PGSQL: query failed:

2005-09-01 18:19:01 H=www.xxxxxxxxx.de [XXX.XXX.XXX.XXX]:40692 I=[XX.XX.XXX.XX]:25 F=<xxxxxxx@www.xxxxxxxxx.de> temporarily rejected RCPT <xxxxxxxxx@???>: PGSQL: query failed:
=================<snip>========================


When executing this query manually or sniffing the SMTP connection and doing
a replay (eithen using -bh[c] or directly via SMTP), everything works
exactly as expected. I discovered two odd things (odd at least to me):
* the space character at the end of the log line
* the following empty line without timestamp

The relevant lines in the postgresql server logfile (with statements):
=================<snip>========================
2005-09-01 18:18:56 [7142] LOG: statement: SELECT home, mail, system_user, uid, gid FROM dovecot_mailboxes WHERE mailbox = 'codebreaker' AND domain = 'bumpern.de'
2005-09-01 18:19:01 [5462] LOG: connection received: host=127.0.0.1 port=41446
2005-09-01 18:19:01 [5462] LOG: connection authorized: user=exim database=pfauadmin
2005-09-01 18:19:01 [5462] LOG: statement: SELECT domain || ' : ' FROM exim_virtual_domains
2005-09-01 18:19:01 [5462] LOG: statement: SELECT sa_enable FROM exim_virtual_addresses WHERE domain = 'p23q.org' AND (mailbox = 'cyberport' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1
2005-09-01 18:19:01 [5462] LOG: statement: SELECT sa_tempreject_score FROM exim_virtual_addresses WHERE domain = 'p23q.org' AND (mailbox = 'cyberport' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1
2005-09-01 18:19:01 [5462] LOG: unexpected EOF on client connection
=================<snip>========================


I ran exim4 -d+all-memory, the relevant lines are:

=================<snip>========================
18:19:01  1700 database lookup required for SELECT sa_reject_score FROM exim_virtual_addresses WHERE domain = 'xxxx.org' AND (mailbox = 'xxxxxxxxx' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1
18:19:01  1700 PGSQL query: SELECT sa_reject_score FROM exim_virtual_addresses WHERE domain = 'xxxx.org' AND (mailbox = 'xxxxxxxxx' OR mailbox = '*') ORDER BY mailbox DESC LIMIT 1
18:19:01  1700 PGSQL using cached connection for localhost/pfauadmin/exim
18:19:01  1700 PGSQL: query failed:                                                                                                                                      18:19:01  1700 
18:19:01  1700 lookup deferred: PGSQL: query failed: 
18:19:01  1700 
=================<snip>========================




Any idea, what causes these fails and how to resolve or workaround this?


Thanks for reading this long mail! :)

- Alexander