Re: [exim-dev] Embedding Python

Top Page
Delete this message
Reply to this message
Author: Phil Pennock
Date:  
To: jgh
CC: exim-dev
Subject: Re: [exim-dev] Embedding Python
On 2013-05-02 at 12:01 +0100, Jeremy Harris wrote:
> So presumably if the exim process modifies its own environment in that
> way before starting Python, we get what we need?


Messing with environment seems bad, and affects the ability of sysadmin
to do so if needed.

Todd should just be able to use PyList_Insert() to modify sys.path,
after importing sys. Note that this has to happen after the interpreter
is loaded, since Py_GetPath() is used to provide the default value.

The main things to figure out and document are:

 1. Does site initialisation get loaded?  If not, does another fixed
    module name get loaded instead, letting administrators symlink it to
    pull in any site customisations of sys.path done therein?  This is
    `site.py` inside the main python system modules directory, and it
    affects whether or not distutils-provided packages can be used, etc
    etc.


 2. Is threading initialised?  If not, threading calls will all succeed
    but the other threads will never actually be scheduled.  It's
    GIL-based, and OS-level threads are unaffected, since effectively
    it's just a single real thread, but given that Exim is traditionally
    single-threaded, even just thinking about explaining and helping
    debug this might cause issues.


    I suspect we explicitly document that threading is not initialised,
    because of Exim's forking model, and that folks who want to use
    threaded python architect to have a separate long-lived daemon run
    entirely outside of Exim and then perhaps use the Python-in-Exim
    support to talk to that daemon.


 3. Signal handler registration?  Yes or no?  Probably not.  Still needs
    to be documented as a limitation.


 4. What do we do about character sets, and what data is passed to the
    Python code from Exim strings?  How does this change with Python 3
    versus Python 2?  Py3 uses Unicode extensively and a lot of existing
    interfaces pass around byte arrays instead.


 5. What happens with the lifetime and scope of globals?  Is there any
    persistence within a process between calls, or is there a fresh
    global dict at each invocation, snapshotted from any initialisation
    atstart-style function?


 6. What happens with garbage collection?  Do we GC at every return to
    top-level Exim config?  Do we never GC?  This affects when return
    values have to be copied to Exim's storage pools, or even if they
    need to be, and whether subsequent calls to Python might invalidate
    existing returned string pointers, if we don't
    copy(-and-perhaps-GC).


Those are the issues that come immediately to mind. There may be
others.

-Phil