User Tools

Site Tools


info:python_bindings

Python Bindings

The main goals of the things listed on this page are:

  1. Providing a way to call C/C++ code from Python (Python "Bindings" really only applies to this, which is the focus of this page)
  2. Speeding up Python code

Example Binding Projects

Ctypes

The ctypes library is the only option that doesn't require compiling against the Python API, so it's Python-version-agnostic. However, it can only call pure C functions, and I believe that the overhead is fairly high compared to compiled extensions.

Ctypes also works with PyPy. As of PyPy 1.6, the ctypes implementation was said to be 10 times as fast as in CPython.

There is a method briefly described in this StackOverflow question for using a C++ class with ctypes, but it seems like a lot of work, and there are unanswered questions like how you delete the object when done (some ideas in another question). It seems like this could be done automatically, but I haven't found anything (except this mailing list post, and I'm still not sure what that does). Is there any particular reason for this, or just no one has done it?

Parsing C++: pygccxml, C++ Header Parser

CFFI

http://cffi.readthedocs.org/en/latest/index.html

Recently (2012-06) released: http://morepypy.blogspot.com/2012/06/release-01-of-cffi.html

Intended to work with both CPython and PyPy, it appears to be something like an easier-to-use version of ctypes.

Distribution

When distributing modules with CFFI (pre-building the extensions, as indicated in the documentation), the arguments to verify must be exactly the same between the machine where you built the extension and where you are running it, otherwise it will try to rebuild.

One alternative, if the arguments won't be exactly the same, should be to set the modulename manually, but then it won't automatically rebuild (during development).

Some more info: https://bitbucket.org/cffi/cffi/issue/53/use-extension-modules-without-building

This article explains some more of the trickiness around distributing CFFI projects, using techniques from the cryptography project: https://caremad.io/2014/11/distributing-a-cffi-project/

Python C/C++ Extensions

These are the options that I know of at the moment (I haven't tried many of them):

  • Writing extensions directly using Python's C API
  • pybind11 - new-ish binding generator using C++11; similar to Boost.Python but smaller
    • Header-only library; you only need to link against Python
    • Optional CMake functions are included for compiling modules
    • https://github.com/RosettaCommons/binder - binding generator for pybind11 using Clang LibTooling
    • Works with CPython and PyPy
    • Mentioned on CFFI website as a recommendation for C++ code
    • It's not clear (without trying it) if generated bindings would require the large cppyy backend packages
    • This is the only one (that I know of) that is able to generate bindings for languages besides Python; Java might potentially be useful
    • So far I've had the most luck with SWIG for dealing with complicated legacy code, primarily because it's easy to only parse the bits you need and ignore the rest
  • Interrogate - generates Python bindings for C++ code; looks simple to use
  • robin - another binding generator for C++ code
  • Boost.Python
    • Possibly slower than other options (ref 1, ref 2)
    • PySide started with Boost.Python but switched to their own ("Shiboken") because the libraries were too big (however, Qt is a very big and complicated project)
  • Py++ (generates Boost.Python code)
    • Uses GCC-XML to read C++ headers
    • Also has some facility using ctypes; it's not too clear right now. See documentation and message.
  • https://pypi.org/project/PyBindGen/ - supposed to generate "small and fast" modules; I'm pretty sure it uses the C API directly instead of requiring Boost.Python
    • Doesn't require GCC-XML, but can use it to generate binding code from C++ headers
    • Might could use it to generate C code which is then compiled against Python 2.3
  • SIP (used by PyQt)
    • From their basic example, it looks like a lot of manual work
    • TGEA/TGB, GCC-XML, and Python - the author wrote code to read headers using GCC-XML then use that to generate the SIP bindings; he indicated that SIP generated good, efficient bindings, still able to do things like override virtual methods from Python
  • Shiboken (used by PySide bindings for Qt)
    • Recommends the use of CMake in the FAQ
    • Was/is fairly specific to Qt4-based C++ code; last I saw they didn't recommend it for generic bindings (update 2011-05-11: I don't see this warning anymore; the FAQ says you can wrap non-Qt libraries)
    • Appears to have some powerful stuff (supporting the complexities of Qt), and possibly automatically generates some amount of binding code (in addition to allowing customizing with "typesystem" XML files)
    • There is documentation, but no clear place to get started
    • There are a lot of broken links on the wiki pages
  • PyCXX - looks like a nice way to write extensions in C++
  • PyD - creating bindings for the D programming language
    • Uses CppHeaderParser, a pure-Python module
    • Generates C wrappers for C++ classes, then ctypes code to use the C code 1)
    • I'm not sure how to use it to generate C wrappers (as of 2013-01-11)

Hopefully in the future there will be some binding generators that use LLVM/Clang to parse C++.

Python-to-C (Cython)

With the following, you write Python-like code (with optional C data types), which generates C code (Python extensions). They have the ability to wrap C/C++ code. It's also possible to write C/C++ functions that are callable from other C/C++ code (e.g. callbacks). It looks like they would be extremely useful if you needed to optimize a slow Python program (i.e. profile the program and rewrite the worst 5% in Pyrex/Cython).

I've only used Cython. See Cython for more information.

PyPy Extensions

PyPy natively supports CFFI (currently seems to be the preferred method; see above) and ctypes; both are supposed to be very fast on PyPy.

PyPy supports a subset of the Python C API (and this will probably get better, and of course they'd accept patches of anything missing). They call this "cpyext".

I haven't seen whether or not Cython works with PyPy, but it seems like it could theoretically work, since it just generates C API code.

There is also work being done on wrapping C++ using reflection (e.g. with the Clang compiler), called "cppyy". Last time I checked this was still in early stages: http://morepypy.blogspot.com/2011/08/wrapping-c-libraries-with-reflection.html (update: http://morepypy.blogspot.com/2012/06/architecture-of-cppyy.html). They intend to use Clang in the future.

C Extension Problems

Interrupting

If a C function called from Python runs for a long time, the application won't respond to Ctrl+C. This is because the interrupt handler just sets a flag, which is checked by the Python interpreter, allowing it to throw a KeyboardInterrupt exception. Since the C function never re-enters the interpreter, the application doesn't respond.

This is a general problem; I don't think there's any one solution for all cases.

How Existing Extension Modules Handle Interrupts

It would be good to check how this is handled across various extension modules.

It looks like PyQt4 does not do anything to handle it 2).

Sage (uses Cython) has some special functions for this (appears to use some setjmp trickery): http://www.sagemath.org/doc/developer/coding_in_cython.html#interrupt-and-signal-handling

Redesign C API

One option is to make sure that C functions always return quickly. If there's a C main loop, for example, it could be called one iteration at a time inside a Python loop. For some types of functions, a timeout parameter could be helpful.

I believe there are also some functions in the Python C API (PyErr_CheckSignals?) that can check for signals (probably need to acquire the GIL first, if it was released). However, this requires making modifications to the library being wrapped, which in many cases is probably not desirable.

Ignore the Problem, Kill It Manually

If you're okay we Ctrl+C not working, you can just kill the process (SIGTERM). On Linux, this can usually be done by backgrounding it (Ctrl+Z) and then killing it (kill %1 if it's job #1).

Also, Ctrl+\ (Ctrl+Backslash) will send a SIGQUIT (which may cause the process to create a core dump).

Set the Signal Handler

You can set the signal handler to the special value signal.SIG_DFL, which causes the program to terminate instead of trying to raise a KeyboardInterrupt:

signal.signal(signal.SIGINT, signal.SIG_DFL)

I'm not sure if there are any implications to be considered when doing this, but it seems like it shouldn't be any different than sending a SIGTERM (i.e. killing the process). Presumably some things might not be cleaned up properly (e.g. finally blocks).

You could, for example, do this with a context manager, if you want to reset the handler after the call:

from contextlib import contextmanager
import signal
 
 
@contextmanager
def allow_c_interrupt():
    """
    Temporarily override the SIGINT handler (which would normally raise
    a KeyboardInterrupt exception) to terminate the application immediately.
    """
    orig_handler = signal.getsignal(signal.SIGINT)
    signal.signal(signal.SIGINT, signal.SIG_DFL)
    yield
    signal.signal(signal.SIGINT, orig_handler)
 
with allow_c_interrupt():
    # C function call

References:

Run in a Separate Thread/Process

Running in a separate thread/process can allow you to put a timeout on a C function, though there are still some wrinkles.

There's not a good way to kill a thread 3), but you might want to do something while waiting, such as make a log entry, or give some message to the user; or you could kill the entire application.

One easy way to run a function like this in a separate thread/process (note that running in a process requires that all arguments/return values be pickle-able) is to use the concurrent.futures module (added in Python 3.2, backported to Python 2.5+). For example:

from concurrent import futures
 
with futures.ThreadPoolExecutor(max_workers=1) as e:
    future = e.submit(my_c_function)
    try:
        future.result(timeout=5)

If you want to terminate your application after a timeout, the only way I've found so far is to have it send a SIGTERM to itself (sys.exit will just hang). The following example kills itself if it times out, or on Ctrl+C:

with futures.ThreadPoolExecutor(max_workers=1) as e:
    future = e.submit(rtdb_init)
    try:
        future.result(timeout=5)
    except futures.TimeoutError:
        print("Timed out")
        os.kill(os.getpid(), signal.SIGTERM)
    except KeyboardInterrupt:
        print("Interrupted")
        os.kill(os.getpid(), signal.SIGTERM)

Note that if you have multiple processes (subprocess / multiprocessing), then on Linux you should use os.killpg (instead of os.kill), which will kill the entire process group.

Other Speedup Methods

Other methods of speeding up Python code:

  • Psyco - sort of a "just-in-time" compiler for Python; only works on the i386 architecture (PyPy pretty much replaces this)
  • Stackless Python
  • PyPy - now also has a JIT compiler
  • Nuitka - translates Python to C++ and executes using libpython; does some optimizations
    • Works on RHEL5 using Python 2.7 and the devtoolset software collection (Red Hat Developer Toolset)
info/python_bindings.txt · Last modified: 2019-03-27 14:57 by sam