Misadventures in Python Packaging: Optional C Extensions
I began an unlikely adventure into Python packaging this week when I made what
I thought were some innocuous modifications to the source distribution and
setup.py
script for the peewee
database library. Over the course of a day, the setup.py
more than doubled in
size and underwent five major revisions as I worked to fix problems arising out
of various differences in users environments. This was tracked in issue #1676,
may it always bear witness to the complexities of Python packaging!
In this post I'll explain what happened, the various things I tried, and how I ended up resolving the issue.
What happened?
Peewee is a Python library that contains three optional C extensions, written in an intermediate language (Cython), which are converted to C and then compiled into shared libraries. Prior to 3.6.0, Peewee did not ship with the C source files generated from the Cython code, and so the build process would only attempt to compile the C extensions if you had already installed Cython. Presumably this was a good arrangement for most people, as I don't recall receiving many reports of issues.
Shortly after releasing 3.6.0, which included the C sources, I received a ticket (#1676) indicating a user was seeing a fatal error from the compiler when attempting to install peewee:
fatal error: sqlite3.h: No such file or directory
My first attempt at fixing this was to use the ctypes.util.find_library()
function to detect whether libsqlite3
was available, and only then would the
SQLite-specific extensions be built. After this, I pushed a new release and let
everyone know the bug had been fixed.
What do you mean you don't have a compiler?
I received a new report from another user saying that they were seeing a
different error because they don't have a C compiler installed. I dug around in
the distutils
code searching for a way to detect if a compiler was installed,
but ended up not finding anything that I felt confident would work on Linux,
Mac and Windows. I ended up temporarily adding the following code, which checks
for the existence of a compiler by compiling a small C source file:
def have_compiler():
from distutils.ccompiler import new_compiler
from distutils.errors import CompileError
from distutils.errors import DistutilsExecError
import tempfile
import warnings
fd, fname = tempfile.mkstemp('.c', text=True)
f = os.fdopen(fd, 'w')
f.write('int main(int argc, char** argv) { return 0; }')
f.close()
compiler = new_compiler()
try:
compiler.compile([fname])
except (CompileError, DistutilsExecError):
warnings.warn('compiler not installed')
return False
except Exception as exc:
warnings.warn('unexpected error encountered while testing if compiler '
'available: %s' % exc)
return False
else:
return True
Now there were two layers of checks:
- Does the user have a compiler? If so, we'll build extensions.
- Does the user also have
libsqlite3
? If so, we'll build the SQLite-specific extensions.
I pushed a new release and once again informed everyone that the bug had been fixed.
If at first you fail, try again
It didn't take long before a new report came up: a user was reporting that they
had a C compiler but they didn't have the Python headers, and so couldn't
compile the extension. I also received a fresh report that the original "missing sqlite3.h" error was still occurring for some users. I had assumed that if Python were installed, the headers
would be as well. Similarly, if libsqlite3
were available, then the
sqlite3.h
would be present. Apparently this is not the case on many
distributions. Back to the drawing board... How to detect the presence of a
header file as well?
I got inspiration from the simplejson
project, which, like Peewee, has an optional C extension. It does a very simple
thing: first it tries to build the project with the C extensions, and if that
fails, it falls-back to a pure-python installation. Given all the problems I
was having, this seemed like the best approach, so I removed the
have_compiler()
function and just wrapped the setup()
in a conditional.
The first attempt looked something like this:
def _do_setup(c_extensions, sqlite_extensions):
if c_extensions:
ext_modules = [speedups_ext_module]
if sqlite_extensions:
ext_modules.extend([sqlite_udf_module, sqlite_ext_module])
else:
ext_modules = None
setup(
name='peewee',
# ... other arguments ...
ext_modules=cythonize(ext_modules))
if extension_support:
try:
_do_setup(extension_support, sqlite_extension_support)
except (CompileError, DistutilsExecError, LinkError):
print('#' * 75)
print('Error compiling C extensions, C extensions will not be built.')
print('#' * 75)
_do_setup(False, False)
else:
_do_setup(False, False)
When I went to test this on a docker image that didn't have a compiler
installed (I was getting smarter by this point) I found that the installation
aborted if the first call to _do_setup()
failed. I thought I had been
catching the appropriate exceptions, but it turns out that the distutils
build_ext
command will raise a SystemExit
exception upon failure and so I'd
have to catch that if I wanted to try again.
Catching a SystemExit
seemed extreme. Referring back to simplejson
,
I saw that it implemented a custom build_ext
command class which raised a
custom error class. I had wondered why they did this the first time I looked at
the code and now it made sense: this allowed them to circumvent distutils
raising a SystemExit
.
The code now looked like this:
class BuildFailure(Exception): pass
class _PeeweeBuildExt(build_ext):
def run(self):
try:
build_ext.run(self)
except DistutilsPlatformError:
raise BuildFailure()
def build_extension(self, ext):
try:
build_ext.build_extension(self, ext)
except (CCompilerError, DistutilsExecError, DistutilsPlatformError):
raise BuildFailure()
def _do_setup(c_extensions, sqlite_extensions):
# everything the same except for the inclusion of my custom command class.
setup(
# ...
cmdclass={'build_ext': _PeeweeBuildExt})
if extension_support:
try:
_do_setup(...)
except BuildFailure: # NOW we can catch the build failure!
# ...
I tested this new script and finally it appeared to be working!
Everyone's happy but me
At this point we are at release 3.6.4, and people were reporting that the project was installing successfully again. I am very grateful to them for their persistence in uncovering these issues and their patience while I fixed them. The end-result, though, isn't very aesthetically satisfying.
In a fit of pique, I decided to make one final addition to the script. I was bothered by the fact that my SQLite3 detection was flawed and wanted a more robust way to differentiate whether a build failure was due to a general inability to compile Python C extensions, or specifically missing the SQLite headers.
So I removed the ctypes.util.find_library()
function (which didn't work
anyways) and replaced it with a small function that actually attempted to
include "sqlite3.h" and link against libsqlite3
. The function looks like
this (inspired by this StackOverflow answer):
def _have_sqlite_extension_support():
import shutil
import tempfile
from distutils.ccompiler import new_compiler
from distutils.sysconfig import customize_compiler
libraries = ['sqlite3']
c_code = ('#include <sqlite3.h>\n\n'
'int main(int argc, char **argv) { return 0; }')
tmp_dir = tempfile.mkdtemp(prefix='tmp_pw_sqlite3_')
bin_file = os.path.join(tmp_dir, 'test_pw_sqlite3')
src_file = bin_file + '.c'
with open(src_file, 'w') as fh:
fh.write(c_code)
compiler = new_compiler()
customize_compiler(compiler)
success = False
try:
compiler.link_executable(
compiler.compile([src_file], output_dir=tmp_dir),
bin_file,
libraries=['sqlite3'])
except CCompilerError:
print('unable to compile sqlite3 C extensions - missing headers?')
except DistutilsExecError:
print('unable to compile sqlite3 C extensions - no c compiler?')
except DistutilsPlatformError:
print('unable to compile sqlite3 C extensions - platform error')
else:
success = True
shutil.rmtree(tmp_dir)
return success
You can view the full setup.py
in all it's baroque glory on GitHub.
What have we learned?
Here are some assumptions you might want to check if you're packaging a library with optional C extensions:
- Is the package being built on a different interpreter than CPython (pypy, for instance)?
- Is there a C compiler?
- Are the Python headers present?
- Are the shared libraries you're linking against installed?
Comments (0)
Commenting has been closed.