Compiling SQLite for use with Python Applications
The upcoming SQLite 3.25.0 release adds support for one of my favorite SQL language features: window functions. Over the past few years SQLite has released many changes to improve the efficiency of the query planner and virtual machine, plus many extension modules which can provide additional functionality. For example:
- Window function support,
which are available in
trunkand will be included in the next release (3.25.0).
- Postgresql-style UPSERT support, which was released on 2018-06-04 (3.24.0).
- FTS5 full-text search extension, which improves on the previous full-text search modules (3.9.0).
- json1 extension, which brings support for JSON to SQLite (3.9.0).
- lsm1 extension and virtual table, while not officially released, is included in the source.
- Eponymous-only virtual tables, or table-valued functions (3.9.0).
- csv virtual table for browsing CSVs directly with SQLite (3.14.0).
Unfortunately, many distributions and operating systems include very old versions of SQLite. When they do include a newer release, typically many of these additional modules are not compiled and are therefore unavailable for use in your Python applications.
In this post I'll walk through obtaining the latest version of SQLite's source code and how to compile it so it includes these exciting features. We'll be doing all of this with the goal of being able to use these features in our Python applications, so we'll also be discussing how to integrate Python with our custom SQLite library.
Determining which features we want to enable
Some features, like window functions are enabled by default. Others, like full-text search and JSON, need to be enabled at compile-time. You can view the full list of SQLite compile-time options but for simplicity I'll provide a list that should be a good starting point:
- FTS3, FTS4 and the latest FTS5 full-text search extensions.
- JSON1 extension to allow JSON support.
- Keep database statistics to help the query-planner choose better plans.
- Include an implementation of the
SOUNDEXfunction, which is good for fuzzy or phonetic text search.
- Always use memory for the storage of temporary data.
These options give a good mix of features and performance.
Getting the latest SQLite
To get started with this feature, we'll need to have the latest version of SQLite installed. We can pull down the source code and compile it directly from SQLite's version control:
$ wget https://www.sqlite.org/src/tarball/sqlite.tar.gz $ tar xzf sqlite.tar.gz $ cd sqlite/
At this point we now have a directory containing the most up-to-date SQLite source code. Now we should decide how we want to compile this. I'll present two options:
- System-wide install, replacing the system SQLite. Python's
sqlite3module will then use this new
libsqlite3that we compile and we're all set.
- Installation into a virtualenv, which is accomplished by compiling a
sqlite3driver and statically-linking the latest SQLite code.
If you're not sure, go with option two.
Installing SQLite system-wide will allow all your Python projects to utilize
the latest SQLite, since the standard library
sqlite3 module typically links
against the system SQLite. Installing the latest SQLite system-wide is pretty
straightforward. We'll be using the compile-time options described above.
$ export CFLAGS="-DSQLITE_ENABLE_FTS3 \ -DSQLITE_ENABLE_FTS3_PARENTHESIS \ -DSQLITE_ENABLE_FTS4 \ -DSQLITE_ENABLE_FTS5 \ -DSQLITE_ENABLE_JSON1 \ -DSQLITE_ENABLE_LOAD_EXTENSION \ -DSQLITE_ENABLE_RTREE \ -DSQLITE_ENABLE_STAT4 \ -DSQLITE_ENABLE_UPDATE_DELETE_LIMIT \ -DSQLITE_SOUNDEX \ -DSQLITE_TEMP_STORE=3 \ -DSQLITE_USE_URI \ -O2 \ -fPIC" $ export PREFIX="/usr/local" $ LIBS="-lm" ./configure --disable-tcl --enable-shared --enable-tempstore=always --prefix="$PREFIX" $ make $ sudo make install # Install system-wide.
We can verify that our Python
sqlite3 driver is using the new version by
running the following:
>>> import sqlite3 >>> sqlite3.sqlite_version '3.25.0'
Standalone install to virtual environment
Rather than overwrite our system
libsqlite3, we can create a virtual
environment with a completely self-contained, statically-linked Python SQLite
driver. We'll use pysqlite3, which is
essentially the standard library
sqlite3 packaged as a standalone library.
To build a standalone
pysqlite3, we'll need to create an amalgamation of
the SQLite source code. This simply means we need to produce a single C source
file that contains all the relevant SQLite code, and it's associated header
file. The SQLite
Makefile provides a target for just that, so it's quite easy
$ ./configure # Run the configure script in the SQLite source directory. $ make sqlite3.c # Generate source amalgamation and header file.
Now we'll leave that for a moment and go set-up our virtual environment for our Python project. We'll use Python 3.6 and create a virtualenv named "sqlite325":
$ virtualenv -p /usr/bin/python3.6 sqlite325 # Create virtualenv for testing. $ cd sqlite325 && source bin/activate # Activate the virtualenv.
We have our virtualenv set up, so we'll now obtain a copy of the
source code so we can build the
pysqlite3 module and compile it with the latest SQLite source code.
The following steps should be done with the virtualenv activated:
$ git clone https://github.com/coleifer/pysqlite3 $ cd pysqlite3 $ mkdir amalgamation # Directory to hold SQLite3 source amalgamation. $ cp /path/to/sqlite/sqlite3.[ch] amalgamation/ # Copy source and header into dir. $ python setup.py build_amalgamation # Compile self-contained C extension. $ python setup.py install # Install into virtualenv.
Congratulations! You now have a virtualenv with a self-contained
We can test the installation and verify that our SQLite driver is using the
latest version of SQLite:
>>> from pysqlite3 import dbapi2 as sqlite3 # Just like "import sqlite3". >>> sqlite3.sqlite_version '3.25.0'
Quick note on LSM1
If you're curious about the LSM1 extension and how to use that in your project,
check out my post LSM Key-Value Storage in SQLite3,
which contains additional, brief, instructions for compiling the
extension as a loadable module and then using it from Python.
Thanks for reading
I hope you found this post helpful. Please let me know if you have any questions about compiling SQLite or run into problems with the instructions.
- List of all my posts about SQLite
- Going fast with SQLite and Python, which lists numerous techniques for effectively working with SQLite in your Python applications.
- Instructions for compiling BerkeleyDB with SQLite, for use with Python. SQLite backed by BerkeleyDB for the storage layer.
- SQLite table-valued functions with Python, describes how to write Python functions to generate tabular data on-the-fly, which can then be queried with SQLite.
- Using the SQLite JSON1 and FTS5 extensions with Python. Slightly outdated, but plenty of code demonstrating how to use these extension modules with Python.