Compiling SQLite for use with Python Applications
The upcoming SQLite 3.25.0 release adds support for one of my favorite SQL language features: window functions. Over the past few years SQLite has released many changes to improve the efficiency of the query planner and virtual machine, plus many extension modules which can provide additional functionality. For example:
- Window function support,
which are available in
trunk
and will be included in the next release (3.25.0). - Postgresql-style UPSERT support, which was released on 2018-06-04 (3.24.0).
- FTS5 full-text search extension, which improves on the previous full-text search modules (3.9.0).
- json1 extension, which brings support for JSON to SQLite (3.9.0).
- lsm1 extension and virtual table, while not officially released, is included in the source.
- Eponymous-only virtual tables, or table-valued functions (3.9.0).
- csv virtual table for browsing CSVs directly with SQLite (3.14.0).
Unfortunately, many distributions and operating systems include very old versions of SQLite. When they do include a newer release, typically many of these additional modules are not compiled and are therefore unavailable for use in your Python applications.
In this post I'll walk through obtaining the latest version of SQLite's source code and how to compile it so it includes these exciting features. We'll be doing all of this with the goal of being able to use these features in our Python applications, so we'll also be discussing how to integrate Python with our custom SQLite library.
Determining which features we want to enable
Some features, like window functions are enabled by default. Others, like full-text search and JSON, need to be enabled at compile-time. You can view the full list of SQLite compile-time options but for simplicity I'll provide a list that should be a good starting point:
- FTS3, FTS4 and the latest FTS5 full-text search extensions.
- JSON1 extension to allow JSON support.
- Keep database statistics to help the query-planner choose better plans.
- Allow
LIMIT
clause onUPDATE
andDELETE
queries. - Include an implementation of the
SOUNDEX
function, which is good for fuzzy or phonetic text search. - Always use memory for the storage of temporary data.
These options give a good mix of features and performance.
Getting the latest SQLite
To get started with this feature, we'll need to have the latest version of SQLite installed. We can pull down the source code and compile it directly from SQLite's version control:
$ wget https://www.sqlite.org/src/tarball/sqlite.tar.gz
$ tar xzf sqlite.tar.gz
$ cd sqlite/
At this point we now have a directory containing the most up-to-date SQLite source code. Now we should decide how we want to compile this. I'll present two options:
- System-wide install, replacing the system SQLite. Python's
sqlite3
module will then use this newlibsqlite3
that we compile and we're all set. - Installation into a virtualenv, which is accomplished by compiling a
standalone Python
sqlite3
driver and statically-linking the latest SQLite code.
If you're not sure, go with option two.
System-wide install
Installing SQLite system-wide will allow all your Python projects to utilize
the latest SQLite, since the standard library sqlite3
module typically links
against the system SQLite. Installing the latest SQLite system-wide is pretty
straightforward. We'll be using the compile-time options described above.
$ export CFLAGS="-DSQLITE_ENABLE_FTS3 \
-DSQLITE_ENABLE_FTS3_PARENTHESIS \
-DSQLITE_ENABLE_FTS4 \
-DSQLITE_ENABLE_FTS5 \
-DSQLITE_ENABLE_JSON1 \
-DSQLITE_ENABLE_LOAD_EXTENSION \
-DSQLITE_ENABLE_RTREE \
-DSQLITE_ENABLE_STAT4 \
-DSQLITE_ENABLE_UPDATE_DELETE_LIMIT \
-DSQLITE_SOUNDEX \
-DSQLITE_TEMP_STORE=3 \
-DSQLITE_USE_URI \
-O2 \
-fPIC"
$ export PREFIX="/usr/local"
$ LIBS="-lm" ./configure --disable-tcl --enable-shared --enable-tempstore=always --prefix="$PREFIX"
$ make
$ sudo make install # Install system-wide.
We can verify that our Python sqlite3
driver is using the new version by
running the following:
>>> import sqlite3
>>> sqlite3.sqlite_version
'3.25.0'
Standalone install to virtual environment
Rather than overwrite our system libsqlite3
, we can create a virtual
environment with a completely self-contained, statically-linked Python SQLite
driver. We'll use pysqlite3, which is
essentially the standard library sqlite3
packaged as a standalone library.
To build a standalone pysqlite3
, we'll need to create an amalgamation of
the SQLite source code. This simply means we need to produce a single C source
file that contains all the relevant SQLite code, and it's associated header
file. The SQLite Makefile
provides a target for just that, so it's quite easy
to do:
$ ./configure # Run the configure script in the SQLite source directory.
$ make sqlite3.c # Generate source amalgamation and header file.
Now we'll leave that for a moment and go set-up our virtual environment for our Python project. We'll use Python 3.6 and create a virtualenv named "sqlite325":
$ virtualenv -p /usr/bin/python3.6 sqlite325 # Create virtualenv for testing.
$ cd sqlite325 && source bin/activate # Activate the virtualenv.
We have our virtualenv set up, so we'll now obtain a copy of the pysqlite3
source code so we can build the pysqlite3
module and compile it with the latest SQLite source code.
The following steps should be done with the virtualenv activated:
$ git clone https://github.com/coleifer/pysqlite3
$ cd pysqlite3
$ cp /path/to/sqlite/sqlite3.[ch] . # Copy source and header into dir.
$ python setup.py build_static # Compile self-contained C extension.
$ python setup.py install # Install into virtualenv.
Congratulations! You now have a virtualenv with a self-contained pysqlite3
.
We can test the installation and verify that our SQLite driver is using the
latest version of SQLite:
>>> from pysqlite3 import dbapi2 as sqlite3 # Just like "import sqlite3".
>>> sqlite3.sqlite_version
'3.25.0'
Quick note on LSM1
If you're curious about the LSM1 extension and how to use that in your project,
check out my post LSM Key-Value Storage in SQLite3,
which contains additional, brief, instructions for compiling the lsm1
extension as a loadable module and then using it from Python.
Thanks for reading
I hope you found this post helpful. Please let me know if you have any questions about compiling SQLite or run into problems with the instructions.
Further reading:
- List of all my posts about SQLite
- Going fast with SQLite and Python, which lists numerous techniques for effectively working with SQLite in your Python applications.
- Instructions for compiling BerkeleyDB with SQLite, for use with Python. SQLite backed by BerkeleyDB for the storage layer.
- SQLite table-valued functions with Python, describes how to write Python functions to generate tabular data on-the-fly, which can then be queried with SQLite.
- Using the SQLite JSON1 and FTS5 extensions with Python. Slightly outdated, but plenty of code demonstrating how to use these extension modules with Python.
Comments (3)
Charles | jul 15 2018, at 12:34pm
You need python 3.6.
Karl Bartel | jul 15 2018, at 12:19pm
Great post! I've been looking for a good way to get a recent, featureful pysqlite for a while. Unfortunately, the final result for option two does not work for me, yet. Do you have any idea what might cause the following error?
Python 3.5.3 (default, Jan 19 2017, 14:11:04)
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pysqlite3 import dbapi2 as sqlite3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/karl/gdrive/code/sqlite/sqlite325/lib/python3.5/site-packages/pysqlite3-0.2.0-py3.5-linux-x86_64.egg/pysqlite3/__init__.py", line 23, in <module>
from pysqlite3.dbapi2 import *
File "/home/karl/gdrive/code/sqlite/sqlite325/lib/python3.5/site-packages/pysqlite3-0.2.0-py3.5-linux-x86_64.egg/pysqlite3/dbapi2.py", line 28, in <module>
from pysqlite3._sqlite3 import *
ImportError: /home/karl/gdrive/code/sqlite/sqlite325/lib/python3.5/site-packages/pysqlite3-0.2.0-py3.5-linux-x86_64.egg/pysqlite3/_sqlite3.cpython-35m-x86_64-linux-gnu.so: undefined symbol: _PyObject_CallNoArg
Commenting has been closed.
Karl Bartel | jul 18 2018, at 10:27am
Thanks, works great, now! I've put my results at https://github.com/karlb/pysqlite3 , that I can pip install a full featured pysqlite3 whenever I need it.