Updated instructions for compiling BerkeleyDB with SQLite for use with Python
About three years ago I posted some instructions for building the Python SQLite driver for use with BerkeleyDB. While those instructions still work, they have the unfortunate consequence of stomping on any other SQLite builds you've installed in /usr/local
. I haven't been able to build pysqlite
with BerkeleyDB compiled in, because the source amalgamation generated by BerkeleyDB doesn't compile. So that leaves us with dynamically linking, and that requires that we use the BerkeleyDB libsqlite
, which is exactly what the previous post described.
In this post I'll describe a better approach. Instead of building a modified version of libsqlite3
, we'll modify pysqlite
to use the BerkeleyDB libdb_sql
library.
Why use BerkeleyDB at all?
Really briefly I think it's worth mentioning why anyone would use BerkeleyDB's SQLite in the first place. There are a couple of advantages, but the one most likely to be relevant is that BerkeleyDB uses page-level locking while SQLite locks the database during writes. As a result, BerkeleyDB has much higher transaction throughput when multiple threads are involved. To read more about BerkeleyDB and SQLite, definitely check out this document, which compares the different behaviors of each database. You can also learn more by reading the BerkeleyDB SQLite API doc.
To sum up, BerkeleyDB might be a good option if you have many concurrent writers.
Getting the code
Start by navigating to the BerkeleyDB downloads page and checking to see what the latest version is. At the time of writing, it is 6.1.26
.
Next, decide where you want to install the BerkeleyDB source. I put mine in ~/bin/berkeleydb
. Now we can grab the code:
export BDB="$HOME/bin/berkeleydb"
mkdir -p $BDB
cd $BDB
export BVER='6.1.26'
wget http://download.oracle.com/berkeley-db/db-$BVER.tar.gz
tar xzf db-$BVER.tar.gz
Compiling BerkeleyDB with SQLite
To compile BerkeleyDB with SQLite support, we'll specify the define macros we want and do configure
-> make
-> make install
. The final step will install the system libraries and headers into /usr/local/
so that we can dynamically link our pysqlite
driver to them.
cd db-$BVER/build_unix
export CFLAGS="-DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS=1 -DSQLITE_ENABLE_COLUMN_METADATA=1 -DSQLITE_ENABLE_UPDATE_DELETE_LIMIT=1 -DSQLITE_SOUNDEX=1 -DSQLITE_TEMP_STORE=1 -fPIC"
../dist/configure --enable-static --enable-shared --enable-sql --prefix="$BDB" --with-cryptography
make
make prefix="$BDB/" install
sudo make prefix="/usr/local/" install
Compiling pysqlite
Building pysqlite
is very easy, but we will need to hand-hack the source a little bit so that we're building it against the BerkeleyDB SQLite libraries (as opposed to the system SQLite). In the previous step, we installed dbsql.h
and libdb_sql.so
into directories under /usr/local/
, so the first step will be to tell pysqlite
where to find these files. The second step will be to modify the source code references to libsqlite3
and sqlite3.h
. Finally we'll compile everything.
To begin, we'll grab the latest version of pysqlite
from GitHub. I put the repo at the root of our BerkeleyDB install, which if you were using the same directories as me, would be ~/bin/berkeleydb/pysqlite/
.
cd $BDB
git clone https://github.com/ghaering/pysqlite
cd pysqlite
Now we'll run some commands to update the SQLite3 references in the source code:
# Fix setup.cfg
sed -i "s|sqlite3|db_sql|g" setup.cfg
echo -e "library_dirs=/usr/local/lib" >> setup.cfg
echo -e "include_dirs=/usr/local/include" >> setup.cfg
# Fix setup.py
sed -i "s|\"sqlite3\"|\"db_sql\"|g" setup.py
# Fix source files
find src/ -name "*.h" -exec sed -i "s|sqlite3.h|dbsql.h|g" {} \;
With the changes in place, we can build pysqlite
:
python setup.py build
Verifying everything worked
To check that the install worked, you can cd
into the newly-created build
directory in the pysqlite
checkout and open up a Python interpreter:
cd build/lib.linux-x86_64-2.7 # Yours may be slightly different
python
>>> from pysqlite2 import dbapi2 as sqlite
>>> conn = sqlite.connect(':memory:')
>>> conn.execute('pragma compile_options;').fetchall()
[(u'BERKELEY_DB',),
(u'ENABLE_COLUMN_METADATA',),
(u'ENABLE_FTS3',),
(u'ENABLE_FTS3_PARENTHESIS',),
(u'ENABLE_FTS4',),
(u'ENABLE_RTREE',),
(u'ENABLE_UPDATE_DELETE_LIMIT',),
(u'HAS_CODEC',),
(u'SOUNDEX',),
(u'SYSTEM_MALLOC',),
(u'TEMP_STORE=1',),
(u'THREADSAFE=1',)]
At the very top you can see that BERKELEY_DB
is one of the compilation options. Success! pysqlite
is using the BerkeleyDB version of SQLite.
From here, you can either install pysqlite
using python setup.py install
or symlink it into your PYTHONPATH
.
Thanks for reading
Thanks for reading this post, I hope you found these instructions helpful! If you use Peewee ORM, there is a playhouse.berkeleydb
extension module that you can use with the custom version of pysqlite
.
Comments (0)
Commenting has been closed.