Updated instructions for compiling BerkeleyDB with SQLite for use with Python

photos/sqlite-bdb.png

About three years ago I posted some instructions for building the Python SQLite driver for use with BerkeleyDB. While those instructions still work, they have the unfortunate consequence of stomping on any other SQLite builds you've installed in /usr/local. I haven't been able to build pysqlite with BerkeleyDB compiled in, because the source amalgamation generated by BerkeleyDB doesn't compile. So that leaves us with dynamically linking, and that requires that we use the BerkeleyDB libsqlite, which is exactly what the previous post described.

In this post I'll describe a better approach. Instead of building a modified version of libsqlite3, we'll modify pysqlite to use the BerkeleyDB libdb_sql library.

Why use BerkeleyDB at all?

Really briefly I think it's worth mentioning why anyone would use BerkeleyDB's SQLite in the first place. There are a couple of advantages, but the one most likely to be relevant is that BerkeleyDB uses page-level locking while SQLite locks the database during writes. As a result, BerkeleyDB has much higher transaction throughput when multiple threads are involved. To read more about BerkeleyDB and SQLite, definitely check out this document, which compares the different behaviors of each database. You can also learn more by reading the BerkeleyDB SQLite API doc.

To sum up, BerkeleyDB might be a good option if you have many concurrent writers.

Getting the code

Start by navigating to the BerkeleyDB downloads page and checking to see what the latest version is. At the time of writing, it is 6.1.26.

Next, decide where you want to install the BerkeleyDB source. I put mine in ~/bin/berkeleydb. Now we can grab the code:

export BDB="$HOME/bin/berkeleydb"
mkdir -p $BDB
cd $BDB
export BVER='6.1.26'
wget http://download.oracle.com/berkeley-db/db-$BVER.tar.gz
tar xzf db-$BVER.tar.gz

Compiling BerkeleyDB with SQLite

To compile BerkeleyDB with SQLite support, we'll specify the define macros we want and do configure -> make -> make install. The final step will install the system libraries and headers into /usr/local/ so that we can dynamically link our pysqlite driver to them.

cd db-$BVER/build_unix
export CFLAGS="-DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS=1 -DSQLITE_ENABLE_COLUMN_METADATA=1 -DSQLITE_ENABLE_UPDATE_DELETE_LIMIT=1 -DSQLITE_SOUNDEX=1 -DSQLITE_TEMP_STORE=1 -fPIC"
../dist/configure --enable-static --enable-shared --enable-sql --prefix="$BDB" --with-cryptography
make
make prefix="$BDB/" install
sudo make prefix="/usr/local/" install

Compiling pysqlite

Building pysqlite is very easy, but we will need to hand-hack the source a little bit so that we're building it against the BerkeleyDB SQLite libraries (as opposed to the system SQLite). In the previous step, we installed dbsql.h and libdb_sql.so into directories under /usr/local/, so the first step will be to tell pysqlite where to find these files. The second step will be to modify the source code references to libsqlite3 and sqlite3.h. Finally we'll compile everything.

To begin, we'll grab the latest version of pysqlite from GitHub. I put the repo at the root of our BerkeleyDB install, which if you were using the same directories as me, would be ~/bin/berkeleydb/pysqlite/.

cd $BDB
git clone https://github.com/ghaering/pysqlite
cd pysqlite

Now we'll run some commands to update the SQLite3 references in the source code:

# Fix setup.cfg
sed -i "s|sqlite3|db_sql|g" setup.cfg
echo -e "library_dirs=/usr/local/lib" >> setup.cfg
echo -e "include_dirs=/usr/local/include" >> setup.cfg

# Fix setup.py
sed -i "s|\"sqlite3\"|\"db_sql\"|g" setup.py

# Fix source files
find src/ -name "*.h" -exec sed -i "s|sqlite3.h|dbsql.h|g" {} \;

With the changes in place, we can build pysqlite:

python setup.py build

Verifying everything worked

To check that the install worked, you can cd into the newly-created build directory in the pysqlite checkout and open up a Python interpreter:

cd build/lib.linux-x86_64-2.7  # Yours may be slightly different
python
>>> from pysqlite2 import dbapi2 as sqlite
>>> conn = sqlite.connect(':memory:')
>>> conn.execute('pragma compile_options;').fetchall()
[(u'BERKELEY_DB',),
 (u'ENABLE_COLUMN_METADATA',),
 (u'ENABLE_FTS3',),
 (u'ENABLE_FTS3_PARENTHESIS',),
 (u'ENABLE_FTS4',),
 (u'ENABLE_RTREE',),
 (u'ENABLE_UPDATE_DELETE_LIMIT',),
 (u'HAS_CODEC',),
 (u'SOUNDEX',),
 (u'SYSTEM_MALLOC',),
 (u'TEMP_STORE=1',),
 (u'THREADSAFE=1',)]

At the very top you can see that BERKELEY_DB is one of the compilation options. Success! pysqlite is using the BerkeleyDB version of SQLite.

From here, you can either install pysqlite using python setup.py install or symlink it into your PYTHONPATH.

Thanks for reading

Thanks for reading this post, I hope you found these instructions helpful! If you use Peewee ORM, there is a playhouse.berkeleydb extension module that you can use with the custom version of pysqlite.

Comments (0)


Commenting has been closed.