Encrypted SQLite Databases with Python and SQLCipher

photos/p1414470640.98.png

SQLCipher, created by Zetetic, is an open-source library that provides transparent 256-bit AES encryption for your SQLite databases. SQLCipher is used by a large number of organizations, including Nasa, SalesForce, Xerox and more. The project is open-source and BSD licensed. Best of all, there are open-source python bindings.

In this post, I'll show how to get started writing Python scripts that interact with encrypted SQLite databases. For users of the peewee ORM, I will demonstrate the usage of the sqlcipher playhouse module. Finally, I'll show how to convert your existing SQLite databases into encrypted databases suitable for use with SQLCipher.

Building SQLCipher

Let's get started by cloning the most recent version of the SQLCipher library and installing it on our system.

$ git clone https://github.com/sqlcipher/sqlcipher
$ cd sqlcipher

To compile SQLCipher, we will link against OpenSSL's libcrypto, so make sure you have OpenSSL installed before proceeding. I've also specified that we want to enable the full-text search extension. For the adventurous, the SQLite documentation contains a comprehensive list of compile options.

$ ./configure \
  --enable-tempstore=yes \
  CFLAGS="-DSQLITE_HAS_CODEC -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS" \
  LDFLAGS="-lcrypto"

$ make
$ sudo make install

You should now be able to fire up the sqlcipher shell, which by default is connected to an in-memory database:

$ sqlcipher
SQLCipher version 3.8.6 2014-08-15 11:46:33
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite>

Kicking the tires on SQLCipher

To create an encrypted database, we can use the SQLCipher shell, specifying a key using a special PRAGMA command:

sqlite> .open testing.db
sqlite> PRAGMA key='testing';
sqlite> create table people (name text primary key);
sqlite> insert into people (name) values ('charlie'), ('huey');
sqlite> .quit

If we take a look at the data in testing.db, we'll find that it is completely garbled:

$ hexdump -C testing.db
0000  04 37 1e 64 12 fb a2 0b  8d 88 2f 72 fd c6 4b e6  |.7.d....../r..K.|
0010  7f 80 14 ec 74 68 83 00  e9 d2 4f 2e 80 5d 05 da  |....th....O..]..|
0020  f0 44 f3 83 23 5e 29 e4  73 fc 29 1b 2d 6a 1d bc  |.D..#^).s.).-j..|
0030  be 94 e6 12 6e 7a 28 32  15 cd 7b 1e a5 3c f7 52  |....nz(2..{..<.R|
0040  1a 51 37 40 28 70 3e fe  5d d9 0f 06 cc 76 4c 98  |.Q7@(p>.]....vL.|
...

If we try to open the database using the normal SQLite client, or if we specify the incorrect key, the data will be unreadable:

$ sqlite3 testing.db
SQLite version 3.8.7 2014-10-17 11:24:17
Enter ".help" for usage hints.
sqlite> .schema
Error: file is encrypted or is not a database
sqlite> .quit

$ sqlcipher testing.db
SQLCipher version 3.8.6 2014-08-15 11:46:33
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> pragma key='wrong';
sqlite> .schema
Error: file is encrypted or is not a database

SQLCipher supports a number of special commands besides PRAGMA key. For the full list, check out the API documentation.

Building pysqlcipher

Run the following commands to install the latest version of pysqlcipher globally on your system:

$ git clone https://github.com/leapcode/pysqlcipher/
$ cd pysqlcipher
$ python setup.py build_sqlcipher  # Build against the system libsqlcipher
$ sudo python setup.py install

If you are installing into a virtualenv, I would not necessarily recommend installing pysqlcipher using pip, simply because it may fetch the SQLCipher source from a different remote source. The default installation will not link against your system sqlcipher, but will link against the downloaded amalgamation.

To install in a virtualenv and build against the system libsqlcipher, you can clone the source tree into your virtualenv and build there:

$ cd my_env && source bin/activate
$ git clone https://github.com/leapcode/pysqlcipher/
$ cd pysqlcipher
$ python setup.py build_sqlcipher
$ python setup.py install

Connecting to an encrypted database from Python

Let's see how to use SQLCipher from a Python script. pysqlcipher implements the db-api 2.0 spec, so if you've worked with databases in Python before, you'll feel right at home.

>>> from pysqlcipher import dbapi2 as sqlcipher
>>> db = sqlcipher.connect('testing.db')

In order to actually make queries, we need to specify a passphrase using the PRAGMA key statement. Additionally, we need to specify the key derivation iterations using PRAGMA kdf_iter, which has a default value of 64000.

>>> db.executescript('pragma key="testing"; pragma kdf_iter=64000;')
<pysqlcipher.dbapi2.Cursor object at 0x7f2a77be40a0>
>>> db.execute('select * from people;').fetchall()
[(u'charlie',), (u'heuy',)]

If we attempt to connect with the incorrect passphrase, we will receive a DatabaseError:

>>> db = sqlcipher.connect('testing.db')
>>> db.execute('pragma key="wrong"')
<pysqlcipher.dbapi2.Cursor object at 0x7f167ec2d0a0>
>>> db.execute('select * from people;')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
pysqlcipher.dbapi2.DatabaseError: file is encrypted or is not a database

Using SQLCipher with Peewee ORM

If you do not have peewee installed, feel free to install it now:

$ pip install peewee

The peewee SQLCipher extension allows you to use peewee with encrypted SQLite databases. To create an encrypted diary, we might write the following code:

import datetime

from playhouse.sqlcipher_ext import *

db = SqlCipherDatabase('diary.db', passphrase='my secret passphrase')

class Note(Model):
    content = TextField()
    timestamp = DateTimeField(default=datetime.datetime.now)

    class Meta:
        database = db

If the above code is in a model named diary.py, we can interact with it from the command-line in the following way:

>>> from diary import Note
>>> Note.create_table(fail_silently=True)
>>> Note.create(content='Dear diary, today I had a good day!')
>>> Note.create(content='Dear diary, huey threw up on the floor.')
>>> for note in Note.select():
...     print note.timestamp, note.content
...
2014-10-27 21:05:58.488291 Dear diary, today I had a good day!
2014-10-27 21:06:16.663230 Dear diary, huey threw up on the floor.

Hard-coding the passphrase in your database might not be a good idea. To retrieve the passphrase at run-time, we can use the standard library getpass module to prompt the user:

import datetime
import getpass

from playhouse.sqlcipher_ext import *

db = SqlCipherDatabase(None)  # Defer initialization of the database.

class Note(Model):
    content = TextField()
    timestamp = DateTimeField(default=datetime.datetime.now)

    class Meta:
        database = db

passphrase = getpass.getpass('Enter the diary password: ')
db.init('cipher.db', passphrase=passphrase)

Alternatively you can use environment variables or look into a library like python-keyring.

Encrypting an existing SQLite Database

If, like me, you have some existing SQLite databases you wish to convert over to SQLCipher, the following commands should get you started. These commands, and other examples, can be found in the SQLCipher documentation:

$ sqlcipher plaintext.db
sqlite> ATTACH DATABASE 'encrypted.db' AS encrypted KEY 'my password';
sqlite> SELECT sqlcipher_export('encrypted');
sqlite> DETACH DATABASE encrypted;

That's it! Now encrypted.db will contain an encrypted copy of the data in plaintext.db.

Thanks for taking the time to read this post, I hope you found it interesting. If you'd like to read more, here are some links which you may find useful:

Comments (5)

Charles Leifer | oct 31 2014, at 03:38pm

According to the project page, it should work on windows, but the installation may be different: https://www.zetetic.net/sqlcipher/

jspblm | oct 31 2014, at 03:36pm

Does this work on windows platform?

Charles Leifer | oct 31 2014, at 03:14pm

Marcin, the best source for that information is on the project's documentation: https://www.zetetic.net/sqlcipher/design/

It looks like AES-256 in CBC mode. Each page of data is encrypted individually, and each page has its own random initialization vector. There are many more details, though, so please check the docs if you have questions.

Marcin Pohl | oct 31 2014, at 02:33pm

Which mode of AES256 are they're using?

Ikem | oct 31 2014, at 08:18am

Your blog post need a Google+ button.

I would had liked/shared them.


Commenting has been closed, but please feel free to contact me