charles leifer | Search entries

Migrating to SQLite

may 8, 2014 6:42pm / peewee python sqlite / 0 comments

Small. Fast. Reliable. Choose any three.

I made the decision this week to migrate my personal sites and several other sites I host onto SQLite. Previously almost everything I hosted had been using Postgresql. The move was motivated by a couple factors:

SQLite is awesome!
Self-contained: does not require a separate server process
Data is stored in a single file, simplifying backups
Excellent Python (and peewee) support
Full-text search

At times it has seemed to me that there is a tacit agreement within the Flask / Django communities that if you're using SQL you should be using Postgresql. Postgresql is an amazing piece of engineering. I have spent the last five years of my career working exclusively with it, and I am continually impressed by its performance and the constant stream of great new features.

So why change things?

Well, as my list indicates, there are a handful of reasons. But the primary reason was that I wanted something lightweight. I'm running a fairly low-traffic, read-heavy site, so Postgresql was definitely overkill. My blog is deployed on a VPS with very limited resources, so every MB of RAM counts. Additionally, I wasn't using any special Postgresql features so there was nothing holding me back.

(continued...)

SQLite: Small. Fast. Reliable. Choose any three.

july 14, 2014 12:37pm / berkeleydb peewee python sqlite / 10 comments

SQLite is a fantastic database and in this post I'd like to explain why I think that, for many scenarios, SQLite is actually a great choice. I hope to also clear up some common misconceptions about SQLite.

Extending SQLite with Python

december 2, 2014 9:33pm / peewee python sqlite / 1 comments

SQLite is an embedded database, which means that instead of running as a separate server process, the actual database engine resides within the application. This makes it possible for the database to call directly into the application when it would be beneficial to add some low-level, application-specific functionality. SQLite provides numerous hooks for inserting user code and callbacks, and, through virtual tables, it is even possible to construct a completely user-defined table. By extending the SQL language with Python, it is often possible to express things more elegantly than if we were to perform calculations after the fact.

In this post I'll describe how to extend SQLite with Python, adding functions and aggregates that will be callable directly from any SQL queries you execute. We'll wrap up by looking at SQLite's virtual table mechanism and seeing how to expose a SQL interface over external data sources.

Web-based SQLite Database Browser, powered by Flask and Peewee

november 13, 2014 8:28pm / flask python sqlite / 1 comments

For the past week or two I've been spending some of my spare time working on a web-based SQLite database browser. I thought this would be a useful project, because I've switched all my personal projects over to SQLite and foresee using it for pretty much everything. It also dovetailed with some work I'd been doing lately on peewee regarding reflection and code generation. So it seemed like some pretty good bang/buck, especially given my perception that there weren't many SQLite browsers out there (it turns out there are quite a few, however). I'm sharing it in the hopes that other devs (and non-devs?) find it useful.

Multi-threaded SQLite without the OperationalErrors

january 30, 2017 10:36pm / peewee python sqlite / 3 comments

SQLite's write lock and pysqlite's clunky transaction state-machine are a toxic combination for multi-threaded applications. Unless you are very diligent about keeping your write transactions as short as possible, you can easily wind up with one thread accidentally holding a write transaction open for an unnecessarily long time. Threads that are waiting to write will then have a much greater likelihood of timing out while waiting for the lock, giving the illusion of poor performance.

In this post I'd like to share a very effective technique for performing writes to a SQLite database from multiple threads.

Using SQLite Full-Text Search with Python

may 12, 2014 7:12pm / peewee python search sqlite / 0 comments

In this post I will show how to use SQLite full-text search with Python (and a lot of help from peewee ORM). We will see how to index content for searching, and how to order search results using two ranking algorithms.

Last week I migrated my site from Postgresql to SQLite. I had been using Redis to power my site's search, but since SQLite has an awesome full-text search extension, I decided to give it a try. I am really pleased with the results, and being able to specify boolean search queries is an added plus. Here is a brief overview of the types of search queries SQLite supports:

Simple phrase: peewee would return all docs containing the word peewee.
Prefix queries: py* would return docs containing Python, pypi, etc.
Quoted phrases: "sqlite extension"
NEAR: peewee NEAR sqlite would return docs containing the words peewee and sqlite with no more than 10 intervening words. You can also specify the max number of intervening words, e.g. peewee NEAR/3 sqlite.
AND, OR, NOT: sqlite OR postgresql AND NOT mysql would return docs about high-quality databases (just trollin).

Check out the full post for details on adding full-text search to your project.

Encrypted SQLite Databases with Python and SQLCipher

october 27, 2014 11:20pm / peewee python sqlcipher sqlite / 5 comments

SQLCipher, created by Zetetic, is an open-source library that provides transparent 256-bit AES encryption for your SQLite databases. SQLCipher is used by a large number of organizations, including Nasa, SalesForce, Xerox and more. The project is open-source and BSD licensed, and there are open-source python bindings.

A GitHub user known as The Dod was kind enough to contribute a sqlcipher playhouse module, making it a snap to use Peewee with SQLCipher.

In this post, I'll show how to compile SQLCipher and the sqlcipher3 python bindings, then use peewee ORM to work with an encrypted SQLite database.

Five reasons you should use SQLite in 2016

january 6, 2016 8:49pm / berkeleydb python sqlite / 17 comments

If you haven't heard, SQLite is an amazing database capable of doing real work in real production environments. In this post, I'll outline 5 reasons why I think you should use SQLite in 2016.

Using the SQLite JSON1 and FTS5 Extensions with Python

november 11, 2015 8:01am / peewee python search sqlite / 3 comments

Back in September, word started getting around trendy programming circles about a new file that had appeared in the SQLite fossil repo named json1.c. I originally wrote up a post that contained some gross hacks in order to get pysqlite to compile and work with the new json1 extension. With the release of SQLite 3.9.0, those hacks are no longer necessary.

SQLite 3.9.0 is a fantastic release. In addition to the much anticipated json1 extension, there is a new version of the full-text search extension called fts5. fts5 improves performance of complex search queries and provides an out-of-the-box BM25 ranking implementation. You can also boost the significance of particular fields in the ranking. I suggest you check out the release notes for the full list of enhancements

This post will describe how to compile SQLite with support for json1 and fts5. We'll use the new SQLite library to compile a python driver so we can use the new features from python. Because I really like pysqlite and apsw, I've included instructions for building both of them. Finally, we'll use peewee ORM to run queries using the json1 and fts5 extensions.

Querying Tree Structures in SQLite using Python and the Transitive Closure Extension

november 22, 2014 8:52pm / peewee python sqlite / 0 comments

I recently read a good write-up on tree structures in PostgreSQL. Hierarchical data is notoriously tricky to model in a relational database, and a variety of techniques have grown out of developers' attempts to optimize for certain types of queries.

In his post, Graeme describes several approaches to modeling trees, including:

Adjancency models, in which each node in the tree contains a foreign key to its parent row.
Materialized path model, in which each node stores its ancestral path in a denormalized column. Typically the path is stored as a string separated by a delimiter, e.g. "{root id}.{child id}.{grandchild id}".
Nested sets, in which each node defines an interval that encompasses a range of child nodes.
PostgreSQL arrays, in which the materialized path is stored in an array, and general inverted indexes are used to efficiently query the path.

In the comments, some users pointed out that the ltree extension could also be used to efficiently store and query materialized paths. LTrees support two powerful query languages (lquery and ltxtquery) for pattern-matching LTree labels and performing full-text searches on labels.

One technique that was not discussed in Graeme's post was the use of closure tables. A closure table is a many-to-many junction table storing all relationships between nodes in a tree. It is related to the adjacency model, in that each database row still stores a reference to its parent row. The closure table gets its name from the additional table, which stores each combination of ancestor/child nodes.

Search: "peewee AND sqlite"

Small. Fast. Reliable. Choose any three.