Search: "peewee AND sqlite"
SQLite is a fantastic database and in this post I'd like to explain why I think that, for many scenarios, SQLite is actually a great choice. I hope to also clear up some common misconceptions about SQLite.
Small. Fast. Reliable. Choose any three.
I made the decision this week to migrate my personal sites and several other sites I host onto SQLite. Previously almost everything I hosted had been using Postgresql. The move was motivated by a couple factors:
- SQLite is awesome!
- Self-contained: does not require a separate server process
- Data is stored in a single file, simplifying backups
- Excellent Python (and peewee) support
- Full-text search
At times it has seemed to me that there is a tacit agreement within the Flask / Django communities that if you're using SQL you should be using Postgresql. Postgresql is an amazing piece of engineering. I have spent the last five years of my career working exclusively with it, and I am continually impressed by its performance and the constant stream of great new features.
So why change things?
Well, as my list indicates, there are a handful of reasons. But the primary reason was that I wanted something lightweight. I'm running a fairly low-traffic, read-heavy site, so Postgresql was definitely overkill. My blog is deployed on a VPS with very limited resources, so every MB of RAM counts. Additionally, I wasn't using any special Postgresql features so there was nothing holding me back.
In this post I will show how to use SQLite full-text search with Python (and a lot of help from peewee ORM). We will see how to index content for searching, and how to order search results using two ranking algorithms.
Last week I migrated my site from Postgresql to SQLite. I had been using Redis to power my site's search, but since SQLite has an awesome full-text search extension, I decided to give it a try. I am really pleased with the results, and being able to specify boolean search queries is an added plus. Here is a brief overview of the types of search queries SQLite supports:
- Simple phrase: peewee would return all docs containing the word peewee.
- Prefix queries: py* would return docs containing Python, pypi, etc.
- Quoted phrases: "sqlite extension"
NEAR: peewee NEAR sqlite would return docs containing the words peewee and sqlite with no more than
10intervening words. You can also specify the max number of intervening words, e.g. peewee NEAR/3 sqlite.
NOT: sqlite OR postgresql AND NOT mysql would return docs about high-quality databases (just trollin).
Check out the full post for details on adding full-text search to your project.
For the past month or so I've been working on writing my own ORM in Python. The project grew out of a need for a lightweight persistence layer for use in Flask web apps. As I've grown so familiar with the Django ORM over the past year, many of the ideas in Peewee are analagous to the concepts in Django. My goal from the beginning has been to keep the implementation simple without sacrificing functionality, and to ultimately create something hackable that others might be able to read and contribute to.
SQLCipher, created by Zetetic, is an open-source library that provides transparent 256-bit AES encryption for your SQLite databases. SQLCipher is used by a large number of organizations, including Nasa, SalesForce, Xerox and more. The project is open-source and BSD licensed. Best of all, there are open-source python bindings.
In this post, I'll show how to compile SQLCipher and the pysqlcipher bindings, then use peewee ORM to work with an encrypted SQLite database.
SQLite is an embedded database, which means that instead of running as a separate server process, the actual database engine resides within the application. This makes it possible for the database to call directly into the application when it would be beneficial to add some low-level, application-specific functionality. SQLite provides numerous hooks for inserting user code and callbacks, and, through virtual tables, it is even possible to construct a completely user-defined table. By extending the SQL language with Python, it is often possible to express things more elegantly than if we were to perform calculations after the fact.
In this post I'll describe how to extend SQLite with Python, adding functions and aggregates that will be callable directly from any SQL queries you execute. We'll wrap up by looking at SQLite's virtual table mechanism and seeing how to expose a SQL interface over external data sources.
For the past week or two I've been spending some of my spare time working on a web-based SQLite database browser. I thought this would be a useful project, because I've switched all my personal projects over to SQLite and foresee using it for pretty much everything. It also dovetailed with some work I'd been doing lately on peewee regarding reflection and code generation. So it seemed like some pretty good bang/buck, especially given my perception that there weren't many SQLite browsers out there (it turns out there are quite a few, however). I'm sharing it in the hopes that other devs (and non-devs?) find it useful.
In my last post, I wrote about how to work with encrypted SQLite databases with Python. As an example application of these libraries, I showed some code fragments for a fictional diary program. Because I was thinking the examples directory of the peewee repo was looking a little thin, I decided to flesh out the diary program and include it as an example.
In this post, I'll go over the diary code in the hopes that you may find it interesting or useful. The code shows how to use the peewee SQLCipher extension. I've also implemented a simple command-line menu loop. All told, the code is less than 100 lines!
SQLite's write lock and pysqlite's clunky transaction state-machine are a toxic combination for multi-threaded applications. Unless you are very diligent about keeping your write transactions as short as possible, you can easily wind up with one thread accidentally holding a write transaction open for an unnecessarily long time. Threads that are waiting to write will then have a much greater likelihood of timing out while waiting for the lock, giving the illusion of poor performance.
In this post I'd like to share a very effective technique for performing writes to a SQLite database from multiple threads.
In preparation for the fourth and final installment in the "Flask Note-taking app" series, I found it necessary to improve the search feature of the note-taking app. In this post we will use SQLite's full-text search extension to improve the search feature.
In part 2, we added email reminders and check-able task lists to the note-taking app. We also converted the backend to use flask-peewee's REST API extension, which made it easy to add pagination and search. And that is how I've left it for the last three months or so.
Below is a screenshot of the latest version of the notes app. The UI is much cleaner thanks to a stylesheet from bootswatch. The bootswatch stylesheet works as a drop-in replacement for the default bootstrap CSS file.
All together, the note-taking app has the following features:
- Flexible pinterest-style tiled layout that looks great on a variety of screen sizes.
- Easy to create notes and reminders from the phone.
- Notes support markdown and there is also a simple WYSIWYM markdown editing toolbar.
- Links are converted to rich media objects where possible (e.g. a YouTube URL becomes an embedded player).
- To-do lists (or task lists) can be embedded in notes.
- Email reminders can be scheduled for a given note.
- Simple full-text search.
In case you were curious, I've been using the notes app for things like:
- Bookmarking interesting sites to read later.
- Creating short to-do lists or writing down particular items to get from the store, etc.
- Writing down interesting dreams or ideas I get in the middle of the night.
- Appointment reminders, reminders to call people, etc.
- Saving funny cat pics.
- Writing down ideas for programming projects.
- Saving code snippets or useful commands.