Entries from 2014 « 2013 / all / by tag / popular / 2015 »

Extending SQLite with Python


SQLite is an embedded database, which means that instead of running as a separate server process, the actual database engine resides within the application. This makes it possible for the database to call directly into the application when it would be beneficial to add some low-level, application-specific functionality. SQLite provides numerous hooks for inserting user code and callbacks, and, through virtual tables, it is even possible to construct a completely user-defined table. By extending the SQL language with Python, it is often possible to express things more elegantly than if we were to perform calculations after the fact.

In this post I'll describe how to extend SQLite with Python, adding functions and aggregates that will be callable directly from any SQL queries you execute. We'll wrap up by looking at SQLite's virtual table mechanism and seeing how to expose a SQL interface over external data sources.


Querying Tree Structures in SQLite using Python and the Transitive Closure Extension


I recently read a good write-up on tree structures in PostgreSQL. Hierarchical data is notoriously tricky to model in a relational database, and a variety of techniques have grown out of developers' attempts to optimize for certain types of queries.

In his post, Graeme describes several approaches to modeling trees, including:

  • Adjancency models, in which each node in the tree contains a foreign key to its parent row.
  • Materialized path model, in which each node stores its ancestral path in a denormalized column. Typically the path is stored as a string separated by a delimiter, e.g. "{root id}.{child id}.{grandchild id}".
  • Nested sets, in which each node defines an interval that encompasses a range of child nodes.
  • PostgreSQL arrays, in which the materialized path is stored in an array, and general inverted indexes are used to efficiently query the path.

In the comments, some users pointed out that the ltree extension could also be used to efficiently store and query materialized paths. LTrees support two powerful query languages (lquery and ltxtquery) for pattern-matching LTree labels and performing full-text searches on labels.

One technique that was not discussed in Graeme's post was the use of closure tables. A closure table is a many-to-many junction table storing all relationships between nodes in a tree. It is related to the adjacency model, in that each database row still stores a reference to its parent row. The closure table gets its name from the additional table, which stores each combination of ancestor/child nodes.


Web-based SQLite Database Browser, powered by Flask and Peewee


For the past week or two I've been spending some of my spare time working on a web-based SQLite database browser. I thought this would be a useful project, because I've switched all my personal projects over to SQLite and foresee using it for pretty much everything. It also dovetailed with some work I'd been doing lately on peewee regarding reflection and code generation. So it seemed like some pretty good bang/buck, especially given my perception that there weren't many SQLite browsers out there (it turns out there are quite a few, however). I'm sharing it in the hopes that other devs (and non-devs?) find it useful.


Dear Diary, an Encrypted Command-Line Diary with Python


In my last post, I wrote about how to work with encrypted SQLite databases with Python. As an example application of these libraries, I showed some code fragments for a fictional diary program. Because I was thinking the examples directory of the peewee repo was looking a little thin, I decided to flesh out the diary program and include it as an example.

In this post, I'll go over the diary code in the hopes that you may find it interesting or useful. The code shows how to use the peewee SQLCipher extension. I've also implemented a simple command-line menu loop. All told, the code is less than 100 lines!


Saturday morning hacks: Building an Analytics App with Flask

Saturday morning hacks

A couple years back I wrote about building an Analytics service with Cassandra. As fun as that project was to build, the reality was that Cassandra was completely unsuitable for my actual needs, so I decided to switch to something simpler. I'm happy to say the replacement app has been running without a hitch for the past 5 months taking up only about 20 MB of RAM! In this post I'll show how to build a lightweight Analytics service using Flask.


Encrypted SQLite Databases with Python and SQLCipher


SQLCipher, created by Zetetic, is an open-source library that provides transparent 256-bit AES encryption for your SQLite databases. SQLCipher is used by a large number of organizations, including Nasa, SalesForce, Xerox and more. The project is open-source and BSD licensed, and there are open-source python bindings.

A GitHub user known as The Dod was kind enough to contribute a sqlcipher playhouse module, making it a snap to use Peewee with SQLCipher.

In this post, I'll show how to compile SQLCipher and the sqlcipher3 python bindings, then use peewee ORM to work with an encrypted SQLite database.


Saturday morning hacks: Adding full-text search to the flask note-taking app

Saturday morning hacks

In preparation for the fourth and final installment in the "Flask Note-taking app" series, I found it necessary to improve the search feature of the note-taking app. In this post we will use SQLite's full-text search extension to improve the search feature.

To recap, the note-taking app provides a lightweight interface for storing markdown-formatted notes. Because I frequently find myself wanting to take notes on the spur of the moment, the note-taking app needed to be very mobile-friendly. By using twitter bootstrap and a hefty dose of JavaScript, we made an app that matches our spec and manages to look good doing it!

In part 2, we added email reminders and check-able task lists to the note-taking app. We also converted the backend to use flask-peewee's REST API extension, which made it easy to add pagination and search. And that is how I've left it for the last three months or so.

Below is a screenshot of the latest version of the notes app. The UI is much cleaner thanks to a stylesheet from bootswatch. The bootswatch stylesheet works as a drop-in replacement for the default bootstrap CSS file.


All together, the note-taking app has the following features:

  • Flexible pinterest-style tiled layout that looks great on a variety of screen sizes.
  • Easy to create notes and reminders from the phone.
  • Notes support markdown and there is also a simple WYSIWYM markdown editing toolbar.
  • Links are converted to rich media objects where possible (e.g. a YouTube URL becomes an embedded player).
  • To-do lists (or task lists) can be embedded in notes.
  • Email reminders can be scheduled for a given note.
  • Simple full-text search.
  • Pagination.

You can browse or download the finished code from part 2 in this gist. If you're in a hurry, you can find all the code from this post in this gist.

In case you were curious, I've been using the notes app for things like:

  • Bookmarking interesting sites to read later.
  • Creating short to-do lists or writing down particular items to get from the store, etc.
  • Writing down interesting dreams or ideas I get in the middle of the night.
  • Appointment reminders, reminders to call people, etc.
  • Saving funny cat pics.
  • Writing down ideas for programming projects.
  • Saving code snippets or useful commands.


SQLite: Small. Fast. Reliable. Choose any three.

Sqlite Logo

SQLite is a fantastic database and in this post I'd like to explain why I think that, for many scenarios, SQLite is actually a great choice. I hope to also clear up some common misconceptions about SQLite.


JavaScript Canvas Fun: Pong

Earlier this week I rediscovered some old games I'd written, and I realized that I had not yet done a JavaScript version of Pong. I did versions of Tetris and Snake, perennial favorites of mine to implement, but somehow I'd forgotten about Pong. I think Pong was probably the first game I ever tried to copy, and it has a special place in my early-programmer's memory.

So I set out last night to put together a JavaScript canvas version of Pong. You can find a playable version in the post.


Completely un-scientific benchmarks of some embedded databases with Python

I've spent some time over the past couple weeks playing with the embedded NoSQL databases Vedis and UnQLite. Vedis, as its name might indicate, is an embedded data-structure database modeled after Redis. UnQLite is a JSON document store (like MongoDB, I guess??). Beneath the higher-level APIs, both Vedis and UnQLite are key/value stores, which puts them in the same category as BerkeleyDB, KyotoCabinet and LevelDB. The Python standard library also includes some dbm-style databases, including gdbm.

For fun, I thought I would put together a completely un-scientific benchmark showing the relative speeds of these various databases for storing and retrieving simple keys and values.

Here are the databases and drivers that I used for the test:

I'm running these tests with:

  • Linux 3.14.4
  • Python 2.7.7 (Py2K 4 lyfe!)
  • SSD

For the test, I simply recorded the time it took to store 100K simple key/value pairs (no collisions). Then I recorded the time it took to read back all these values. The results are in seconds elapsed:
