Welcome! I'm a software engineer in Topeka, KS, and I like blogging about Python and programming in general. I'm the author of several open-source Python libraries, including Peewee ORM, Huey task queue, and lots more. Below you can find a list of my most recent blog posts.
If you don't know what you're doing here, check out some posts I'm not too embarrassed of.
In this post I'll provide instructions for building a standalone Python
sqlite3-compatible module which is powered by BerkeleyDB. This is possible because BerkeleyDB provides a SQL frontend, which essentially is the SQLite we all know and love with the b-tree code ripped out and replaced with BerkeleyDB.
I began an unlikely adventure into Python packaging this week when I made what I thought were some innocuous modifications to the source distribution and
setup.pyscript for the peewee database library. Over the course of a day, the
setup.pymore than doubled in size and underwent five major revisions as I worked to fix problems arising out of various differences in users environments. This was tracked in issue #1676, may it always bear witness to the complexities of Python packaging!
In this post I'll explain what happened, the various things I tried, and how I ended up resolving the issue.
The upcoming SQLite 3.25.0 release adds support for one of my favorite SQL language features: window functions. Over the past few years SQLite has released many changes to improve the efficiency of the query planner and virtual machine, plus many extension modules which can provide additional functionality. For example:
- Window function support,
which are available in
trunkand will be included in the next release (3.25.0).
- Postgresql-style UPSERT support, which was released on 2018-06-04 (3.24.0).
- FTS5 full-text search extension, which improves on the previous full-text search modules (3.9.0).
- json1 extension, which brings support for JSON to SQLite (3.9.0).
- lsm1 extension and virtual table, while not officially released, is included in the source.
- Eponymous-only virtual tables, or table-valued functions (3.9.0).
- csv virtual table for browsing CSVs directly with SQLite (3.14.0).
Unfortunately, many distributions and operating systems include very old versions of SQLite. When they do include a newer release, typically many of these additional modules are not compiled and are therefore unavailable for use in your Python applications.
In this post I'll walk through obtaining the latest version of SQLite's source code and how to compile it so it includes these exciting features. We'll be doing all of this with the goal of being able to use these features in our Python applications, so we'll also be discussing how to integrate Python with our custom SQLite library.
- Window function support, which are available in
I recently open-sourced a Python library for working with the networked key/value databases Kyoto Tycoon and Tokyo Tyrant. These databases sit atop Kyoto Cabinet and Tokyo Cabinet, respectively, and provide fast DBM implementations.
The Cabinet libraries expose a familiar key/value API backed by a number of different storage options, including persistent hash-tables and b-trees, as well as in-memory variants. A cabinet database can only be accessed by a single process at any given time, though they can be used safely in multi-threaded environments. For this reason, the author created an evented network layer that can expose these storage interfaces to multiple processes. These database servers are Kyoto Tycoon and Tokyo Tyrant, and in addition to providing a fast front-end to the underlying storage engines, they can add features like lua scripting, multi-master replication and LRU cache eviction.
Some scenarios in which you might find these databases useful:
- Memcached or Redis replacement. The LRU eviction provided by Kyoto Tycoon, combined with being scriptable with Lua, and backed with persistent storage, allows these databases to go beyond traditional key/value database roles.
- Time-series and event-logging. The B-Tree storage engine supports fast range scans and ordered key traversal, for rolling-up events, map/reduce workflows, and reporting.
- Full-text search, using the hash storage engine.
- Secondary indexes for external data-stores.
- Document database using lua tables in place of JSON objects.
Features of the kt library:
- Binary protocol support implemented as a C extension.
- Thread-safe and greenlet-safe.
- Simple and memorable APIs.
- Full-featured implementation of the protocols.
- Multiple serialization schemes, including JSON, msgpack and pickle.
If you're interested in trying out these fantastic databases with Python, the documentation for kt can be found here: http://kt-lib.readthedocs.io/en/latest/
On Monday of this week I merged in the
3.0abranch of peewee, a lightweight Python ORM, marking the official 3.0.0 release of the project. Today as I'm writing this, the project is at 3.0.9, thanks to so many helpful people submitting issues and bug reports. Although this was pretty much a complete rewrite of the 2.x codebase, I have tried to maintain backwards-compatibility for the public APIs.
In this post I'll discuss a bit about the motivation for the rewrite and some changes to the overall design of the library. If you're thinking about upgrading, check out the changes document and, if you are wondering about any specific APIs, take a spin through the rewritten (and much more thorough) API documentation.