Entries tagged with python
The SQLite source tree is full of wonders. There is the the lemon parser generator, a btree implementation (well, kind of not surprising), multiple search engines, a json library, and more. It looks like Dr. Hipp is also experimenting with integrating the LSM key/value store from SQLite4 as a standalone virtual table.
I've written about the json and full-text search extensions, the lsm key/value store, and the transitive closure extension (useful when querying hierarchical data). In this post I'll be covering another interesting extension, the
spellfix1 extension (documentation).
One of the benefits of running an embedded database like SQLite is that you can configure SQLite to call into your application's code. SQLite provides APIs that allow you to create your own scalar functions, aggregate functions, collations, and even your own virtual tables. In this post I'll describe how I used the virtual table APIs to expose a nice API for creating table-valued (or, multi-value) functions in Python. The project is called
sqlite-vtfunc and is hosted on GitHub.
I've been working on some new features for Scout and thought they might be worth a short blog post. The super-short version is that Scout now supports complex filtering on metadata, adding another layer of filtering besides the full-text search. Additionally, I've added support for SQLite FTS5, using it by default if it's available otherwise falling back to FTS4.
Back in September, word started getting around trendy programming circles about a new file that had appeared in the SQLite fossil repo named json1.c. I originally wrote up a post that contained some gross hacks in order to get pysqlite to compile and work with the new
json1 extension. With the release of SQLite 3.9.0, those hacks are no longer necessary.
SQLite 3.9.0 is a fantastic release. In addition to the much anticipated
json1 extension, there is a new version of the full-text search extension called
fts5 improves performance of complex search queries and provides an out-of-the-box BM25 ranking implementation. You can also boost the significance of particular fields in the ranking. I suggest you check out the release notes for the full list of enhancements
This post will describe how to compile SQLite with support for
fts5. We'll use the new SQLite library to compile a python driver so we can use the new features from python. Because I really like
apsw, I've included instructions for building both of them. Finally, we'll use peewee ORM to run queries using the
This post is going to be a greatest hits of my open-source libraries and blog posts concerning the use of SQLite with Python. I'll also share a list of some other neat SQLite projects that you may not have heard of before.
SQLite and Key/Value databases are two of my favorite topics to blog about. Today I get to write about both, because in this post I will be demonstrating a Python wrapper for SQLite4's log-structured merge-tree (LSM) key/value store.
I don't actively follow SQLite's releases, but the recent release of SQLite 3.8.11 drew quite a bit of attention as the release notes described massive performance improvements over 3.8.0. While reading the release notes I happened to see a blurb about a new, experimental full-text search extension (which I wrote about in a different post), and all this got me to wondering what was going on with SQLite4.
As I was reading about SQLite4, I saw that one of the design goals was to provide an interface for pluggable storage engines. At the time I'm writing this, SQLite4 has two built-in storage backends, one of which is an LSM key/value store. Over the past month or two I've been having fun with Cython, writing Python wrappers for the embedded key/value stores UnQLite and Vedis. I figured it would be cool to use Cython to write a Python interface for SQLite4's LSM storage engine.
Read the rest of the post for examples of how to use the library.
SQLite 126.96.36.199 contains a new, experimental version of the full-text search extension named FTS5. Reviewing the documentation for FTS5, I saw that it includes a couple cool enhancements, namely a more sophisticated query language, and built-in BM25 result ranking.
I decided to give it a try and thought I'd share my notes on compiling the extension in case anyone else is curious.
About a year ago, I blogged about some Python bindings I wrote for the embedded NoSQL document store UnQLite. One year later I'm happy to announce that I've rewritten the library using Cython and operations are, in most cases, an order of magnitude faster.
This was my first real attempt at using Cython and the experience was just the right mix of challenging and rewarding. I bought the O'Reilly Cython Book which came in super handy, so if you're interested in getting started with Cython I recommend picking up a copy.
In this post I'll quickly touch on the features of UnQLite, then show you how to use the Python bindings. When you're done reading you should hopefully be ready to use UnQLite in your next Python project.
Redis is one of the more unique NoSQL offerings to have become popular over the past five years. It seems that there is no limit to the use-cases one can find for Redis. It's fantastic as a cache, doubles as a task-queue, can provide fast type-ahead search, and much more. The idea that you can store data-structures instead of rows and columns, keys and values, or JSON documents strikes me as particularly innovative. A while back I released walrus, a collection of Python utilities I'd built to simplify some of these use-cases and provide Pythonic APIs for the data-structures Redis natively supports. If you're a Python developer you might check it out.
Recently I've learned about a few new Redis-like databases: Rlite, Vedis and LedisDB. Each of these projects offers a slightly different take on the data-structure server you find in Redis, so I thought that I'd take some time and see how they worked. In this post I'll share what I've learned, and also show you how to use these databases with Walrus, as I've added support for them in the latest 0.3.0 release.
In this post I'll describe how to implement tagging with a relational database. What I mean by tagging are those little labels you see at the top of this blog post, which indicate how I've chosen to categorize the content. There are many ways to solve this problem, and I'll try to describe some of the more popular methods, as well as one unconventional approach using bitmaps. In each section I'll describe the database schema, try to list the benefits and drawbacks, and present example queries. I will use Peewee ORM for the example code, but hopefully these examples will easily translate to your tool-of-choice.