Walrus: Lightweight Python utilities for working with Redis

photos/walrus-logo.png

A couple weekends ago I got it into my head that I would build a thin Python wrapper for working with Redis. Andy McCurdy's redis-py is a fantastic low-level client library with built-in support for connection-pooling and pipelining, but it does little more than provide an interface to Redis' built-in commands (and rightly so). I decided to build a project on top of redis-py that exposed pythonic containers for the Redis data-types. I went on to add a few extras, including a cache and a declarative model layer. The result is walrus.

Installation

If you'd like to try it out, you can install walrus using pip. Note that you will also need to install redis-py and have a Redis server running.

$ pip install walrus

At the time of writing, the current version of walrus is 0.1.9.

Containers

photos/p1421027460.57.jpg

Redis supports five data-types, and each of these types supports a number of special-purpose commands. To make working with these types easier, I wrote container objects that look like their built-in analogues. For instance walrus hashes look like dict objects, have familiar methods like keys(), items(), update(), and support item access using square-brackets. Walrus sets behave like python sets, and so on.

walrus comes with support for the five data-types, as well as an additional Array type implemented using lua scripts (as opposed to Redis' linked-list implementation):

Working with containers is easy, as they can be instantiated by calling the corresponding method on the walrus database instance. Let's see how it works:

>>> from walrus import *
>>> db = Database(host='localhost', db=0)
>>> huey = db.Hash('huey')
>>> huey.update(color='white', temperament='ornery', type='kitty')
<Hash "huey": {'color': 'white', 'type': 'kitty', 'temperament': 'ornery'}>

>>> huey.keys()
['color', 'type', 'temperament']
>>> 'color' in huey
True
>>> huey['color']
'white'

There are similar APIs for the other data-types, which you can read about in the documentation.

Originally these containers were all I had planned on implementing, but I had such a good time working on this project that I just kept going.

Models

photos/p1421034633.72.jpg

I thought it would be cool to add a lightweight structured data modelling API, something with a declarative API like Django or peewee. To that end, walrus supports declarative model classes and a number of field types for things like text, dates, integers, floats, and more.

Here is how I modeled a twitter-like app (which you can find in the examples):

class User(BaseModel):
    username = TextField(primary_key=True)
    password = TextField(index=True)
    email = TextField()

    followers = ZSetField()
    following = ZSetField()


class Message(BaseModel):
    username = TextField(index=True)
    content = TextField(fts=True)
    timestamp = DateTimeField(default=datetime.datetime.now)

    def get_user(self):
        return User.load(self.username)

There are already a number of projects that do this, some of them quite well, such as Stdnet. Redisco, Rom, and limpyd are also similar projects. Stdnet looks to be the most sophisticated, but it relies on a ton of lua scripts. Redisco, Rom, and limpyd (wtf is a limpyd?) all seem to offer only very basic column filters. The goal for walrus models was to support flexible, composable filtering using a combination of secondary indexes and set operations.

Walrus's model layer is built on top of the Redis hash, but all the interesting stuff happens in the indexes. Each field can have a number of secondary indexes which provide different ways to filter/query. For instance, the default index type is simply a big set of all values, and can be used to perform equality/inequality tests. For scalar values, the index is a sorted set, which can be sliced by value to perform greater-than and less-than queries. By combining filter options with set operations, walrus is able to support arbitrarily complex queries.

Full-text search

I'd like to take a quick detour to discuss the full-text search feature, since I think it's kind of neat. The full-text index is a basic inverted index where tokens correspond to sets of matching document IDs. The full-text search index implements the porter stemming algorithm and also supports the double-metaphone algorithm and automatic stop-word removal.

The cool part is that I built a very simple search query parser that executes boolean expressions against the full-text index. This makes it possible to write things like:

expr = Message.content.search('python AND (walrus OR redis)')
messages = Message.query(expr)

This translates into the following sequence (roughly) of Redis commands being executed:

"ZINTERSTORE" "temp.629a" "1" "message:content.fts.python"
"ZINTERSTORE" "temp.ebe2" "1" "message:content.fts.walru"
"ZINTERSTORE" "temp.bb37" "1" "message:content.fts.redi"
"ZUNIONSTORE" "temp.72a8" "2" "temp.ebe2" "temp.bb37"
"ZINTERSTORE" "temp.7fc3" "2" "temp.629a" "temp.72a8"
"ZREVRANGE" "temp.7fc3" "0" "-1"

This sequence of operations means:

All of this is handled transparently by the backend!

Creating objects and performing queries

The model layer is hopefully easy to work with and understand. Walrus makes use of operator overloads to create the query tree, which is then translated into a series of Redis statements and set operations.

Message.create(content='this is a message', username='huey')
msg = Message(content='this is another message', username='mickey')
msg.save()

# Get messages by "huey".
messages = Message.query(
    Message.username == 'huey',
    order_by=Message.timestamp.desc())

# Get messages by huey or mickey.
messages = Message.query(
    (Message.username == 'huey') | (Message.username == 'mickey'),
    order_by=Message.timestamp.desc())

# Find messages by huey matching a search query.
search_expr = Message.content.search('python AND (peewee OR huey OR walrus)')
messages = Message.query(
    search_expr & (Message.username == 'huey'),
    order_by=Message.timestamp)

If you'd like to see more examples, check out the model documentation, the example twitter app, or the example diary app.

Caching

The final component of walrus is a Caching API. The cache implements the standard get and set operators, and also provides a decorator which can be used to wrap expensive / cache-friendly functions or methods.

Here is how you might use the cache:

cache = db.cache(default_timeout=600)

@cache.cached()
def get_recommendations(person):
    # Perform some expensive calculation that can be cached.
    return RecommendationEngine(person).get_recommendations()

If you're curious about the Cache, check out the documentation.

Reading more

photos/walrus-lurker.jpg

The documentation can be found online, and the source code is availabe on GitHub.

Walrus is still very new, so if you find bugs or have feature requests, feel free to create an issue on GitHub.

Thanks for taking the time to read this post, I hope you enjoyed it!

Impostor walrus

photos/p1421038498.96.jpg

Accept no substitutes!

Comments (5)

Charlie | jan 12 2015, at 01:37pm

Geo -- Here is a link to my Xcolors and the stylesheet. The theme is called "Ivory". I created a matching pygments stylesheet for the code-blocks on the website, but have since replaced it with a different one.

https://gist.github.com/coleifer/8adf75a4b6c784567ad5

Geo | jan 12 2015, at 01:30pm

color scheme mainly, bash, vim, etc

Charlie | jan 12 2015, at 11:33am

You mind replying with that setup?

Which setup are you referring to?

Geo | jan 12 2015, at 10:33am

Nice write up. Distracted by the color-theme in the code books. (I do that sometimes.)

You mind replying with that setup?

Anonyme | jan 12 2015, at 07:09am

Wow such walrus model, lolled.


Commenting has been closed.