Entries tagged with nosql

Examples of using Walrus, a lightweight Redis Toolkit

photos/walrus-logo-0.png

walrus is my go-to toolkit for working with Redis in Python, and hopefully this post will convince you that it can be your go-to as well. I've tried to include lots of high-level Python APIs built on Redis primitives and the result is quite a lot of functionality. In this post I'll take you on a tour of the library and show examples of how it might be useful in your next project.

Read more...

Announcing sophy: fast Python bindings for Sophia Database

photos/sophia-logo.png

Sophia is a powerful key/value database with loads of features packed into a simple C API. In order to use this database in some upcoming projects I've got planned, I decided to write some Python bindings and the result is sophy. In this post, I'll describe the features of Sophia database, and then show example code using sophy, the Python wrapper.

Here is an overview of the features of the Sophia database:

  • Append-only MVCC database
  • ACID transactions
  • Consistent cursors
  • Compression
  • Ordered key/value store
  • Range searches
  • Prefix searches

Read more...

Using SQLite4's LSM Storage Engine as a Stand-alone NoSQL Database with Python

photos/lsm.png

SQLite and Key/Value databases are two of my favorite topics to blog about. Today I get to write about both, because in this post I will be demonstrating a Python wrapper for SQLite4's log-structured merge-tree (LSM) key/value store.

I don't actively follow SQLite's releases, but the recent release of SQLite 3.8.11 drew quite a bit of attention as the release notes described massive performance improvements over 3.8.0. While reading the release notes I happened to see a blurb about a new, experimental full-text search extension (which I wrote about in a different post), and all this got me to wondering what was going on with SQLite4.

As I was reading about SQLite4, I saw that one of the design goals was to provide an interface for pluggable storage engines. At the time I'm writing this, SQLite4 has two built-in storage backends, one of which is an LSM key/value store. Over the past month or two I've been having fun with Cython, writing Python wrappers for the embedded key/value stores UnQLite and Vedis. I figured it would be cool to use Cython to write a Python interface for SQLite4's LSM storage engine.

After pulling down the SQLite4 source code and reading through the LSM header file (it's very small!), I started coding and the result is python-lsm-db (docs).

Read the rest of the post for examples of how to use the library.

Read more...

Introduction to the fast new UnQLite Python Bindings

photos/unqlite-python-logo.png

About a year ago, I blogged about some Python bindings I wrote for the embedded NoSQL document store UnQLite. One year later I'm happy to announce that I've rewritten the library using Cython and operations are, in most cases, an order of magnitude faster.

This was my first real attempt at using Cython and the experience was just the right mix of challenging and rewarding. I bought the O'Reilly Cython Book which came in super handy, so if you're interested in getting started with Cython I recommend picking up a copy.

In this post I'll quickly touch on the features of UnQLite, then show you how to use the Python bindings. When you're done reading you should hopefully be ready to use UnQLite in your next Python project.

Read more...

Alternative Redis-Like Databases with Python

photos/p1432653421.74.png

Redis is one of the more unique NoSQL offerings to have become popular over the past five years. It seems that there is no limit to the use-cases one can find for Redis. It's fantastic as a cache, doubles as a task-queue, can provide fast type-ahead search, and much more. The idea that you can store data-structures instead of rows and columns, keys and values, or JSON documents strikes me as particularly innovative. A while back I released walrus, a collection of Python utilities I'd built to simplify some of these use-cases and provide Pythonic APIs for the data-structures Redis natively supports. If you're a Python developer you might check it out.

Recently I've learned about a few new Redis-like databases: Rlite, Vedis and LedisDB. Each of these projects offers a slightly different take on the data-structure server you find in Redis, so I thought that I'd take some time and see how they worked. In this post I'll share what I've learned, and also show you how to use these databases with Walrus, as I've added support for them in the latest 0.3.0 release.

Read more...

Naive Bayes Classifier using Python and Kyoto Cabinet

photos/p1422977174.11.png

In this post I will describe how to build a simple naive bayes classifier with Python and the Kyoto Cabinet key/value database. I'll begin with a short description of how a probabilistic classifier works, then we will implement a simple classifier and put it to use by writing a spam detector. The training and test data will come from the Enron spam/ham corpora, which contains several thousand emails that have been pre-categorized as spam or ham.

Read more...

Walrus: Lightweight Python utilities for working with Redis

photos/walrus-logo.png

A couple weekends ago I got it into my head that I would build a thin Python wrapper for working with Redis. Andy McCurdy's redis-py is a fantastic low-level client library with built-in support for connection-pooling and pipelining, but it does little more than provide an interface to Redis' built-in commands (and rightly so). I decided to build a project on top of redis-py that exposed pythonic containers for the Redis data-types. I went on to add a few extras, including a cache and a declarative model layer. The result is walrus.

Read more...

Completely un-scientific benchmarks of some embedded databases with Python

I've spent some time over the past couple weeks playing with the embedded NoSQL databases Vedis and UnQLite. Vedis, as its name might indicate, is an embedded data-structure database modeled after Redis. UnQLite is a JSON document store (like MongoDB, I guess??). Beneath the higher-level APIs, both Vedis and UnQLite are key/value stores, which puts them in the same category as BerkeleyDB, KyotoCabinet and LevelDB. The Python standard library also includes some dbm-style databases, including gdbm.

For fun, I thought I would put together a completely un-scientific benchmark showing the relative speeds of these various databases for storing and retrieving simple keys and values.

Here are the databases and drivers that I used for the test:

I'm running these tests with:

  • Linux 3.14.4
  • Python 2.7.7 (Py2K 4 lyfe!)
  • SSD

For the test, I simply recorded the time it took to store 100K simple key/value pairs (no collisions). Then I recorded the time it took to read back all these values. The results are in seconds elapsed:

p1404105100.86.png

Read more...

Python bindings for UnQLite, an embedded NoSQL database/JSON document store

unqlite python logo

Note, Aug 2015: I've rewritten the UnQLite Python bindings using Cython, and they are much faster. Here is a blog post announcing the update.

Original post:

I'm happy to write that I've just released some python bindings for UnQLite, an embedded NoSQL database and JSON document store. UnQLite might be characterized as the SQLite of NoSQL databases, though it's JSON document-store and Jx9 scripting language make it a pretty unique offering. UnQLite is created by Symisc Systems, who are also responsible for Vedis, an embedded Redis-like database (I also wrote some python bindings for vedis). Here is a quick overview of some of UnQLite's features, as described on the project homepage:

  • Embedded, zero-config database
  • Transactional (ACID)
  • Single file or in-memory database
  • Key/value store
  • Cursor support and linear record traversal
  • JSON document store
  • Thread-safe
  • Terabyte-sized databases

In the rest of this post I will show some basic usage of the unqlite-python library. If you'd like to follow along, you can use pip to install unqlite:

pip install unqlite

You can find the project source code hosted on GitHub and the documentation is available on readthedocs.

Read on for the details!

Read more...

Python bindings for Vedis, the Embedded NoSQL Database

vedis-python logo

Over the past week I've been writing some python bindings to the embedded NoSQL database Vedis, a transactional data-store modeled after Redis. Like Redis, Vedis could be characterized as an advanced key-value store that supports hash, set and list data-structures. Vedis has over 70 available commands for working with the various data types. Unlike Redis, which is run as a separate server process, Vedis is embedded in the host process like SQLite. Vedis works with either in-memory databases or on-disk databases. Vedis is transactional (ACID) and also thread-safe. If you'd like more information, check out the Vedis FAQ.

Vedis-python

Vedis-python allows you to use Vedis in your Python apps. Vedis-python supports all the Vedis data-types, and also allows you extend Vedis by writing your own commands in Python. As I mentioned, this project is very new so while I have written pretty extensive unit tests, the library has certainly not been battle-tested yet.

If you'd like to give it a try, you can use pip to install vedis-python. At the time of writing the current version is 0.1.5.

$ pip install vedis-python

Just a word of caution, I've tested the installation on various flavors of Linux (including on my raspberry pi), and Mac OSX, but have not tested on Windows.

Read the rest of the post for the details.

Read more...