Introduction to the fast new UnQLite Python Bindings
About a year ago, I blogged about some Python bindings I wrote for the embedded NoSQL document store UnQLite. One year later I'm happy to announce that I've rewritten the library using Cython and operations are, in most cases, an order of magnitude faster.
This was my first real attempt at using Cython and the experience was just the right mix of challenging and rewarding. I bought the O'Reilly Cython Book which came in super handy, so if you're interested in getting started with Cython I recommend picking up a copy.
In this post I'll quickly touch on the features of UnQLite, then show you how to use the Python bindings. When you're done reading you should hopefully be ready to use UnQLite in your next Python project.
What is UnQLite?
UnQLite is a serverless JSON document store built on a fast key/value database. The key/value features make UnQLite kin to DBM-style databases (BerkeleyDB, KyotoCabinet), while the JSON document store is closer to something like MongoDB. UnQLite occupies an especially unique place in the NoSQL world, though, through it's use of a special scripting language to manage the JSON document store. To make an analogy, UnQLite is to MongoDB what SQLite is to Postgres, and the Jx9 scripting language serves the same purpose in UnQLite as SQL does in SQLite. (Note that although UnQLite sounds like SQLite, the projects are not affiliated).
Here is a quick run-down of some of the features UnQLite's creators, Symisc Systems, decided were worth putting on the project's homepage:
- Serverless, NoSQL database.
- Transactional (ACID) database.
- Zero config.
- Single database file, no temporary files.
- Cross-platform file format.
- Self-contained C library without dependencies.
- Standard key/value store with powerful disk storage engine supporting O(1) lookup time.
- Document store (JSON) database via Jx9.
- Cursors for linear record traversal.
- Supports Terabyte sized databases.
- BSD licensed.
Installing UnQLite
To get started, let's create a virtualenv and install unqlite-python. unqlite-python
comes with a pre-generated C source-code file for the extension, but if you'd like you can install Cython and a new source file will be generated.
$ virtualenv unqlite-demo
New python executable in unqlite-demo/bin/python2
Also creating executable in unqlite-demo/bin/python
Installing setuptools, pip...done.
$ cd unqlite-demo
$ source bin/activate
(unqlite-demo) $ pip install Cython unqlite
...
Successfully built Cython unqlite
Installing collected packages: Cython, unqlite
Successfully installed Cython-0.22.1 unqlite-0.4.1
You can verify your install worked by running the following, which should produce no output:
$ python -c "import unqlite; unqlite.UnQLite()"
Key/Value Features
If UnQLite were only a key/value store, it would still be a fantastic database thanks to it's speed, cursors and transaction support. In this section we'll take a look at how to use the key/value features of UnQLite.
UnQLite databases can reside in a single file on disk, or entirely in memory. To begin working with UnQLite, the first step is to create a database object:
>>> from unqlite import UnQLite
>>> db = UnQLite()
The above statements will create an in-memory database. To use a file, you would instead pass in the filename when instantiating your db
object.
unqlite-python
implements a similar API to Python's dict
object, so it should feel pretty familiar:
>>> db['foo'] = 'bar'
>>> print db['foo']
bar
>>> 'foo' in db
True
>>> del db['foo']
>>> len(db)
0
>>> db.update({'huey': 'kitty', 'mickey': 'puppy'})
>>> print [item for item in db]
[('huey', 'kitty'), ('mickey', 'puppy')]
As shown in the example above, you can iterate directly over the database, which will yield key/value pairs. unqlite-python
databases also support keys()
and values()
methods.
For finer-grained iteration, you can use Cursors.
>>> for i in range(7):
... db['k%s' % i] = 'v%s' % i
>>> with db.cursor() as cursor:
... cursor.seek('k4')
... print cursor.value()
... for key, value in cursor: # Cursors are also iterable.
... print (key, value)
...
v4
('k4', 'v4')
('k5', 'v5')
('k6', 'v6')
If you're using a file-backed database, UnQLite supports transactions. The simplest way to use transactions is as a context manager:
>>> db = UnQLite('/tmp/test.udb')
>>> with db.transaction():
... db['foo'] = 'bar'
...
>>> print db['foo']
bar
>>> with db.transaction():
... db['foo'] = 'baze'
... db.rollback() # Undo the changes.
...
>>> print db['foo'] # Prints the original value.
bar
JSON Document Store
UnQLite has this crazy scripting language baked-in, which is used to query the JSON document store. Jx9 serves the same purpose in UnQLite as SQL does in SQLite, but it can also do a whole lot of crazy stuff.
I took some care to make it really easy to pass Python values into the Jx9 scripts, and pull them back out after execution. Here is a silly example:
>>> script = """
... $my_data = {
... os_name: uname(), // jx9 builtin function
... date: __DATE__, // another builtin
... foo: $py_value // just a simple key/value.
... };
... """
>>> with db.vm(script) as jx9_vm:
... jx9_vm['py_value'] = {'baze': 'nugget'} # Set the value of $py_value
... jx9_vm.execute()
... print jx9_vm['my_data'] # Extract $my_data from the executed script.
...
{'date': '2015-07-21', 'os_name': 'Linux 4.0.7-2-ARCH #1 ... lambda x86_64', 'foo': {'baze': 'nugget'}}
This procedural scripting language is used to work with Collections of JSON documents. Rather than forcing you to write Jx9 scripts, unqlite-python
abstracts away some of the most common operations behind a Collection
class:
>>> users = db.collection('users')
>>> users.create() # Create the collection.
>>> users.store({
... 'name': 'Charlie',
... 'pets': [{'name': 'mickey'}, {'name': 'huey'}],
... 'best friends': ['Leslie', 'Connor'],
... })
0
When we store an object, UnQLite returns the __id
of the newly-created document. Multiple objects can be stored by passing in a list of dictionaries. To update an object, simply specify the __id
and the dictionary of new data:
>>> users.store([{'name': 'Leslie'}, {'name': 'Connor', 'type': 'baby'}])
2
>>> users.update(1, {'name': 'Leslie', 'favorite_color': 'green'})
True
To view the documents in a collection, you can call Collection.all()
:
>>> users.all()
[{'__id': 0,
'best friends': ['Leslie', 'Connor'],
'name': 'Charlie',
'pets': ['mickey', 'huey']},
{'__id': 1, 'favorite color': 'green'},
{'__id': 2, 'name': 'Connor', 'type': 'baby'}]
Things get neat when it comes to filtering collections. To filter a collection, you write a filter function in Python, and unqlite-python
will expose it to a Jx9 script that performs the filtering:
>>> def babies(document):
... return document.get('type') == 'baby'
...
>>> users.filter(babies)
[{'__id': 2, 'name': 'Connor', 'type': 'baby'}]
For more information about collections, check out the unqlite-python documentation.
Thanks for reading
Thanks for taking the time to read this post, I hope you found the content interesting. I think UnQLite is one of those weird quirky projects that would be fun to build all sorts of little apps with. It could be used as a cache, of course, but you could go deep into learning Jx9 and write all sorts of crazy stuff.
If you have any questions or comments, please leave a comment or contact me. If you are trying out unqlite-python
and believe you've found a bug, don't hesitate to create a ticket on GitHub.
Links
Here are some links you may find useful:
- unqlite-python documentation and source code
- UnQLite homepage
- vedis-python, Python bindings for vedis, an embedded Redis-like database from the makers of UnQLite.
Here are some related blog posts that you may enjoy:
- Using SQLite4's LSM storage engine as a stand-alone NoSQL database with Python
- Using the SQLite JSON extension with Python
- Alternative Redis-like databases with Python
- Completely un-scientific benchmark of some embedded databases with Python
Comments (2)
Depado | jul 23 2015, at 08:51am
That was a very nice article ! I'll give your binding a try for sure, and thanks for making me dicover UnQLite !
Commenting has been closed.
sirkonst | jul 27 2015, at 02:59am
How about thread/processes safe?