Python's yield from

April 13, 2014 12:13 / 0 comments

The yield from syntax, introduced in PEP 380, is getting a lot of attention lately due to its important role in the new asyncio package. I did not immediately understand what this syntax provides, but I have a handy way of thinking about it which I thought I'd share on my blog.

Imagine you have an arbitrarily nested list structure like so:

lists = [
    1, 2, 3,
    [4, 5, [6, 7], 8],
    [[[9, 10], 11]],
    [[]],
    12,
]

You can flatten this data-structure by writing a recursive generator thanks to the new yield from syntax:

def flatten(items):
    for item in items:
        if isinstance(item, (list, tuple)):
            yield from flatten(item)
        else:
            yield item

The output would then be:

>>> [item for item in flatten(lists)]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

To achieve this using Python 2.x, which does not have yield from, you would instead write the recursive call like this:

if isinstance(item, (list, tuple)):
    for subitem in flatten(item):
        yield subitem

In Honor of Spring...

March 17, 2014 11:18 / 0 comments

Two new themes.


Lawrence, KS

February 26, 2014 11:18 / 5 comments

I am proud to live in Lawrence, KS, a college town of about 100,000 which has been my home for the majority of my life. Perhaps the most striking feature about my home is the amazing sky here -- nowhere else I've lived comes close:

Being in the tech industry, I'm often asked if I have plans to move away to a place with more jobs. I always answer simply and somewhat apologetically that I intend to stay in Kansas. Answering that way is so much less embarassing than explaining why I love Kansas. My home is very much a part of me, though, and I'd like to write just once about why I am so happy to live here.


How do you use peewee?

February 22, 2014 13:11 / 5 comments

When I first wrote peewee I set out to accomplish a simple task: make it easy to execute queries in my Flask apps. I was a bit familiar with SQLAlchemy, but wanted something lightweight and thought it would be a quick project. While the first version only took a couple days to write, over the past two or three years peewee has been my favorite project to work on. I've been very surprised to see that it's user base has grown, and would like to ask anyone who is using peewee:

How do you use peewee?

I'd like to add a "testimonials" section to the documentation that describes the interesting projects people have written using peewee. If you don't mind sharing, I'd love to hear about your project.


Window functions, case statements, and savepoints in peewee

February 21, 2014 10:44 / 0 comments

In case you've missed the last few releases, I've been busy adding some fun new features to peewee. While the changelog and the docs explain the new features and describe their usage, I thought I'd write a blog post to provide a bit more context.

Most of these features were requested by peewee users. I depend heavily on users like you to help me improve peewee, so thank you very much! Not only have your feature requests helped make peewee a better library, they've helped me become a better programmer.

So what's new in peewee? Here is something of an overview:

Hopefully some of those things sound interesting. In this post I will not be discussing everything, but will hit some of the highlights.


Ricing the Desktop: "Brown rice"

January 15, 2014 10:42 / 0 comments

I've redone my desktop again and thought I'd post some screenshots. The image below is "curated" for maximum visual appeal, but usually I just work with the browser and a few terminals.

Rice


Working from home

December 10, 2013 12:19 / 0 comments

December marks my 9th month working remotely for Counsyl and I thought I would write about my experience working from home.


"djpeewee": use the peewee ORM with your Django models

November 19, 2013 22:40 / 5 comments

I sat down and started working on a new library shortly after posting about Django's missing API for generating SQL. djpeewee is the result, and provides a simple translate() function that will recursively translate a Django model graph into a set of "peewee equivalents". The peewee versions can then be used to construct queries which can be passed back into Django as a "raw query".

Here are a couple scenarios when this might be useful:

  • Joining on fields that are not related by foreign key (for example UUID fields).
  • Performing filters on calculated values.
  • Performing aggregate queries on calculated values.
  • Using SQL statements that Django does not support such as CASE.
  • Utilizing SQL functions that Django does not support, such as SUBSTR.
  • Replacing nearly-identical SQL queries with reusable, composable data-structures.

I've included this module in peewee's playhouse, which is bundled with peewee.


The search for the missing link: what lies between SQL and Django's ORM?

November 12, 2013 11:54 / 7 comments

I had the opportunity this week to write some fairly interesting SQL queries. I don't write "raw" SQL too often, so it was fun to use that part of my brain (by the way, does it bother anyone else when people call SQL "raw"?). At Counsyl we use Django for pretty much everything so naturally we also use the ORM. Every place I've worked there's a strong bias against using SQL when you've got an ORM on board, which makes sense -- if you choose a tool you should standardize on it if for no other reason than it makes maintenance easier.

So as I was saying, I had some pretty interesting queries to write and I struggled to think how to shoehorn them into Django's ORM. I've already written about some of the shortcomings of Django's ORM so I won't rehash those points. I'll just say that Django fell short and I found myself writing SQL. The queries I was working on joined models from very disparate parts of our codebase. The joins were on values that weren't necessarily foreign keys (think UUIDs) and this is something that Django just doesn't cope with. Additionally I was interested in aggregates on calculated values, and it seems like Django can only do aggregates on a single column.

As I was prototyping, I found several mistakes in my queries and decided to run them in the postgres shell before translating them into my code. I started to think that some of these errors could have been avoided if I could find an abstraction that sat between the ORM and a string of SQL. By leveraging the python interpreter, the obvious syntax errors could have been caught at module import time. By using composable data structures, methods I wrote that used similar table structures could have been more DRY. When I write less code, I think I generally write less bugs as well.

That got me started on my search for the "missing link" between SQL (represented as a string) and Django's ORM.


Using peewee to explore CSV files

November 07, 2013 06:19 / 3 comments

I recently heard a talk from a coworker wherein one of the things he discussed was automatically converting CSV data for use with a SQLite database. I thought this would be a great thing to add to peewee, especially as lately I've found myself on several occasions working with CSV and battling with it in a spreadsheet. It would be much easier to load it into a database and then query it using a tool I'm familiar with.

Which brings me to playhouse.csv_loader, a new module I've added to the playhouse package of extras. It's hopefully really easy to use. Here is an example of how you might use it:

>>> from playhouse.csv_loader import *
>>> db = SqliteDatabase(':memory:')  # Create an in-memory sqlite database

# Load the CSV file into the in-memory database and return a Model suitable
# for querying the data.
>>> ZipToTZ = load_csv(db, 'zipcode_to_timezone.csv')

# Get the timezone for a zipcode.
>>> ZipToTZ.get(ZipToTZ.zip == 66047).timezone
'US/Central'

# Get all the zipcodes for my town.
>>> [row.zip for row in ZipToTZ.select().where(
...     (ZipToTZ.city == 'Lawrence') && (ZipToTZ.state == 'KS'))]
[66044, 66045, 66046, 66047, 66049]