Rewriting Huey for a better API May 15, 2013 11:50 / 0 comments

For a while I've been itching to rewrite Huey, and just last week released 0.4 which is an almost total rewrite. I initially started Huey for performing tasks like checking comments for spam, sending emails, generating thumbnails, and basically anything that would slow down the pagespeed on my sites. This is still what I see as the primary use-case for huey -- performing small tasks outside the request/response cycle and running jobs on a schedule (I have a site that scrapes the county sheriff's site and keeps a log of arrests in my town). The goal for the rewrite was not to change the purpose of Huey, rather it was to change the API.


Structuring flask apps, a how-to for those coming from Django April 27, 2013 13:21 / 0 comments

The other day a friend of mine was trying out flask-peewee and he had some questions about the best way to structure his app to avoid triggering circular imports. For someone new to flask, this can be a bit of a puzzler, especially if you're coming from django which automatically imports your modules. In this post I'll walk through how I like to structure my flask apps to avoid circular imports. In my examples I'll be showing how to use "flask-peewee", but the same technique should be applicable for other flask plugins.

I'll walk through the modules I commonly use in my apps, then show how to tie them all together and provide a single entrypoint into your app.


"wallfix", using python to set my wallpaper April 22, 2013 09:54 / 0 comments

I had fun writing about my "cd" helper, so I thought I'd share another productivity helper I wrote for setting my wallpaper. It's a little silly, but I insist on my wallpaper being used for my lockscreen and my login window as well -- that way the entire time I'm on my computer the background is "seamless". Before I wrote this script it used to take me probably 3 or 4 minutes to change wallpapers!


Creating a personal password manager April 14, 2013 09:26 / 1 comments

My password "system" used to be that I had three different passwords, all of which were variations on the same theme. I maintained a list of sites I had accounts on and for each site gave a hint which of the three passwords I used. What a terrible scheme.

A couple weeks ago I decided to do something about it. I wanted, above all, to only have to remember a single password. Being lately security-conscious, I also recognized the need for a unique password on every site.

In this post I'll show how I used python to create a password management system that allows me to use a single "master" password to generate unique passwords for all the sites and services I use.


"j" for switching directories - hacking "cd" with python April 12, 2013 22:05 / 7 comments

Everyone uses cd a lot, I'm no exception. Because I use virtualenvs for my python projects, I'm often "cutting" through several layers of crap to get to what I actually want to edit. This was a good opportunity for a helper script!

The two biggest annoyances I was trying to alleviate were:

  1. There are directories I use a lot, but making bash aliases for them is not maintainable. I should be able to get to them quickly.
  2. I have to keep a mental map of the directory tree to go from one nested directory to another -- e.g. cd ../../some-other-dir/foo/. It would be nice to just type the part that matters and not the whole thing.

The solution I came up with stores directories I use (the entire path), and then I can perform a search of that history using a partial path.


Raspberry Pi Mobile January 14, 2013 09:31 / 0 comments

My Raspberry Pi got a new case this weekend:

Raspberry Pi Mobile


Shortcomings in the Django ORM and a look at Peewee, a lightweight alternative December 15, 2012 14:27 / 13 comments

In this post I'd like to talk about some of the shortcomings of the Django ORM, the ways peewee approaches things differently, and how this resulted in peewee having an API that is both more consistent and more expressive.


Sharing Screenshots with Dropbox and Imgur November 28, 2012 11:09 / 2 comments

I saw a post on hackernews this morning where a guy had built a little screenshot uploader for dropbox. Unfortunately, his script is for Mac OS and I use linux.

So, for linux folks out there, here is a little wrapper around scrot, a linux screenshot utility. It will allow you to capture the full screen, the current window, or free-select a region, then take the resulting image and put it in your dropbox folder or upload it to Imgur:

#!/usr/bin/env python
import base64
import json
import optparse
import os
import subprocess
import sys
import time
import urllib
import urllib2

BINARY = 'scrot'
HOME = os.environ['HOME']

# Imgur API -- register your app and paste the client id and secret:
CLIENT_ID = ''
CLIENT_SECRET = ''

# Location of your dropbox folder and your dropbox user id:
DROPBOX_DIR = os.path.join(HOME, 'Dropbox/Public/screens/')
DROPBOX_URL_TEMPLATE = 'http://dl.dropbox.com/u/%s/screens/%s'
DROPBOX_UID = ''

def upload_file(filename):
    with open(filename, 'rb') as fh:
        contents = fh.read()
    payload = urllib.urlencode((
        ('image', base64.b64encode(contents)),
        ('key', CLIENT_SECRET),
    ))
    request = urllib2.Request('https://api.imgur.com/3/image', payload)
    request.add_header('Authorization', 'Client-ID ' + CLIENT_ID)
    try:
        resp = urllib2.urlopen(request)
    except urllib2.HTTPError, exc:
        return False, 'Returned status: %s' % exc.code
    except urllib2.URLError, exc:
        return False, exc.reason
    resp_data = resp.read()
    try:
        resp_json = json.loads(resp_data)
    except ValueError:
        return False, 'Error decoding response: %s' % resp_data
    if resp_json['success']:
        return True, resp_json['data']['link']
    return False, 'Imgur failure: %s' % resp_data

def get_parser():
    parser = optparse.OptionParser('Screenshot helper')
    parser.add_option('-s', '--select', action='store_true', default=True,
                      dest='select', help='Select region to capture')
    parser.add_option('-f', '--full', action='store_true', dest='full',
                      help='Capture entire screen')
    parser.add_option('-c', '--current', action='store_true', dest='current',
                      help='Capture currently selected window')
    parser.add_option('-d', '--delay', default=0, dest='delay', type='int',
                      help='Seconds to wait before capture')
    parser.add_option('-p', '--public', action='store_true', dest='dropbox',
                      help='Store in dropbox public folder')
    parser.add_option('-x', '--no-upload', action='store_false', default=True,
                      dest='upload', help='Do not upload to imgur')
    parser.add_option('-k', '--keep-local', action='store_true', default=False,
                      dest='keep', help='Keep local copy after upload')
    return parser

def get_scrot_command(filename, options):
    args = [BINARY]
    if options.current:
        args.append('-u')
    elif not options.full:
        args.append('-s')
    if options.delay:
        args.append('-d %s' % options.delay)
    args.append(dest)
    return args

if __name__ == '__main__':
    parser = get_parser()
    options, args = parser.parse_args()

    filename = 's%s.png' % time.time()
    if options.dropbox:
        dest = os.path.join(DROPBOX_DIR, filename)
    else:
        dest = os.path.join(HOME, 'tmp', filename)

    if not options.current and not options.full:
        print 'Select a region to capture...'

    scrot_args = get_scrot_command(dest, options)
    p = subprocess.Popen(scrot_args)
    p.wait()

    if options.dropbox:
        print DROPBOX_URL_TEMPLATE % (DROPBOX_UID, filename)

    if options.upload:
        success, res = upload_file(dest)
        if not success:
            print 'Error uploading image: %s' % res
            print 'Image stored in: %s' % dest
            sys.exit(1)
        else:
            if not options.keep:
                os.unlink(dest)
            print res
    else:
        print dest

Using python and k-means to find the dominant colors in images October 23, 2012 17:23 / 17 comments

I'm working on a little photography website for my Dad and thought it would be neat to extract color information from photographs. I tried a couple of different approaches before finding one that works pretty well. This approach uses k-means clustering to cluster the pixels in groups based on their color. The center of those resulting clusters are then the "dominant" colors. k-means is a great fit for this problem because it is (usually) fast.


Peewee was baroque, so I rewrote it October 08, 2012 09:25 / 7 comments

Today I merged in the "unstable/2.0" branch of peewee. I'm very excited about the changes and I hope you will be, too.

I have written a documentation page on upgrading which gives the rationale behind the rewrite and some examples of the new querying API. Please feel free to take a look but much of the information presented in this post is lifted directly from the docs.

Goals for the rewrite

  • consistent: there is one way of doing things
  • expressive: things can be done that I never thought of

What changed?

The biggest changes between 1.0 and 2.0 are in the syntax used for constructing queries. The first iteration of peewee I threw up on github was about 600 lines. I was passing around strings and dictionaries and as time went on and I added features, those strings turned into tuples and objects. This meant, though, that I needed code to handle all the possible ways of expressing something. Look at the code for parse_select.

I learned a valuable lesson: keep data in datastructures until the absolute last second.

With the benefit of hindsight and experience, I decided to rewrite and unify the API a bit. The result is a tradeoff. The newer syntax may be a bit more verbose at times, but at least it will be consistent.


My 2004 Yamaha R6 September 12, 2012 14:47 / 2 comments

About two months ago I became the proud owner of a 2004 Yamaha R6! Previously I had been riding an '02 Honda Shadow (pic) and the change has been a revelation.


Web-based encrypted file storage using Flask and AWS September 12, 2012 10:34 / 6 comments

The other day I noticed I had a couple thumbdrives kicking around with various versions of my "absolutely do not lose" files...stuff like my private keys, tax documents, zips of papers I wrote in college, etc. These USB drives were all over the house, and many contained duplicate versions of the same files. I thought it would be neat to write a little app to give me a web-based interface to store and manage these files securely. In this post I'll talk about how I built a web-based file storage app using flask, pycrypto, and amazon S3.


A picture is worth a thousand words... using Google Images to spice up IRC September 11, 2012 14:07 / 1 comments

I wrote a little python IRC bot library a while back. It comes with a few silly examples like a google search bot, an ASCII art bot, even an example botnet. Today I was lurking around in a channel with a bunch of other local developers and noticed that we often are pasting links of images to "contextualize" things other folks have said.


Gedit port of vim theme "candycode" September 10, 2012 13:07 / 0 comments

I decided to port my favorite vim theme, candycode, to gedit.


The missing library: ad-hoc queries for your models September 05, 2012 11:41 / 5 comments

I think it would be great if more sites allowed users (or consumers of their APIs) to produce and execute ad-hoc queries against their data. In this post I'll talk a little bit about some ways sites are currently doing this, some of the challenges involved, my experience trying to build something "reusable", and finally invite you to share your thoughts.


Experimenting with an analytics web-service using python and cassandra August 02, 2012 10:48 / 2 comments

The other day I was poking around my google analytics account and thought it would be a fun project to see if I could collect "analytics"-type data myself. I recalled that the Apache Cassandra project was supposed to use a data model similar to Google's BigTable so I decided to use it for this project. The BigTable data model turned out to be a good fit for this project once I got over some of the intricacies of dealing with time-series data in Cassandra. In this post I'll talk about how I went about modelling, collecting, and finally analyzing basic page-view data I collected from this very blog.


Building the python SQLite driver for use with BerkeleyDB July 10, 2012 15:36 / 0 comments

As of Oracle's 11gR2 release of BerkeleyDB, the library has included a SQL API which is fully compatible with SQLite3. BerkeleyDB can even be compiled with support for SQLite's full-text search and r-tree extensions. There's a good whitepaper that Oracle published detailing the performance of their implementation versus SQLite. To summarize, since Berkeley supports page-level locking as opposed to database-level locking, it can push quite a few more transactions per second, making it a better fit for write-heavy applications (the one area sqlite suffers, IMO). Additionally, Berkeley makes fewer syscalls due to the way it blocks as opposed to SQLite's use of busy-locks.

This post is partly for me so I remember how I did this, and partly for anyone else interested in trying out Berkeley's SQL support. plug: I enjoy using SQLite for small projects, if you're interested you might check out peewee, a lightweight ORM with some fun SQLite extensions.


Powerful autocomplete with Redis in under 200 lines of Python July 06, 2012 16:29 / 0 comments

In this post I'll present how I built a (reasonably) powerful autocomplete engine with Redis and python. For those who are not familiar with Redis, it is a fast, in-memory, single-threaded database that is capable of storing structured data (lists, hashes, sets, sorted sets). I chose Redis for this particular project because its sorted set data type, which is a good fit for autocomplete. The engine I'll describe relies heavily on Redis' sorted sets and its set operations, but can easily be translated to a pure-python solution (links at bottom of post).


Using advanced database features with Peewee, a python ORM July 01, 2012 01:30 / 0 comments

I've developed an interest in some of the more advanced features of SQLite 3.7 after reading the O'Reilly title Using SQLite (Small. Fast. Reliable. Choose Any Three). For personal projects I like using SQLite for prototyping or for simple applications, and when I need something more powerful I turn to Postgresql. Because peewee supports both of these databases (as well as MySQL), it is limited to a lowest-common-denominator feature set. While this encompasses a broad range of features, each database engine has its own extensions and I've been interested in adding some pythonic support for the cooler extensions.

Currently, I've got support for:

This post will show the usage of the hstore and full-text search extensions. I will also show how I went about writing these extension modules so if you're interested in writing your own you will have a good foundation.

All of these extensions live in the playhouse package, included with the current master branch of peewee.

To follow along at home, feel free to install peewee:

pip install -e git+https://github.com/coleifer/peewee.git#egg=peewee

Working around Django's ORM to do interesting things with GFKs May 03, 2012 00:05 / 0 comments

In this post I want to discuss how to work around some of the shortcomings of djangos ORM when dealing with Generic Foreign Keys (GFKs).

At the end of the post I'll show how to work around django's lack of correctly CAST-ing when the generic foreign key is of a different column type than the objects it may point to.


Micawber, a python library for extracting rich content from URLs April 19, 2012 11:13 / 0 comments

A while ago I wrote about an awesome API for retrieving metadata about URLs called oembed. I'm writing to announce a new project I've been working on called micawber, which is very similar but with a cleaner API and not restricted to django projects.


Using Redis Pub/Sub and IRC for Error Logging with Python April 15, 2012 11:23 / 0 comments

I recently rewrote my personal site using flask and peewee, breaking a good amount of stuff in the process. I was trying to track down the errors by tailing log files, but that didn't help alert me to new errors that someone visiting the site might stir up. I thought about setting up error emails a-la django, which is a tried and true method...but then I happened on a different approach. I won't say it's the most elegant solution, but it was a quick hack and the results have been awesome. I wrote a custom logging handler that pushes JSON-encoded log record data to a redis pub/sub channel. I then have an IRC bot that subscribes to this channel and when it receives a message generates a paste of the traceback and pings me with a link to the traceback.


Nautilus script to push files to S3 in python April 09, 2012 16:58 / 0 comments

Sometimes I want to push a file on my harddrive to S3 for safe keeping. I wrote a little script for nautilus which appears in the context menu to push files to a specific S3 bucket.


Building a bookmarking service with python and phantomjs March 29, 2012 19:16 / 1 comments

Using python and phantomjs, a headless webkit browser, it is a snap to build a self-hosted bookmarking service that can capture images of entire pages. Combine this with a simple javascript bookmarklet and you end up with a really convenient way of storing bookmarks. The purpose of this post will be to walk through the steps to getting a simple bookmarking service up and running.

http://media.charlesleifer.com/images/photos/import_playground-182916.png


Model code generation with peewee March 20, 2012 15:03 / 0 comments

For fun I put together a small script that is capable of introspecting databases and generating peewee models. I borrowed the crucial bits from django's codebase, which has methods for introspecting column types and foreign key constraints.

The code is hopefully rather straightforward - it simply grabs the list of tables, column type information which is then mapped to peewee field types, then finally resolves foreign keys. The generated models are then dumped to standard out, along with a database declaration.


So long, djangosnippets, and thanks for all the fish March 06, 2012 09:38 / 0 comments

After two years of maintaining djangosnippets.org, I am pleased to announce that the guys from django-de are going to be taking over and you can expect to see some real improvements.


Huey, a lightweight task queue for python February 02, 2012 15:22 / 0 comments

At my job we've been doing a quarterly hackday for almost a year now. My coworkers have made some amazing stuff, and its nice to have an entire day dedicated to hacking on ... well, whatever you want. Tomorrow marks the 4th hackday and I need to scrounge up a good project, but in the meantime I thought I'd write a post about what I did last time around -- a lightweight python task queue that has an API similar to celery.

I've called it huey (which also turns out to be the name of my kitten).

Design goals

The goal of the project was to keep it simple while not skimping on features. At the moment the project does the following:

Backend storages implement a simple API, currently the only implementation uses Redis but adding one that uses the database would be a snap.

The other main goal of the project was to have it work easily for any python application (I've been into using flask lately), but come with baked-in support for django. Because of django's centralized configuration and conventions for loading modules, the django API is simpler than the python one, but hopefully both are reasonably straightforward.


Building a markov-chain IRC bot with python and Redis January 24, 2012 22:59 / 0 comments

As an IRC bot enthusiast and tinkerer, I would like to describe the most enduring and popular bot I've written, a markov-chain bot. Markov chains can be used to generate realistic text, and so are great fodder for IRC bots. The bot I am writing of has been hanging out in my town's channel for the past year or so and has amassed a pretty awesome corpus from which it generates messages. Here are few of his greatest hits:


Don't sweat the small stuff - use flask blueprints October 30, 2011 15:29 / 2 comments

For a change, I've been doing all of my new app development using flask, a python web framework built atop the werkzeug WSGI toolkit. Having used django for the last two years it's been fun to do something different, but at the same time stick with python.

In this post I'd like to show a couple of the small projects I've written using flask over the past few weeks.


Redesign of flask-peewee admin October 28, 2011 15:44 / 0 comments

Recently I stumbled across the twitter bootstrap project, which is a set of cross-browser compliant stylesheets and scripts. I liked them so much that I've ported the admin templates to use bootstrap. Here's a little screenshot of the design refresh taken from the example app:

http://media.charlesleifer.com/images/photos/flask-peewee-admin.jpg

I hope this will make the admin easier to work with in the long-run!


Integrating the flask microframework with the peewee ORM September 27, 2011 10:52 / 5 comments

I'd like to write a post about a project I've been working on for the past month or so. I've had a great time working on it and am excited to start putting it to use. The project is called flask-peewee -- it is a set of utilities that bridges the python microframework flask and the lightweight ORM peewee. It is packaged as a flask extension and comes with the following batteries included:


peewee now supports postgresql (and mysql and sqlite) July 24, 2011 10:07 / 2 comments

Over the past month I've been working on adding support for both MySQL and PostgreSQL to peewee. I'm happy to say that after a couple weekend hack sessions all tests are now passing.


Suggesting tags with django-taggit and jQuery UI June 29, 2011 12:20 / 2 comments

One of the problems mentioned by a couple people when I asked for suggestions on improving djangosnippets.org was the proliferation of tags. This is a well-known problem on sites that allow users to enter their own tags, where misspellings are frequent and its sometimes unclear whether a tag should be plural or singular.

To try and reduce the amount of different tags on djangosnippets I ended up using the jQuery UI autocomplete tools to provide users with hints when they enter tags for their snippets.


Solr on Ubuntu, revisited June 17, 2011 17:11 / 1 comments

It's been a while since I first wrote about setting up Solr on Ubuntu. Since then I've opted for a different approach that is both simpler and lighter-weight. This post describes briefly the steps to setting up Solr on Ubuntu.


Updates to peewee, now supports MySQL June 08, 2011 15:56 / 0 comments

I'm pleased to announce that I've added support for MySQL to peewee. All tests are now passing. In the process I uncovered a few small bugs which have also been fixed.

I also added some new reference documentation which describes succinctly how to do basic configuration and querying with peewee.


A simple botnet written in Python April 20, 2011 16:54 / 2 comments

As of this week we instituted a regular "hackday" at my office -- anything goes, you can work on whatever you like, so at 11:30 the night before the hackday started I decided on writing a simple IRC-powered botnet.


Connecting anything to anything with Django February 17, 2011 19:18 / 0 comments

I'm writing this post to introduce a new project I've released, django-generic-m2m, which as its name would indicate is a generic ManyToMany implementation for django models. The goal of this project was to provide a uniform API for both creating and querying generically-related content in a flexible manner. One use-case for this project would be creating semantic "tags" between diverse objects in the database.


Autocompletion for Django models using Solr, Redis or SQL December 30, 2010 18:08 / 2 comments

One of the nicest UI's around when dealing with a large dataset is a good autocomplete. Facebook's search is a great example, same for Netflix, and recently Google launched "Google Instant", which returns search results as you type. Autocomplete can really complement hierarchical drill-down search (which is useful for discovery), as the goal of autocomplete is more for helping users find something they already know about with a minimum of effort.


Peewee, a lightweight Python ORM - Original Post November 28, 2010 15:01 / 15 comments

For the past month or so I've been working on writing my own ORM in Python. The project grew out of a need for a lightweight persistence layer for use in Flask web apps. As I've grown so familiar with the Django ORM over the past year, many of the ideas in Peewee are analagous to the concepts in Django. My goal from the beginning has been to keep the implementation simple without sacrificing functionality, and to ultimately create something hackable that others might be able to read and contribute to.


Even more Canvas fun - Tetris in JavaScript November 05, 2010 14:56 / 1 comments

Tetris in JavaScript using the Canvas element, 'nuff said!


More fun with Canvas - a JavaScript Starfield! November 04, 2010 17:44 / 7 comments

The canvas element is awesome. JavaScript is fast enough that you can run some pretty computationally intensive stuff (I've seen 3D games, a NES emulator, and much more all done with JS!). This script shouldn't push your CPU to the limit, but it does show how easy it is to create cool effects with just a small amount of code.


Nokia Snake with JavaScript + Canvas November 03, 2010 23:35 / 4 comments

Keeping with the theme of yesterday's post - "a stroll down memory lane" - I thought I'd re-create the Nokia Snake game (a distant relative of Nibbles) using JavaScript and the canvas element.


A Stroll Down Memory Lane: Scripting AOL November 02, 2010 23:53 / 14 comments

When I started working at my current job I was surprised to see that everyone used IRC as their primary means of communication - much more so than email or IM. I recently wrote a small irc bot library in python - it was a ton of fun and reminded me of some of the first programs I wrote that were bots and scripts for America Online.


Search on djangosnippets.org November 01, 2010 21:07 / 0 comments

Users of djangosnippets.org may have noticed the addition of a few search-related features over the past several months. I'd like to highlight some of the additions that have been made and show how you can implement similar functionality on your sites. All of djangosnippet's search leans on Apache Solr, a powerful search engine built on top of Apache Lucene (full-text search). Haystack is the search solution for Django apps - it provides a querying interface similar to Django's ORM, handles indexing your models for you, and supports advanced features like "more-like-this" and faceting.


Django Patterns: Model Inheritance October 09, 2010 14:41 / 5 comments

This post discusses the two flavors of model inheritance supported by Django, some of their use-cases as well as some potential gotchas.


Django Patterns: View Decorators September 23, 2010 01:25 / 5 comments

Sites often have many views that operate with a similar set of assumptions. Maybe there are entire areas that the user must be logged-in to visit, or there is some repetitive boilerplate functionality that a group of views shares like being rate-limited. This post looks at ways to make this kind of functionality less repetitive by using a common Django pattern, view decorators.


Django Patterns: Pluggable Backends September 15, 2010 11:20 / 5 comments

As the first installment in a series on common patterns in Django development, I'd like to discuss the Pluggable Backend pattern. The pattern addresses the common problem of providing extensible support for multiple implementations of a lower level function, be it caching, database querying, etc.


Quick shell command to add CSRF token June 16, 2010 23:18 / 2 comments

To get the benefit of Django 1.2's new CSRF protection, all POST forms will need a special token. Here is a quick command that runs through templates adding the token:

find . -type f -name "*.html" -exec sed -i \
's|\(<form[^>]*method="post"[^>]*>\)\({% csrf_token %}\)\?|\1{% csrf_token %}|g' \
{} \;

Generating aggregate data across generic relations May 22, 2010 19:22 / 7 comments

Aggregation support was added to django's ORM in version 1.1, allowing you to generate Sums, Counts, and more without having to write any SQL. According to the docs aggregation is not supported for generic relations. This entry describes how to work around this using the .extra() method.


Announcing django-relationships March 27, 2010 19:00 / 7 comments

I recently posted on writing an app that allows you to create flexible and descriptive relationships between Django's built-in auth.users. django-relationships is the result.