In this post I want to discuss how to work around some of the shortcomings of djangos ORM when dealing with Generic Foreign Keys (GFKs).
At the end of the post I'll show how to work around django's lack of correctly CAST-ing when the generic foreign key is of a different column type than the objects it may point to.
A while ago I wrote about an awesome API for retrieving metadata about URLs called oembed. I'm writing to announce a new project I've been working on called micawber, which is very similar but with a cleaner API and not restricted to django projects.
I recently rewrote my personal site using flask and peewee, breaking a good amount of stuff in the process. I was trying to track down the errors by tailing log files, but that didn't help alert me to new errors that someone visiting the site might stir up. I thought about setting up error emails a-la django, which is a tried and true method...but then I happened on a different approach. I won't say it's the most elegant solution, but it was a quick hack and the results have been awesome. I wrote a custom logging handler that pushes JSON-encoded log record data to a redis pub/sub channel. I then have an IRC bot that subscribes to this channel and when it receives a message generates a paste of the traceback and pings me with a link to the traceback.
Sometimes I want to push a file on my harddrive to S3 for safe keeping. I wrote a little script for nautilus which appears in the context menu to push files to a specific S3 bucket.
Using python and phantomjs, a headless webkit browser, it is a snap to build a self-hosted bookmarking service that can capture images of entire pages. Combine this with a simple javascript bookmarklet and you end up with a really convenient way of storing bookmarks. The purpose of this post will be to walk through the steps to getting a simple bookmarking service up and running.
http://media.charlesleifer.com/images/photos/import_playground-182916.png
For fun I put together a small script that is capable of introspecting databases and generating peewee models. I borrowed the crucial bits from django's codebase, which has methods for introspecting column types and foreign key constraints.
The code is hopefully rather straightforward - it simply grabs the list of tables, column type information which is then mapped to peewee field types, then finally resolves foreign keys. The generated models are then dumped to standard out, along with a database declaration.
After two years of maintaining djangosnippets.org, I am pleased to announce that the guys from django-de are going to be taking over and you can expect to see some real improvements.
At my job we've been doing a quarterly hackday for almost a year now. My coworkers have made some amazing stuff, and its nice to have an entire day dedicated to hacking on ... well, whatever you want. Tomorrow marks the 4th hackday and I need to scrounge up a good project, but in the meantime I thought I'd write a post about what I did last time around -- a lightweight python task queue that has an API similar to celery.
I've called it huey (which also turns out to be the name of my kitten).
The goal of the project was to keep it simple while not skimping on features. At the moment the project does the following:
Backend storages implement a simple API, currently the only implementation uses Redis but adding one that uses the database would be a snap.
The other main goal of the project was to have it work easily for any python application (I've been into using flask lately), but come with baked-in support for django. Because of django's centralized configuration and conventions for loading modules, the django API is simpler than the python one, but hopefully both are reasonably straightforward.
As an IRC bot enthusiast and tinkerer, I would like to describe the most enduring and popular bot I've written, a markov-chain bot. Markov chains can be used to generate realistic text, and so are great fodder for IRC bots. The bot I am writing of has been hanging out in my town's channel for the past year or so and has amassed a pretty awesome corpus from which it generates messages. Here are few of his greatest hits:
Over the last two months I've spent a lot of time working on improvements to peewee, a lightweight ORM written in python.
Some of these features are present in Django and were added for better parity, some I found a need for while working on other projects, and others were requested by opening an issue on GitHub or bringing it up on IRC (#peewee on freenode). If you're interested in trying peewee out, it ships with an example app which is described here.
Here's a rundown on what has been added recently:
For a change, I've been doing all of my new app development using flask, a python web framework built atop the werkzeug WSGI toolkit. Having used django for the last two years it's been fun to do something different, but at the same time stick with python.
In this post I'd like to show a couple of the small projects I've written using flask over the past few weeks.
Recently I stumbled across the twitter bootstrap project, which is a set of cross-browser compliant stylesheets and scripts. I liked them so much that I've ported the admin templates to use bootstrap. Here's a little screenshot of the design refresh taken from the example app:
http://media.charlesleifer.com/images/photos/flask-peewee-admin.jpg
I hope this will make the admin easier to work with in the long-run!
I'd like to write a post about a project I've been working on for the past month or so. I've had a great time working on it and am excited to start putting it to use. The project is called flask-peewee -- it is a set of utilities that bridges the python microframework flask and the lightweight ORM peewee. It is packaged as a flask extension and comes with the following batteries included:
Over the past month I've been working on adding support for both MySQL and PostgreSQL to peewee. I'm happy to say that after a couple weekend hack sessions all tests are now passing.
With the 0.3.0 release of django-relationships, I've made a couple backwards-incompatible changes which I thought I'd mention.
One of the problems mentioned by a couple people when I asked for suggestions on improving djangosnippets.org was the proliferation of tags. This is a well-known problem on sites that allow users to enter their own tags, where misspellings are frequent and its sometimes unclear whether a tag should be plural or singular.
To try and reduce the amount of different tags on djangosnippets I ended up using the jQuery UI autocomplete tools to provide users with hints when they enter tags for their snippets.
Describing some of the improvements made to the django snippets site over the past couple weeks and asking for user feedback on additional improvements they'd like to see.
It's been a while since I first wrote about setting up Solr on Ubuntu. Since then I've opted for a different approach that is both simpler and lighter-weight. This post describes briefly the steps to setting up Solr on Ubuntu.
I'm pleased to announce that I've added support for MySQL to peewee. All tests are now passing. In the process I uncovered a few small bugs which have also been fixed.
I also added some new reference documentation which describes succinctly how to do basic configuration and querying with peewee.
After several months of running the task queue bundled with django-utils, I decided to re-evaluate certain aspects of the design. This post describes those changes.
Just a quick heads-up to anyone out there using django-completion, I've released a couple important updates this weekend and you may be interested in updating your checkouts. These changes are purely additive, so don't worry about having to update your own code.
There are three important updates:
As of this week we instituted a regular "hackday" at my office -- anything goes, you can work on whatever you like, so at 11:30 the night before the hackday started I decided on writing a simple IRC-powered botnet.
I'd been scrounging around for a smallish project, when I happened on the idea of writing a spider with a simple web interface. I had recently released a task queue, so I wanted to incorporate that to do the actual crawling, while a django view served up the results as they arrived in the database. The end result is a new project I'm calling django-spider, you can check it out on GitHub. This post will discuss some of the aspects of the design.
It's quite common when building out a website to trigger actions during the normal request/response cycle that may be time-consuming. Examples of these actions might be:
I remember last year about this time my coworkers and I got pretty excited about Celery, a distributed task queue, that provided a really nice API for executing tasks out-of-process. Basically just decorate functions with the @task decorator and so long as everything is configured properly, they will execute out of process. Celery is an actively-developed project with great documentation and an incredibly rich feature-set, but all those features come with the added cost of lots of configuration and the need for integration with a number of projects (celery, django-celery, kombu, django-kombu, pyparsing, mailer).
I needed a lightweight task queue for some side-projects and rather than trying to integrate all the various celery dependencies (and pinning all the correct versions) I did what anyone would do: rolled my own.
I'm writing this post to introduce a new project I've released, django-generic-m2m, which as its name would indicate is a generic ManyToMany implementation for django models. The goal of this project was to provide a uniform API for both creating and querying generically-related content in a flexible manner. One use-case for this project would be creating semantic "tags" between diverse objects in the database.
One of the nicest UI's around when dealing with a large dataset is a good autocomplete. Facebook's search is a great example, same for Netflix, and recently Google launched "Google Instant", which returns search results as you type. Autocomplete can really complement hierarchical drill-down search (which is useful for discovery), as the goal of autocomplete is more for helping users find something they already know about with a minimum of effort.
For the past month or so I've been working on writing my own ORM in Python. The project grew out of a need for a lightweight persistence layer for use in Flask web apps. As I've grown so familiar with the Django ORM over the past year, many of the ideas in Peewee are analagous to the concepts in Django. My goal from the beginning has been to keep the implementation simple without sacrificing functionality, and to ultimately create something hackable that others might be able to read and contribute to.
You may not know it, but djangoembed can be used to OEmbed your own site's static media. We use it at work to allow users to embed photos they upload through the site.
Tetris in JavaScript using the Canvas element, 'nuff said!
The canvas element is awesome. JavaScript is fast enough that you can run some pretty computationally intensive stuff (I've seen 3D games, a NES emulator, and much more all done with JS!). This script shouldn't push your CPU to the limit, but it does show how easy it is to create cool effects with just a small amount of code.
Keeping with the theme of yesterday's post - "a stroll down memory lane" - I thought I'd re-create the Nokia Snake game (a distant relative of Nibbles) using JavaScript and the canvas element.
When I started working at my current job I was surprised to see that everyone used IRC as their primary means of communication - much more so than email or IM. I recently wrote a small irc bot library in python - it was a ton of fun and reminded me of some of the first programs I wrote that were bots and scripts for America Online.
Users of djangosnippets.org may have noticed the addition of a few search-related features over the past several months. I'd like to highlight some of the additions that have been made and show how you can implement similar functionality on your sites. All of djangosnippet's search leans on Apache Solr, a powerful search engine built on top of Apache Lucene (full-text search). Haystack is the search solution for Django apps - it provides a querying interface similar to Django's ORM, handles indexing your models for you, and supports advanced features like "more-like-this" and faceting.
This post discusses the two flavors of model inheritance supported by Django, some of their use-cases as well as some potential gotchas.
In this post I'll show how I used Hookbox, a comet server/message queue, and Flask, a lightweight python web framework, to create a simple real-time chat app. This post will walk through creating a bare-bones example, then discuss ways to add additional functionality.
Sites often have many views that operate with a similar set of assumptions. Maybe there are entire areas that the user must be logged-in to visit, or there is some repetitive boilerplate functionality that a group of views shares like being rate-limited. This post looks at ways to make this kind of functionality less repetitive by using a common Django pattern, view decorators.
As the first installment in a series on common patterns in Django development, I'd like to discuss the Pluggable Backend pattern. The pattern addresses the common problem of providing extensible support for multiple implementations of a lower level function, be it caching, database querying, etc.
All about the lightweight jQuery-powered markup editing toolkit markItUp, which I used recently to implement basic reStructuredText markup support.
I'm pleased to announce the release of djangoembed, a django app for consuming and providing rich media.
OEmbed is a format for allowing a rich representation of a url. If you've used Facebook you've probably seen this feature before -- linking a YouTube video will embed an actual video player in the news feed, automatically. The player is represented by some HTML, plus there may be additional metadata like the author, a link to their channel, the title of the video, or even a thumbnail.
Last week, after several false starts, I moved all the sites I maintain into virtualenvs, with their own pip requirements files. My reasons for doing so are pretty simple:
There are quite a few great tutorials out there for getting started with these tools. I will only discuss how I got over some of the hurdles involved in using these tools, as well as a tool for automating the creation of "skeleton" django sites.
Apache Solr is a fast, open-source search solution. People are doing some very cool things with Solr. I personally have only begun to scratch the surface of what is possible with Solr, but have seen amazing returns with a relatively small investment (thanks entirely to Daniel Lindsley's excellent search framework, django-haystack). There are instructions for getting up and running with Solr + Jetty -- the purpose of this blog entry is to walk through setting up multi-core Solr with Apache Tomcat.
To get the benefit of Django 1.2's new CSRF protection, all POST forms will need a special token. Here is a quick command that runs through templates adding the token:
find . -type f -name "*.html" -exec sed -i \
's|\(<form[^>]*method="post"[^>]*>\)\({% csrf_token %}\)\?|\1{% csrf_token %}|g' \
{} \;
This post will be very brief, but I want to show a little trick I'm using on my different servers so I can tell them apart at a glance. I use a custom bash prompt which gives the hostname of each server a different color:
[charles@alpha ~] $[charles@beta ~] $
Generating aggregate data across generic relations May 22, 2010 19:22 / 7 comments
Aggregation support was added to django's ORM in version 1.1, allowing you to generate Sums, Counts, and more without having to write any SQL. According to the docs aggregation is not supported for generic relations. This entry describes how to work around this using the .extra() method.
django-site-gen, a tool for automating site creation May 03, 2010 18:14 / 2 comments
Most django sites I create have quite a lot in common. Beyond the handful of files generated by django-admin startproject, my projects all have a database, wsgi file, apache and nginx confs, static media and templates. All these building blocks of a site vary very little from project-to-project. django-site-gen allows you to automate the creation of the stuff that doesn't vary.
Idea for a simple task queue May 01, 2010 12:45 / 1 comments
cue is a simple queue abstraction along the lines of the command pattern, and provides django apps with a way to decouple the creation of a request from its execution.
A handful of snippets April 24, 2010 14:36 / 4 comments
A collection of a few snippets I've made use of recently.
Topeka and Google April 01, 2010 10:02 / 0 comments
Topeka, the state capitol of Kansas, is about 30 minutes down the road from Lawrence, KS where I work. In a recent publicity stunt, they've been talking about renaming the capitol to 'Google' in order to bring Google fiber-optics to the city. Here's Google's response.
Announcing django-relationships March 27, 2010 19:00 / 7 comments
I recently posted on writing an app that allows you to create flexible and descriptive relationships between Django's built-in auth.users. django-relationships is the result.