Entries tagged with python
Saturday morning hacks: Adding full-text search to the flask note-taking app
In preparation for the fourth and final installment in the "Flask Note-taking app" series, I found it necessary to improve the search feature of the note-taking app. In this post we will use SQLite's full-text search extension to improve the search feature.
To recap, the note-taking app provides a lightweight interface for storing markdown-formatted notes. Because I frequently find myself wanting to take notes on the spur of the moment, the note-taking app needed to be very mobile-friendly. By using twitter bootstrap and a hefty dose of JavaScript, we made an app that matches our spec and manages to look good doing it!
In part 2, we added email reminders and check-able task lists to the note-taking app. We also converted the backend to use flask-peewee's REST API extension, which made it easy to add pagination and search. And that is how I've left it for the last three months or so.
Below is a screenshot of the latest version of the notes app. The UI is much cleaner thanks to a stylesheet from bootswatch. The bootswatch stylesheet works as a drop-in replacement for the default bootstrap CSS file.
All together, the note-taking app has the following features:
- Flexible pinterest-style tiled layout that looks great on a variety of screen sizes.
- Easy to create notes and reminders from the phone.
- Notes support markdown and there is also a simple WYSIWYM markdown editing toolbar.
- Links are converted to rich media objects where possible (e.g. a YouTube URL becomes an embedded player).
- To-do lists (or task lists) can be embedded in notes.
- Email reminders can be scheduled for a given note.
- Simple full-text search.
- Pagination.
You can browse or download the finished code from part 2 in this gist. If you're in a hurry, you can find all the code from this post in this gist.
In case you were curious, I've been using the notes app for things like:
- Bookmarking interesting sites to read later.
- Creating short to-do lists or writing down particular items to get from the store, etc.
- Writing down interesting dreams or ideas I get in the middle of the night.
- Appointment reminders, reminders to call people, etc.
- Saving funny cat pics.
- Writing down ideas for programming projects.
- Saving code snippets or useful commands.
SQLite: Small. Fast. Reliable. Choose any three.
SQLite is a fantastic database and in this post I'd like to explain why I think that, for many scenarios, SQLite is actually a great choice. I hope to also clear up some common misconceptions about SQLite.
Completely un-scientific benchmarks of some embedded databases with Python
I've spent some time over the past couple weeks playing with the embedded NoSQL databases Vedis and UnQLite. Vedis, as its name might indicate, is an embedded data-structure database modeled after Redis. UnQLite is a JSON document store (like MongoDB, I guess??). Beneath the higher-level APIs, both Vedis and UnQLite are key/value stores, which puts them in the same category as BerkeleyDB, KyotoCabinet and LevelDB. The Python standard library also includes some dbm-style databases, including gdbm.
For fun, I thought I would put together a completely un-scientific benchmark showing the relative speeds of these various databases for storing and retrieving simple keys and values.
Here are the databases and drivers that I used for the test:
- UnQLite: unqlite-python (ctypes)
- Vedis: vedis-python (ctypes)
- GDBM: standard library (C)
- BerkeleyDB (b-tree and hash-table): bsddb3 (C)
- KyotoCabinet (b-tree and hash-table): kyotocabinet (C++)
- LevelDB: plyvel (Cython)
- RocksDB: pyrocksdb (Cython)
- SQLite: standard library (C and Python)
- Redis: redis-py (Python) -- this one is just for fun!
I'm running these tests with:
- Linux 3.14.4
- Python 2.7.7 (Py2K 4 lyfe!)
- SSD
For the test, I simply recorded the time it took to store 100K simple key/value pairs (no collisions). Then I recorded the time it took to read back all these values. The results are in seconds elapsed:
Saturday morning hacks: Revisiting the notes app
My post from last month, Saturday Morning Hack, a Little Note-Taking App with Flask, was pretty well-received. Since I've made a number of improvements to the app, I thought I would write one more post to share some of the updates I've made to this project, in the hopes that they may be of interest to you.
A live demo is up and running on Python Anywhere, so feel free to check that out before continuing on with the post: http://beetlejuicer.pythonanywhere.com/
To briefly recap the previous post, I discussed how I built a lightweight note-taking app which I could use from my phone or desktop. It has a nice ajax-ey interface and some simple markdown helpers written with javascript. In addition to supporting markdown, it also supports oembed for automatically embedding YouTube videos and the like. Here is what it looked like when we left off a few weeks ago:
And this is how it looks now!
So what's new? Well, I've made a couple changes under-the-hood, and added some entirely new features to the UI.
- Allow creation of Task Lists with checkbox inputs.
- Create reminders that will send me an email at the appointed time.
- Built a RESTful API to interact with the
Note
model. Thanks to flask-peewee everything comes "for free". - Added search.
- Added pagination (using Ajax).
This was super fun to hack on so I thought I'd share the new code and describe how I added these features. Honestly, I didn't really end up adding much in terms of implementation. Huey handles scheduling and sending the email reminders, even automatically retrying messages that fail to send. Similarly, Flask-Peewee's REST API provides search and pagination out-of-the-box, so all I had to do was write the JavaScript to communicate with it. Thanks to these libraries, I was able to focus on the things that made this project unique, and hopefully you enjoy reading about the code.
Read the rest of the post for the details.
Using SQLite Full-Text Search with Python
In this post I will show how to use SQLite full-text search with Python (and a lot of help from peewee ORM). We will see how to index content for searching, and how to order search results using two ranking algorithms.
Last week I migrated my site from Postgresql to SQLite. I had been using Redis to power my site's search, but since SQLite has an awesome full-text search extension, I decided to give it a try. I am really pleased with the results, and being able to specify boolean search queries is an added plus. Here is a brief overview of the types of search queries SQLite supports:
- Simple phrase: peewee would return all docs containing the word peewee.
- Prefix queries: py* would return docs containing Python, pypi, etc.
- Quoted phrases: "sqlite extension"
NEAR
: peewee NEAR sqlite would return docs containing the words peewee and sqlite with no more than10
intervening words. You can also specify the max number of intervening words, e.g. peewee NEAR/3 sqlite.AND
,OR
,NOT
: sqlite OR postgresql AND NOT mysql would return docs about high-quality databases (just trollin).
Check out the full post for details on adding full-text search to your project.
Saturday morning hack: personalized news digest with boolean query parser
Because I had so much fun writing my last Saturday morning hack, I thought I would share another little hack. I was thinking that I really enjoy my subscription to Python weekly and wouldn't it be great if I had a personal email digest containing just the types of things that interest me? I regularly cruise reddit and hacker hater news but in my opinion there's a pretty low signal-to-noise ratio. Occasionally I stumble on fascinating content and that's what keeps me coming back.
I wanted to write an app that would search the sites I read and automatically create an email digest based on search terms that I specified. I recently swapped my blog over to SQLite and I love that the SQLite full-text search extension lets you specify boolean queries. With that in mind, I decided that I would have a curated list of boolean search queries which would be used to filter content from the various sites I read. Any articles that match my search would then be emailed to me.
Here are some of my search terms, which I am viewing in the flask-peewee admin interface:
If you're interested in learning how to build your own version of this project, check out the rest of the post.
Migrating to SQLite
Small. Fast. Reliable. Choose any three.
I made the decision this week to migrate my personal sites and several other sites I host onto SQLite. Previously almost everything I hosted had been using Postgresql. The move was motivated by a couple factors:
- SQLite is awesome!
- Self-contained: does not require a separate server process
- Data is stored in a single file, simplifying backups
- Excellent Python (and peewee) support
- Full-text search
At times it has seemed to me that there is a tacit agreement within the Flask / Django communities that if you're using SQL you should be using Postgresql. Postgresql is an amazing piece of engineering. I have spent the last five years of my career working exclusively with it, and I am continually impressed by its performance and the constant stream of great new features.
So why change things?
Well, as my list indicates, there are a handful of reasons. But the primary reason was that I wanted something lightweight. I'm running a fairly low-traffic, read-heavy site, so Postgresql was definitely overkill. My blog is deployed on a VPS with very limited resources, so every MB of RAM counts. Additionally, I wasn't using any special Postgresql features so there was nothing holding me back.
(continued...)
Saturday morning hack: a little note-taking app with Flask
A couple Saturdays ago I spent the morning hacking together a note-taking app. I'm really pleased with the result, so I thought I'd share the code in case anyone else might find it useful.
The note-taking project idea came about out of necessity -- I wanted something that worked well from my phone. While I have a personal wiki site I've used for things like software installation notes or salsa recipes, I've also noticed that because it's so cumbersome to use from my phone, I often end up emailing things to myself. Plus a wiki implies a kind of permanence to the content, making it not a great fit for these impromptu notes. I also like to use markdown to format notes, but markdown isn't too easy on a phone because of the special characters or the need to indent blocks of text. With these considerations in mind, I set out to build a note-taking app that would be easy to use from my phone.
Here is how the app appears on a narrow screen like my phone:
And here it is on my laptop:
Because markdown is a bit difficult to use when you're not in a nice text editor like vim, I've added some simple toolbar buttons to the editor:
Read the full post for all the details!
The search for the missing link: what lies between SQL and Django's ORM?
I had the opportunity this week to write some fairly interesting SQL queries. I don't write "raw" SQL too often, so it was fun to use that part of my brain (by the way, does it bother anyone else when people call SQL "raw"?). At Counsyl we use Django for pretty much everything so naturally we also use the ORM. Every place I've worked there's a strong bias against using SQL when you've got an ORM on board, which makes sense -- if you choose a tool you should standardize on it if for no other reason than it makes maintenance easier.
So as I was saying, I had some pretty interesting queries to write and I struggled to think how to shoehorn them into Django's ORM. I've already written about some of the shortcomings of Django's ORM so I won't rehash those points. I'll just say that Django fell short and I found myself writing SQL. The queries I was working on joined models from very disparate parts of our codebase. The joins were on values that weren't necessarily foreign keys (think UUIDs) and this is something that Django just doesn't cope with. Additionally I was interested in aggregates on calculated values, and it seems like Django can only do aggregates on a single column.
As I was prototyping, I found several mistakes in my queries and decided to run them in the postgres shell before translating them into my code. I started to think that some of these errors could have been avoided if I could find an abstraction that sat between the ORM and a string of SQL. By leveraging the python interpreter, the obvious syntax errors could have been caught at module import time. By using composable data structures, methods I wrote that used similar table structures could have been more DRY. When I write less code, I think I generally write less bugs as well.
That got me started on my search for the "missing link" between SQL (represented as a string) and Django's ORM.
Using python to generate awesome linux desktop themes
I remember spending hours when I was younger cycling through the various awesome color themes on my 386, in the glory days of windows 3.1. Remember hotdog stand?
Well, I haven't changed much. I still enjoy making tweaks to the colors and appearance of my desktop. In this post I'll talk about a script I wrote that makes it easy for me to modify all the various colors and configuration files which control the appearance of my desktop.