Archive for the ‘Python’ Category.

Django and jquery.tmpl

Lately, I’ve been finding Django increasingly inappropriate for the web applications I develop. I have several complaints: the forms library doesn’t extend well to ajax requests, any extensive customization to the admin requires obtuse inspection of the admin source code, many admin customizations simply aren’t possible, the “reusable apps” philosophy has added a layer of complexity to a lot of things that really should not be there, there are no obvious best practices for ajax support.

In spite of all this Django is still better than other frameworks (Python or not) that I have investigated or tested. I’ve considered writing my own web framework, but I wouldn’t maintain interest in it long enough to get it off the ground. So I’m letting my complaints bounce around in the back of my mind with the hopes that I can improve Django so that it continues to work for me as a platform.

I’m currently trying to come up with a better system for ajax requests. I have nothing concrete in mind, but I’ve started with the premise that ajax requests should never return rendered html, but should return only json (hence my issue with the django forms library). With that in mind, I need a templating library for json data. JQuery is a must, and the officially supported JQuery templating library is jquery.tmpl http://api.jquery.com/category/plugins/templates/

The problem with jquery.tmpl is that it uses django-like syntax. The following is a valid block that may be rendered in a jquery.tmpl page:

<script id="project_tmpl" type="text/x-jquery-tmpl">
    {{each projects}}<li>${$value}</li>{{/each}}
</script>

If you try to include this in a django template, the {{ and }} tags will be replaced with (probably) empty variables. Django has {% templatetag %} to render these individual items, but what we really need is a way to tell the django templating system to leave complete sections of code alone. So I wrote the jqtmpl template tag. It allows us to wrap code in a block that tells Django not to render that block as template code. So the above would show up in a Django template as follows:

<script id="project_tmpl" type="text/x-jquery-tmpl">
{% jqtmpl %}
    {{each projects}}<li>${$value}</li>{{/each}}
{% endjqtmpl %}
</script>

Here’s the template tag:

from django.template import Library, TextNode, TOKEN_BLOCK, TOKEN_VAR
 
register = Library()
 
@register.tag
def jqtmpl(parser, token):
    nodes = []
    t = parser.next_token()
    while not (t.token_type == TOKEN_BLOCK and t.contents == "endjqtmpl"):
        if t.token_type == TOKEN_BLOCK:
            nodes.extend(["{%", t.contents, "%}"])
        elif t.token_type == TOKEN_VAR:
            nodes.extend(["{{", t.contents, "}}"])
        else:
            nodes.append(t.contents)
 
        t = parser.next_token()
 
    return TextNode(''.join(nodes))

This doesn’t handle Django’s {# comments #}, as token.contents doesn’t return a valid value for comment items. As far as I know, you wouldn’t use the comment construct inside a jquery.tmpl template anyway, so it’s still functional.

Next on my list is a better forms validation library to suit my theory that validation should be done client side. I’ve got a server-side system in mind that returns json responses, and does not return user-facing error messages, since those should have been caught client side. With these tools, and hopes for Grappelli to eventually create a decent admin, Django may continue to serve me.

Updating m2m after saving a model in Django admin

I wanted to ensure that a many-to-many field on my model contained a value that was on a foreign key field on the model whenever that model was saved in the admin, like so:

            obj.participants.add(obj.instructor)

I thought this was a trivial task. I added this code to a post_save signal connected the model, but the participants list was not being updated. I was pretty sure that the reason it didn’t work was that form.save_m2m() must be called somewhere in the admin after the object was saved, which would override my m2m changes with the empty ones from the model.. Reading the Django admin source code confirmed this. But it didn’t show an obvious way to circumvent the behavior.

It is possible to override the change_view and add_view functions on the ModelAdmin object, but I did not want to do that. I would have had to copy the entire change_view contents into the subclass, as there is no way to make a super call do what I wanted here. Here’s the section of code I needed to modify (it’s in django.contrib.admin.options.ModelAdmin.change_view if you want to look):

            if all_valid(formsets) and form_validated:
                self.save_model(request, new_object, form, change=True)
                form.save_m2m()
                # Right here is where i want to insert my call
                for formset in formsets:
                    self.save_formset(request, form, formset, change=True)
 
                change_message = self.construct_change_message(request, form, formsets)
                self.log_change(request, new_object, change_message)
                return self.response_change(request, new_object)

Obviously, I can’t override save_model because save_m2m() is called after that, which would still wipe out my changes. I really need to have a self.after_m2m() call at the point I have commented in the above code. But I don’t.

I really didn’t want to have to copy this entire method into my ModelAdmin subclass (in admin.py) just to add that one call… so instead, I overroad another method that happens to have access to the new object and is called after save_m2m(). See that call to self.log_change a few lines later? That method updates the admin log db table. But it also happens to have access to the newly created object. I want to emphasize that this is an ugly hack:

# in admin.py
class ClassListAdmin(admin.ModelAdmin):
    #
    def log_change(self, request, object, message):
        super(ClassListAdmin, self).log_change(request, object, message)
        # The add and change views happen to call log_addition and
        # log_change after the object has been saved. I update the
        # m2m at this point (in update_participants) because the add
        # and change views call form.save_m2m() which wipes out the
        # changes if I put it in self.save_model().
        self.update_participants(object)

self.update_participants(), of course, contains the code I originally wanted to run.

This isn’t the most proper way to do this, but if you’re looking for a quick, dirty but DRY hack, it might save you some time.

Get your head out of the clouds: Local web applications

I spent this weekend with two friends crazy enough to join me in a 48 hour coding sprint for the Django Dash. We competed in the dash last year and placed 5th. Our goal was to move up in the rankings this year (competition is stiff, wish us luck!). Our team had the highest number of commits, but I can’t say how many of them can be justified as quality commits… especially since we keep track of our TODO file inside the git repository!

This year, we created a project called Great Big Crane. (I don’t know why we called it this.) The code is stored on Github, and we set up a splash page at greatbigcrane.com. We don’t have a live demo for reasons I’ll get into shortly.

This project’s primary purpose is to help managing buildouts for Python projects, most especially Django projects. It helps take care of some of the confusing boilerplate in buildout configuration. It also allows one click access to common commands like running bootstrap or buildout, syncdb, manage.py, or migrate, and running the test suite associated with a buildout. It does most of these actions as jobs in the background, and pops up a notification when it completes. It even keeps track of the results of the latest test suite run so you can see at a glance which of your projects are failing their tests.

One of the most intriguing things this application does is open a text editor, such as gvim, to edit a buildout if you need more control than our interface provides. It does this be queuing a job that executes the text editor command on the server.

Wait, What? It can be a bit creepy when clicking a button in a web application fires up an arbitrary program on your computer.

This entire app is designed to run on localhost. It’s set up for developers to manage their own projects. It doesn’t support authentication (this is why we don’t have a live demo), and the server has full access to the local filesystem. It’s meant to support your local IDE, not to provide an online IDE. The entire app is therefore super fast (no network delay), and switching from it to my text editor to several terminals became quite normal as I was developing on it (yes, the buildout for Great Big Crane runs just fine from inside Great Big Crane ;).

So yes, you’re expected to run this web app locally. Why would anybody want to do this? Is it a sensible thing to do?

The alternative to what we’ve done here would be to code the whole thing up as a GUI application of some sort. I have experience with most of the Python GUI toolkits, and I can’t say that I “enjoy” working in any of them. I’m not sure I enjoy working in HTML either, but I do a lot of it. HTML 5 with CSS 3 is certainly a powerful and reasonable alternative to modern graphical toolkits.

I’ve been coding HTML for so long that I don’t know what the learning curve is for it, but I’m definitely more comfortable working with it than I am with TK, QT, GTK, or WxWidgets, all of which take a long time to learn how to code properly. Possibly I’m just stagnating, but I think I’d prefer to develop my next “desktop” app as a webapp intended to run locally, rather than study these toolkits again. Indeed, because I think I’d prefer to do that, I started coding my last project in PyQT, just to fight the stagnation tendency. PyQT is an incredibly sensible toolkit, after you have learned how to make sense of it, but it’s not as sensible as the new web standards. Another advantage is that if you ever decide you want to make the app network enabled, you’re already running an app server, and using standard web technologies to push it to the cloud.

So my gut feeling at this point is that yes, it is sensible to design “traditional” desktop apps using HTML 5, CSS, and javascript for the interface, and your choice of webserver and web framework for the backend. Perhaps it’s not any more sensible than using a GUI toolkit, but it’s certainly not insane.

If it makes sense to replace local desktop apps with a local server, does it also make sense to replace web applications with a local system?

I’m not a huge fan of web applications because they are slow for me. I have a good connection (by Canadian standards, which aren’t high…). Yet Gmail is slower than Thunderbird, Freshbooks is too slow for me to justify paying for it, and github, while fascinating, is also slow compared to local access. The only webapp I have tested that I consider responsive is Remember The Milk, a popular todo list. I’m not certain what they do to make it so responsive, but I suspect Google Gears or HTML 5 localstorage must be involved.

Local storage. I’ve written about this before (I must be getting repetitive). My idea then was that offline enabled webapps are just as responsive as desktop apps. But the current available paradigm, using HTML5 localstorage, requires a lot of overhead normally involving manual syncing between the browser data and the server. What if I was running the app locally instead? Then I could just design it as a “normal” web app, without having to put extra thought into designing and maintaining local storage in the browser. It would be super responsive when I access it from my computer. More interestingly, it would also be available from remote computers. If I accessed it across my LAN using another laptop or my phone’s wifi, it would still be acceptably responsive. And if I happen to need access from the library or my friend’s computer, I can log in remotely, and still have approximately the same level of responsiveness that I currently get by logging into a server in the cloud.

This isn’t a new idea. It’s been presented as a “gain control of your own data” alternative to the privacy and control fears that Google, Facebook, and Apple (among others) have been creating. (<a href="http://www.h-online.com/open/features/Interview-Eben-Moglen-Freedom-vs-the-Cloud-Log-955421.html"this Is a nice discussion). There are a lot of clear advantages of moving data local, but there are also disadvantages. The nice thing about cloud storage is not having to worry about data backup. The “access anywhere” paradigm is nice, too, although that is not ruled out with running a home webserver. Zero install and end users not having to think about dependencies is also nice.

Overall, I’m finding more and more reasons to bring our apps home, where we have control of them. Such cycles are common in the technology industry. Dumb terminals mainframes. Personal computers. Business networks. The Internet. The cloud. Off-board video/On-board video. Network cards? On-board nic. Hardware modems or Software modems. Personally, I think the cycle away from the cloud is just beginning. I think the company that finally conquers Google will be doing it by giving you back control of your data. I’ve never been totally comfortable with the whole web application idea (as a user — they’re fine to develop!). I’m still trying to identify what my reasons are, but in the meantime, we experimented with the idea by developing Great Big Crane as a local web application.

Converting a Django project for zc.buildout

Jacob Kaplan-Moss has written an article on using zc.buildout to develop a django application. My goal is slightly different: I want to deploy an entire django project, with numerous dependencies, using zc.buildout. The documentation seems scarce, so I’m trying to keep track of each step as I go, in the hopes that it may be useful to someone someday.

I have an existing Django project that I’m having trouble deploying and sharing with other developers. It’s located in a private github repository. So my goal is not only to manage a Django project, but to manage an already mature project. This is, of course, harder than starting from scratch.

I do my development on Arch Linux, which is currently running Python 2.6 (and 3.1, but Django isn’t supported, so I’m using 2.6 for this project). I have git version 1.7.1, and my project is using Django version 1.2.1.

Since I didn’t know what I was doing, I started by doing some exploring. I created an empty directory and ran:

wget http://svn.zope.org/*checkout*/zc.buildout/trunk/bootstrap/bootstrap.py

to install the buildout bootstrap. I then created a very simple buildout.cfg file based on the djangorecipe example:

[buildout]
parts = django
eggs = ipython
 
[django]
recipe = djangorecipe
version = 1.2.1
eggs = ${buildout:eggs}
project = my_project

I then ran:

python bootstrap.py
./bin/buildout

Suddenly, my directory containing only two files (bootstrap.py and buildout.cfg) looked like this:

bin
buildout.cfg
downloads
my_project
bootstrap.py
develop-eggs
eggs
parts

Jacob’s article has an excellent description of all these files. The main question for me was “where does my source go?” This example shows that the project source code goes in my_project. Djangorecipe had created the following structure in that directory:

development.py
__init__.py
media
production.py
settings.py
templates
urls.py

The development.py and production.py files both have from my_project.settings import * calls, and then customize some variables. My habit has always been to have a localsettings.py in my .gitignore and include from localsettings.py import * in my main settings.py. For my project I had to decide whether to stick with my old habits, or modify my setup to be parallel to the djangorecipe version.

I see that djangorecipe has a way to select the settings to use for a given buildout, but if buildout.cfg is under version control, wouldn’t that make selecting settings a pain? And if each developer has a different database setup, would we require a different settings module for each developer? In my experience, it is better to do things the way the examples in the documentation say it should be done, because they know what they’re doing and I don’t. But in this case, I decided to keep my layout as is. I can always change it later.

The thing I wanted to learn from that experiment was where my source goes; apparently it goes in a folder with my project’s name at the same level as buildout.cfg and bootstrap.py. Looks like I’m going to have to move my code around in my project’s version control.

First I checked out a new branch, because that is the thing to do in git. Specifically because I want it to be easy to go back to the status quo if I decide, halfway through the process, that buildout is a pain to configure.

git checkout -b buildout

The first thing I want to do is move all my files into a new subdirectory with my project’s name, so buildout can have the top of the git tree for it’s own files:

mkdir my_project
git mv -k !(my_project) my_project
mv localsettings.py my_project
rm *.pyc
git commit

The git mv command essentially says “move anything that isn’t my_project into my_project“. The -k switch says “just ignore it if it isn’t under version control.” This left my localsettings.py and a few .pyc files in the main directory, since those files are in .gitignore, so I cleaned those up manually. Finally, I committed the changes, so the move happened in one place.

Now it’s time to start creating a new buildout, this time in the version controlled directory. I ran the wget command to get bootstrap.py, and I copied the buildout.cfg from my exploration directory. Then I ran the bootstrap and bin/buildout commands to see what happened. It did the same thing before, except for providing a django: Skipping creating of project: my_project since it exists. That’s what I wanted. Running git status showed several patterns that needed to be added to my .gitignore:

.installed.cfg
bin
develop-eggs
downloads
eggs
parts

I also had to change the .gitignore file to ignore my_project/static/uploads instead of just static/uploads.

At this point, I decided to commit bootstrap.py and buildout.cfg:

git add bootstrap.py buildout.cfg
git commit

Now, I know I’m missing dozens of dependencies, but I wanted to see what happens if I run bin/django. My understanding is that this is supposed to be a wrapper similar to management.py, but using the buildout’s django environment. It failed, telling me that the development settings.py file didn’t exist. I modified the buildout.cfg to add settings = settings to the django recipe. Then I ran bin/django again, and nothing had changed.

Whenever you change buildout.cfg, you have to also run bin/buildout to create the new environment (rant: I hate compile steps!).

I was worried that my custom management commands (in my case, for py.test testing, and running south migrations) would not show up, but there they were, listed in the help output that bin/django provided. This is especially surprising, since I have not installed south inside the buildout yet! It appears that bin/django is a drop-in replacement for manage.py.

Next, I ran bin/django shell expecting to enter dependency hell. Not yet! Instead, I got the error “no module named my_project.settings”. Looking at the bin/django script, it is trying to prepend the project name to the project. I have a habit of not including an __init__.py in my project directory, preferring to think of a django project as a collection of apps, rather than an independent project. I don’t want to write from my_project.my_app import something, because then the apps are no longer reusable. In my world, the project is not a package. Apparently, djangorecipe thinks it is. So touch my_project/__init__.py had to happen, since I definitely don’t want to start hacking the recipe at this point!

Now I have “no module named ” errors for each of my INSTALLED_APPS. Because I list my apps as “x” instead of “myproject.x”. To fix this, I added extra-paths = my_project, which inserts the project directory into the path.

Then I ran bin/django shell and bin/django runserver only to discover that everything was working! Apparently my buildout had not installed to a private environment, and was still accessing the default site-packages on my system. Not quite what I wanted. I thought zc.buildout created an isolated environment, much like virtualenv, only portable across systems. My mistake.

zc.buildout does not create an isolated sandboxed environment by default.

I had to do a lot of google searching to come to this conclusion. There are many statements out there that suggest that zc.buildout can and does create an isolated environment, but none of them turned out to be true. zc.buildout is all about reproducibility, while virtualenv is about isolation They are not competing products, and the ideal environment uses both of them.

So I removed all the temp files and directories (including the hidden .installed.cfg) that buildout had created for me and started over to install them to a virtualenv:

virtualenv -p python2.6 --no-site-packages .
source bin/activate
python bootstrap.py
bin/buildout

I temporarily removed IPython from the eggs because it was refusing to download. The server must be down. This time, when I run the bin/django shell, I get a proper dependency error for psycopg2. Looks like I’m finally on the right track. I also had to add several directories virtualenv had created to my .gitignore.

Before buildout, I had a rather complicated dependencies.sh file that installed all my dependencies using a combination of easy_install, git checkout, hg checkout, etc. I started with the easy_install stuff; stuff that can be installed from Pypi. I created a new eggs part in my buildout. The entire file now looked like this:

[buildout]
parts = eggs django
 
[eggs]
recipe = zc.recipe.egg
interpreter = python
eggs =
    psycopg2
    south==0.7
    django-attachments
    pil==1.1.7
    Markdown
    recaptcha-client
    django-registration-paypal
    python-dateutil
 
[django]
settings = settings
recipe = djangorecipe
version = 1.2.1
eggs = ${eggs:eggs}
project = my_project
extra-paths = my_project

Trying to run bin/buildout now causes a “Text file busy” error. At this point, I’m seriously considering that buildout is more of a pain than a help. It’s poorly documented and broken (some might say poorly documented IS broken). And I know I have an even harder task coming up when I have to patch a git clone.

But, I’m obstinate and I persevered. Google was quick to confirm my hypothesis that virtualenv and buildout were both trying to access the “bin/python” file. The solution was to change the interpreter = python line in my recipe; I called the buildout interpreter” py”.

This time, when I ran bin/django shell I got an error pertaining to a module that needs to be installed from git. Time to look for a git recipe! Here’s how it eventually looked:

[django-mailer]
recipe = zerokspot.recipe.git
repository = git://github.com/jtauber/django-mailer.git
as_egg = True

I also had to add django-mailer to my parts in the [buildout] section, and arranged the [django] extra-paths section as follows:

extra-paths =
    ${buildout:directory}/my_project
    ${buildout:directory}/parts/django-mailer

I had a second git repository to apply, and this one was messy because the code on the project was not working and my dependencies.sh was applying a patch to it. I was considering whether I had to hack the git recipe to support applying patches when I realized a much simpler solution was to fork it on github. So I did that, applied my patch, and rejoiced at how simple it was.

Finally, I had to install an app from a mercurial repository (because we can’t all use the One True DVCS, can we?) I found MercurialRecipe, but no examples as to how to use it. It’s not terribly difficult:

[django-registration]
recipe = mercurialrecipe
repository = http://bitbucket.org/ubernostrum/django-registration

With all my dependencies set up, I was finally able to run bin/django shell without any errors.

Now I have to figure out how to make this thing work in production, but that’s another post. I hope it works flawlessly on my co-developer’s Mac. Hopefully the pain will be less painful than the old pain. This was a huge amount of work, several hours went into it, and I won’t know for a while if it was worth it.

Django Registration Paypal Backend

One of my clients needed integration with Paypal for his site. The django-paypal module is great for this sort, but it didn’t quite suit our needs because we wanted to disallow account logins until payment had been collected. We had been using django-registration to support registration, and I decided to stick with it. The development version of django-registration has support for different registration backends. The default version sends an e-mail to verify the user’s e-mail address before they can get in. A simpler backend included with the project allows direct registrations with no verification. We needed a backend that required, not only e-mail address verification, but also confirmation that a paypal payment had been made.

And so, django-registration-paypal was born:

http://pypi.python.org/pypi/django-registration-paypal/0.1.1

http://github.com/buchuki/django-registration-paypal

It is, like all my first releases, pretty rough around the edges, and contains at least one glaring security hole. But it’s out in the wild now, and patches, as always, are welcome.

Python 3 Object Oriented Programming

For the past eight months, I’ve been working hard on a project that’s a little out of the ordinary, for me. It’s the reason there’s been such a drastic reduction in number of blog posts here. It’s the reason I haven’t been earning enough money to cover my expenses each month. It’s my biggest accomplishment to date.

I’ve written a book (an entire book!) on object oriented programming, with a focus on syntax and libraries supported in the exciting new Python version 3. It’s designed for beginner to intermediate Python developers who are more familiar with Python as a scripting language than as an object oriented programming language.

As a byproduct, it also introduces Python 3 syntax, and will be a great reference for programmers wanting to upgrade their Python 2 skills. For the most part, Python 3 is a simpler, more elegant language. The learning curve is shallow, but it takes some getting used to.

It also summarizes the state of the most exciting libraries available for Python 3 at this time. If you’ve been wondering when it’s time to start migrating to the new language, it is now!

I’m currently in the rewrite phase on the book (it’s time consuming!) but it’s already available for preorder directly from my publisher:

https://www.packtpub.com/python-3-object-oriented-programming/book

I’m not great at marketing, so to put it bluntly: I hope you all buy a copy! I’ve put a great deal of effort into this project, and I’m very proud of the result. This book is a great resource and fills a void in the available references. It also fills a void in my available writings, as my blog posts tapered off over the past few months!

Web Applications in Python 3

Everyone knows that there aren’t any decent web stacks in Python 3, right? Nobody seems interested in porting Django or Turbogears or Pylons or Plone or Zope to the new, improved much more fun to write language.

Almost true. There are a few less popular frameworks for Python 3, including my own half-finished psyclone: http://archlinux.me/dusty/2010/01/17/python-3-web-framework/

But there aren’t any full stack frameworks. But it’s not too hard to hack one together using available tools.

In my mind, a full stack web framework requires several features:

  • The core server
  • A database engine
  • A templating languages

In addition, it’d be nice if it supported sessions out of the box and had a built-in auth framework. Basically, the more like Django it is, the better.

The good news is, we can hack one together with popular Python 3 ready tools including:

CherryPy provides a working web application engine. SQLAlchemy, which supports sqlite and posgres under Python 3 provides us an extremely flexible and robust ORM. Jinja is an advanced Django-like templating system. I implemented a very basic blog (sans authentication) in these tools in a matter of an hour or so, and it wasn’t any worse to work with than Turbogears. This isn’t a big surprise, since Turbogears is built on CherryPy and SQLAlchemy as well. We don’t have access to the extensive library support Turbogears has to offer, but it is now accurate to say that Python 3 supports a (nearly) full stack web framework.

You’ll need the svn version of CherryPy, and the newly released 0.6 version of SQLAlchemy. Jinja 2 cooperates well.

I don’t find it as pleasant as working with Django (I’m actually not a huge fan of Alchemy’s syntax), but it’s certainly a working solution. It wouldn’t overly much library support to turn it into a really nice system.

Image Manipulation in Python 3

Enough libraries have been ported to Python 3 to finally make it seriously possible to write real world code in this modern Python interpreter. Sure, we don’t have django or really any decent web framework yet (CherryPy runs, but it’s not a full web stack), and database support is limited (SQLAlchemy supports postgres and sqlite3), but for the most part, if you need to do something Python 3, you can.

The major exception I’ve discovered is Image manipulation. The Python Imaging Library has not been ported to Python 3, and there is no indication when it will be. The latest version of PIL, 1.1.7 was released in late 2009, with an indication that it would be “made available for Python 3,” but no estimate as to timeline. There are no mailing list posts answering an increasingly popular question “when will PIL be available for python 3?” I found no source repositories that indicate that it has been started. There is a patch floating around that can supposedly be applied to PIL 1.1.6 to make it Python 3 compatible, but it didn’t work for me.

I tried doing a port of PIL 1.1.7 myself, but was unable to find documentation for the modifications to the C extension API in Python 3. My C is pretty rusty, and my schedule is way too full for the next two months, so I gave up on the task. Because of the lack of support from the official PIL developers (I hold nothing against them; we’re all busy in the open source world, and can only contribute what we have time and interest to contribute), I’m hoping this post will motivate someone to attempt a port of PIL, making Python 3 that much more attractive.

If you came to this post looking for some kind of Python 3 image manipulation library and were disappointed, don’t be! The Pygame image module allows us to load a few image formats, and we can manipulate the resulting surfaces in a variety of ways. It’s lower level and not as comprehensive as the Python Imaging Library, but it is one useful alternative if you need to do image manipulation in Python 3.

A Python 3 Powered Blog

Last week, I posted my intent to port the Tornado web framework to Python 3. Not only have I done that (sort of), but I’ve hacked it to pieces; soon it will be unrecognizable to the original developers, and possibly, to me!

It didn’t take long to get the Hello World example included with Tornado running. Quite a bit more work was the blog demo, but I now have the template, auth, and httpclient modules working, along with the core modules for a working async server. I was able to log into my example blog with my google account, compose some entries, view, and edit them, including the feed.

That doesn’t sound like much if you’re coding blog software for an existing framework (10 minutes in django, probably about 12 in web.py). But if you’re coding a web framework for existing blog software, it’s an accomplishment. I’m proud to have a “working” framework (even though it’s full of bugs and working just means “those things I’ve fixed”) for Python 3 in such a short amount of time.

I’ve named the project psyclone, a play on ‘tornado’ –> ‘cyclone,’ and the fact that Python projects oughta have a ‘p’ and ‘y’ in them somewhere. The code is on github for all to play with. Patches welcome! :-)

I’m having a lot of fun with this project, so it’s taking more of my time than I ought to be devoting to it… on the positive side, it’s progressing rapidly!

My plans:

  • Go over the existing code and improve some of the rather messy unicode/str –> str/unicode hacks I had to make to get it working.
  • Write some tests. (The Tornado team seems not to value tests.) I’ll use py.test and may need to write a test client.
  • Write a session framework and auth framework; the current auth framework uses openID only; but I like a more local solution to be available as well.
  • Consider writing an ORM. Likely, I’ll discard this idea, arguing that Judd was right to design a frameworkwith SQL only. The truth behind the argument will be laziness, of course.

The Utility Of Python Coroutines

Coroutines are a mysterious aspect of the Python programming language that many programmers don’t understand. When the first came out I thought, “Cool, now you can send values into generators to reset the sequence… when would I use that?” The examples in most books and tutorials are academic and unhelpful.

Last year, I attended David Beazley’s course A Curious Course On Coroutines along with a fellow Archer. We agreed that it was an exceptionally interesting course (Beazley built an OS scheduler in Python, with just a minimal amount of code: how cool is that), but that we didn’t see any practical application of it in our regular work.

Yesterday, I started working with the Tornado code to port it to Python 3. Tornado uses an async framework; I hate async because I hate working with code like this:

def somemethod(self):
    #
    self.stream.read_until("\r\n\r\n", self.callback)
 
def self.callback(self, content):
    # handle content read from the stream

I understand the utility of this code; while the stream is being read, the app can take care of other stuff, like accepting new connections, until the stream has been read. You receive high speed concurrency without the overhead of threads, or the confusion of GIL. When the read is complete, it calls the callback function. It makes perfect sense, but when you read code with a lot of such callbacks, you’re constantly trying to figure out where the code went next.

In my mind, the above code is really saying:

def somemethod(self):
    #
    self.stream.read_until("\r\n\r\n")
    # give up the CPU to let other stuff happen
    # but let me know as soon as the stream has finished reading
    # handle content read from the stream

I find this paradigm much easier to read; everything I want to do surrounding content is in one place. After pondering different ways to write a language in which this was possible, it hit me that this is what coroutines are for, and it’s possible in my preferred language.

Because coroutines use generator syntax, I thought they had something to do with iterators. They don’t, really. The above code can be written like so:

def somemethod(self):
    #
    self.stream.read_until("\r\n\r\n")
    content = (yield)
    # handle the content

The calling code would call somemethod() and somemethod().next(), and eventually, when content is available, somemethod().send(content) to drive it.

A generator compiles to an object with an iterator interface. The coroutine above (sort of, but not really, at all) compiles to a function with a callback interface (you could say it is an iterator over callbacks). You can use yield multiple times in one method to receive more data (or to send it; put the value on the right side of yield, like in a generator).

The mainloop that called this code would still be at least as complicated to read as it is using a callback syntax, but the objects on the async loop are now much easier to read.

This paradigm has been implemented in the Diesel web framework. I’ve looked at it before and thought it was an extremely bizarre way to design a web framework. I still do, but now I understand what their goals were. If you’ve ever struggled with the, “why would I ever use this?” question when it comes to coroutines, now you understand too.

I have no immediate plans to rewrite my tornado port using coroutines, but maybe someday if I’m bored, I’ll give it a try.