Posts tagged ‘async’

The Utility Of Python Coroutines

Coroutines are a mysterious aspect of the Python programming language that many programmers don’t understand. When the first came out I thought, “Cool, now you can send values into generators to reset the sequence… when would I use that?” The examples in most books and tutorials are academic and unhelpful.

Last year, I attended David Beazley’s course A Curious Course On Coroutines along with a fellow Archer. We agreed that it was an exceptionally interesting course (Beazley built an OS scheduler in Python, with just a minimal amount of code: how cool is that), but that we didn’t see any practical application of it in our regular work.

Yesterday, I started working with the Tornado code to port it to Python 3. Tornado uses an async framework; I hate async because I hate working with code like this:

def somemethod(self):
    #
    self.stream.read_until("\r\n\r\n", self.callback)
 
def self.callback(self, content):
    # handle content read from the stream

I understand the utility of this code; while the stream is being read, the app can take care of other stuff, like accepting new connections, until the stream has been read. You receive high speed concurrency without the overhead of threads, or the confusion of GIL. When the read is complete, it calls the callback function. It makes perfect sense, but when you read code with a lot of such callbacks, you’re constantly trying to figure out where the code went next.

In my mind, the above code is really saying:

def somemethod(self):
    #
    self.stream.read_until("\r\n\r\n")
    # give up the CPU to let other stuff happen
    # but let me know as soon as the stream has finished reading
    # handle content read from the stream

I find this paradigm much easier to read; everything I want to do surrounding content is in one place. After pondering different ways to write a language in which this was possible, it hit me that this is what coroutines are for, and it’s possible in my preferred language.

Because coroutines use generator syntax, I thought they had something to do with iterators. They don’t, really. The above code can be written like so:

def somemethod(self):
    #
    self.stream.read_until("\r\n\r\n")
    content = (yield)
    # handle the content

The calling code would call somemethod() and somemethod().next(), and eventually, when content is available, somemethod().send(content) to drive it.

A generator compiles to an object with an iterator interface. The coroutine above (sort of, but not really, at all) compiles to a function with a callback interface (you could say it is an iterator over callbacks). You can use yield multiple times in one method to receive more data (or to send it; put the value on the right side of yield, like in a generator).

The mainloop that called this code would still be at least as complicated to read as it is using a callback syntax, but the objects on the async loop are now much easier to read.

This paradigm has been implemented in the Diesel web framework. I’ve looked at it before and thought it was an extremely bizarre way to design a web framework. I still do, but now I understand what their goals were. If you’ve ever struggled with the, “why would I ever use this?” question when it comes to coroutines, now you understand too.

I have no immediate plans to rewrite my tornado port using coroutines, but maybe someday if I’m bored, I’ll give it a try.