Posts tagged ‘pypy’

Pushing Python Past the Present

This is the first time I’ve inserted myself into an exchange between bloggers I don’t know. This topic interests me and I have something to add. Most importantly, I found an alliterative title. So I thought I’d give it a go.

I first saw Calvin Spealman’s article I am worried about the future of Python. I suspect that upon reading this article, Guido got into his famous time machine to give his Pycon 2012 Keynote speech, accusing Calvin of trolling.

This article was shortly followed by Tim McNamara’s Python is doing just fine which (likely unintentionally) summarizes Guido’s talk.

Finally, Nick Coghlan came out with an extensive summary article called Python’s Future: A Global Perspective.

And now there’s me. All I want to do is push people to start supporting PyPy. The PyPy developers have worked on a variety of technologies that can be made into production-ready products that address most issues (real or imaginary) that people see with Python.

They’ve already solved speed and I suspect the next release of PyPy will solve memory. They’ve made huge but mostly incomplete progress in a lot of other areas, including the ones the above bloggers have mentioned.

Concurrency

I have had a lot of trouble dealing with parallelism in Python. Everything I have tried has either been a hack or required a hack to work with it. In my experience, the multiprocessing module does not work in large-scale production. However, it works a lot better on PyPy than it does on cPython, at least for me. The various async solutions can only use one processor and are therefore, in my opinion, no more exciting than GIL-throttled Threads.

In addition to better multiprocessing, PyPy also supports stackless extensions out of the box. And I haven’t even mentioned Armin Rigo’s mysterious Software Transactional Memory implementation.

Concurrency is a sore point in Python. There are solutions. PyPy is capable of doing those solutions better.

Web

I think it’s unfortunate that the only way to code for a web browser is to use JavaScript or a language that compiles to JavaScript. I feel there is no real reason browsers can’t support arbitrary <script type=""> tags. I discussed this last March.

With development, PyPy can support Python in the browser. The mostly-finished but untested sandboxing feature of PyPy can be adapted for in-browser execution. I know some experimenting was done on this in 2009 that proved it’s possible. We just need to take it up and make it happen.

Mobile

PyPy’s speed makes it theoretically possible to run it effectively on mobile platforms. Further, it can be compiled to arbitrary execution environments including JVM and .Net. There is no reason that with some development, PyPy couldn’t run on the Dalvik JVM. I’m sure it’s even possible to run it on iOS.

Interestingly, one of the trolls Guido mentioned in his talk was “PyPy should become the default Python implementation”. He debunked this readily by asking the audience how many people use PyPy in production. Nobody moved.

PyPy is ready for production use, but it is not widely used. I think this is largely because the primary PyPy developers are way more excited about creating new features than they are about marketing the current platform or polishing up nearly finished features like sandboxing or a Dalvik backend. They are visionaries, not finishers.

I say this with a great deal of respect. I am not calling for these guys to change their focus. What I want is for other people from the Python community to join the PyPy project as finishers. It needs people that can make sure these nearly-finished features are working, production-ready and more importantly: documented.

I’ve been meaning, for months, to become one of these people. Unfortunately, I’ve prioritized other things including my job and my upcoming book. It’s still on my radar, though, and I hope that after reading this article, you, too, are thinking about make PyPy the future of Python.

So, what can you do?

Use PyPy
PyPy is capable of being your default Python 2 interpreter for most common tasks. Use it. If it doesn’t work for a given project, get on #pypy, they will help you fix it. It’s even more exciting if you can use PyPy features that are not currently available in cPython, such as the stackless extensions.
Evangelize PyPy
Tell people how well PyPy is working for you. Write articles like this one.
Document PyPy
The PyPy website contains a lot of documentation, but it’s rather intimidating and unreadable to the uninitiated. PyPy is just Python, but it’s also much more than Python. Somebody with some writing skills needs to get in there and make it digestable.
Design PyPy
Seriously, pypy.org does not need to look like something out of 2005.
Develop PyPy
I’ve hacked on PyPy. It’s not scary. Work on features like numpy or Python 3 support that are the current developer’s focus. Better yet, work on finalizing features like sandboxing or alternative backends that are finished but not quite tested.

If you have an hour or two free this weekend, instead of writing about how Python is great or not great, or not going to continue to be great, do something with PyPy. Make it happen.

Hacking on PyPy

In another great Pycon2012 keynote, David Beazely asked the question, “is PyPy easily hackable?” After a great talk, he answered with a decisive, “I still don’t know.” Having sprinted on Python I’d like to answer his question in a bit more detail.

I love David’s presentation style. He has a novel method of using phrases like, “blow your mind” and “this is really scary” repeatedly until they lose their meaning and you no longer feel mindblown or scared. A variety of factors, including Beazely’s thorough keynote address motivated me to join the PyPy team during the Pycon developer sprints.

I’d like to clear up one oversight in Dave’s otherwise impeachable talk. One of the PyPy devs, Holger Kregel explained to me that PyPy does not have over 1 million lines of code. I don’t have exact numbers, but for “historical reasons”, a non-python file containing Base64 encoded data was given a .py extension. When excluding this file from the line count, around half a million lines of actual Python code exist, and about a quarter of these are tests.

I was surprised how trivial it was to get started hacking on PyPy. I don’t really grok the many layers of the translation toolset and PyPy interpreter, but it’s pretty clear that the layers are well separated. I was hacking on the py3k branch of PyPy. I am happy to admit I was working primarily on changing print statements to print() functions and commas in exceptions to the as keyword.

Here are the steps to start hacking on PyPy. Notice that the hour-long translation step is not part of the procedure. PyPy has a solid test framework, and the PyPy crew are focused on a 100% test-driven-development paradigm.

  1. Clone pypy (this takes a while):
    hg clone https://bitbucket.org/pypy/pypy/
  2. Pick a branch to work on. There are about 80 branches. I don’t know what they all do. Popular ones during the sprints included py3k and numpy-ufuncs2
  3. Pick a feature to work on. For py3k support, the list of failing tests in the buildbot is a good place to start. Numpy programmers had a list of fuctions that needed implementing, but I can’t find the link. Ask on IRC, the PyPy crew are very helpful. The bug tracker contains many features and issues that need addressing
  4. Add /path/to/pypy/ to your path so you can run the pytest.py command
  5. cd into the directory indicated in the buildbot output and run pytest.py path/to/test.py -k testname
  6. The test will likely fail. Hack away and fix it.
  7. When the test passes, commit, push to a bitbucket repo, and issue a pull request.
  8. Repeat!
  9. There are quite a few cons to working on this project. If you run hg in the pypy/modules/ directory, it will try to pick standard library modules from pypy and choke horribly. The pypy developers don’t really believe in documenting their code. Being able to tell the difference between rpython and python (which have identical syntax) is important. In general, if a module starts with “interp_” it contains rpython, but if it starts with “app_” it contains python. The code does not appear to be well-documented.

    If you are hacking on Python 3 support, you need to bear in mind that the PyPy interpreter is written in Python 2. You are working on a Python 2 application that executes Python 3 bytecode!

    On the positive side, rPython and Python are much easier to read and write than C. The PyPy devs are brilliant, but not intimidating. They are so confident in their test suite that they are comfortable programming in a “cowboy coding” style, hacking randomly until all the tests pass. Any one layer in the toolchain is easy to understand and develop. The IRC channel is full of friendly, knowledgable, helpful people at any time of day.

    Overall, I am much less intimidated by this project than I was before I started the dev sprints. I still can’t answer the, “Is Python easily hackable?” question fully. It’s certainly easy to get started, but I don’t know how easy it is to become intimate with the project. Dave Beazely’s keynote made PyPy more approachable, and I approached it. Hopefully this article will encourage you to do the same.

Py.test funcargs and Django

Holger Krekel just released the 1.0 version of py.test. Py.test is a functional and unit testing tool that cuts out a lot of the annoying boilerplate found in the unittest module included in the Python standard library.

I do a lot of Django coding. Django has a built-in test engine based on unittest. Its annoying, but it does a few things (such as capturing e-mails and creating a test database) automatically so I’ve tended to use it rather than setting up py.test to take care of these things. Today I decided I’d rather use py.test for my latest project. Turns out its not that complicated:

# there are better ways to do these first three lines
import os, sys
sys.path.append('.')
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
from django.conf import settings
from django.test.client import Client
from django.test.utils import setup_test_environment, teardown_test_environment
from django.core.management import call_command
 
def pytest_funcarg__django_client(request):
    old_name = settings.DATABASE_NAME
    def setup():
        setup_test_environment()
        settings.DEBUG = False
        from django.db import connection
        connection.creation.create_test_db(1, True)
        return Client()
    def teardown(client):
        teardown_test_environment()
        from django.db import connection
        connection.creation.destroy_test_db(old_name, 1)
    return request.cached_setup(setup, teardown, "session")
 
def pytest_funcarg__client(request):
    def setup():
        return request.getfuncargvalue('django_client')
    def teardown(client):
        call_command('flush', verbosity=0, interactive=False)
        mail.outbox = []
    return request.cached_setup(setup, teardown, "function")

Put that in your conftest.py and you can write tests like this:

def test_something(client):
    response = client.post('/some_url', {'someparam': 'somevalue'})
    assert "somestring" in response.content
    # Other assertions...

This uses the innovative py.test ‘funcarg’ mechanism to create a test database when testing starts, and refer to that database throughout the test run. The ‘django_client’ funcarg sets up a database when the session starts and deletes it when it finishes. The ‘client’ funcarg creates a similar client, but also resets the database after each test is run to ensure there are no interaction effects between tests.

I haven’t fully tested it yet, but its nice to know I can get the most useful django test functionality so cheaply in py.test.

Offline-Enabled Web Apps: The Future

I was reluctant to join the world of web development. I started in high school with a few sites and realized several things: Javascript sucks, Internet Explorer sucks; therefore web development sucks.

Fast-forward through a couple academic degrees. Job hunting with one requirement: Python. Python jobs all require Django.

So I learned Django, assuming, incorrectly, that if I was developing python backends, I wouldn’t need to work with the horrors of Javascript or Internet Explorer. I earned money. I relearned Javascript and became a first rate web developer.

In the back of my mind I still felt that web development sucks. So a few weeks back when deciding on a platform for a personal project, I thought I’d try something new. The Android platform was in my hands and I gave it a whirl.

I didn’t enjoy it much and I am now rewriting the app as an offline enabled webapp using Google Gears.

Then Chrome OS was announced and I realized that I’ll probably be doing a lot of offline enabled webapps using Google gears and/or HTML 5. Like it or not, it’s the future. Me, I like it. There are a lot of advantages to this kind of setup: I can access the apps from my phone, my laptop, my parent’s desktop, or Phrakture’s hacked computer whenever and wherever I want. I don’t have to write a different client for each one. Its true ‘write once, run anywhere’. I can upgrade each of those clients automatically as long as there’s a network connection.

On that note, you don’t need a network connection to run HTML 5 or Google Gears based apps. They both provide a ‘localserver’ that caches pages and javascripts, and give you an SQLite database for data caching. Typically offline versions of apps are not as powerful as their networked counterparts, but they do not require network access to run. Further, because they are locally cached, they can be made to run as fast as a “standard” (old fashioned) desktop app. The apps run in the browser, but the browser is just a container, a window manager, to hold the application.

In traditional webapps, you code most of the logic on the server side. In this new model, you end up coding most of the logic in the client, because the app needs to run without a guaranteed server connection. For me, this has a massive, nearly show-stopping drawback: A large portion of the app must be written in Javascript. JQuery makes Javascript suck lest, but it still sucks. I’m a Python programmer.

For years, I’ve dreamed of browsers supporting tags that allow me to write my DOM manipulation scripts in Python rather than the ubiquitous and annoying Javascript. This wasn’t possible because python can’t be adequately sandboxed such that arbitrary scripts running on the web don’t have access to, say, your entire hard drive.

This is no longer true. The PyPy project finally has a complete Python 2.5 interpreter that can be safely sandboxed. Since discovering this at Pycon 2009, I’ve been thinking about interfacing it with a web browser.

I figured “somebody must have started this already”. Google didn’t help much, but when I logged into #pypy on freenode I was told “fijal started doing that with webkit yesterday”. I’ve been following up trying to get the project to build (I was warned that the build process is a mess and was invited to wait until it is cleaned up a bit). So far, no luck, but I am optimistic that python support is finally coming to the browser. Granted, it won’t be much use for public webapps (at first) since browsers won’t want to be distributing pypy, but a lot of my projects are personal, and satisfying the general public will be far lower on my priorities list than ‘developing in my preferred language’.

I’ll have to install a pypy interpreter into Chrome Lite under Android before this is useful to me. That may be tricky.