The Economics Of Cooperation

I’d like to spread the word about a project that has the goal of allowing open source developers to focus on the projects they have passion for.

Gittip allows people to set up recurring payments to specific programmers to encourage them to work on whatever projects they are passionate about. The long term idea is that maybe these developers could quit their day jobs working for companies that pay well but don’t necessarily produce products they are fascinated with, and spend their time on whatever open source project is currently taking their fancy. This model has been used somewhat effectively for a few projects such as the massively successful kickstarter project for Diaspora (http://www.kickstarter.com/projects/196017994/diaspora-the-personally-controlled-do-it-all-distr) or the donation bars for PyPy development (http://pypy.org/). However, the simple fact is that most open source developers are either a) paid by companies that have ulterior motives or b) working on open source projects only in their free time.

Gittip was developed by one of my favourite coworkers. He’s leaving a kickass job at our company because he believes that the world can be made more personal. I am writing this article, on my own initiative and not at his request, to support that goal. Perhaps I, too, one day, will be able to return to open source development and focus on my passions.

Instead of paying Google to pay it’s developers to make awesome stuff for us (and allowing Google’s wealthy shareholders — who did nothing — to skim a little off the top), we can pay developers directly. Google employees have a pretty awesome life, but they still have to spend 80% of their time working on Google products. We can make them happier by allowing them to quit Google and focus on their passions. And they can make us happier by creating brilliant products for us without having to sell our souls and personal data to use it. This story, of course, could apply to every software company and any brilliant open source developer out there today.

Gittip reminds me of a project I developed a few years ago called Arch Bounty (http://archlinux.me/dusty/2009/09/07/introducing-arch-bounty/). Arch Bounty failed before it got off the ground, partially because Arch itself wasn’t big enough to support it, partially because I failed to promote it, and partially because I didn’t do a good job of explaining what kinds of projects would be suitable for it. I hope Gittip does better.

I am fascinated by how effective cooperative communities can be. I love how my tiniest contributions to Arch Linux are paid in full by the tiny contributions of numerous other users. I love how hosting one person in the Couch Surfing community gives me the chance to stay anywhere in the world and meet wonderful people.

And yet, I am saddened by the way consumerism is able to overwhelm many similar communities. Google is very intent on making Mozilla irrelevant. We are socially conditioned to compete for resources instead of cooperating to build shared resources that benefit each of us more than individual competition could ever pay off.

Cooperation and competition need not be mutually exclusive. When I play ultimate, twice a week, I have a very strong competitive desire, as do my teammates and opponents. However, we have a culture of cheering good plays, regardless of which team makes them, and commiserating with missed opportunities. When two of us are competing for the disc, we’ll high five each other for having made a good run, regardless of who caught it. The world would be a better place if this cooperative-competitive culture could be applied universally. If we could all be attempting to excel at our individual passions, while also nourishing and encouraging others in their pursuits. I believe this dual nature can be applied to the marketplace. Everyone talks about the relative merits and problems with capitalism vs communism. Both systems seem to have fatal flaws over the last hundred years. What if we had both?

Can such a coopero-competitive culture be financially stable? A successful Gittip would be a great first step. I encourage you to discover the github username of your favourite Arch Linux developers and make a small weekly contribution. It would be amazing if enough of us could do this to motivate even one developer to quit his day job and work on Arch full time. Is it possible?

Yes.

ArchCon 2012

In 2010, I helped organize the first ArchCon, held in Toronto. We had a blast, and I considered it a success. Unfortunately life overtook me and I did not have time to organize a second one in 2011. I heard that there was a Polish language conference that used the ArchCon name that year, but didn’t know about it until it was over.

This year, the Polish group is going all out with the first truly international ArchCon. You can view their site in English at http://2012.archcon.pl/en/ (other languages also available). They’re expecting up to 250 people and are planning a terrific after party and workshops. The conference will be held in Warsaw on July 28 and 29.

Sadly, I won’t be able to make it, but I am very excited that the community has grown enough to create this international event. If you’re going to be near Poland at that time, I encourage you to register. ArchCon 2010 was a small event, but it was a lot of fun and to the best of my knowledge, everyone was glad they attended. There’s something special about being surrounded by Archers.

Trembling Aspen

Aspen is a unique Python web framework developed by Chad Whitacre. I am still evaluating it, but one thing I can say unequivocally is that it’s different. Very different. Aspen divorces itself from the MVC pattern that every major — and almost every minor — web framework has been relying on for the past decade. While each framework has its perks and annoyances, they all have this underlying MVC-like interface. Aspen throws that all away.

Aspen lodges itself in a small niche between php-like spaghetti code and a complicated package structure full of python models, views, controllers, and templates. In Aspen, template and control code live in the same file, but in separate pages, called Simplates. In a rebellion against regular expression url parsing, Aspen uses the filesystem to define url structure and paths; like in the old days of statically served HTML files.

One thing I love about Aspen is how easy it is to get up and running. Aspen will serve a single HTML file with no python code without any boilerplate. It serves static files for css, javascript, and images just as easily. To add a dynamic page to your Aspen powered site, you simply use the aforementioned simplates, which include python controllers and Tornado inspired templates in the same file. Aspen provides numerous service connectors, so deployment can also be trivial.

Another great feature of Aspen is its documentation. At first glance, the docs seem sparse and rather limited. However, they are well organized, and every page seems to answer every question I have on the topic succinctly and quickly. Chad seems to have a terrific minimalist policy of telling you exactly what you want to know in as few words as possible. As a writer, I know just how difficult this is to accomplish without repeating oneself.

On the negative side, I do have misgivings as to how well Aspen’s filesystem urls it can scale. Chad has assured me that Aspen is up for running large sites. I appreciate the power of regex urls, and I don’t necessarily like having arbitrary file extensions in URLS. I think locators should be succinct and divorced from implementation details. Ideally, I believe there may be a happy medium between Aspen’s enforced folder structure, and the overly-abstract and overly-complicated url scheme Django and its kin provide.

Another downside is that I haven’t figured out a good way to unit test Aspen simplates. You can refactor much of the code into backend libraries and test that, of course, or you can start the service running and mock requests. However, I miss the TestClient feature from Django that allows me to mock a request without running the server. It provides a better analysis of code coverage, and is closer to the actual code.

Now, with the review over, I’d like to introduce a little project I’ve started to provide some vital features that Aspen core is missing.
Trembling is kind of a catch-all project with a few basic helpers and a few useful projects. Namely, trembling provides:

  • Basic sessions
  • Basic user authentication
  • Less than basic single request messages
  • A simple redirect response

Aspen itself does not define or require a model layer; you’re free to use the ORM, DBAPI connector, or NoSQL database of your choice. Trembling arbitrarily chooses mongoengine for the models. I made no effort to make pluggable backends like Django does; Aspen does not encourage excessive layers of abstraction.

I also don’t know how secure trembling’s session and auth features are. If you find a security hole, pull requests are welcome. Overall, I think the combination of Trembling and Aspen will be very useful for my future basic sites. I would also like to try it on a more complicated project just to see how well it performs.

Arch Linux Lanyards Are Back!

Demand for Arch Linux lanyards has been growing steadily. I had intended to have a new order by new years, but I ended up dealing with a different company, and decided to do a completely new design inspired by the updated Arch Linux website. There is a subtle gradient from dark grey to black in the background that looks very dynamic, and the logo itself is crisp and clear. Each lanyard is thin and very light to wear.

The lanyards are $6 for singles, and can be purchased from The Arch Schwag Store.

Add prominent links to your project in your project’s documentation

I’ve spent this weekend researching a wide variety of different python libraries for a pet project that may never come to fruition. To my delight, most of the projects I was interested were documented in Sphinx and the documentation had been posted online using Read The Docs or a self-hosted site.

Some of the projects only include API documentation, while others also include helpful guides or tutorials. However, None of the projects I visited had links back to the project website and source code repository in their documentation. Some had such links hidden on a contributing or download page.

A lot of the time, my Duck Duck Go results link me directly to the documentation for a project I haven’t heard of. I read the APIs and think, “hey, I’d like to try this out.” Then I have a bit of trouble actually finding the project’s home page.

So documentation authors: Please add prominent links to your project’s home page, as well as source code repository and issue trackers to your documentation.

Easily changing virtualenvs

The workon script available in virtualenvwrapper is a useful tool for switching virtual environments on the fly. However, it fails for the workflow I prefer, so I wrote my own.

I prefer to keep my virtualenvs in the same folder as whatever source code or project they apply to. Normally, I put them in a folder named venv, although sometimes I use different names, most often when I have multiple virtualenvs applying to the same source tree (e.g. venv27, venvpypy).

My directory structure therefore looks like this:

/home/dusty/code/
+ project1
| + src
| | + pkg1
| | + pkg2
| + doc
| + venv
+ project2
| + src
| + venv27
| + venvpypy
+ project3
| + venv

... and so on

The problem is, I may be anywhere in the folder tree when I want to activate a virtualenv. This means I'm doing stuff like . ../../venv/bin/activate one time and venv/bin/activate another time. Trying to remember which parent directory I want often requires several tab completions to see the directory listing. This is annoying. So I wrote a bash function to activate the optionally named virtualenv in the current directory or any parent directory.

I avoid bash whenever I can, so this may not be the prettiest bash script you've ever seen. ;-)

function v {
  # activate a virtualenv
  # usage: v
  #   activate the virtualenv named venv
  # usage: v venvname
  #   activate the virtualenv named venvname
  name=venv
  if [ $1 ] ; then
    name=$1
  fi
  olddir=$(pwd)
  quit=0
  deactivate &>/dev/null
  cwd=$(pwd)
  while [ $quit -eq 0 ]
  do
    cd $cwd
    if [ $cwd == '/' ] ; then
      quit=1
    fi
    if [ -e $name ]  ; then
      source "$name/bin/activate"
      quit=1
    fi

Put this in your ~/.bashrc. Typing v on the command line will look in the current and all parent directories for a directory named venv, and will activate that venv. Typing v [venvname] will walk the same tree looking for a directory named [venvname] to activate.

If you want to make it prettier, feel free to fork it on github.

Contributing to git projects on github

I spent some time this week helping a colleague learn the ins and outs of Git rebase. He mentioned that he could not find a git introduction that summarized the basic workflow you might use when contributing to an open source project on Github. This article attempts to fill the gap.

Note that there are many possible ways to interact with git and github and none of those ways are canonically correct. The steps here will help a newcomer to git or to collaborative editing on git. I outline the steps in short form first, then summarize alternatives below.

  1. Fork the remote project on github.
  2. Clone your new fork.
    git clone ssh://git@github.com/[yourname]/[reponame]/
    cd [reponame]
  3. Add a shortcut to upstream.
    git remote add upstream https://git@github.com/[developername]/[reponame]/
  4. Optionally choose a branch to hack on.
    git checkout development_branch_name
  5. Create a local copy of that branch.
    git checkout -b working_branch_name
  6. Hack.
  7. Commit your changes
    git add [-p] filenames
    git commit

    Repeat the hack,add,commit steps until your commits look the way you want them to.

  8. Grab any commits that were added upstream while you were working.
    git checkout development_branch_name
    git pull upstream development_branch_name
  9. Rebase your work onto to the end of the development branch.
    git checkout working_branch_name
    git rebase development_branch_name
  10. Push your commits to your github fork for the world to see.
    git push origin working_branch_name
  11. Make a pull request on github. Pick development_branch_name as the one you want your code pulled into, and working_branch_name as the one you are pulling from in your repo.
  12. Oh nos! The upstream devs didn’t like your changes. That’s fine. Repeat the hack/commit step as needed. You may
    be able to just append commits to your previous commits. If you need to alter your history (e.g. because they don’t like how one of your commits looks), check the man pages for git commit --amend and git rebase -i.
  13. Once you have the working_branch in a state you like, repeat the remote pull and rebase steps to ensure your commits “look like” they happened at the end of the remote developer’s branch.
  14. Push your changes to your github fork, forcing an overwrite of any existing changes.
    git push -f origin
  15. Issue another pull request on github; your old pull request will be updated to the new working branch.
  16. Your code has been accepted upstream, congratulations on your contribution to an important open source project!

Notes

  1. This represents a branch the upstream developer has pushed to their remote repository. The branch chosen really depends on where you want to develop. Some projects do all their development on master. Others push a remote branch named ‘devel’ or branches named after the feature being developed on that branch. Either way, your ultimate goal is to have your commits added to this branch on the remote repository.
  2. It is possible to do your development on a single branch and rearrange commits as needed. However, branching is free in git, and storing your changes on a different logical branch will really help you identify where everything fits together.
  1. Each commit should contain one logical change.
    A commit message should not contain the word “and”.
    If you have been working on two logical changes at the same time, you can index only relevant hunks using git add -p.
  2. You’re checking out the original branch here. You don’t want to pull all the foreign commits into the branch you’ve been developing on; that branch should only contain your commits at this time.
  3. This is the same as “moving” all your commits
    to the end of development. The idea is to have your commits as a “tail” that all come after the upstream changes. Upstream will love this because they can do a fast-forward merge of your commits. Hopefully
    there aren’t any conflicts. If there are, resolve them so that your commits can apply cleanly. You’ll want to review the rebase documentation before continuing. Also, use three way merging, it makes life so much easier.
  1. There is a philisophical debate as to whether it is acceptable to rewrite public history, as Dieter has discussed in detail. In this case, since you have communicated (via the pull request) an intent to rewrite history, it’s not only acceptable, but desirable.Most upstream developers will consider it good form to edit older commits
    so that they reflect what the history “should have” looked like so that people reading your history can understand it.

Hacking on PyPy

In another great Pycon2012 keynote, David Beazely asked the question, “is PyPy easily hackable?” After a great talk, he answered with a decisive, “I still don’t know.” Having sprinted on Python I’d like to answer his question in a bit more detail.

I love David’s presentation style. He has a novel method of using phrases like, “blow your mind” and “this is really scary” repeatedly until they lose their meaning and you no longer feel mindblown or scared. A variety of factors, including Beazely’s thorough keynote address motivated me to join the PyPy team during the Pycon developer sprints.

I’d like to clear up one oversight in Dave’s otherwise impeachable talk. One of the PyPy devs, Holger Kregel explained to me that PyPy does not have over 1 million lines of code. I don’t have exact numbers, but for “historical reasons”, a non-python file containing Base64 encoded data was given a .py extension. When excluding this file from the line count, around half a million lines of actual Python code exist, and about a quarter of these are tests.

I was surprised how trivial it was to get started hacking on PyPy. I don’t really grok the many layers of the translation toolset and PyPy interpreter, but it’s pretty clear that the layers are well separated. I was hacking on the py3k branch of PyPy. I am happy to admit I was working primarily on changing print statements to print() functions and commas in exceptions to the as keyword.

Here are the steps to start hacking on PyPy. Notice that the hour-long translation step is not part of the procedure. PyPy has a solid test framework, and the PyPy crew are focused on a 100% test-driven-development paradigm.

  1. Clone pypy (this takes a while):
    hg clone https://bitbucket.org/pypy/pypy/
  2. Pick a branch to work on. There are about 80 branches. I don’t know what they all do. Popular ones during the sprints included py3k and numpy-ufuncs2
  3. Pick a feature to work on. For py3k support, the list of failing tests in the buildbot is a good place to start. Numpy programmers had a list of fuctions that needed implementing, but I can’t find the link. Ask on IRC, the PyPy crew are very helpful. The bug tracker contains many features and issues that need addressing
  4. Add /path/to/pypy/ to your path so you can run the pytest.py command
  5. cd into the directory indicated in the buildbot output and run pytest.py path/to/test.py -k testname
  6. The test will likely fail. Hack away and fix it.
  7. When the test passes, commit, push to a bitbucket repo, and issue a pull request.
  8. Repeat!
  9. There are quite a few cons to working on this project. If you run hg in the pypy/modules/ directory, it will try to pick standard library modules from pypy and choke horribly. The pypy developers don’t really believe in documenting their code. Being able to tell the difference between rpython and python (which have identical syntax) is important. In general, if a module starts with “interp_” it contains rpython, but if it starts with “app_” it contains python. The code does not appear to be well-documented.

    If you are hacking on Python 3 support, you need to bear in mind that the PyPy interpreter is written in Python 2. You are working on a Python 2 application that executes Python 3 bytecode!

    On the positive side, rPython and Python are much easier to read and write than C. The PyPy devs are brilliant, but not intimidating. They are so confident in their test suite that they are comfortable programming in a “cowboy coding” style, hacking randomly until all the tests pass. Any one layer in the toolchain is easy to understand and develop. The IRC channel is full of friendly, knowledgable, helpful people at any time of day.

    Overall, I am much less intimidated by this project than I was before I started the dev sprints. I still can’t answer the, “Is Python easily hackable?” question fully. It’s certainly easy to get started, but I don’t know how easy it is to become intimate with the project. Dave Beazely’s keynote made PyPy more approachable, and I approached it. Hopefully this article will encourage you to do the same.

Three-way merge for hg using vim

I accidentally blew away my .hgrc and it took me some time to recover my 3-way merge inspired by Dan McGee’s article on the subject. So for my and your future reference here’s what you need in your .hgrc to get hg merge to give you a three merge window in vim:

[ui]
merge = vimdiff

[merge-tools]
vimdiff.executable = vim
vimdiff.args = -f -d -c "wincmd J" "$output" "$local" "$base" "$other"

When running hg merge, the effect you get is best described by Dan McGee himself:

The end result is three panes along the top (local, base, and remote respectively), and the merge work-in-progress is on the bottom. In addition, all four panes are still locked together for scrolling.

Why we need Python in the Browser

In his Pycon 2012 keynote speech on Sunday, Guido van Rossum covered many of the open “Troll” questions related to the Python community. I’ve had occasion to either complain about or defend all of the topics he covered, and he did a wonderful job of addressing them. The issues could largely be divided into two categories: those that come from a fundamental misunderstanding of why Python is wonderful (e.g. whitespace), and those that are not an issue for the core python dev team, but are being actively worked on outside of core Python (e.g event loop).

And then there’s the question of supporting Python in the web browser. I honestly thought I was the only one that cared about this issue, but apparently enough people have complained about it that Guido felt a need to address it. His basic assertion is that the browsers aren’t going to support this because nobody uses it and that nobody uses it because the browsers don’t support it.

This is a political problem. Politics shouldn’t impact technical awesomeness.

The fallacious underlying assumption here is that modern HTML applications must be supported on all web browsers in order to be useful. This is no longer true. Web browser applications are not necessarily deployed to myriad unknown clients. In a way, HTML 5, CSS 3, and DOM manipulation have emerged as a de facto standard MVC and GUI system. For example, many mobile apps are developed with HTML 5 interfaces that are rendered by a packaged web library rather than an unknown browser. Client side local storage has created fully Javascript applications that require no or optional network connectivity. There are even situations where it may not be necessary to sandbox the code because it’s trusted. Many developers create personal or private projects using HTMl 5 because it’s convenient.

Convenient. It would be more convenient if we could code these projects in Python. Web browsers can be viewed as a zero install interface, a virtual machine for these applications. Such a VM has no reason to be language dependent.

It is simply unfair to all the other programming languages and coders of those languages to say, “we can’t displace Javascript, so we won’t try.” Web browsers have evolved into a virtualization layer more like operating systems than single programs. While it is true that the most restrictive operating systems only permit us to code in Objective C, in general, it is not considerate to restrict your developers a single language or environment.

It is time (in fact, long overdue) for Python to be supported in the browser, not necessarily as an equal to Javascript, but as an alternative. The web is a platform, and we must take Guido’s statement as a call to improve this platform, not to give up on it.

Update: I just stumbled across http://www.gnu.org/software/pythonwebkit/ and I can’t wait to play with it!

Update 2: From the number of comments on this article, it appears that my article has hit some web aggregator. To those users suggesting python to javascript compilers, I’ve been a minor contributor to the pyjaco project, a rewrite of the pyjs library. It has the potential to be a great tool, and the head developer, Christian Iversen is very open to outside contributions. Let’s make it better!