Posts tagged ‘python’

My Linux Rig Interview

It’s pretty rare for me to link to other people’s articles in this blog. I don’t believe in regurgitating information. However, this link is not regurgitated, it’s an interview I gave to http://mylinuxrig.com/ so it’s my own words:

http://www.mylinuxrig.com/post/9557009605/the-linux-setup-dusty-phillips-developer

When you use a Django query keyword as a field name

I need to model a location in the Alberta Township System coordinate space. The model is extremely simple:

class Location(models.Model):
    project = models.ForeignKey(Project)
    lsd = models.PositiveIntegerField(null=True, blank=True)
    section = models.PositiveIntegerField(null=True, blank=True)
    township = models.PositiveIntegerField(null=True, blank=True)
    range = models.PositiveIntegerField(null=True, blank=True)
    meridian = models.PositiveIntegerField(null=True, blank=True)

There’s a rather subtle problem with this model, that came up months after I originally defined it. When querying the foreign key model by a join on location, having a field named range causes Django to choke:

>>> Project.objects.filter(location__range=5)
------------------------------------------------------------
Traceback (most recent call last):
  File "", line 1, in
  File "/home/dusty/code/egetime/venv/lib/python2.7/site-packages/django/db/models/manager.py", line 141, in filter
    return self.get_query_set().filter(*args, **kwargs)
  File "/home/dusty/code/egetime/venv/lib/python2.7/site-packages/django/db/models/query.py", line 556, in filter
    return self._filter_or_exclude(False, *args, **kwargs)
  File "/home/dusty/code/egetime/venv/lib/python2.7/site-packages/django/db/models/query.py", line 574, in _filter_or_exclude
    clone.query.add_q(Q(*args, **kwargs))
  File "/home/dusty/code/egetime/venv/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1152, in add_q
    can_reuse=used_aliases)
  File "/home/dusty/code/egetime/venv/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1092, in add_filter
    connector)
  File "/home/dusty/code/egetime/venv/lib/python2.7/site-packages/django/db/models/sql/where.py", line 67, in add
    value = obj.prepare(lookup_type, value)
  File "/home/dusty/code/egetime/venv/lib/python2.7/site-packages/django/db/models/sql/where.py", line 316, in prepare
    return self.field.get_prep_lookup(lookup_type, value)
  File "/home/dusty/code/egetime/venv/lib/python2.7/site-packages/django/db/models/fields/related.py", line 136, in get_prep_lookup
    return [self._pk_trace(v, 'get_prep_lookup', lookup_type) for v in value]
TypeError: 'int' object is not iterable

That’s a pretty exotic looking error in Django’s internals, but it didn’t take long to figure out that using location__range is making Django think I want to use the range field lookup on Location.id instead of the field I defined in the model. I expect a similar problem would arise if I had a field named “in”, “gt”, or “exact”, for example.

The solution is simple enough, but didn’t occur to me until searching Google and the Django documentation, and ultimately scouring the Django source code failed to yield any clues. If you ever encounter this problem, simply explicitly specify an exact lookup:

>>> Project.objects.filter(location__range__exact=5)
[< Project: abc>, > Project: def >]

Prickle v0.4: CouchDB to MongoDB

I took some free time to port Prickle, my stay out of your way time tracking tool, from CouchDB to MongoDB. I originally wrote it to use Pylons and Couchdb specifically because I felt like studying a couple of technologies I hadn’t tried before. I found CouchDB to be a bit unwieldy to code with, and it didn’t take long for it to get slow on my Prickle datasets. Rather than figure out how to optimize CouchDB, I decided to port it to MongoDB.

The port was relatively straightforward. I was able to use pretty much the same model layout (Prickle has a very simple data layer). I tightened it up a bit to use actual references between models instead of implicit foreign keys by ids.

I chose to use MongoEngine for the database layer. I started with pymongo, but decided that a more abstract object-document-manager would be more useful.

I found querying in MongoEngine to be much simpler than the map-reduce queries in CouchDB. It is very similar to the Django ORM with which I am very familiar. Document creation and mapping is also familiar. Overall, I think Prickle will be a lot easier to maintain and extend using MongoDB than it was with CouchDB.

Prickle v0.4 does not contain any new features over 0.3. It also contains one bugfix to the date validation, submitted by Shelby Munsch.

Django and jquery.tmpl

Lately, I’ve been finding Django increasingly inappropriate for the web applications I develop. I have several complaints: the forms library doesn’t extend well to ajax requests, any extensive customization to the admin requires obtuse inspection of the admin source code, many admin customizations simply aren’t possible, the “reusable apps” philosophy has added a layer of complexity to a lot of things that really should not be there, there are no obvious best practices for ajax support.

In spite of all this Django is still better than other frameworks (Python or not) that I have investigated or tested. I’ve considered writing my own web framework, but I wouldn’t maintain interest in it long enough to get it off the ground. So I’m letting my complaints bounce around in the back of my mind with the hopes that I can improve Django so that it continues to work for me as a platform.

I’m currently trying to come up with a better system for ajax requests. I have nothing concrete in mind, but I’ve started with the premise that ajax requests should never return rendered html, but should return only json (hence my issue with the django forms library). With that in mind, I need a templating library for json data. JQuery is a must, and the officially supported JQuery templating library is jquery.tmpl http://api.jquery.com/category/plugins/templates/

The problem with jquery.tmpl is that it uses django-like syntax. The following is a valid block that may be rendered in a jquery.tmpl page:

<script id="project_tmpl" type="text/x-jquery-tmpl">
    {{each projects}}<li>${$value}</li>{{/each}}
</script>

If you try to include this in a django template, the {{ and }} tags will be replaced with (probably) empty variables. Django has {% templatetag %} to render these individual items, but what we really need is a way to tell the django templating system to leave complete sections of code alone. So I wrote the jqtmpl template tag. It allows us to wrap code in a block that tells Django not to render that block as template code. So the above would show up in a Django template as follows:

<script id="project_tmpl" type="text/x-jquery-tmpl">
{% jqtmpl %}
    {{each projects}}<li>${$value}</li>{{/each}}
{% endjqtmpl %}
</script>

Here’s the template tag:

from django.template import Library, TextNode, TOKEN_BLOCK, TOKEN_VAR
 
register = Library()
 
@register.tag
def jqtmpl(parser, token):
    nodes = []
    t = parser.next_token()
    while not (t.token_type == TOKEN_BLOCK and t.contents == "endjqtmpl"):
        if t.token_type == TOKEN_BLOCK:
            nodes.extend(["{%", t.contents, "%}"])
        elif t.token_type == TOKEN_VAR:
            nodes.extend(["{{", t.contents, "}}"])
        else:
            nodes.append(t.contents)
 
        t = parser.next_token()
 
    return TextNode(''.join(nodes))

This doesn’t handle Django’s {# comments #}, as token.contents doesn’t return a valid value for comment items. As far as I know, you wouldn’t use the comment construct inside a jquery.tmpl template anyway, so it’s still functional.

Next on my list is a better forms validation library to suit my theory that validation should be done client side. I’ve got a server-side system in mind that returns json responses, and does not return user-facing error messages, since those should have been caught client side. With these tools, and hopes for Grappelli to eventually create a decent admin, Django may continue to serve me.

Another release of Prickle, the stay out of your way time tracker

I just released version 0.3 of Prickle, a time tracking tool I’ve mentioned before. Prickle is a simple stay-out-of-your-way time tracking program that I find very useful for my day-to-day work as a consultant. In originally developing it, I modeled it after Freckle. With version 0.3, I consider Prickle to be more usable and more powerful than Freckle for my specific purposes. It may be useful for you as well.

The major improvement in this release is the ability to have multiple types of work for a given project. Each type is represented in the invoice by a different line-item, and can optionally have a different rate. So if you bill admin work at a different rate than development, prickle can handle it on one invoice. This required adding one more field to the log it form, which means one extra keypress for logging line items, but the new field is perfectly comfortable staying blank. It also autocompletes the types for a given project, so it is still stay-out-of-your way convenient.

The other major change is the ability to edit timesheet rows. I find this useful when I occasionally bill a timesheet for the wrong date or project. This is done using inline editing and is very ajaxy. At a user request, I’ve also changed the timesheet view to use hours and minutes rather than decimal representation.

Prickle is now working pretty much as I want it. I may add some reportlab pdf generation at some point so I can automatically e-mail pdf invoices, and a little extra customizability may be in order. The entire interface really wants some design love, but I am really hoping another user will jump up and do that. I’m not what you’d call a design guru. But overall, I think the current feature set is what I need, and I believe it may be useful for a few other people in the consulting field as well.

Updating m2m after saving a model in Django admin

I wanted to ensure that a many-to-many field on my model contained a value that was on a foreign key field on the model whenever that model was saved in the admin, like so:

            obj.participants.add(obj.instructor)

I thought this was a trivial task. I added this code to a post_save signal connected the model, but the participants list was not being updated. I was pretty sure that the reason it didn’t work was that form.save_m2m() must be called somewhere in the admin after the object was saved, which would override my m2m changes with the empty ones from the model.. Reading the Django admin source code confirmed this. But it didn’t show an obvious way to circumvent the behavior.

It is possible to override the change_view and add_view functions on the ModelAdmin object, but I did not want to do that. I would have had to copy the entire change_view contents into the subclass, as there is no way to make a super call do what I wanted here. Here’s the section of code I needed to modify (it’s in django.contrib.admin.options.ModelAdmin.change_view if you want to look):

            if all_valid(formsets) and form_validated:
                self.save_model(request, new_object, form, change=True)
                form.save_m2m()
                # Right here is where i want to insert my call
                for formset in formsets:
                    self.save_formset(request, form, formset, change=True)
 
                change_message = self.construct_change_message(request, form, formsets)
                self.log_change(request, new_object, change_message)
                return self.response_change(request, new_object)

Obviously, I can’t override save_model because save_m2m() is called after that, which would still wipe out my changes. I really need to have a self.after_m2m() call at the point I have commented in the above code. But I don’t.

I really didn’t want to have to copy this entire method into my ModelAdmin subclass (in admin.py) just to add that one call… so instead, I overroad another method that happens to have access to the new object and is called after save_m2m(). See that call to self.log_change a few lines later? That method updates the admin log db table. But it also happens to have access to the newly created object. I want to emphasize that this is an ugly hack:

# in admin.py
class ClassListAdmin(admin.ModelAdmin):
    #
    def log_change(self, request, object, message):
        super(ClassListAdmin, self).log_change(request, object, message)
        # The add and change views happen to call log_addition and
        # log_change after the object has been saved. I update the
        # m2m at this point (in update_participants) because the add
        # and change views call form.save_m2m() which wipes out the
        # changes if I put it in self.save_model().
        self.update_participants(object)

self.update_participants(), of course, contains the code I originally wanted to run.

This isn’t the most proper way to do this, but if you’re looking for a quick, dirty but DRY hack, it might save you some time.

Prickle version 0.2

Yesterday, I sent out my first invoices using Prickle. Having tested it in production (I had a to fix a couple sort order issues), I decided it was time, already, to release the second version (release early and often, catb says).

In case you missed the introduction: Prickle is a simple time tracking tool I whipped up to avoid fighting with spreadsheets or paying for Freckle. Source is available from github. Dependencies include pylons and couchdb.

So what’s new in v0.2? Mostly minor bug fixes or data display. The invoice summary is a little more useful, most of the display tables have meaningful sort orders (descending by date, usually), and timesheets can be deleted. One bigger feature is a simple javascript timer with start/stop/pause buttons. When you stop the timer, it automatically adds the time to the duration field to allow you to log it. Another is the ability to mark timesheets as “invoiced” without actually creating an invoice. I use this to clean out timesheets that I’m not actually billing for, such as time spent working on prickle, or arch schwag, or editing for my Dad’s new book.

I’m finding Prickle exceptionally useful; If you do a lot of time tracking for freelance work or other tasks, I hope it might be useful to you as well. Please try it out, and if you have any questions, or problems, get in touch!

Here’s a screenshot of Prickle in action:
2010-10-01-151414_1920x1848_scrot

Introducing Prickle, a time tracking tool

I’ve been doing a lot of Django coding lately, and I’ve been doing it very well. But I’ve felt bit of stagnation settling in. I’ve only been coding what I know. I needed to explore, experiment.

So I picked a couple of technologies I’ve heard about but haven’t tried, and started coding. The goal wasn’t to make something useful, it was to learn some new APIs and frameworks. I accidentally made something useful.

So here I present Prickle. Prickle is a time tracking tool for freelancers written using Pylons, CouchDB, and Jinja2. (Those are the technologies I chose to play with.) I decided to write an app to replace Freckle, a stay-out-of-your-way time tracking tool that has been highly recommended to me by favorite alternate freelancer, Dan McGee. Freckle is a great service, but I have a couple problems with it: I hate hosting my data on other people’s machines, and I hate paying money to use a service. Granted, at $12 a month, it’s going to take me years to earn back the hours I put into developing Prickle, but that’s not the point. Replacing Freckle in my daily use was just a little bonus I got while studying Pylons and CouchDB.

All Prickle does is allow me to enter the number of hours I’ve worked on a project, along with a description. It makes entering hours as simple as possible; the date is selected semi-automatically. Hours can be entered in a variety of formats (0:30, 30, 0.5, and .5 all map to 30 minutes 1, 1:00 1.0 all map to 1 hour), so it tends to “just work”. Project names are autocompleted, so it’s just a few keypresses to get to the description box. Now I type a description, press and I’m done.

There are a bunch of views of the timesheets that allow me to see how much uninvoiced time has been spent on a particular project or during a particular month or day. If I set a rate for a project, I also get a summary of how much money I’ve earned on a given day or month, or how much I’m going to have to bill a client for a specific project.

And Prickle does invoicing. It’s very simple invoicing, but very simple is all that I need. It summarizes the hours since the last invoice was created, displays them in an invoice template (I currently use print to pdf to generate the invoice. Maybe later I’ll automate sending an e-mail to clients). Even here, Prickle has some handy helpers to speed up the process. It keeps track of billing addresses for a given project, so I don’t have to enter it each time, and it guesses what the next invoice number should be so I often don’t have to type it in.

Prickle doesn’t have all the fancy graphs and reports that Freckle has, but it quickly answers the primary questions that I ask of my data: How many hours have I billed today? How much money have I earned this month? Am I on target for this project’s budget?

Prickle is open source, and I’m hoping some people will find it useful enough to contribute back to it. I use it daily already, and don’t really have any complaints. Some things I intend to add include:

* A timer with a pause button. I hated this when I first tried Freckle, but it grew on me.
* improved historic view of invoices
* editing or deleting time entries

Some things I invite others to add include:
* authentication. I like to run things locally, so auth wasn’t important to me. It’s also confusing to implement in Pylons.
* styling. I applied some semi-random CSS rules, but I know it’s ugly.
* browser support. This thing works great in chromium, but I’m using some of the most experimental stuff: html 5 inputs, CSS 3, etc. It’d be nice to add some Javascript to make it work on other browsers.

So, Dusty, now that you’ve played with Pylons and CouchDB, what’s your take?

CouchDB is pretty cool to work with. Map/Reduce based queries take some getting used to, but once you’ve learned them, that’s all you have to know. There’s no tuning SQL, the ORM is a very thin layer, it just works. If CouchDB is as scalable as they say it is, I think I’d like to use it some more.

Pylons is kinda nice. Formencode sucks, but other than that, the libraries bundled with or suggested for pylons are pretty intuitive. I’m finding Django is a bit over-engineered these days, or maybe I’ve just been pushing its limits. I was hoping Pylons would be a less bossy solution, but I don’t think I’ll be switching to it or suggesting it to my clients anytime soon. It seems a bit rough around the center, and doesn’t seem any less complicated than Django, in practice.

Next on my list of tools to play with is node.js. I also want to play with web2py, and I may try a zope 3 app just for fun someday, too.

Great Big Crane now supports pip

This week, I started using Great Big Crane in real life to manage some of my buildout projects. I was surprised to discover how useful, slick, and bug-free it is. When we wrote it in a 48 hour sprint, I did not realize how functional and complete our final product turned out to be.

I filed about a dozen issues on the project as I used it, but surprisingly few of them were bugs; just feature requests and minor enhancements to make it more usable. I don’t think any of us were expecting to maintain this project when the contest was over. However, now that I see how useful it is, and because winning the dash has garnered a lot of interest in the project, I sat down for a few hours and added the one thing people have been requesting for it: pip support.

This new feature is fairly simple, and not fully tested, but the general idea is to be able to run virtualenv, manage your requirements.txt, and install the dependencies from inside greatbigcrane. This required a fairly invasive refactor of certain commands that we had implemented as buildout specific, but overall, it wasn’t a terribly difficult task.

What I have so far is certainly usable, but I suspect in the long run, it’s just a start!

Have a look at the sources here: http://github.com/pnomolos/Django-Dash-2010/

Get your head out of the clouds: Local web applications

I spent this weekend with two friends crazy enough to join me in a 48 hour coding sprint for the Django Dash. We competed in the dash last year and placed 5th. Our goal was to move up in the rankings this year (competition is stiff, wish us luck!). Our team had the highest number of commits, but I can’t say how many of them can be justified as quality commits… especially since we keep track of our TODO file inside the git repository!

This year, we created a project called Great Big Crane. (I don’t know why we called it this.) The code is stored on Github, and we set up a splash page at greatbigcrane.com. We don’t have a live demo for reasons I’ll get into shortly.

This project’s primary purpose is to help managing buildouts for Python projects, most especially Django projects. It helps take care of some of the confusing boilerplate in buildout configuration. It also allows one click access to common commands like running bootstrap or buildout, syncdb, manage.py, or migrate, and running the test suite associated with a buildout. It does most of these actions as jobs in the background, and pops up a notification when it completes. It even keeps track of the results of the latest test suite run so you can see at a glance which of your projects are failing their tests.

One of the most intriguing things this application does is open a text editor, such as gvim, to edit a buildout if you need more control than our interface provides. It does this be queuing a job that executes the text editor command on the server.

Wait, What? It can be a bit creepy when clicking a button in a web application fires up an arbitrary program on your computer.

This entire app is designed to run on localhost. It’s set up for developers to manage their own projects. It doesn’t support authentication (this is why we don’t have a live demo), and the server has full access to the local filesystem. It’s meant to support your local IDE, not to provide an online IDE. The entire app is therefore super fast (no network delay), and switching from it to my text editor to several terminals became quite normal as I was developing on it (yes, the buildout for Great Big Crane runs just fine from inside Great Big Crane ;).

So yes, you’re expected to run this web app locally. Why would anybody want to do this? Is it a sensible thing to do?

The alternative to what we’ve done here would be to code the whole thing up as a GUI application of some sort. I have experience with most of the Python GUI toolkits, and I can’t say that I “enjoy” working in any of them. I’m not sure I enjoy working in HTML either, but I do a lot of it. HTML 5 with CSS 3 is certainly a powerful and reasonable alternative to modern graphical toolkits.

I’ve been coding HTML for so long that I don’t know what the learning curve is for it, but I’m definitely more comfortable working with it than I am with TK, QT, GTK, or WxWidgets, all of which take a long time to learn how to code properly. Possibly I’m just stagnating, but I think I’d prefer to develop my next “desktop” app as a webapp intended to run locally, rather than study these toolkits again. Indeed, because I think I’d prefer to do that, I started coding my last project in PyQT, just to fight the stagnation tendency. PyQT is an incredibly sensible toolkit, after you have learned how to make sense of it, but it’s not as sensible as the new web standards. Another advantage is that if you ever decide you want to make the app network enabled, you’re already running an app server, and using standard web technologies to push it to the cloud.

So my gut feeling at this point is that yes, it is sensible to design “traditional” desktop apps using HTML 5, CSS, and javascript for the interface, and your choice of webserver and web framework for the backend. Perhaps it’s not any more sensible than using a GUI toolkit, but it’s certainly not insane.

If it makes sense to replace local desktop apps with a local server, does it also make sense to replace web applications with a local system?

I’m not a huge fan of web applications because they are slow for me. I have a good connection (by Canadian standards, which aren’t high…). Yet Gmail is slower than Thunderbird, Freshbooks is too slow for me to justify paying for it, and github, while fascinating, is also slow compared to local access. The only webapp I have tested that I consider responsive is Remember The Milk, a popular todo list. I’m not certain what they do to make it so responsive, but I suspect Google Gears or HTML 5 localstorage must be involved.

Local storage. I’ve written about this before (I must be getting repetitive). My idea then was that offline enabled webapps are just as responsive as desktop apps. But the current available paradigm, using HTML5 localstorage, requires a lot of overhead normally involving manual syncing between the browser data and the server. What if I was running the app locally instead? Then I could just design it as a “normal” web app, without having to put extra thought into designing and maintaining local storage in the browser. It would be super responsive when I access it from my computer. More interestingly, it would also be available from remote computers. If I accessed it across my LAN using another laptop or my phone’s wifi, it would still be acceptably responsive. And if I happen to need access from the library or my friend’s computer, I can log in remotely, and still have approximately the same level of responsiveness that I currently get by logging into a server in the cloud.

This isn’t a new idea. It’s been presented as a “gain control of your own data” alternative to the privacy and control fears that Google, Facebook, and Apple (among others) have been creating. (<a href="http://www.h-online.com/open/features/Interview-Eben-Moglen-Freedom-vs-the-Cloud-Log-955421.html"this Is a nice discussion). There are a lot of clear advantages of moving data local, but there are also disadvantages. The nice thing about cloud storage is not having to worry about data backup. The “access anywhere” paradigm is nice, too, although that is not ruled out with running a home webserver. Zero install and end users not having to think about dependencies is also nice.

Overall, I’m finding more and more reasons to bring our apps home, where we have control of them. Such cycles are common in the technology industry. Dumb terminals mainframes. Personal computers. Business networks. The Internet. The cloud. Off-board video/On-board video. Network cards? On-board nic. Hardware modems or Software modems. Personally, I think the cycle away from the cloud is just beginning. I think the company that finally conquers Google will be doing it by giving you back control of your data. I’ve never been totally comfortable with the whole web application idea (as a user — they’re fine to develop!). I’m still trying to identify what my reasons are, but in the meantime, we experimented with the idea by developing Great Big Crane as a local web application.