Archive for the ‘Uncategorized’ Category.

Pushing Python Past the Present

This is the first time I’ve inserted myself into an exchange between bloggers I don’t know. This topic interests me and I have something to add. Most importantly, I found an alliterative title. So I thought I’d give it a go.

I first saw Calvin Spealman’s article I am worried about the future of Python. I suspect that upon reading this article, Guido got into his famous time machine to give his Pycon 2012 Keynote speech, accusing Calvin of trolling.

This article was shortly followed by Tim McNamara’s Python is doing just fine which (likely unintentionally) summarizes Guido’s talk.

Finally, Nick Coghlan came out with an extensive summary article called Python’s Future: A Global Perspective.

And now there’s me. All I want to do is push people to start supporting PyPy. The PyPy developers have worked on a variety of technologies that can be made into production-ready products that address most issues (real or imaginary) that people see with Python.

They’ve already solved speed and I suspect the next release of PyPy will solve memory. They’ve made huge but mostly incomplete progress in a lot of other areas, including the ones the above bloggers have mentioned.

Concurrency

I have had a lot of trouble dealing with parallelism in Python. Everything I have tried has either been a hack or required a hack to work with it. In my experience, the multiprocessing module does not work in large-scale production. However, it works a lot better on PyPy than it does on cPython, at least for me. The various async solutions can only use one processor and are therefore, in my opinion, no more exciting than GIL-throttled Threads.

In addition to better multiprocessing, PyPy also supports stackless extensions out of the box. And I haven’t even mentioned Armin Rigo’s mysterious Software Transactional Memory implementation.

Concurrency is a sore point in Python. There are solutions. PyPy is capable of doing those solutions better.

Web

I think it’s unfortunate that the only way to code for a web browser is to use JavaScript or a language that compiles to JavaScript. I feel there is no real reason browsers can’t support arbitrary <script type=""> tags. I discussed this last March.

With development, PyPy can support Python in the browser. The mostly-finished but untested sandboxing feature of PyPy can be adapted for in-browser execution. I know some experimenting was done on this in 2009 that proved it’s possible. We just need to take it up and make it happen.

Mobile

PyPy’s speed makes it theoretically possible to run it effectively on mobile platforms. Further, it can be compiled to arbitrary execution environments including JVM and .Net. There is no reason that with some development, PyPy couldn’t run on the Dalvik JVM. I’m sure it’s even possible to run it on iOS.

Interestingly, one of the trolls Guido mentioned in his talk was “PyPy should become the default Python implementation”. He debunked this readily by asking the audience how many people use PyPy in production. Nobody moved.

PyPy is ready for production use, but it is not widely used. I think this is largely because the primary PyPy developers are way more excited about creating new features than they are about marketing the current platform or polishing up nearly finished features like sandboxing or a Dalvik backend. They are visionaries, not finishers.

I say this with a great deal of respect. I am not calling for these guys to change their focus. What I want is for other people from the Python community to join the PyPy project as finishers. It needs people that can make sure these nearly-finished features are working, production-ready and more importantly: documented.

I’ve been meaning, for months, to become one of these people. Unfortunately, I’ve prioritized other things including my job and my upcoming book. It’s still on my radar, though, and I hope that after reading this article, you, too, are thinking about make PyPy the future of Python.

So, what can you do?

Use PyPy
PyPy is capable of being your default Python 2 interpreter for most common tasks. Use it. If it doesn’t work for a given project, get on #pypy, they will help you fix it. It’s even more exciting if you can use PyPy features that are not currently available in cPython, such as the stackless extensions.
Evangelize PyPy
Tell people how well PyPy is working for you. Write articles like this one.
Document PyPy
The PyPy website contains a lot of documentation, but it’s rather intimidating and unreadable to the uninitiated. PyPy is just Python, but it’s also much more than Python. Somebody with some writing skills needs to get in there and make it digestable.
Design PyPy
Seriously, pypy.org does not need to look like something out of 2005.
Develop PyPy
I’ve hacked on PyPy. It’s not scary. Work on features like numpy or Python 3 support that are the current developer’s focus. Better yet, work on finalizing features like sandboxing or alternative backends that are finished but not quite tested.

If you have an hour or two free this weekend, instead of writing about how Python is great or not great, or not going to continue to be great, do something with PyPy. Make it happen.

How (not) to screw up timezone processing in Python

If you use pytz without reading the documentation, you might think you can do this:

cltime = datetime.datetime(
        2012,9,26,1,15,0,
        tzinfo=pytz.timezone('America/Santiago'))

This is an easy way to utterly screw up your timezones.

Never pass a tzinfo into the datetime constructor. Here’s why:

>>> cltime = datetime.datetime(
        2012,9,26,1,15,0,
        tzinfo=pytz.timezone('America/Santiago'))
>>> cltime.astimezone(pytz.timezone('US/Pacific')).isoformat()
'2012-09-25T22:58:00-07:00'

That’s simply the wrong time. 1:15 Chilean time should be 21:15 Pacific the previous day, not 22:58.

For whatever reason, the wrong tzinfo is attached to the object. Compare:

>>> cltime.tzinfo
<DstTzInfo 'America/Santiago' SMT-1 day, 19:17:00 STD>
>>> datetime.datetime.now(pytz.timezone('America/Santiago')).tzinfo
<DstTzInfo 'America/Santiago' CLST-1 day, 21:00:00 DST>

The correct way is to use pytz.localize():

>>> correct_cltime = pytz.timezone('America/Santiago'
        ).localize(datetime.datetime(2012,9,26,1,15,0))
>>> correct_cltime.astimezone(pytz.timezone('US/Pacific')).isoformat()
'2012-09-25T21:15:00-07:00'
>>> correct_cltime.tzinfo
<DstTzInfo 'America/Santiago' CLST-1 day, 21:00:00 DST>

The pytz documentation does not indicate if datetime.datetime.now() creates the correct timezone, so I tested it:

>>> for tz in pytz.all_timezones:
...     assert datetime.datetime.now(pytz.timezone(tz)
...         ).tzinfo == pytz.UTC.localize(datetime.datetime.utcnow()
...         ).astimezone(pytz.timezone(tz)).tzinfo
...     
>>>

The assertion never failed, so it is safe to use datetime.datetime.now(pytz.timezone("[timezonename]")) to generate the current date in a specific timezone.

In general, you should store dates in UTC format and convert them to the user’s timezone at the latest possible time. Try to never store naive datetimes. if you need the current time in UTC, use

pytz.UTC.localize(datetime.datetime.utcnow())

because utcnow() returns a naive datetime. Localizing it to UTC adds the correct timezone to it.

pytz.UTC.localize(datetime.datetime(2012,9,26,1,15))

is the correct way to construct a datetime in the UTC timezone. Technically, you CAN pass tzinfo to the datetime constructor if tzinfo is UTC. This is because UTC does not have daylight savings time. However, for consistency, do not do this; do not ever pass tzinfo into a datetime constructor.

Converting to or from UTC is simple, using astimezone:

>>> utctime = cltime.astimezone(pytz.UTC)
>>> utctime
datetime.datetime(2012, 9, 26, 4, 15, tzinfo=<UTC>)
>>> utctime.astimezone(pytz.timezone('America/Santiago'))
datetime.datetime(2012, 9, 26, 1, 15, tzinfo=<DstTzInfo 'America/Santiago' CLST-1 day, 21:00:00 DST>)

Arch Linux Handbook for Kindle

I have had a handful of requests that the Arch Linux Handbook be made available for the Kindle platform. It seemed like an odd request, given that the latest version of the Beginners’ Guide is already freely available in electronic format online. However, I had some free time this week and tried the conversion. It wasn’t difficult and I uploaded a version of the Handbook to the Kindle app store. I’ve helped publish other books to Kindle, so I already knew the process.

I received an e-mail from Amazon Customer Service to the following effect:

During a review of your KDP submission(s), we found content that is freely available on the web. You can do an online search for the content inside your book(s) to discover which sites are offering the content for free. Copyright is important to us – we want to make sure that no author or other copyright holder has their work claimed and sold by anyone else.

To confirm you have publishing rights to and control where you distribute the book(s), please provide all of the following information:

1. The URLs for all websites where this content is published
2. An explanation as to why the content is available online

If the books are in the public domain, please confirm this and include the information you used to make this determination. We may request additional information to confirm the public domain status.

Please respond within 5 days to title-submission@amazon.com, and include the title and ID of your books in your reply. Your book has been moved to a blocked status on your bookshelf and will not be available for sale in the Kindle store until we receive the documentation requested.

Sure, no problem. I responded to explain that it was freely redistributable under the Free Documentation License:

This content is indeed freely available on the web at https://wiki.archlinux.org/index.php/Beginners%27_Guide although I have done a certain amount of editing to get it into its current format.

However, this freely available content is published under the GNU Free Documentation License 1.3 or later. (http://www.gnu.org/copyleft/fdl.html) which explicitly states:

“The purpose of this License is to make a manual, textbook, or other functional and useful document “free” in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.”

Their response makes me sound like some kind of criminal:

Hello,

We’ve reviewed the information you provided and have decided to block these books from being sold in the Kindle Store. The books closely match content that is freely available on the web and we are not confident that you hold exclusive publishing rights. This type of content can create a poor customer experience, and is not accepted. As a result, we have blocked the books listed below from being sold in the Kindle Store.

Arch Linux Handbook 3.0 by Phillips, Dusty (EDITOR) (ID: 2884216)

Please be advised that you must hold exclusive publishing rights for books that closely match content that is freely available on the web. If your catalog continues to contain books that fail to comply with these conditions or do not meet our Content Guidelines, your account may be terminated.

The content guidelines applicable to all Kindle Direct Publishing (KDP) publishers can be found here: https://kdp.amazon.com/self-publishing/help?topicId=A2TOZW0SV7IR1U

Best regards,

Megan B.
Amazon.com
Your feedback is helping us build Earth’s Most Customer-Centric Company.

I was insulted and hurt by this message. It’s arrogant and it is confrontational. It makes no effort to address the specifics of the e-mail. But I cheered up at the end, or at least I laughed, “Customer-Centric Company”? Don’t advertise what you aren’t.

The original message made no mention of the fact that I should have “exclusive” publishing rights. It even said public domain work was acceptable. I have published Creative Commons books that are freely available online on the Kindle before (although they are a non-commercial creative commons license, so we still have “exclusive” rights to publish on Kindle).

In other news, the Arch Linux Handbook can be downloaded in .mobi format, free of charge, from http://archlinux.ca/arch_linux_handbook_3.mobi.

UPDATE:

After this article unexpectedly hit Reddit and Hacker News (my target audience was the Arch Linux Community), Amazon let the title go through and it is now available from http://www.amazon.com/dp/B0098R71V2. They have been in contact with me but have not been able to explain why the book has been allowed to be published! The best explanation I got was:

We’ve re-reviewed your content and have determined it may be published to the Kindle Store. We generally can’t accept content that closely matches content that is freely available on the web, for which you do not hold the sole publishing rights, or that which is not in the public domain. For example, content from Wikipedia and content with private label rights are not allowed since it disappoints our customers to pay for content that is freely available on the web.

Since the Arch Linux Beginners’ Guide is a wiki-developed article no different from Wikipedia, I’m not certain why the Handbook is now allowed.

I originally posted this article to explain why the handbook is not available in the Kindle store. My intent was not to complain about Amazon’s policies (although I was not happy with the suggestion that I am prone to criminal activity). The new purpose is to mention it’s availability through the store, although you can still download my quick one-off conversion if it suits you as well. In all honesty, I can’t imagine why anyone would want to purchase something that is already freely available in electronic format, but at least now that choice is open.

Copyresponsibility

There has been a great deal of discussion about copyrights in the new world, where the ability to copy or distribute any piece of information without destroying the original is virtually infinite.

Much effort has been put into preserving the “old model”, where the right to copy is restricted based on the author (or more often, the publisher/distributor), in the form of DRM, civil lawsuits, new laws, and intellectual property. Overall, this effort has failed, and continues to fail. The laws are becoming more complex, but enforcement is still impossible.

We live in an exciting time. Society has not yet decided whether copying information is moral or not. We already know what a world in which copying data is immoral is like — the “old model”. But what would happen if a world where it was both accepted and expected that anyone who has access to a piece of information has the legal and moral right to redistribute it?

Personally, I think we must plan for such a world, because it seems inevitable. Perhaps the various organizations attempting to hold onto the old model will succeed, but not forever. I think coming generations will demand the right to access and distribute information as they see fit.

Ultimately, this would mean that anyone who has access to (“read”) a piece of information has an equal right to share that information as the author. We already see models where this is the case; very few people know or care who the authors of Wikipedia articles are, for example. When a YouTube video goes viral, you rarely look to see who uploaded it. Even if you do, you don’t know if they actually created the original.

I think this will place a great deal of responsibility on people who have privileged knowledge. The author of a document has exclusive right to its content until she shares it with another person. At that point, those two people would have an equal share in the right to the content. This is implicit and obvious, but something that has not been discussed much to date is that they also have an equal share in the responsibility of deciding who else may view that document.

If every single person who has access to the information agrees that no-one else should see it, then the information is private to that group. However, if the weakest link chooses to share it with outsiders, nobody else in the group would have the right to stop them.

Thus, in the copy-responsibility world, a huge amount of trust is required before sharing information that you consider sensitive. I know people who claim, “if it’s on the Internet then it’s public”, even if it’s inside a private Facebook group or Google Plus circle. This claim simply acknowledges that every member of that circle has an equal right to share that data, or equal responsibility to keep it within the cartel. (They are overlooking the fact that employees at Facebook and Google are also implicit members of that circle that may not be so reliable.)

In summary, I think it’s fair to say we live in (or will soon live in) a world where we no longer have a right to privacy. Instead, we will have a responsibility to privacy. We are responsible for our own “secrets” as well as the secrets of anyone who has shared their secrets with us.

Arch Linux Handbook 3.0

After some unexpected delays, I am proud to present the latest version of the Arch Linux Handbook: https://www.createspace.com/3904652. This Handbook is based off the Arch Linux Beginners’ Guide as it was on June 10, 2012. Due to complaints with previous editions, I put extra effort into the interior of this version, removing the styling on things that look like links to other wiki pages, and ensuring that such links at least give a hint as to what you should search for on the Arch Wiki for further information.

Even so, the handbook is little more than a printout of the Beginners’ Guide. It does not contain any new information aside from a short forward I authored. However, if you are looking for a paper copy of the Beginners’ Guide in a convenient handbook size, this book is exactly what you are looking for.

As with version 2.0 of the handbook, the cover design for version 3.0 was done by Branko Vukelic, a gifted artist, designer, and web developer.

The Economics Of Cooperation

I’d like to spread the word about a project that has the goal of allowing open source developers to focus on the projects they have passion for.

Gittip allows people to set up recurring payments to specific programmers to encourage them to work on whatever projects they are passionate about. The long term idea is that maybe these developers could quit their day jobs working for companies that pay well but don’t necessarily produce products they are fascinated with, and spend their time on whatever open source project is currently taking their fancy. This model has been used somewhat effectively for a few projects such as the massively successful kickstarter project for Diaspora (http://www.kickstarter.com/projects/196017994/diaspora-the-personally-controlled-do-it-all-distr) or the donation bars for PyPy development (http://pypy.org/). However, the simple fact is that most open source developers are either a) paid by companies that have ulterior motives or b) working on open source projects only in their free time.

Gittip was developed by one of my favourite coworkers. He’s leaving a kickass job at our company because he believes that the world can be made more personal. I am writing this article, on my own initiative and not at his request, to support that goal. Perhaps I, too, one day, will be able to return to open source development and focus on my passions.

Instead of paying Google to pay it’s developers to make awesome stuff for us (and allowing Google’s wealthy shareholders — who did nothing — to skim a little off the top), we can pay developers directly. Google employees have a pretty awesome life, but they still have to spend 80% of their time working on Google products. We can make them happier by allowing them to quit Google and focus on their passions. And they can make us happier by creating brilliant products for us without having to sell our souls and personal data to use it. This story, of course, could apply to every software company and any brilliant open source developer out there today.

Gittip reminds me of a project I developed a few years ago called Arch Bounty (http://archlinux.me/dusty/2009/09/07/introducing-arch-bounty/). Arch Bounty failed before it got off the ground, partially because Arch itself wasn’t big enough to support it, partially because I failed to promote it, and partially because I didn’t do a good job of explaining what kinds of projects would be suitable for it. I hope Gittip does better.

I am fascinated by how effective cooperative communities can be. I love how my tiniest contributions to Arch Linux are paid in full by the tiny contributions of numerous other users. I love how hosting one person in the Couch Surfing community gives me the chance to stay anywhere in the world and meet wonderful people.

And yet, I am saddened by the way consumerism is able to overwhelm many similar communities. Google is very intent on making Mozilla irrelevant. We are socially conditioned to compete for resources instead of cooperating to build shared resources that benefit each of us more than individual competition could ever pay off.

Cooperation and competition need not be mutually exclusive. When I play ultimate, twice a week, I have a very strong competitive desire, as do my teammates and opponents. However, we have a culture of cheering good plays, regardless of which team makes them, and commiserating with missed opportunities. When two of us are competing for the disc, we’ll high five each other for having made a good run, regardless of who caught it. The world would be a better place if this cooperative-competitive culture could be applied universally. If we could all be attempting to excel at our individual passions, while also nourishing and encouraging others in their pursuits. I believe this dual nature can be applied to the marketplace. Everyone talks about the relative merits and problems with capitalism vs communism. Both systems seem to have fatal flaws over the last hundred years. What if we had both?

Can such a coopero-competitive culture be financially stable? A successful Gittip would be a great first step. I encourage you to discover the github username of your favourite Arch Linux developers and make a small weekly contribution. It would be amazing if enough of us could do this to motivate even one developer to quit his day job and work on Arch full time. Is it possible?

Yes.

ArchCon 2012

In 2010, I helped organize the first ArchCon, held in Toronto. We had a blast, and I considered it a success. Unfortunately life overtook me and I did not have time to organize a second one in 2011. I heard that there was a Polish language conference that used the ArchCon name that year, but didn’t know about it until it was over.

This year, the Polish group is going all out with the first truly international ArchCon. You can view their site in English at http://2012.archcon.pl/en/ (other languages also available). They’re expecting up to 250 people and are planning a terrific after party and workshops. The conference will be held in Warsaw on July 28 and 29.

Sadly, I won’t be able to make it, but I am very excited that the community has grown enough to create this international event. If you’re going to be near Poland at that time, I encourage you to register. ArchCon 2010 was a small event, but it was a lot of fun and to the best of my knowledge, everyone was glad they attended. There’s something special about being surrounded by Archers.

Trembling Aspen

Aspen is a unique Python web framework developed by Chad Whitacre. I am still evaluating it, but one thing I can say unequivocally is that it’s different. Very different. Aspen divorces itself from the MVC pattern that every major — and almost every minor — web framework has been relying on for the past decade. While each framework has its perks and annoyances, they all have this underlying MVC-like interface. Aspen throws that all away.

Aspen lodges itself in a small niche between php-like spaghetti code and a complicated package structure full of python models, views, controllers, and templates. In Aspen, template and control code live in the same file, but in separate pages, called Simplates. In a rebellion against regular expression url parsing, Aspen uses the filesystem to define url structure and paths; like in the old days of statically served HTML files.

One thing I love about Aspen is how easy it is to get up and running. Aspen will serve a single HTML file with no python code without any boilerplate. It serves static files for css, javascript, and images just as easily. To add a dynamic page to your Aspen powered site, you simply use the aforementioned simplates, which include python controllers and Tornado inspired templates in the same file. Aspen provides numerous service connectors, so deployment can also be trivial.

Another great feature of Aspen is its documentation. At first glance, the docs seem sparse and rather limited. However, they are well organized, and every page seems to answer every question I have on the topic succinctly and quickly. Chad seems to have a terrific minimalist policy of telling you exactly what you want to know in as few words as possible. As a writer, I know just how difficult this is to accomplish without repeating oneself.

On the negative side, I do have misgivings as to how well Aspen’s filesystem urls it can scale. Chad has assured me that Aspen is up for running large sites. I appreciate the power of regex urls, and I don’t necessarily like having arbitrary file extensions in URLS. I think locators should be succinct and divorced from implementation details. Ideally, I believe there may be a happy medium between Aspen’s enforced folder structure, and the overly-abstract and overly-complicated url scheme Django and its kin provide.

Another downside is that I haven’t figured out a good way to unit test Aspen simplates. You can refactor much of the code into backend libraries and test that, of course, or you can start the service running and mock requests. However, I miss the TestClient feature from Django that allows me to mock a request without running the server. It provides a better analysis of code coverage, and is closer to the actual code.

Now, with the review over, I’d like to introduce a little project I’ve started to provide some vital features that Aspen core is missing.
Trembling is kind of a catch-all project with a few basic helpers and a few useful projects. Namely, trembling provides:

  • Basic sessions
  • Basic user authentication
  • Less than basic single request messages
  • A simple redirect response

Aspen itself does not define or require a model layer; you’re free to use the ORM, DBAPI connector, or NoSQL database of your choice. Trembling arbitrarily chooses mongoengine for the models. I made no effort to make pluggable backends like Django does; Aspen does not encourage excessive layers of abstraction.

I also don’t know how secure trembling’s session and auth features are. If you find a security hole, pull requests are welcome. Overall, I think the combination of Trembling and Aspen will be very useful for my future basic sites. I would also like to try it on a more complicated project just to see how well it performs.

Arch Linux Lanyards Are Back!

Demand for Arch Linux lanyards has been growing steadily. I had intended to have a new order by new years, but I ended up dealing with a different company, and decided to do a completely new design inspired by the updated Arch Linux website. There is a subtle gradient from dark grey to black in the background that looks very dynamic, and the logo itself is crisp and clear. Each lanyard is thin and very light to wear.

The lanyards are $6 for singles, and can be purchased from The Arch Schwag Store.

Add prominent links to your project in your project’s documentation

I’ve spent this weekend researching a wide variety of different python libraries for a pet project that may never come to fruition. To my delight, most of the projects I was interested were documented in Sphinx and the documentation had been posted online using Read The Docs or a self-hosted site.

Some of the projects only include API documentation, while others also include helpful guides or tutorials. However, None of the projects I visited had links back to the project website and source code repository in their documentation. Some had such links hidden on a contributing or download page.

A lot of the time, my Duck Duck Go results link me directly to the documentation for a project I haven’t heard of. I read the APIs and think, “hey, I’d like to try this out.” Then I have a bit of trouble actually finding the project’s home page.

So documentation authors: Please add prominent links to your project’s home page, as well as source code repository and issue trackers to your documentation.