<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Reliably Broken &#187; python</title>
	<atom:link href="http://reliablybroken.com/b/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://reliablybroken.com/b</link>
	<description>It&#039;s a blog: let&#039;s do funch!</description>
	<lastBuildDate>Sat, 31 Jul 2010 01:07:08 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Django-style routing for Bottle</title>
		<link>http://reliablybroken.com/b/2010/07/django-style-routing-for-bottle/</link>
		<comments>http://reliablybroken.com/b/2010/07/django-style-routing-for-bottle/#comments</comments>
		<pubDate>Mon, 26 Jul 2010 21:57:40 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[bottle]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=510</guid>
		<description><![CDATA[Bottle provides the @route decorator to associate URL paths with view functions. This is very convenient, but if you are a Django-reject like me then you may prefer having all your URLs defined in one place, the advantage being it is easy to see at a glance all the different URLs your application will match.

Updated: [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://bottle.paws.de/">Bottle</a> provides the <a href="http://bottle.paws.de/docs/0.8/api.html#bottle.route"><code>@route</code> decorator</a> to associate URL paths with view functions. This is very convenient, but if you are a <a href="http://www.djangoproject.com/">Django</a>-reject like me then you may prefer having all your URLs defined in one place, the advantage being it is easy to see at a glance <a href="http://docs.djangoproject.com/en/1.2/topics/http/urls/#example">all the different URLs your application will match</a>.</p>

<p><em>Updated: I have re-written this post and the example to make it simpler following Marcel Hellkamp&#8217;s comments (Marcel is the primary author of Bottle). My original example was needlessly complicated.</em></p>

<p>It is possible to have <a href="http://docs.djangoproject.com/en/dev/topics/http/urls/">a Django-style urlpatterns stanza</a> with a Bottle app. Here&#8217;s how it can work:</p>

<pre><code>from bottle import route

# Assuming your *_page view functions are defined above somewhere
urlpatterns = (
    # (path, func, name)
    ('/', home_page, 'home'),
    ('/about', about_page, 'about'),
    ('/contact', contact_page, 'contact'),
)

for path, func, name in urlpatterns:
    route(path, name=name)(func)
</code></pre>

<p>Here we run through a list where each item is a triple of URL path, view function and a name for the route. For each we simply call the <code>route</code> method and then invoke it with the function object. Not as flexible as using the decorator on a function (because the <code>@route</code> decorator can take additional keyword arguments) but at least you can have all the routes in one place at the end of the module.</p>

<p>Then again if you have so many routes that you need to keep them in a pretty list you probably aren&#8217;t writing the simple application that Bottle was intended for.</p>

<p>(This was tested with Bottle&#8217;s 0.8 and 0.9-dev branches.)</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2010/07/django-style-routing-for-bottle/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>More Python features that I really like</title>
		<link>http://reliablybroken.com/b/2010/05/more-python-features-that-i-really-like/</link>
		<comments>http://reliablybroken.com/b/2010/05/more-python-features-that-i-really-like/#comments</comments>
		<pubDate>Fri, 28 May 2010 15:00:00 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Ben Dodd]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=446</guid>
		<description><![CDATA[Another thing that makes using Python pleasing is decorators. A decorator is a wrapper for a function (or method) that takes a function (or method) as an argument and returns a new function (or&#8230;) which is then bound to the name for the original function.

The newly-decorated function can then do things like checking the called [...]]]></description>
			<content:encoded><![CDATA[<p>Another thing that makes using <a href="http://www.python.org">Python</a> pleasing is decorators. <a href="http://docs.python.org/reference/compound_stmts.html#function">A decorator is a wrapper for a function</a> (or method) that takes a function (or method) as an argument and returns a new function (or&#8230;) which is then bound to the name for the original function.</p>

<p>The newly-decorated function can then do things like checking the called arguments before invoking the original un-decorated function.</p>

<p><a href="http://docs.djangoproject.com/en/dev/topics/auth/#django.contrib.auth.decorators.user_passes_test">Django provides decorators for authentication</a> so that you can wrap a view function with a check for client credentials before deciding whether to return the original response or a deny access.</p>

<p>In this manner Django&#8217;s authentication decorators encourage orthogonal code: the logic for displaying a view is separated from the logic for deciding whether you should be permitted to see the view&#8217;s output. By keeping them separate, it becomes simpler to re-use the authentication logic and apply it to other views.</p>

<p>Suppose you have a view that accepts <a href="http://docs.djangoproject.com/en/dev/ref/request-response/">a Django request object</a> and checks whether the user is signed in:</p>

<pre><code>def administration_page(request):
    if request.user.is_authenticated():
        return HttpResponse("Welcome, dear user.")
    else:
        return HttpResponseRedirect("/signin/")
</code></pre>

<p>With a decorator you can simplify and clarify things:</p>

<pre><code>@login_required
def administration_page(request):
    return HttpResponse("Welcome, dear user.")
</code></pre>

<p>For older versions of Python (pre 2.4) <a href="http://docs.python.org/whatsnew/2.4.html#pep-318-decorators-for-functions-and-methods">which don&#8217;t understand the <code>@</code> operator</a> one must explicitly decorate the view function like so:</p>

<pre><code>def administration_page(request):
    return HttpResponse("Welcome, dear administrator.")

administration_page = login_required(administration_page)
</code></pre>

<p>Note in the example that the original <code>administration_page</code> function is passed to the decorator. The <code>@</code> syntax in the first example makes that implicit but the two are equivalent.</p>

<p>The implementation of a decorator is interesting. It takes the function itself as an argument and returns a new function which does the actual checking. Here is how the decorator used above might do its stuff:</p>

<pre><code>def login_required(view_function):
    def decorated_function(request):
        if request.user.is_authenticated():
            return view_function(request)
        else:
            return HttpResponseRedirect("/signin/")

    return decorated_function
</code></pre>

<p><em>The actual <a href="http://code.djangoproject.com/browser/django/tags/releases/1.2.1/django/contrib/auth/decorators.py">implementation of Django&#8217;s <code>login_required</code> decorator</a> is considerably less idiotic. Python&#8217;s <a href="http://docs.python.org/library/functools.html">functools module</a> has helpers for writing well-behaved decorators.</em></p>

<p>Because functions in Python are themselves objects the decorator can accept a function reference, construct a new function that checks for authentication and then return a reference to that new function.</p>

<p>Simples!</p>

<p>(Simples gets less simples when you want to write a decorator that accepts configuration arguments because you then need either another layer of nested function definitions or a class whose instances can be called directly, but I&#8217;m going to ignore you for a bit and <em>wow is that Concorde&#8230;?</em>)</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2010/05/more-python-features-that-i-really-like/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Split a file on any character in Python</title>
		<link>http://reliablybroken.com/b/2010/04/split-a-file-on-any-character-in-python/</link>
		<comments>http://reliablybroken.com/b/2010/04/split-a-file-on-any-character-in-python/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 10:38:12 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[unix]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=437</guid>
		<description><![CDATA[I need to split a big text file on a certain character. I expect I am being thick about this, but split doesn&#8217;t quite do what I want because it includes the matching line, whereas I want to split right on the matching character.

My Python answer:

def readlines(filename, endings, chunksize=4096):
    """Returns a generator [...]]]></description>
			<content:encoded><![CDATA[<p>I need to split a big text file on a certain character. I expect I am being thick about this, but <a href="http://developer.apple.com/Mac/library/documentation/Darwin/Reference/ManPages/man1/split.1.html"><code>split</code></a> doesn&#8217;t quite do what I want because it includes the matching line, whereas I want to split right on the matching character.</p>

<p>My Python answer:</p>

<pre><code>def readlines(filename, endings, chunksize=4096):
    """Returns a generator that splits on lines in a file with the given
    line-ending.
    """
    line = ''
    while True:        
        buf = filename.read(chunksize)
        if not buf:
            yield line
            break

        line = line + buf

        while endings in line:
            idx = line.index(endings) + len(endings)
            yield line[:idx]
            line = line[idx:]

if __name__ == "__main__":
    import sys, os

    FORMFEED = chr(12) # ASCII 12
    basename = os.path.basename(sys.argv[1])
    for num, data in enumerate(readlines(open(sys.argv[1]), endings=FORMFEED)):
        filename = basename + '-' + str(num)
        open(filename, 'wb').write(data)
</code></pre>

<p>This is also useful when reading data exported from some old-fashioned Mac application like <a href="http://www.filemaker.com/support/downloads/downloads_prev_versions.html">Filemaker 5</a> where the line-endings are ASCII 13 not ASCII 10.</p>

<p>This post was inspired by <a href="http://www-01.ibm.com/software/lotus/products/notes/">Lotus Notes</a> version 8.5, which is so advanced that to save a message in a file on disk you have to export it as structured text. And if you want to save a whole bunch of messages as individual files you must forget that <a href="http://www.mactech.com/articles/mactech/Vol.10/10.06/DragAndDrop/index.html">drag-and-drop was introduced with System 7</a>, that would be too obvious.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2010/04/split-a-file-on-any-character-in-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Django AdminForm objects and templates</title>
		<link>http://reliablybroken.com/b/2010/04/django-adminform-objects-and-templates/</link>
		<comments>http://reliablybroken.com/b/2010/04/django-adminform-objects-and-templates/#comments</comments>
		<pubDate>Sun, 04 Apr 2010 21:21:54 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=425</guid>
		<description><![CDATA[I can&#8217;t find documentation for the context of a Django admin template. In particular, where is the form and how does one access the fields? This post describes the template context for a generic admin model for Django 1.1.

Django uses an instance of ModelAdmin (defined in django.contrib.admin.options) to handle the request for a model object [...]]]></description>
			<content:encoded><![CDATA[<p>I can&#8217;t find documentation for the context of a Django admin template. In particular, where is the form and how does one access the fields? This post describes the template context for a generic admin model for <a href="http://code.djangoproject.com/browser/django/tags/releases/1.1">Django 1.1</a>.</p>

<p>Django uses an instance of <code>ModelAdmin</code> (defined in <a href="http://code.djangoproject.com/browser/django/tags/releases/1.1/django/contrib/admin/options.py#L175"><code>django.contrib.admin.options</code></a>) to handle the request for a model object add / change view in the admin site. <code>ModelAdmin.add_view</code> and <code>ModelAdmin.change_view</code> are responsible for populating the template context when rendering the add object and change object pages respectively.</p>

<p>Here are the keys common to add and change views:</p>

<ul>
<li><strong>title</strong>, &#8216;Add &#8216; or &#8216;Change &#8216; + your model class&#8217; <code>_meta.verbose_name</code></li>
<li><strong>adminform</strong> is an instance of <code>AdminForm</code></li>
<li><strong>is_popup</strong>, a boolean which is true when <code>_popup</code> is passed as a request parameter</li>
<li><strong>media</strong> is an instance of <a href="http://docs.djangoproject.com/en/dev/topics/forms/media/"><code>django.forms.Media</code></a></li>
<li><strong>inline_admin_formsets</strong> is a list of <a href="http://code.djangoproject.com/browser/django/tags/releases/1.1/django/contrib/admin/helpers.py#L102"><code>InlineAdminFormSet</code></a> objects</li>
<li><strong>errors</strong> is an instance of <a href="http://code.djangoproject.com/browser/django/tags/releases/1.1/django/contrib/admin/helpers.py#L198"><code>AdminErrorList</code></a></li>
<li><strong>root_path</strong> is the <code>root_path</code> attribute of the <code>AdminSite</code> object</li>
<li><strong>app_label</strong> is your model class&#8217; <code>_meta.app_label</code> attribute</li>
</ul>

<p>The way that Django renders a form in the admin view is to iterate over the <code>adminform</code> instance and then iterate over each <a href="http://code.djangoproject.com/browser/django/tags/releases/1.1/django/contrib/admin/helpers.py#L50"><code>FieldSet</code></a> which in turn yield <a href="http://code.djangoproject.com/browser/django/tags/releases/1.1/django/contrib/admin/helpers.py#L82"><code>AdminField</code></a> instances. All I want to do is layout the form fields, ignoring the fieldset groupings which may or may not be defined in the model&#8217;s <code>ModelAdmin.fieldset</code> attribute.</p>

<p>This turns out to be easy once you know how. The regular form is an attribute of the <code>adminform</code> object. So if your model has a field named &#8220;<code>king_of_pop</code>&#8221; you can refer to the form field in your template like so:</p>

<pre><code>{{ adminform.form.king_of_pop.label_tag }}: {{ adminform.form.king_of_pop }}
</code></pre>

<p>Or if you want to save your finger tips you can use the <a href="http://docs.djangoproject.com/en/dev/ref/templates/builtins/#with"><code>with</code> template tag</a>:</p>

<pre><code>{% with adminform.form as f %}
{{ f.king_of_pop.label_tag }}: {{ f.king_of_pop }}
{% endwith %}
</code></pre>

<p>Delving through the Django source while I tried to understand all of this I was struck by how <a href="http://docs.python.org/reference/datamodel.html#emulating-container-types">Python defines hook functions for iteration and accessing attributes</a>. Half of Python&#8217;s attraction is in how easy it is from the program author&#8217;s point of view to treat objects as built-in types like lists, dicts, etc.; the other half is the responsibility of the author of a Python module to encourage that same ease of use by implementing the related iteration protocols. It is harder to write a good Python module than it is to write a good Python program that uses a good module.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2010/04/django-adminform-objects-and-templates/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Using MacPorts behind a firewall</title>
		<link>http://reliablybroken.com/b/2010/03/using-macports-behind-a-firewall/</link>
		<comments>http://reliablybroken.com/b/2010/03/using-macports-behind-a-firewall/#comments</comments>
		<pubDate>Wed, 31 Mar 2010 11:37:39 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[macports]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=420</guid>
		<description><![CDATA[I failed to persuade MySQLdb to build on a Mac OS X Server 10.5.8 install using the system Python + MySQL installation. So I turned to MacPorts where I know I can get Django + all the bits working without much hassle (but with much patience).

The next problem was that MacPorts couldn&#8217;t update because rsync [...]]]></description>
			<content:encoded><![CDATA[<p>I failed to persuade <a href="http://mysql-python.sourceforge.net/MySQLdb.html">MySQLdb</a> to build on a <a href="http://www.apple.com/server/macosx/">Mac OS X Server 10.5.8</a> install using the system <a href="http://www.python.org/">Python</a> + <a href="http://www.mysql.com/">MySQL</a> installation. So I turned to <a href="http://www.macports.org/">MacPorts</a> where I know I can get <a href="http://www.djangoproject.com/">Django</a> + all the bits working without much hassle (but with much patience).</p>

<p>The next problem was that MacPorts couldn&#8217;t update because <a href="http://samba.anu.edu.au/rsync/">rsync</a> was blocked by the corporate access policy. Fortunately plain HTTP is permitted outbound. Here&#8217;s how to use a local ports tree.</p>

<p>Install MacPorts using the disk image for 10.5.</p>

<pre><code>curl -O http://distfiles.macports.org/MacPorts/MacPorts-1.8.2-10.5-Leopard.dmg
hdiutil attach MacPorts-1.8.2-10.5-Leopard.dmg
sudo installer -pkg /Volumes/MacPorts-1.8.2/MacPorts-1.8.2.pkg -target /
hdiutil detach /Volumes/MacPorts-1.8.2
</code></pre>

<p>If the MacPorts install directories are not in your $PATH environment, you can add them to your <code>.profile</code>. This change will not take effect until you start a new terminal session.</p>

<pre><code>cat &gt;&gt; ~/.profile &lt;&lt;EOF
PATH=/opt/local/bin:/opt/local/sbin:${PATH}
MANPATH=/opt/local/share/man:${MANPATH}
EOF
</code></pre>

<p>After you have installed MacPorts, create a directory for the ports tree and check it out using <a href="http://subversion.tigris.org/">Subversion</a>.</p>

<pre><code>sudo mkdir -p /opt/local/var/macports/sources/svn.macports.org/trunk/dports
cd /opt/local/var/macports/sources/svn.macports.org/trunk/dports
sudo svn co http://svn.macports.org/repository/macports/trunk/dports/ .
</code></pre>

<p>N.B. In the last line beginning <code>svn co ...</code> the trailing directory separator is significant!</p>

<p>Now tell MacPorts to use the local checkout rather than rsync. Edit <code>/opt/local/etc/macports/sources.conf</code> and add a new line to the end with the path to the ports tree, then comment out the previous line that uses rsync. Here are the last lines from my configuration:</p>

<pre><code>#rsync://rsync.macports.org/release/ports/ [default]
file:///opt/local/var/macports/sources/svn.macports.org/trunk/dports/ [default]
</code></pre>

<p>Finally you must create an index for the tree (otherwise you will see messages saying &#8220;Warning: No index(es) found!&#8221;).</p>

<pre><code>cd /opt/local/var/macports/sources/svn.macports.org/trunk/dports
sudo portindex
</code></pre>

<p>Now go do great things.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2010/03/using-macports-behind-a-firewall/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ModelForms good for importing too</title>
		<link>http://reliablybroken.com/b/2010/01/modelforms-good-for-importing-too/</link>
		<comments>http://reliablybroken.com/b/2010/01/modelforms-good-for-importing-too/#comments</comments>
		<pubDate>Tue, 26 Jan 2010 20:36:25 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=395</guid>
		<description><![CDATA[If you have exported data from one database in plain text format and you want to import it to Django, you should use a ModelForm class to do a lot of the heavy lifting for you.

A suitable ModelForm for your Django model will consume each row and do the conversion of each field to an [...]]]></description>
			<content:encoded><![CDATA[<p>If you have exported data from one database in plain text format and you want to import it to <a href="http://www.djangoproject.com">Django</a>, you should use a <a href="http://docs.djangoproject.com/en/dev/topics/forms/modelforms/"><code>ModelForm</code> class</a> to do a lot of the heavy lifting for you.</p>

<p>A suitable <code>ModelForm</code> for your Django model will consume each row and do the conversion of each field to an appropriate Python type. Much simpler than explicitly converting each value yourself before creating a new model instance.</p>

<p>Suppose you have a model for an address book entry and its associated <code>ModelForm</code> (this works for Django 1.1):</p>

<pre><code># myapp/models.py
from django.db import models
from django import forms

class Contact(models.Model):
    first_name = models.CharField(max_length=100)
    second_name = models.CharField(max_length=100)
    telephone = models.CharField(max_length=50, blank=True)
    email = models.EmailField(blank=True)

class ContactForm(forms.ModelForm):
    class Meta:
        model = Contact
</code></pre>

<p>Here&#8217;s a script to run through a comma-separated list of contacts where each line looks something like &#8220;Smits, Jimmy, jimmy@example.com, 555-1234&#8243;:</p>

<pre><code>from myapp.models import ContactForm

# Map columns to fields, adjusting the order as necessary
column_map = (
    'second_name',
    'first_name',
    'email',
    'telephone',
)

for line in open('tab-separated-data.txt'):
    row = dict(zip(column_map, (field.strip() for field in line.split(','))))
    form_obj = ContactForm(row)
    try:
        form_obj.save()
    except ValueError:
        for k, v in form_obj.errors.items():
            print k, row[k], ', '.join(map(unicode, v))
</code></pre>

<p>If a line doesn&#8217;t validate the script prints the validation errors and moves to the next line. If your data has columns you want to ignore then just name them in the <code>column_map</code> &#8211; the form class will ignore extra keys in the dictionary.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2010/01/modelforms-good-for-importing-too/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Notes on Radmind&#8217;s checksum</title>
		<link>http://reliablybroken.com/b/2010/01/radminds-checksum/</link>
		<comments>http://reliablybroken.com/b/2010/01/radminds-checksum/#comments</comments>
		<pubDate>Fri, 01 Jan 2010 12:04:05 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[radmind]]></category>
		<category><![CDATA[watchedinstall]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=376</guid>
		<description><![CDATA[It would be nice to do a pure-Python implementation of Radmind&#8217;s fsdiff output for watchedinstall, which consists of several white-space separated fields describing the filename&#8217;s attributes and an optional checksum for the file.

These are notes on how Radmind generates checksums for files on Mac OS X.

The fsdiff format is documented, however for files with Mac [...]]]></description>
			<content:encoded><![CDATA[<p>It would be nice to do a <a href="http://www.python.org/">pure-Python</a> implementation of <a href="http://rsug.itd.umich.edu/software/radmind/">Radmind</a>&#8217;s fsdiff output for <a href="http://bitbucket.org/ptone/watchedinstall/">watchedinstall</a>, which consists of several white-space separated fields describing the filename&#8217;s attributes and an optional checksum for the file.</p>

<p>These are notes on how Radmind generates checksums for files on <a href="http://www.apple.com/macosx/">Mac OS X</a>.</p>

<p>The <a href="http://linux.die.net/man/1/fsdiff">fsdiff format is documented</a>, however for files with Mac Finder info or a resource fork the checksum is for an <a href="http://users.phg-online.de/tk/netatalk/doc/Apple/v2/AppleSingle_AppleDouble.pdf">AppleSingle</a>-encoded representation of the file, which means a Python implementation needs to produce an equivalent AppleSingle-encoded byte stream for the file. Bummer.</p>

<p>Python 2.6 on Mac OS X includes a <a href="http://www.python.org/doc/2.6.2/library/undoc.html#module-applesingle">(deprecated) applesingle module</a> that can read the format but cannot write it (and the module has been removed for Python 3). Therefore a pure Python implementation of Radmind&#8217;s checksum has to implement a compatible AppleSingle encoding routine too.</p>

<p>Radmind&#8217;s fsdiff command is written in C, which I can just about get the gist of, but I am missing something because my attempts at emulating Radmind&#8217;s checksums are wrong.</p>

<p>The meat of Radmind&#8217;s checksum is the <a href="http://radmind.cvs.sourceforge.net/viewvc/radmind/radmind/cksum.c"><code>do_acksum()</code> function in <code>cksum.c</code></a>. The algorithm appears to be as follows:</p>

<ol>
<li>Initialize a digest using the default cipher (<a href="http://www.openssl.org/docs/crypto/md5.html">MD5</a> I think).</li>
<li>Add the AppleSingle header, consisting of a magic number and version number and some padding.</li>
<li>Add the AppleSingle entry table, which has 3 entries for the Finder info, the resource fork info and the data fork info (in that order). Each entry is 12 bytes &#8211; an unsigned long for the entry type, an unsigned long for an offset into the file where the data will start and an unsigned long for the data length.</li>
<li>Add the Finder info data.</li>
<li>Add the resource for data.</li>
<li>Add the data fork data.</li>
<li>Return a base64 encoded version of the final digest.</li>
</ol>

<p>Because the entry table in the AppleSingle header specifies data offsets and lengths you need to know the size of the Finder info data (always 32 bytes) and the size of the resource fork and the size of the data fork before you pass that data to the digest function.</p>

<p>So a working Python implementation needs to know the size of the resource fork and data fork before feeding that same data to the digest. It seems to me that this requirement might imply huge memory allocations while slurping file data &#8211; my wrong attempt tried counting bytes and later feeding the same data to the digest in manageable chunks.</p>

<p>Anyway&#8230;</p>

<p>Advice much appreciated. The workaround is to leave it to fsdiff to generate the checksum and parse the value from the output.</p>

<p>David</p>

<p>P.S. I still intend running <a href="http://www.aux-penelope.com/">A/UX 3.0.1</a> on my Centris 660av one day.</p>

<p>Update: using my eyes and brains and the <code>fsdiff -V</code> command I was able to read the fsdiff man page and deduce the preferred checksum cipher is actually sha1. My code is still wrong.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2010/01/radminds-checksum/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Context managers</title>
		<link>http://reliablybroken.com/b/2009/12/context-managers/</link>
		<comments>http://reliablybroken.com/b/2009/12/context-managers/#comments</comments>
		<pubDate>Sun, 20 Dec 2009 15:10:45 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[with]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=367</guid>
		<description><![CDATA[I was re-writing the exellent watchedinstall tool and needed to simplify a particularly gnarly chunk of code that required three sub-proceses to be started and then killed after invoking another process. It occurred to me I could make these into context managers.

Previously the code was something like&#8230;

start(program1)
try:
    start(program2)
except:
    stop(program1)
 [...]]]></description>
			<content:encoded><![CDATA[<p>I was re-writing the exellent <a href="http://bitbucket.org/ptone/watchedinstall/">watchedinstall</a> tool and needed to simplify a particularly gnarly chunk of code that required three sub-proceses to be started and then killed after invoking another process. It occurred to me I could make these into context managers.</p>

<p>Previously the code was something like&#8230;</p>

<pre><code>start(program1)
try:
    start(program2)
except:
    stop(program1)
    raise

try:
    start(program3)
except:
    stop(program2)
    stop(program1)
    raise

try:
    mainprogram()
finally:
    stop(program3)
    stop(program2)
    stop(program1)
</code></pre>

<p>Of course that could have been written with nested try / except / else / finally blocks as well, which I did start with but found not much shorter while almost incomprehensible.</p>

<p><a href="http://docs.python.org/library/stdtypes.html#typecontextmanager">With context managers</a> the whole thing was written as&#8230;</p>

<pre><code># from __future__ import with_statement, Python 2.5

with start(program1):
    with start(program2):
        with start(program3):
            mainprogram()
</code></pre>

<p>So much more comprehensible! Here&#8217;s the implementation of the context manager (using the <code>contextlib.contextmanager</code> decorator for a triple word score):</p>

<pre><code>import contextlib
import os
import signal
import subprocess


@contextlib.contextmanager
def start(program_args):
    prog = subprocess.Popen(program_args)
    if prog.poll(): # Anything other than None or 0 is BAD
        raise subprocess.CalledProcessError(prog.returncode, program_args[0])

    try:
        yield
    finally:
        if prog.poll() is None:
            os.kill(prog.pid, signal.SIGTERM)
</code></pre>

<p>For bonus points I might have used <a href="http://docs.python.org/library/contextlib.html"><code>contexlib.nested()</code></a> to put the three <code>start()</code> calls on one line but then what would I do for the rest of the day?</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/12/context-managers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>I am very bad at writing tests</title>
		<link>http://reliablybroken.com/b/2009/11/i-am-very-bad-at-writing-tests/</link>
		<comments>http://reliablybroken.com/b/2009/11/i-am-very-bad-at-writing-tests/#comments</comments>
		<pubDate>Sun, 22 Nov 2009 08:00:29 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=280</guid>
		<description><![CDATA[&#8230; but I think I might be getting a little better.

At least these days when I am writing some script (almost certainly in Python) I start out by intending to write tests. I usually fail because I haven&#8217;t learnt to think in terms of writing code that can be easily tested.

Mark Pilgrim&#8217;s Dive Into Python [...]]]></description>
			<content:encoded><![CDATA[<p>&#8230; but I <em>think</em> I might be getting a little better.</p>

<p>At least these days when I am writing some script (almost certainly in <a href="http://www.python.org/">Python</a>) I start out by intending to write tests. I usually fail because I haven&#8217;t learnt to think in terms of writing code that can be easily tested.</p>

<p><a href="http://diveintomark.org/">Mark Pilgrim</a>&#8217;s <a href="http://www.diveintopython.org/">Dive Into Python</a> has great stuff on how to approach a problem by <a href="http://diveintopython.org/unit_testing/stage_1.html">defining the tests first and gradually filling in the code</a> that satisfies the test suite. One day I may be able to work like that, until then I work by writing a concise docstring, then stubbing out the function. Once the function is in a state where it might actually return a meaningful result I can play with it in the Python interpreter and start adding useful <a href="http://docs.python.org/library/doctest.html">doctests</a> to the <a href="http://www.python.org/dev/peps/pep-0257/">docstring</a>.</p>

<p>What really helps is to break the logic out into tiny pieces where ideally each piece returns the result of transforming the input (which I think is known as a <a href="http://en.wikipedia.org/wiki/Functional_programming">functional approach</a>). By doing this I can have tests for most of the code and those functions that have a lot of conditional logic, those functions that are harder to write tests for, will at least be relying on sub-routines that are themselves well tested.</p>

<p>I can dream.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/11/i-am-very-bad-at-writing-tests/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Crazy Acrobat installers love Python</title>
		<link>http://reliablybroken.com/b/2009/11/crazy-acrobat-installers-love-python/</link>
		<comments>http://reliablybroken.com/b/2009/11/crazy-acrobat-installers-love-python/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 22:06:02 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[adobe]]></category>
		<category><![CDATA[hateful]]></category>
		<category><![CDATA[installer]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=331</guid>
		<description><![CDATA[Looking through the updaters for Adobe Acrobat 9 for Mac I came across a bunch of scripts written in Python. My favourte was called FindAndKill.py:

#!/usr/bin/python
"""
    Search for and kill app. 
"""
import os, sys
import commands
import signal


def main():
    if len(sys.argv) != 2:
        print 'Missing [...]]]></description>
			<content:encoded><![CDATA[<p>Looking through the updaters for <a href="http://www.adobe.com/products/acrobatpro/">Adobe Acrobat</a> 9 for Mac I came across a bunch of scripts written in <a href="http://www.python.org">Python</a>. My favourte was called <code>FindAndKill.py</code>:</p>

<pre><code>#!/usr/bin/python
"""
    Search for and kill app. 
"""
import os, sys
import commands
import signal


def main():
    if len(sys.argv) != 2:
        print 'Missing or too many arguments.'
        print 'One argument and only one argument is required.'
        print 'Pass in the app name to find and kill (i.e. "Safari").'
        return 0

    psCmd = '/bin/ps -x -c | grep ' + sys.argv[1]
    st, output = commands.getstatusoutput( psCmd )

    if st == 0:
        appsToKill = output.split('\n')
        for app in appsToKill:
            parts = app.split()
            killCmd = 'kill -s 15 ' + parts[0]
            #print killCmd
            os.system( killCmd )

if __name__ == "__main__":
    main()
</code></pre>

<p>(You can <a href="http://www.adobe.com/support/downloads/detail.jsp?ftpID=4538">download the Acrobat 9.1.3 update</a> and find this script at <code>Acrobat 9 Pro Patch.app/Contents/Resources/FindAndKill.py</code>.)</p>

<p>Was the author not aware of the <code>killall</code> command for sending a kill signal to a named process? The <a href="http://www.manpagez.com/man/1/killall/"><code>killall</code> man page</a> says it appeared in <a href="http://www.freebsd.org/releases/2.1R/announce.html">FreeBSD 2.1, which was released in November 1995</a>. Adobe CS4 was <a href="http://www.adobe.com/aboutadobe/pressroom/pressreleases/200809/092308AdobeCS4Family.html">released about 14 years later</a>. How is it Adobe&#8217;s product managers approve these things for release?</p>

<p>What is particularly galling about Adobe&#8217;s Acrobat 9 updaters is that they seem to  re-implement so much of what the Apple installer application does, even down to their use of gzipped cpio archives for the payload.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/11/crazy-acrobat-installers-love-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Migrating a Filemaker database to Django</title>
		<link>http://reliablybroken.com/b/2009/11/migrating-a-filemaker-database-to-django/</link>
		<comments>http://reliablybroken.com/b/2009/11/migrating-a-filemaker-database-to-django/#comments</comments>
		<pubDate>Sat, 07 Nov 2009 08:00:48 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[filemaker]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=320</guid>
		<description><![CDATA[At work we have several Filemaker Pro databases. I have been slowly working through these, converting them to Web-based applications using the Django framework. My primary motive is to replace an overly-complicated Filemaker setup running on four Macs with a single 2U rack-mounted server running Apache on FreeBSD.

At some point in the process of re-writing [...]]]></description>
			<content:encoded><![CDATA[<p>At work we have several <a href="http://www.filemaker.com/">Filemaker Pro</a> databases. I have been slowly working through these, converting them to Web-based applications using <a href="http://www.djangoproject.com/">the Django framework</a>. My primary motive is to replace an overly-complicated Filemaker setup running on four Macs with a single 2U rack-mounted server running <a href="http://httpd.apache.org/">Apache</a> on <a href="http://www.freebsd.org/">FreeBSD</a>.</p>

<p>At some point in the process of re-writing each database for use with Django I have needed to convert all the records from Filemaker to Django. There exist good <a href="http://www.python.org/">Python</a> libraries for <a href="http://code.google.com/p/pyfilemaker/">talking to Filemaker</a> but they rely on the XML Web interface, meaning that you need Filemaker running and set to publish the database on the Web while you are running an import.</p>

<p>In my experience <a href="http://www.filemaker.com/support/technologies/xml">Filemaker&#8217;s built-in XML publishing interface</a> is too slow when you want to migrate tens of thousands of records. During development of a Django-based application I find I frequently need to re-import the records as the new database schema evolves &#8211; doing this by communicating with Filemaker is tedious when you want to re-import the data several times a day.</p>

<p>So my approach has been to export the data from Filemaker as XML using <a href="http://www.filemaker.com/help/html/import_export.16.30.html#1029660">Filemaker&#8217;s FMPXMLRESULT</a> format. The Filemaker databases at work are <em>old</em> (Filemaker 5.5) and perhaps things have improved in more recent versions but Filemaker 5/6 is a very poor XML citizen. When using the FMPDSORESULT format (which has been dropped from more recent versions) it will happily generate invalid XML all over the shop. The FMPXMLRESULT format is better but even then it will emit invalid XML if the original data happens to contain funky characters.</p>

<p>So here is <a href="http://reliablybroken.com/b/wp-content/uploads/2009/11/filemaker.py">filemaker.py, a Python module for parsing an XML file produced by exporting to FMPXMLRESULT</a> format from Filemaker.</p>

<p>To use it you create a sub-class of the <code>FMPImporter</code> class and over-ride the <code>FMPImporter.import_node</code> method. This method is called for each row of data in the XML file and is passed an XML node instance for the row. You can convert that node to a more useful dictionary where keys are column names and values are the column values. You would then convert the data to your Django model object and save it.</p>

<p>A trivial example:</p>

<pre><code>import filemaker

class MyImporter(filemaker.FMPImporter):
    def import_node(self, node):
        node_dict = self.format_node(node)
        print node['RECORDID'], node_dict

importer = MyImporter(datefmt='%d/%m/%Y')
filemaker.importfile('/path/to/data.xml', importer=importer)
</code></pre>

<p>The <code>FMPImporter.format_node</code> method converts values to an appropriate Python type according to the Filemaker column type. Filemaker&#8217;s <code>DATE</code> and <code>TIME</code> types are converted to Python <a href="http://docs.python.org/library/datetime.html#date-objects"><code>datetime.date</code></a> and <a href="http://docs.python.org/library/datetime.html#time-objects"><code>datetime.time</code></a> instances respectively. <code>NUMBER</code> types are converted to Python <code>float</code> instances. Everything else is left as strings, but you can customize the conversion by over-riding the appropriate methods in your sub-class (see the source for the appropriate method names).</p>

<p>In the case of Filemaker <code>DATE</code> values you can pass the <code>datefmt</code> argument to your sub-class to specify the date format string. See Python&#8217;s <a href="http://docs.python.org/library/time.html#time.strftime">time.strptime documentation</a> for the complete list of the format specifiers.</p>

<p>The code uses <a href="http://docs.python.org/library/xml.sax.html">Python&#8217;s built-in SAX parser</a> so that it is efficent when importing huge XML files (the process uses a constant 15 megabytes for any size of data on my Mac running Python 2.5).</p>

<p>Fortunately I haven&#8217;t had to deal with Filemaker&#8217;s repeating fields so I have no idea how the code works on repeating fields. Please let me know if it works for you. Or not.</p>

<p><a href="http://reliablybroken.com/b/wp-content/uploads/2009/11/filemaker.py">Download filemaker.py</a>. This code is released under a 2-clause BSD license.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/11/migrating-a-filemaker-database-to-django/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Working with Active Directory FILETIME values in Python</title>
		<link>http://reliablybroken.com/b/2009/09/working-with-active-directory-filetime-values-in-python/</link>
		<comments>http://reliablybroken.com/b/2009/09/working-with-active-directory-filetime-values-in-python/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 16:27:16 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[active directory]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=282</guid>
		<description><![CDATA[How To Convert a UNIX time_t to a Win32 FILETIME or SYSTEMTIME:


  Under Win32 platforms, file times are maintained primarily in the form of
  a 64-bit FILETIME structure, which represents the number of 100-nanosecond
  intervals since January 1, 1601 UTC (coordinate universal time).


It just so happens that Microsoft Active Directory uses the [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://support.microsoft.com/kb/167296">How To Convert a UNIX time_t to a Win32 FILETIME or SYSTEMTIME</a>:</p>

<blockquote>
  <p>Under Win32 platforms, file times are maintained primarily in the form of
  a 64-bit FILETIME structure, which represents the number of 100-nanosecond
  intervals since January 1, 1601 UTC (coordinate universal time).</p>
</blockquote>

<p>It just so happens that <a href="http://www.microsoft.com/windowsserver2008/en/us/active-directory.aspx">Microsoft Active Directory</a> uses the same 64-bit value to store some time values. For example <a href="http://msdn.microsoft.com/en-us/library/ms675098(VS.85).aspx">the <code>accountExpires</code> attribute</a> is in this format. Linked below is a module for Python with utility functions for converting between <a href="http://docs.python.org/library/datetime.html">Python&#8217;s datetime instances</a> and Microsoft&#8217;s FILETIME values.</p>

<p>Very handy if you enjoy querying Active Directory for login accounts that are due to expire. And who wouldn&#8217;t enjoy that? On a Monday.</p>

<p><a href="/b/wp-content/uploads/2009/09/filetimes.py">Download filetimes.py module for converting between FILETIME and <code>datetime</code> objects.</a> This code is released under a 2-clause BSD license.</p>

<p>Example usage:</p>

<pre><code>&gt;&gt;&gt; from filetimes import filetime_to_dt, dt_to_filetime, utc
&gt;&gt;&gt; filetime_to_dt(116444736000000000)
datetime.datetime(1970, 1, 1, 0, 0)
&gt;&gt;&gt; filetime_to_dt(128930364000000000)
datetime.datetime(2009, 7, 25, 23, 0)
&gt;&gt;&gt; "%.0f" % dt_to_filetime(datetime(2009, 7, 25, 23, 0))
'128930364000000000'
&gt;&gt;&gt; dt_to_filetime(datetime(1970, 1, 1, 0, 0, tzinfo=utc))
116444736000000000L
&gt;&gt;&gt; dt_to_filetime(datetime(1970, 1, 1, 0, 0))
116444736000000000L
</code></pre>

<p>I even remembered to write tests for once!</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/09/working-with-active-directory-filetime-values-in-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>BBC iCalendar schedules</title>
		<link>http://reliablybroken.com/b/2009/08/bbc-icalendar-schedules/</link>
		<comments>http://reliablybroken.com/b/2009/08/bbc-icalendar-schedules/#comments</comments>
		<pubDate>Fri, 07 Aug 2009 20:18:42 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[bbc]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[time zone]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=262</guid>
		<description><![CDATA[Jon Udell recently wrote about accessing the BBC programming schedules but was put-off by the lack of time zone information in the iCalendar feeds, which prompted me to fix the quick-and-dirty script I have that generates iCalendar files for the BBC. (I wrote the first, time zone-blind version of my script in England&#8217;s Winter and [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.jonudell.net/">Jon Udell</a> recently <a href="http://blog.jonudell.net/2009/08/05/curation-meta-curation-and-live-net-radio/">wrote about accessing the BBC programming schedules</a> but was put-off by the lack of time zone information in the iCalendar feeds, which prompted me to fix the quick-and-dirty script I have that generates <a href="http://reliablybroken.com/guide/">iCalendar files for the BBC</a>. (I wrote the first, time zone-blind version of my script in England&#8217;s Winter and it worked just perfick back then!)</p>

<p>So <a href="http://reliablybroken.com/guide/bbcguidetz.py">I fix it</a>. The updated iCalendar files have events with time zone information.</p>

<p>Everyone&#8217;s happy.</p>

<p>Jon Udell&#8217;s use of Python to explore data manipulation on the Web was one of the reasons I thought I really ought to get stuck into <a href="http://www.python.org/">Python</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/08/bbc-icalendar-schedules/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Tuples to dicts, toot sweet!</title>
		<link>http://reliablybroken.com/b/2009/06/tuples-to-dicts-toot-sweet/</link>
		<comments>http://reliablybroken.com/b/2009/06/tuples-to-dicts-toot-sweet/#comments</comments>
		<pubDate>Fri, 19 Jun 2009 07:12:42 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[trac]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=201</guid>
		<description><![CDATA[Looking through Trac&#8217;s search internals I came across a chunk where a list of tuples is converted to a list of dictionaries for the convenience of the template engine. Each tuple has five fields: href, title, date, author and excerpt.

for idx, result in enumerate(results):
    results[idx] = {'href': result[0], 'title': result[1],
   [...]]]></description>
			<content:encoded><![CDATA[<p>Looking through <a href="http://trac.edgewall.org">Trac</a>&#8217;s search internals I came across <a href="http://trac.edgewall.org/browser/tags/trac-0.11.4/trac/search/web_ui.py#L111">a chunk where a list of tuples is converted to a list of dictionaries</a> for the convenience of the template engine. Each tuple has five fields: <em>href</em>, <em>title</em>, <em>date</em>, <em>author</em> and <em>excerpt</em>.</p>

<pre><code>for idx, result in enumerate(results):
    results[idx] = {'href': result[0], 'title': result[1],
                    'date': format_datetime(result[2]),
                    'author': result[3], 'excerpt': result[4]}
</code></pre>

<p>This allows the template author to use nice names for the fields in a row, like <code>${result.href}</code> etc. Looking at this reminded me of another approach that uses <a href="http://docs.python.org/tutorial/datastructures.html#list-comprehensions">list comprehension</a>, <a href="http://docs.python.org/library/functions.html#zip"><code>zip</code></a> and <a href="http://docs.python.org/library/stdtypes.html#typesmapping"><code>dict</code></a>.</p>

<pre><code>keys = ('href', 'title', 'date', 'author', 'excerpt')
results = [dict(zip(keys, row)) for row in results]
for row in results:
    row['date'] = format_datetime(row['date'])
</code></pre>

<p>The second line in this snippet is where the list of dictionaries is created, but one still has to go back and format the datetime values (the third and fourth lines). There&#8217;s no advantage in speed (the majority of the execution time is spent in <code>format_datetime</code>) but I like it a little better.</p>

<p>Maybe if Trac used the second approach I would like Trac a little better too.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/06/tuples-to-dicts-toot-sweet/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>os.walk and UnicodeDecodeError</title>
		<link>http://reliablybroken.com/b/2009/06/oswalk-and-unicodedecodeerror/</link>
		<comments>http://reliablybroken.com/b/2009/06/oswalk-and-unicodedecodeerror/#comments</comments>
		<pubDate>Tue, 09 Jun 2009 15:32:57 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[unicode]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=175</guid>
		<description><![CDATA[My Python program was raising a UnicodeDecodeError when using os.walk to traverse a directory containing files and folders with UTF-8 encoded names. What had me baffled was the exact same program was working perfectly on the exact same hardware just minutes earlier.

Turns out the difference was between me starting my program as root from an [...]]]></description>
			<content:encoded><![CDATA[<p>My Python program was raising a <code>UnicodeDecodeError</code> when using <a href="http://docs.python.org/library/os.html#os.walk"><code>os.walk</code></a> to traverse a directory containing files and folders with UTF-8 encoded names. What had me baffled was the exact same program was working perfectly on the exact same hardware just minutes earlier.</p>

<p>Turns out the difference was between me starting my program as root from an interactive bash shell, versus the program getting started as part of the boot sequence by init (on a <a href="http://www.debian.org/">Debian Lenny</a> system). When started interactively, the locale was set to en_GB.UTF-8 and so names on the filesystem were assumed to be UTF-8 encoded. When started by init, the locale was set to ASCII.</p>

<p>The fix, as described in this article <a href="http://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/"><em>Python: how is sys.stdout.encoding chosen?</em></a>, was to wrap my program in a script that set the LC_CTYPE environment variable.</p>

<pre><code>#!/bin/sh
export LC_CTYPE='en_GB.UTF-8'
/path/to/program.py
</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/06/oswalk-and-unicodedecodeerror/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Python features for a PHP refugee</title>
		<link>http://reliablybroken.com/b/2009/04/python-features-for-a-php-refugee/</link>
		<comments>http://reliablybroken.com/b/2009/04/python-features-for-a-php-refugee/#comments</comments>
		<pubDate>Fri, 10 Apr 2009 22:11:21 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Ben Dodd]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=130</guid>
		<description><![CDATA[These are things that particularly impressed me
when I decided I had had enough of PHP and I really ought to look at
the crazy white-space language called Python that was used by
Plone, Trac and Django.

The Zen of Python states most of this in 19 lines, for all you
tl;dr types.

Name-spaces and a minimal set of built-ins

I like [...]]]></description>
			<content:encoded><![CDATA[<p>These are things that particularly impressed me
when I decided I had had enough of <a href="http://www.php.net/">PHP</a> and I really ought to look at
<a href="http://www.python.org/">the crazy white-space language called Python</a> that was used by
<a href="http://plone.org/">Plone</a>, <a href="http://trac.edgewall.org/">Trac</a> and <a href="http://www.djangoproject.com/">Django</a>.</p>

<p>The <a href="http://www.python.org/dev/peps/pep-0020/">Zen of Python</a> states most of this in 19 lines, for all you
<em>tl;dr</em> types.</p>

<h2>Name-spaces and a minimal set of built-ins</h2>

<p>I like that the set of keywords is small, and that the built-in methods are
not much larger. This leaves you with an unpolluted name-space (and if you
enjoy confusing people you can always override the built-ins).</p>

<h2>Explicit versus implicit</h2>

<p>Related to name-spaces is the notion that Python is explicit: there is very
little magic in a Python script. Perhaps the closest thing to magic are the
various special methods that define a behaviour, for example the <code>__getattr__</code>
/ <code>__setattr__</code> / <code>__delattr__</code> methods on a class to control attribute access.
But even then Python makes it obvious those methods have special meaning by
using a double-underscore for the method names.</p>

<p>See <a href="http://docs.python.org/reference/datamodel.html#special-method-names">the page on the data model</a> in the Python documentation for
a description of these methods and their purpose.</p>

<h2>Generators and list comprehensions</h2>

<p>I never realized how much I missed these until I went back to PHP for a small
web project. So much of my code seems to be looping through lists and
accumulating a result or applying a function to each member of the list. I
don&#8217;t think the syntax is particularly obvious, but then I can&#8217;t think of a
better way to do it. At first glance generators looked to be the same as
list comprehensions, but eventually I began to understand the difference
between needing a finite list of objects (<a href="http://docs.python.org/tutorial/datastructures.html#list-comprehensions">list comprehension</a>)
and consuming a sequence of objects (<a href="http://docs.python.org/tutorial/classes.html#generators">generators</a>).</p>

<h2>Named arguments for methods and functions</h2>

<p>Gosh, not having named arguments is painful. As a consumer, named arguments
allow one to forget a function&#8217;s precise argument signature, and as a
designer it allows one to provide sensible defaults and flexibility.</p>

<h2>Dates and times as a native type</h2>

<p>Well, not <em>native</em>, but readily available.</p>

<p><a href="http://docs.python.org/library/datetime.html">Python&#8217;s <code>datetime</code> module</a> provides representations of calendar
dates and times and a bunch of obvious behaviour for comparing two moments.
PHP 5 introduced a proper DateTime class, but I had jumped ship a while
before then &#8211; my affection for Python&#8217;s date handling is borne of a time
when one had to rely on PEAR for useful date functions. Converting everything
to seconds since the Unix epoch was never fun.</p>

<p>The greatest annoyance in Python&#8217;s date implementation is its shrugging
support for timezones &#8211; you nearly always need to resort to a third-party
module (<a href="http://pytz.sourceforge.net/">such as pytz</a> or <a href="http://labix.org/python-dateutil">python-dateutil</a>) to handle
timezones without jeopardizing one&#8217;s sanity.</p>

<h2>Batteries included</h2>

<p>It is odd that one <em>does</em> need an additional module to handle timezones
seeing as the Python standard library includes so many useful modules for
common tasks.</p>

<p>Need to work with <a href="http://docs.python.org/library/csv.html">CSV files</a>? Or <a href="http://docs.python.org/library/getopt.html">command-line arguments</a>?
Or <a href="http://docs.python.org/library/plistlib.html">Mac OS X-style .plist</a> files? Or configuration files in
<a href="http://docs.python.org/library/configparser.html">INI format</a>? Or <a href="http://docs.python.org/library/tarfile.html">tar archives</a> (with gzip or bzip2 compression)?</p>

<p>Oh golly so much tedious work has been done for you in the Python standard
library. I suppose this reflects PHP&#8217;s emphasis as a scripting language for
the Web versus Python&#8217;s use as a general purpose language, but I am very
grateful for that distinction.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/04/python-features-for-a-php-refugee/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Q and operator.or_</title>
		<link>http://reliablybroken.com/b/2009/04/q-and-operatoror_/</link>
		<comments>http://reliablybroken.com/b/2009/04/q-and-operatoror_/#comments</comments>
		<pubDate>Thu, 09 Apr 2009 18:56:12 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=123</guid>
		<description><![CDATA[I&#8217;ve finally settled on a nice syntax for OR-ing Django Q objects.

For a simple site search feature I needed to search for a term across several
fields in a model. Suppose the model looks like this:

class BlogPost(models.Model):
    title = models.CharField(max_length=100)
    body = models.TextField()
    summary = models.TextField()


And you [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve finally settled on a nice syntax for <code>OR</code>-ing <a href="http://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q-objects">Django Q objects</a>.</p>

<p>For a simple site search feature I needed to search for a term across several
fields in a model. Suppose the model looks like this:</p>

<pre><code>class BlogPost(models.Model):
    title = models.CharField(max_length=100)
    body = models.TextField()
    summary = models.TextField()
</code></pre>

<p>And you have a view method that accepts a parameter <code>q</code> for searching across
the <code>title</code>, <code>body</code> and <code>summary</code> fields. I want to find objects that contain
the <code>q</code> phrase in any of those fields. I need to build a <a href="http://docs.djangoproject.com/en/dev/ref/models/querysets/"><code>QuerySet</code></a>
with a filter that is the equivalent of</p>

<pre><code>queryset = BlogPost.objects.filter(
    Q(title__icontains=q) | Q(body__icontains=q) | Q(summary__icontains=q)
)
</code></pre>

<p>That&#8217;s not too much of a hassle for this simple example, but in cases where
the fields you are searching are chosen dynamically, or where you just have
an awful lot of fields to search against, I think it is nicer to do it like so:</p>

<pre><code>import operator

search_fields = ('title', 'body', 'summary')
q_objects = [Q(**{field + '__icontains':q}) for field in search_fields]
queryset = BlogPost.objects.filter(reduce(operator.or_, q_objects))
</code></pre>

<p>Nice one! The list comprehension gives me a list of <code>Q</code> objects generated from
the names in <code>search_fields</code>, so it is easy to change the fields to be searched.
And using <a href="http://docs.python.org/library/functions.html#reduce"><code>reduce</code></a> and <a href="http://docs.python.org/library/operator.html"><code>operator.or_</code></a> gives me the
required <code>OR</code> filter in one line.</p>

<p>I see for Python 3 <code>reduce</code> has been <a href="http://docs.python.org/library/functools.html">moved to the <code>functools</code> module</a>.</p>

<p>This stuff never used to be that obvious to me. It kind of isn&#8217;t even now.</p>

<p>P.S. I promise I am not writing a blog engine at this time, it was just for
the example.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/04/q-and-operatoror_/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using an object for Django&#8217;s ChoiceField choices</title>
		<link>http://reliablybroken.com/b/2009/03/using-an-object-for-djangos-choicefield-choices/</link>
		<comments>http://reliablybroken.com/b/2009/03/using-an-object-for-djangos-choicefield-choices/#comments</comments>
		<pubDate>Sun, 29 Mar 2009 11:07:32 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=109</guid>
		<description><![CDATA[I had another thought about per-instance choices for forms.ChoiceField.
Instead of overriding the __init__ method of your form class, you could use
an object with an __iter__ method that returns a fresh iterable each time
it is called.

from django import forms


class LetterChoices(object):
    """Return a random list of max_choices letters of the alphabet."""
    [...]]]></description>
			<content:encoded><![CDATA[<p>I had another thought about <a href="http://reliablybroken.com/b/2009/03/per-instance-choices-for-djangos-formschoicefield/">per-instance choices for <code>forms.ChoiceField</code></a>.
Instead of overriding the <code>__init__</code> method of your form class, you could use
<a href="http://python.org/doc/current/library/stdtypes.html#typeiter">an object with an <code>__iter__</code> method</a> that returns a fresh iterable each time
it is called.</p>

<pre><code>from django import forms


class LetterChoices(object):
    """Return a random list of max_choices letters of the alphabet."""
    def __init__(self, max_choices=3):
        self.max_choices = max_choices

    def __iter__(self):
        import string, random

        return iter((l, l) for l in random.sample(string.ascii_uppercase, self.max_choices))


class LetterForm(forms.Form):
    """Pick a letter from a small, random set."""
    letter = forms.ChoiceField(choices=LetterChoices())
</code></pre>

<p>I don&#8217;t know if I prefer that style to having a simple function &#8211; having to
instantiate the class seems wrong to me, I&#8217;d much rather use any callable as
the <code>choices</code> argument.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/03/using-an-object-for-djangos-choicefield-choices/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Django test database runner as a context manager</title>
		<link>http://reliablybroken.com/b/2009/03/django-test-database-runner-as-a-context-manager/</link>
		<comments>http://reliablybroken.com/b/2009/03/django-test-database-runner-as-a-context-manager/#comments</comments>
		<pubDate>Sat, 28 Mar 2009 00:07:45 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[testing]]></category>
		<category><![CDATA[with]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=98</guid>
		<description><![CDATA[In my last post I mentioned it might be an idea to wrap up the Django test
database setup / teardown in a context manager for use with Python&#8217;s
with statement. Here&#8217;s my first stab, which seems to work.

from contextlib import contextmanager


@contextmanager
def test_db_connection():
    """A context manager for Django's test runner.

    For [...]]]></description>
			<content:encoded><![CDATA[<p>In my last post I mentioned it might be an idea to <a href="http://reliablybroken.com/b/2009/03/creating-a-django-test-database-for-unit-testing/">wrap up the Django test
database setup / teardown in a context manager</a> for use with <a href="http://docs.python.org/reference/datamodel.html#context-managers">Python&#8217;s
<code>with</code> statement</a>. Here&#8217;s my first stab, which seems to work.</p>

<pre><code>from contextlib import contextmanager


@contextmanager
def test_db_connection():
    """A context manager for Django's test runner.

    For Python 2.5 you will need
        from __future__ import with_statement
    """

    from django.conf import settings
    from django.test.utils import setup_test_environment, teardown_test_environment
    from django.db import connection

    setup_test_environment()

    settings.DEBUG = False    
    verbosity = 0
    interactive = False

    old_name = settings.DATABASE_NAME
    connection.creation.create_test_db(verbosity, autoclobber=not interactive)

    yield connection

    connection.creation.destroy_test_db(old_name, verbosity)
    teardown_test_environment()
</code></pre>

<p>All of this requires Python 2.5 or later.</p>

<p>So with that snippet you could write a test something like so:</p>

<pre><code>import unittest


class MyTestCase(unittest.TestCase):
    def test_myModelTest(self):
        with test_db_connection():
            from myproject.myapp.models import MyModel

            obj = MyModel()
            obj.save()
            self.assert_(obj.pk)
</code></pre>

<p>&#8230; and just as with Django&#8217;s <code>manage.py test</code> command the objects would be
created within the test database then destroyed when the
<code>with test_db_connection()</code> block is finished.</p>

<p>Everything&#8217;s going to be hunky dory.</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/03/django-test-database-runner-as-a-context-manager/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating a Django test database for unit testing</title>
		<link>http://reliablybroken.com/b/2009/03/creating-a-django-test-database-for-unit-testing/</link>
		<comments>http://reliablybroken.com/b/2009/03/creating-a-django-test-database-for-unit-testing/#comments</comments>
		<pubDate>Fri, 27 Mar 2009 11:02:02 +0000</pubDate>
		<dc:creator>david</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://reliablybroken.com/b/?p=94</guid>
		<description><![CDATA[I needed to run tests involving a Django application but without using the
manage.py test management command. So I need my own test suite that
sets up the test database and drops it after, leaving my real database untouched.

As of Django 1.0.2 the default behaviour for the test runner is the run_tests
function in django.test.simple. Here is the [...]]]></description>
			<content:encoded><![CDATA[<p>I needed to run tests involving a Django application but without using the
<code>manage.py test</code> management command. So I need my own test suite that
sets up the test database and drops it after, leaving my real database untouched.</p>

<p>As of Django 1.0.2 the default behaviour for the test runner is the <a href="http://code.djangoproject.com/browser/django/tags/releases/1.0.2/django/test/simple.py#L102"><code>run_tests</code>
function in <code>django.test.simple</code></a>. Here is the bones of that function
with the required setup and teardown calls.</p>

<pre><code>from django.conf import settings
from django.test.utils import setup_test_environment, teardown_test_environment


verbosity = 1
interactive = True

setup_test_environment()
settings.DEBUG = False    
old_name = settings.DATABASE_NAME

from django.db import connection
connection.creation.create_test_db(verbosity, autoclobber=not interactive)

# Here you run tests using the test database and with mock SMTP objects

connection.creation.destroy_test_db(old_name, verbosity)
teardown_test_environment()
</code></pre>

<p>Hmmm&#8230; Wouldn&#8217;t this be a good candidate to be wrapped up for use with
<a href="http://docs.python.org/reference/datamodel.html#context-managers">Python 2.5&#8217;s <code>with</code> statement</a>?</p>
]]></content:encoded>
			<wfw:commentRss>http://reliablybroken.com/b/2009/03/creating-a-django-test-database-for-unit-testing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
