Archive

Posts Tagged ‘python’

Django AdminForm objects and templates

April 4th, 2010 david 1 comment

I can’t find documentation for the context of a Django admin template. In particular, where is the form and how does one access the fields? This post describes the template context for a generic admin model for Django 1.1.

Django uses an instance of ModelAdmin (defined in django.contrib.admin.options) to handle the request for a model object add / change view in the admin site. ModelAdmin.add_view and ModelAdmin.change_view are responsible for populating the template context when rendering the add object and change object pages respectively.

Here are the keys common to add and change views:

  • title, ‘Add ‘ or ‘Change ‘ + your model class’ _meta.verbose_name
  • adminform is an instance of AdminForm
  • is_popup, a boolean which is true when _popup is passed as a request parameter
  • media is an instance of django.forms.Media
  • inline_admin_formsets is a list of InlineAdminFormSet objects
  • errors is an instance of AdminErrorList
  • root_path is the root_path attribute of the AdminSite object
  • app_label is your model class’ _meta.app_label attribute

The way that Django renders a form in the admin view is to iterate over the adminform instance and then iterate over each FieldSet which in turn yield AdminField instances. All I want to do is layout the form fields, ignoring the fieldset groupings which may or may not be defined in the model’s ModelAdmin.fieldset attribute.

This turns out to be easy once you know how. The regular form is an attribute of the adminform object. So if your model has a field named “king_of_pop” you can refer to the form field in your template like so:

{{ adminform.form.king_of_pop.label_tag }}: {{ adminform.form.king_of_pop }}

Or if you want to save your finger tips you can use the with template tag:

{% with adminform.form as f %}
{{ f.king_of_pop.label_tag }}: {{ f.king_of_pop }}
{% endwith %}

Delving through the Django source while I tried to understand all of this I was struck by how Python defines hook functions for iteration and accessing attributes. Half of Python’s attraction is in how easy it is from the program author’s point of view to treat objects as built-in types like lists, dicts, etc.; the other half is the responsibility of the author of a Python module to encourage that same ease of use by implementing the related iteration protocols. It is harder to write a good Python module than it is to write a good Python program that uses a good module.

Tags: ,

Using MacPorts behind a firewall

March 31st, 2010 david 4 comments

I failed to persuade MySQLdb to build on a Mac OS X Server 10.5.8 install using the system Python + MySQL installation. So I turned to MacPorts where I know I can get Django + all the bits working without much hassle (but with much patience).

The next problem was that MacPorts couldn’t update because rsync was blocked by the corporate access policy. Fortunately plain HTTP is permitted outbound. Here’s how to use a local ports tree.

Install MacPorts using the disk image for 10.5.

curl -O http://distfiles.macports.org/MacPorts/MacPorts-1.8.2-10.5-Leopard.dmg
hdiutil attach MacPorts-1.8.2-10.5-Leopard.dmg
sudo installer -pkg /Volumes/MacPorts-1.8.2/MacPorts-1.8.2.pkg -target /
hdiutil detach /Volumes/MacPorts-1.8.2

If the MacPorts install directories are not in your $PATH environment, you can add them to your .profile. This change will not take effect until you start a new terminal session.

cat >> ~/.profile <<EOF
PATH=/opt/local/bin:/opt/local/sbin:${PATH}
MANPATH=/opt/local/share/man:${MANPATH}
EOF

After you have installed MacPorts, create a directory for the ports tree and check it out using Subversion.

sudo mkdir -p /opt/local/var/macports/sources/svn.macports.org/trunk/dports
cd /opt/local/var/macports/sources/svn.macports.org/trunk/dports
sudo svn co http://svn.macports.org/repository/macports/trunk/dports/ .

N.B. In the last line beginning svn co ... the trailing directory separator is significant!

Now tell MacPorts to use the local checkout rather than rsync. Edit /opt/local/etc/macports/sources.conf and add a new line to the end with the path to the ports tree, then comment out the previous line that uses rsync. Here are the last lines from my configuration:

#rsync://rsync.macports.org/release/ports/ [default]
file:///opt/local/var/macports/sources/svn.macports.org/trunk/dports/ [default]

Finally you must create an index for the tree (otherwise you will see messages saying “Warning: No index(es) found!”).

cd /opt/local/var/macports/sources/svn.macports.org/trunk/dports
sudo portindex

Now go do great things.

Tags: , ,

ModelForms good for importing too

January 26th, 2010 david No comments

If you have exported data from one database in plain text format and you want to import it to Django, you should use a ModelForm class to do a lot of the heavy lifting for you.

A suitable ModelForm for your Django model will consume each row and do the conversion of each field to an appropriate Python type. Much simpler than explicitly converting each value yourself before creating a new model instance.

Suppose you have a model for an address book entry and its associated ModelForm (this works for Django 1.1):

# myapp/models.py
from django.db import models
from django import forms

class Contact(models.Model):
    first_name = models.CharField(max_length=100)
    second_name = models.CharField(max_length=100)
    telephone = models.CharField(max_length=50, blank=True)
    email = models.EmailField(blank=True)

class ContactForm(forms.ModelForm):
    class Meta:
        model = Contact

Here’s a script to run through a comma-separated list of contacts where each line looks something like “Smits, Jimmy, jimmy@example.com, 555-1234″:

from myapp.models import ContactForm

# Map columns to fields, adjusting the order as necessary
column_map = (
    'second_name',
    'first_name',
    'email',
    'telephone',
)

for line in open('tab-separated-data.txt'):
    row = dict(zip(column_map, (field.strip() for field in line.split(','))))
    form_obj = ContactForm(row)
    try:
        form_obj.save()
    except ValueError:
        for k, v in form_obj.errors.items():
            print k, row[k], ', '.join(map(unicode, v))

If a line doesn’t validate the script prints the validation errors and moves to the next line. If your data has columns you want to ignore then just name them in the column_map – the form class will ignore extra keys in the dictionary.

Tags: ,

Notes on Radmind’s checksum

January 1st, 2010 david No comments

It would be nice to do a pure-Python implementation of Radmind‘s fsdiff output for watchedinstall, which consists of several white-space separated fields describing the filename’s attributes and an optional checksum for the file.

These are notes on how Radmind generates checksums for files on Mac OS X.

The fsdiff format is documented, however for files with Mac Finder info or a resource fork the checksum is for an AppleSingle-encoded representation of the file, which means a Python implementation needs to produce an equivalent AppleSingle-encoded byte stream for the file. Bummer.

Python 2.6 on Mac OS X includes a (deprecated) applesingle module that can read the format but cannot write it (and the module has been removed for Python 3). Therefore a pure Python implementation of Radmind’s checksum has to implement a compatible AppleSingle encoding routine too.

Radmind’s fsdiff command is written in C, which I can just about get the gist of, but I am missing something because my attempts at emulating Radmind’s checksums are wrong.

The meat of Radmind’s checksum is the do_acksum() function in cksum.c. The algorithm appears to be as follows:

  1. Initialize a digest using the default cipher (MD5 I think).
  2. Add the AppleSingle header, consisting of a magic number and version number and some padding.
  3. Add the AppleSingle entry table, which has 3 entries for the Finder info, the resource fork info and the data fork info (in that order). Each entry is 12 bytes – an unsigned long for the entry type, an unsigned long for an offset into the file where the data will start and an unsigned long for the data length.
  4. Add the Finder info data.
  5. Add the resource for data.
  6. Add the data fork data.
  7. Return a base64 encoded version of the final digest.

Because the entry table in the AppleSingle header specifies data offsets and lengths you need to know the size of the Finder info data (always 32 bytes) and the size of the resource fork and the size of the data fork before you pass that data to the digest function.

So a working Python implementation needs to know the size of the resource fork and data fork before feeding that same data to the digest. It seems to me that this requirement might imply huge memory allocations while slurping file data – my wrong attempt tried counting bytes and later feeding the same data to the digest in manageable chunks.

Anyway…

Advice much appreciated. The workaround is to leave it to fsdiff to generate the checksum and parse the value from the output.

David

P.S. I still intend running A/UX 3.0.1 on my Centris 660av one day.

Update: using my eyes and brains and the fsdiff -V command I was able to read the fsdiff man page and deduce the preferred checksum cipher is actually sha1. My code is still wrong.

Context managers

December 20th, 2009 david No comments

I was re-writing the exellent watchedinstall tool and needed to simplify a particularly gnarly chunk of code that required three sub-proceses to be started and then killed after invoking another process. It occurred to me I could make these into context managers.

Previously the code was something like…

start(program1)
try:
    start(program2)
except:
    stop(program1)
    raise

try:
    start(program3)
except:
    stop(program2)
    stop(program1)
    raise

try:
    mainprogram()
finally:
    stop(program3)
    stop(program2)
    stop(program1)

Of course that could have been written with nested try / except / else / finally blocks as well, which I did start with but found not much shorter while almost incomprehensible.

With context managers the whole thing was written as…

# from __future__ import with_statement, Python 2.5

with start(program1):
    with start(program2):
        with start(program3):
            mainprogram()

So much more comprehensible! Here’s the implementation of the context manager (using the contextlib.contextmanager decorator for a triple word score):

import contextlib
import os
import signal
import subprocess


@contextlib.contextmanager
def start(program_args):
    prog = subprocess.Popen(program_args)
    if prog.poll(): # Anything other than None or 0 is BAD
        raise subprocess.CalledProcessError(prog.returncode, program_args[0])

    try:
        yield
    finally:
        if prog.poll() is None:
            os.kill(prog.pid, signal.SIGTERM)

For bonus points I might have used contexlib.nested() to put the three start() calls on one line but then what would I do for the rest of the day?

Tags: ,

I am very bad at writing tests

November 22nd, 2009 david No comments

… but I think I might be getting a little better.

At least these days when I am writing some script (almost certainly in Python) I start out by intending to write tests. I usually fail because I haven’t learnt to think in terms of writing code that can be easily tested.

Mark Pilgrim‘s Dive Into Python has great stuff on how to approach a problem by defining the tests first and gradually filling in the code that satisfies the test suite. One day I may be able to work like that, until then I work by writing a concise docstring, then stubbing out the function. Once the function is in a state where it might actually return a meaningful result I can play with it in the Python interpreter and start adding useful doctests to the docstring.

What really helps is to break the logic out into tiny pieces where ideally each piece returns the result of transforming the input (which I think is known as a functional approach). By doing this I can have tests for most of the code and those functions that have a lot of conditional logic, those functions that are harder to write tests for, will at least be relying on sub-routines that are themselves well tested.

I can dream.

Tags: ,

Crazy Acrobat installers love Python

November 9th, 2009 david No comments

Looking through the updaters for Adobe Acrobat 9 for Mac I came across a bunch of scripts written in Python. My favourte was called FindAndKill.py:

#!/usr/bin/python
"""
    Search for and kill app. 
"""
import os, sys
import commands
import signal


def main():
    if len(sys.argv) != 2:
        print 'Missing or too many arguments.'
        print 'One argument and only one argument is required.'
        print 'Pass in the app name to find and kill (i.e. "Safari").'
        return 0

    psCmd = '/bin/ps -x -c | grep ' + sys.argv[1]
    st, output = commands.getstatusoutput( psCmd )

    if st == 0:
        appsToKill = output.split('\n')
        for app in appsToKill:
            parts = app.split()
            killCmd = 'kill -s 15 ' + parts[0]
            #print killCmd
            os.system( killCmd )

if __name__ == "__main__":
    main()

(You can download the Acrobat 9.1.3 update and find this script at Acrobat 9 Pro Patch.app/Contents/Resources/FindAndKill.py.)

Was the author not aware of the killall command for sending a kill signal to a named process? The killall man page says it appeared in FreeBSD 2.1, which was released in November 1995. Adobe CS4 was released about 14 years later. How is it Adobe’s product managers approve these things for release?

What is particularly galling about Adobe’s Acrobat 9 updaters is that they seem to re-implement so much of what the Apple installer application does, even down to their use of gzipped cpio archives for the payload.

Migrating a Filemaker database to Django

November 7th, 2009 david 2 comments

At work we have several Filemaker Pro databases. I have been slowly working through these, converting them to Web-based applications using the Django framework. My primary motive is to replace an overly-complicated Filemaker setup running on four Macs with a single 2U rack-mounted server running Apache on FreeBSD.

At some point in the process of re-writing each database for use with Django I have needed to convert all the records from Filemaker to Django. There exist good Python libraries for talking to Filemaker but they rely on the XML Web interface, meaning that you need Filemaker running and set to publish the database on the Web while you are running an import.

In my experience Filemaker’s built-in XML publishing interface is too slow when you want to migrate tens of thousands of records. During development of a Django-based application I find I frequently need to re-import the records as the new database schema evolves – doing this by communicating with Filemaker is tedious when you want to re-import the data several times a day.

So my approach has been to export the data from Filemaker as XML using Filemaker’s FMPXMLRESULT format. The Filemaker databases at work are old (Filemaker 5.5) and perhaps things have improved in more recent versions but Filemaker 5/6 is a very poor XML citizen. When using the FMPDSORESULT format (which has been dropped from more recent versions) it will happily generate invalid XML all over the shop. The FMPXMLRESULT format is better but even then it will emit invalid XML if the original data happens to contain funky characters.

So here is filemaker.py, a Python module for parsing an XML file produced by exporting to FMPXMLRESULT format from Filemaker.

To use it you create a sub-class of the FMPImporter class and over-ride the FMPImporter.import_node method. This method is called for each row of data in the XML file and is passed an XML node instance for the row. You can convert that node to a more useful dictionary where keys are column names and values are the column values. You would then convert the data to your Django model object and save it.

A trivial example:

import filemaker

class MyImporter(filemaker.FMPImporter):
    def import_node(self, node):
        node_dict = self.format_node(node)
        print node['RECORDID'], node_dict

importer = MyImporter(datefmt='%d/%m/%Y')
filemaker.importfile('/path/to/data.xml', importer=importer)

The FMPImporter.format_node method converts values to an appropriate Python type according to the Filemaker column type. Filemaker’s DATE and TIME types are converted to Python datetime.date and datetime.time instances respectively. NUMBER types are converted to Python float instances. Everything else is left as strings, but you can customize the conversion by over-riding the appropriate methods in your sub-class (see the source for the appropriate method names).

In the case of Filemaker DATE values you can pass the datefmt argument to your sub-class to specify the date format string. See Python’s time.strptime documentation for the complete list of the format specifiers.

The code uses Python’s built-in SAX parser so that it is efficent when importing huge XML files (the process uses a constant 15 megabytes for any size of data on my Mac running Python 2.5).

Fortunately I haven’t had to deal with Filemaker’s repeating fields so I have no idea how the code works on repeating fields. Please let me know if it works for you. Or not.

Download filemaker.py. This code is released under a 2-clause BSD license.

Working with Active Directory FILETIME values in Python

September 7th, 2009 david 1 comment

How To Convert a UNIX time_t to a Win32 FILETIME or SYSTEMTIME:

Under Win32 platforms, file times are maintained primarily in the form of a 64-bit FILETIME structure, which represents the number of 100-nanosecond intervals since January 1, 1601 UTC (coordinate universal time).

UPDATED New version with fixes by Tim Williams for preserving microseconds. See here for details.

It just so happens that Microsoft Active Directory uses the same 64-bit value to store some time values. For example the accountExpires attribute is in this format. Linked below is a module for Python with utility functions for converting between Python’s datetime instances and Microsoft’s FILETIME values.

Very handy if you enjoy querying Active Directory for login accounts that are due to expire. And who wouldn’t enjoy that? On a Monday.

Download filetimes.py module for converting between FILETIME and datetime objects. This code is released under a 2-clause BSD license.

Example usage:

>>> from filetimes import filetime_to_dt, dt_to_filetime, utc
>>> filetime_to_dt(116444736000000000)
datetime.datetime(1970, 1, 1, 0, 0)
>>> filetime_to_dt(128930364000000000)
datetime.datetime(2009, 7, 25, 23, 0)
>>> "%.0f" % dt_to_filetime(datetime(2009, 7, 25, 23, 0))
'128930364000000000'
>>> dt_to_filetime(datetime(1970, 1, 1, 0, 0, tzinfo=utc))
116444736000000000L
>>> dt_to_filetime(datetime(1970, 1, 1, 0, 0))
116444736000000000L

I even remembered to write tests for once!

BBC iCalendar schedules

August 7th, 2009 david 1 comment

Jon Udell recently wrote about accessing the BBC programming schedules but was put-off by the lack of time zone information in the iCalendar feeds, which prompted me to fix the quick-and-dirty script I have that generates iCalendar files for the BBC. (I wrote the first, time zone-blind version of my script in England’s Winter and it worked just perfick back then!)

So I fix it. The updated iCalendar files have events with time zone information.

Everyone’s happy.

Jon Udell’s use of Python to explore data manipulation on the Web was one of the reasons I thought I really ought to get stuck into Python.

Tags: , ,