Monthly Archives: January 2010

watchedinstall is useful

Very satisfying to use [watchedinstall][wi] at work the other day to see exactly what a tricksy meta-package was doing during installation. Now that I [fixed a stupid bug involving dtrace][bug], watchedinstall works a treat for recording exactly what goes where.

Many thanks to [Preston Holmes][ptone] for releasing watchedinstall in the first place.

My goal is to replace the functionality of the fsevents helper application with a [dtrace][dtrace] script that can list filesystem changes. A single python script would be simpler to install and use – you wouldn’t need to install it at all, just run it from the directory you downloaded it to. No effing about with setting PATH environment variables, no worry about compiling a C program for whatever architecture.

Hey Esther!

[wi]: http://bitbucket.org/davidbuxton/watchedinstall/
[bug]: http://bitbucket.org/davidbuxton/watchedinstall/changeset/d97aaae628c3/
[ptone]: http://www.ptone.com/
[dtrace]: http://www.sun.com/bigadmin/content/dtrace/

AJAX-ified result paging is not good

An annoying trend in Web design: using [AJAX][ajax] to load results when there is more than one page.

Apple does this for their [search results][airport]. Netgear does this when [searching their knowledge base][netgear]. Microsoft does this for their [Mactopia discussion forums][microsoft]. All three ostensibly good, clean designs fail to consider what the hell a visitor wants in the first place, which is to see the next damn page of results.

The first problem with using AJAX to load results is that the browser view does not change when the new results are loaded. Suppose you have read the first ten results, you scroll to the last result on the page and the first result scrolls up and out of view. Then you click the link for the next page of results. The fancy AJAX loader replaces the existing list of results with the next page’s list of results, but does not move the view, leaving you staring at the last result on the second page when what you want is to see the first result of the second page, so you have to scroll back to the top of the page.

The script to load the results should scroll the view so the first result of the subsequent page is visible – I have yet to see an example of this behaviour.

The second problem is the URL does not change between one page and the next, which means you cannot bookmark any page other than the first. URLs and hyperlinks are the very stuff of the Web, it is mad not to make use of them.

My guess is that the Web designer in each of these cases was so pleased by the effect of updating the visitor’s view of a page without changing the browser location that she figured it was an improvement over the established technique of passing query parameters in a URL.

It is not. Please go back to the old-fashioned use of a query parameter to indicate the offset into a list of results.

[netgear]: http://kb.netgear.com/app/answers/list/
[airport]: http://support.apple.com/downloads/#airport
[microsoft]: http://www.officeformac.com/ProductForums/Entourage/
[ajax]: http://en.wikipedia.org/wiki/Ajax_(programming)

Tea Timer widget

Finally found a use for [Dashboard][dashboard] with [Stefan Scherfke][stefan]’s [Tea Timer][tea].

[stefan]: http://stefan.sofa-rockers.org
[tea]: http://stefan.sofa-rockers.org/teatimer/
[dashboard]: http://www.apple.com/downloads/dashboard/

ModelForms good for importing too

If you have exported data from one database in plain text format and you want to import it to [Django][django], you should use a [`ModelForm` class][modelform] to do a lot of the heavy lifting for you.

A suitable `ModelForm` for your Django model will consume each row and do the conversion of each field to an appropriate Python type. Much simpler than explicitly converting each value yourself before creating a new model instance.

Suppose you have a model for an address book entry and its associated `ModelForm` (this works for Django 1.1):

# myapp/models.py
from django.db import models
from django import forms

class Contact(models.Model):
first_name = models.CharField(max_length=100)
second_name = models.CharField(max_length=100)
telephone = models.CharField(max_length=50, blank=True)
email = models.EmailField(blank=True)

class ContactForm(forms.ModelForm):
class Meta:
model = Contact

Here’s a script to run through a comma-separated list of contacts where each line looks something like “Smits, Jimmy, jimmy@example.com, 555-1234”:

from myapp.models import ContactForm

# Map columns to fields, adjusting the order as necessary
column_map = (
‘second_name’,
‘first_name’,
’email’,
‘telephone’,
)

for line in open(‘tab-separated-data.txt’):
row = dict(zip(column_map, (field.strip() for field in line.split(‘,’))))
form_obj = ContactForm(row)
try:
form_obj.save()
except ValueError:
for k, v in form_obj.errors.items():
print k, row[k], ‘, ‘.join(map(unicode, v))

If a line doesn’t validate the script prints the validation errors and moves to the next line. If your data has columns you want to ignore then just name them in the `column_map` – the form class will ignore extra keys in the dictionary.

[django]: http://www.djangoproject.com
[modelform]: http://docs.djangoproject.com/en/dev/topics/forms/modelforms/

Removing printing restrictions in 10.5

I am trying to write a good one-liner for removing all restrictions on printing for Mac OS X 10.5. I had thought that [`sed`][sed] would be perfect for this, but I can’t arrive at a simple syntax for appending new lines that works well when pasted into a terminal window. Here’s what I ended up with:

perl -p -0 -i ‘.bak’ -e ‘s/(Policy default).*(Policy)/$1>\n\nOrder deny,allow\nAllow from all\n<\/Limit>\n<$2/s' /private/etc/cups/cupsd.conf Rather brutal, it just guts the default policy and replaces it with the following:

Order deny,allow
Allow from all

Greg Neagle has [a useful article about printing in the enterprise][mactech]. Apple suggests [adding the network group to the local lpadmin group][ht3511], but points out that mobile users would need to be added individually. In my case most accounts are mobile accounts and we trust everyone to manage print queues on a Mac, so removing all restrictions is acceptable.

[mactech]: http://www.mactech.com/articles/mactech/Vol.24/24.06/2406MacEnterprise-LeopardPrinting/index.html
[ht3511]: http://support.apple.com/kb/HT3511
[sed]: http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPages/man1/sed.1.html

Notes on Radmind’s checksum

It would be nice to do a [pure-Python][python] implementation of [Radmind][radmind]’s fsdiff output for [watchedinstall][watchedinstall], which consists of several white-space separated fields describing the filename’s attributes and an optional checksum for the file.

These are notes on how Radmind generates checksums for files on [Mac OS X][macosx].

The [fsdiff format is documented][manfsdiff], however for files with Mac Finder info or a resource fork the checksum is for an [AppleSingle][applesingle]-encoded representation of the file, which means a Python implementation needs to produce an equivalent AppleSingle-encoded byte stream for the file. Bummer.

Python 2.6 on Mac OS X includes a [(deprecated) applesingle module][applesinglemod] that can read the format but cannot write it (and the module has been removed for Python 3). Therefore a pure Python implementation of Radmind’s checksum has to implement a compatible AppleSingle encoding routine too.

Radmind’s fsdiff command is written in C, which I can just about get the gist of, but I am missing something because my attempts at emulating Radmind’s checksums are wrong.

The meat of Radmind’s checksum is the [`do_acksum()` function in `cksum.c`][cksum]. The algorithm appears to be as follows:

1. Initialize a digest using the default cipher ([MD5][md5] I think).
2. Add the AppleSingle header, consisting of a magic number and version number and some padding.
3. Add the AppleSingle entry table, which has 3 entries for the Finder info, the resource fork info and the data fork info (in that order). Each entry is 12 bytes – an unsigned long for the entry type, an unsigned long for an offset into the file where the data will start and an unsigned long for the data length.
4. Add the Finder info data.
5. Add the resource for data.
6. Add the data fork data.
7. Return a base64 encoded version of the final digest.

Because the entry table in the AppleSingle header specifies data offsets and lengths you need to know the size of the Finder info data (always 32 bytes) and the size of the resource fork and the size of the data fork before you pass that data to the digest function.

So a working Python implementation needs to know the size of the resource fork and data fork before feeding that same data to the digest. It seems to me that this requirement might imply huge memory allocations while slurping file data – my wrong attempt tried counting bytes and later feeding the same data to the digest in manageable chunks.

Anyway…

Advice much appreciated. The workaround is to leave it to fsdiff to generate the checksum and parse the value from the output.

David

P.S. I still intend running [A/UX 3.0.1][aux] on my Centris 660av one day.

Update: using my eyes and brains and the `fsdiff -V` command I was able to read the fsdiff man page and deduce the preferred checksum cipher is actually sha1. My code is still wrong.

[radmind]: http://rsug.itd.umich.edu/software/radmind/
[python]: http://www.python.org/
[macosx]: http://www.apple.com/macosx/
[manfsdiff]: http://linux.die.net/man/1/fsdiff
[applesingle]: http://users.phg-online.de/tk/netatalk/doc/Apple/v2/AppleSingle_AppleDouble.pdf
[applesinglemod]: http://www.python.org/doc/2.6.2/library/undoc.html#module-applesingle
[md5]: http://www.openssl.org/docs/crypto/md5.html
[aux]: http://www.aux-penelope.com/
[cksum]: http://radmind.cvs.sourceforge.net/viewvc/radmind/radmind/cksum.c
[watchedinstall]: http://bitbucket.org/ptone/watchedinstall/