Managing the Release of a Large Python Project

Abstract

Twisted is a Python networking framework. At last count, the project contains nearly 60,000 lines of effective code (not comments or blank lines). When preparing a release, many details must be checked, and many steps must be followed. We describe here the technologies and tools we use, and explain how we built tools on top of them which help us make releasing as painless as possible.

Introduction

One of the virtues of Python is the ease of distributing code. Its module system and the lack of necessity of compilation are what make this possible. This means that for simple Python projects, nothing more complicated then tar is needed to prepare a distribution of a library. However, Twisted has auto-generated documentation in several formats, including docstring generated documentation, HOWTOs written in HTML, and manpages written in nroff. As Twisted grew more complex and popular, a detailed procedure for putting out a release was made necessary. However, human fallibility being what it is, it was decided that most of these steps should be automated.

Overview of Steps

Despite heavy automation, there are still a number of manual steps involved in the release process. We've reduced the amount of manual steps quite a bit, and most of what's left is not fully automatable, although the process could be made easier (see Future Directions below).

Testing

Twisted has three categories of tests: unit, acceptance, and pre-release. Testing is an important part of releasing quality software, of course, so these will be explained.

Unit tests are run as often as possible by each of the developers as they write code, and must pass before they commit any changes to CVS. While the Twisted team tries to follow the XP practice of ensuring all code is releasable, this isn't always true. Thus, running the unit tests on several platforms before releasing is necessary. Our BuildBot runs the unit tests constantly on several hosts and multiple platforms, so the status page is simply checked for green lights before a release.

Acceptance tests (which, unfortunately, are not quite the same as Extreme Programming's Acceptance Tests) are simply interactive tests of various Twisted services. There is a script that executes several system commands that use the Twisted end-user executables and start several clients (web browsers, IRC clients, etc) to allow the user to interactively test the different services that Twisted offers. These are only routinely run before a release, but we also encourage developers to run these before they make major changes.

The pre-release tests are for ensuring the web server (One of the most popular parts of Twisted, and which the twistedmatrix.com web site uses) runs correctly in a semi-production environment. The script starts up a web server on twistedmatrix.com, similar to the one on port 80, but on an out-of-the-way port. lynx is then run several times, with URLs strategically chosen to test different features of the web server. Afterwards, the log of the web server is displayed and the user is to check for any errors.

The release-twisted Script

Like many other build/release systems, the automated parts of our release system started out as a number of small shell scripts. Eventually these became a single Python script which was a large improvement, but still had many problems, especially since our release process became more complex (documentation generation, different types of archive formats, etc). This led to problems with steps in the middle of the process breaking; the release manager would need to restart the entire thing, or enter the remaining commands manually.

The solution that we came up with was a simple framework for pseudo-transactions; Every step of the process is implemented with a class that has doIt and undoIt methods. Each step also has a command-line argument associated with it, so a typical run of the script looks something like this:

$SOMEWHERE/admin/release-twisted -V $VERSION -o $LASTVERSION --checkout \
--release=/twisted/Releases --upver --tag --exp --dist --docs --balls \
--rel --deb --debi

Transactions

As stated above, our transaction system is very simple. One of our rather simple transaction classes is Export.

class Export(Transaction):
    def doIt(self, opts):
        print "Export"
        root = opts['cvsroot']
        ver = opts['release-version']
        sh('cvs -d%s export -r release-%s Twisted' % (root, ver.replace('.', '_')))

    def undoIt(self, opts, fail):
        sh('rm -rf Twisted')

One useful feature to note is the sensitiveUndo attribute on Transaction classes. If a transaction has this set, the user will be prompted before running the undoIt method. This is useful for very long-running processes, like documentation generation, debian package building, and uploading to sourceforge. If something goes wrong in the middle of one of these processes, we want to give the user a chance to manually fix the problem rather than redoing the entire transaction. They can then continue from the next command by omitting the commands that have already been accomplished from the release-twisted arguments.

A list of all of the transactions defined in release-twisted follows.

CheckOut
checks out the latest revision of Twisted from CVS and puts it in the Twisted.CVS directory.
UpdateVersion
changes the version number of the current release -- updating twisted/copyright.py (the canonical location for the current version) and a few other text files where the current version is mentioned.
Tag
tags the revisions in the current source tree with the version passed in on the command line.
Export
runs the cvs export command, which is similar to checkout, but leaves out CVS support directories; this is what we package up in the archives.
PrepareDist
simply copies the directory containing the version of Twisted to be released to a new directory specifically for the release process. The reason that we have this extra copy is that sometimes one will want to create a release from a directory that wasn't created from the Export command; having the release script munge that directory in-place would be impolite.
GenerateDocs
generates the various documentation: HTML API documentation (via Epydoc), HTML, PostScript, and PDF howto documentation (via twisted.lore), and HTML man-pages (via lore, converted from the nroff source).
CreateTarballs
creates the various archives that each Twisted release involves: tarred and gzipped or bzip2ed versions of archives with code plus documentation, code without documentation, and only documentation.
Release
copies all of the archives to a directory specified by the --release parameter. This is meant to be a publically accessible directory, thus the name Release.
MakeDebs
creates the .deb packages and support files for the Twisted Debian packages.
InstallDebs
Creates an apt-gettable Debian package repository in the (unfortunately hard-coded) /twisted/Debian directory.
Sourceforge
uploads the archives and debian packages to Twisted's sourceforge mirror at http://twisted.sourceforge.net/.
UpgradeDebian
Installs the recently-generated Debian packages via dpkg on the local machine.

setup.py

Twisted has an extensive and very customized setup.py script. We have a number of C extension modules and try to ensure that they all build, or at least fail gracefully, on win32, Mac OSX, Linux and other popular unix-style OSes.

We have overridden three of the distutils command classes: build_ext, install_scripts, and install_data.

Building C extensions

build_ext_twisted detects, based on various features of the platform, which C extensions to build. It overrides the build_extensions method to first check which C extensions are appropriate to build for the current platform before proceeding as normal (by calling the superclass's build_extensions). The module-detection consists of several simple tests for platform features and conditional additions to the `extensions' attribute. One especially useful feature is the _check_header method, which takes the name of an arbitrary head file and tries to compile (via the distutil's C compiler interafce) a simple C file that only #includes it.

Installing scripts

install_data_twisted ensures that the data files are installed along-side the python modules in the twisted package. This is accomplished with the incantation:

class install_data_twisted(install_data):
    def finalize_options (self):
        self.set_undefined_options('install',
            ('install_lib', 'install_dir')
        )
        install_data.finalize_options(self)

Windows Releases

Packaging software for windows involves a unique set of problems. The problem of clickability is especially acute; Several customizations to the distutils setup had to be made.

The first customization was to make the scripts end with a .py extension, since Windows relies on extension rather than a she-bang line to specify what interpreter should execute a file. This was accomplished by overriding the install_scripts command, like so:

class install_scripts_twisted(install_scripts):
    """Renames scripts so they end with '.py' on Windows."""

    def run(self):
        install_scripts.run(self)
        if os.name == "nt":
            for file in self.get_outputs():
                if not file.endswith(".py"):
                    os.rename(file, file + ".py")

We also wanted to have a Start-menu group with a number of icons for running different Twisted programs. This was accomplished with a post-install script specified with the command-line parameter --install-script=twisted_postinstall.py.

Future Directions

The theme is, of course, automation, and there are still many manual steps involved in a Twisted release. The currently most annoying step is updating the documentation and downloads section of the twistedmatrix.com website. Automating this would be a major improvement to the time it takes from the running of the release script to a fully completed release.

Another major improvement will involve further integration with BuildBot. Currently we have BuildBot running unit tests, building C extensions, and generating documentation on several hosts. Eventually we would like to have it constantly generating full release archives, and have an additional web form for finalizing any particular build that we deem releasable. The result would be uploading the release to the mirrors and updating the website.

The tagging scheme used by the release-twisted scripts can sometimes be problematic. If we find serious problems in the code-base after the Tag command is executed (which is fairly early in the process), we are forced to fix the bug and increase the version number. This can be prevented by, instead of making the official tag, using the unofficial tag releasing-$version (as opposed to release-$version) at that early stage. Once most of the steps are complete, the official tag will be made. If something in between goes wrong, we can just re-use the unofficial releasing-$version tag and not worry about users trying to use that tag.