PEP:339
Title:How to Change CPython's Bytecode
Version:1970
Last-Modified:2005-02-12 14:02:05 -0800 (Sat, 12 Feb 2005)
Author:Brett Cannon <brett at python.org>
Status:Active
Type:Informational
Content-Type:text/x-rst
Created:02-Feb-2005
Post-History:02-Feb-2005

Contents

Abstract

Python source code is compiled down to something called bytecode. This bytecode must implement enough semantics to perform the actions required by the Language Reference [1]. As such, knowing how to add, remove, or change the bytecode is important to do properly when changing the abilities of the Python language. This PEP covers how to accomplish this in the CPython implementation of the language (referred to as simply "Python" for the rest of this PEP).

Warning

The guidelines outlined in this PEP apply to Python 2.4 and earlier. Current plans for Python 2.5 will lead to a significant change in how Python's bytecode is handled. This PEP will be updated once these planned changes are committed into CVS.

Rationale

While changing Python's bytecode is not a frequent occurence, it still happens. Having the required steps documented in a single location should make experimentation with the bytecode easier since it is not necessarily obvious what the steps are to change the bytecode.

This PEP, paired with PEP 306 [2], should provide enough basic guidelines for handling any changes performed to the Python language itself in terms of syntactic changes that introduce new semantics.

Checklist

This is a rough checklist of what files need to change and how they are involved with the bytecode. All paths are given from the viewpoint of /cvsroot/python/dist/src from CVS). This list should not be considered exhaustive nor to cover all possible situations.

Suggestions for bytecode development

A few things can be done to make sure that development goes smoothly when experimenting with Python's bytecode. One is to delete all .py(c|o) files after each semantic change to Python/compile.c . That way all files will use any bytecode changes.

Make sure to run the entire testing suite [5]. Since the regrtest.py driver recompiles all source code before a test is run it acts a good test to make sure that no existing semantics are broken.

Running parrotbench [7] is also a good way to make sure existing semantics are not broken; this benchmark is practically a compliance test.

Previous experiments

This section lists known bytecode experiments that have not gone into Python.

Skip Montanaro presented a paper at a Python workshop on a peephole optimizer [8].

Michael Hudson has a non-active SourceForge project named Bytecodehacks [9] that provides functionality for playing with bytecode directly.

An opcode to combine the functionality of LOAD_ATTR/CALL_FUNCTION was created named CALL_ATTR [10]. Currently only works for classic classes and for new-style classes rough benchmarking showed an actual slowdown thanks to having to support both classic and new-style classes.

References

[1]Python Language Reference, van Rossum & Drake (http://docs.python.org/ref/ref.html)
[2]PEP 306, How to Change Python's Grammar, Hudson (http://www.python.org/peps/pep-0306.html)
[3]dis Module (http://docs.python.org/lib/module-dis.html)
[4]'compiler' Package (http://docs.python.org/lib/module-compiler.html)
[5]'test' Package (http://docs.python.org/lib/module-test.html)
[6]Python Byte Code Instructions (http://docs.python.org/lib/bytecodes.html)
[7]Parrotbench (ftp://ftp.python.org/pub/python/parrotbench/parrotbench.tgz, http://mail.python.org/pipermail/python-dev/2003-December/041527.html)
[8]Skip Montanaro's Peephole Optimizer Paper (http://www.foretec.com/python/workshops/1998-11/proceedings/papers/montanaro/montanaro.html)
[9]Bytecodehacks Project (http://bytecodehacks.sourceforge.net/bch-docs/bch/index.html)
[10]CALL_ATTR opcode (http://www.python.org/sf/709744)