a neat/ugly hack

Guido van Rossum (Guido.van.Rossum@cwi.nl)
Thu, 16 Jan 92 01:07:00 +0100

After reading an article by adam@visix.com in comp.lang.misc, the
following occurred to me. Is this net or ugly? I kinda like it...

It will probably only make sense to you if you've read the file
misc/CLASSES in the Python distribution; it depends on details of
Python's class implementation (which are simple enough, just different
from other OO languages). It also requires that class objects are
writable (which they are since 0.9.3).

Suppose we have a class whose objects have some kind of state which
modifies the meaning of one or more methods. Traditionally this is
done by changing a flag which is tested by the affected methods.

Perhaps more efficiency or elegance can be gained by swapping method
pointers. A naive implementation of this would be the following:

class C:
def init(self): ...; return self
def var1(self): ...
def var2(self): ...
def setflag(self): self.var = self.var1
def clrflag(self): self.var = self.var2
def usevar(self): ...; self.var(); ...

The problem with this is that it creates circular references, which in
the current version of Python causes uncollectable garbage if the
object becomes unreachable. (The data attribute self.var is either
self.var1 or self.var2; these aren't plain functions but *method*
objects which, amongst others, contain a pointer (accessible as
im_self) to the object. If this is still unclear, write me and I'll
explain it in more detail.)

Now the fact that class declarations are executable becomes useful.
We use a function that is supposed to return a new "C" object. This
function derives a new empty class "C1" from C and initializes it as
if it was a C. The object can then assign a function to an attribute
of its __class__ and thereby change its methods:

class C:
def init(self): ...; return self
def var1(self): ...
def var2(self): ...
def setflag(self): self.__class__.var = C.var1
def clrflag(self): self.__class__.var = C.var2
def usevar(self): ...; self.var(); ...

def newC():
class C1(C): pass
return C1().init()

Each call to newC() creates a new class object C1. This is essential;
if we were to move the definition of C1 out of newC(), calling
setflag() for one C object would change the meaning of var to var1 for
*all* C objects!

There are no circular references in this version: C.var1 and C.var2
are not method objects, they are functions that only get turned into
methods when called as self.var (or self.var1).

A disadvantage of this approach is that it won't work for derived
classes of C. For some cases this can be solved by passing the class
to be instantiated as a parameter to newC() (which is better renamed to
new() then). It gets trickier when some derived class also uses this
hack; then that class must take care of the needs of its base class(es).
(This is probably trivial if the hack is used exactly as shown here,
since all base classes can place their switching methods in the same
"C1" class.)

Another disadvantage is that all other method calls become epsilon
slower, since the chain of base classes is one longer. I haven't
measured this, it should cost no more than one dictionary lookup in a
tiny dictionary.

A variant would require that an instance's __class__ attribute be
writeable (this is currently not the case but it would be a one-line
change to the interpreter); here we would just have two classes C1 and
C2 derived from C and let the setflag() methods switch the instance's
__class__ attribute between C1 and C2. In fact the mere possibility
of this makes me doubt the read-only-ness of __class__ (or __bases__,
for that matter).

(I wonder if this hack allows one to write Python programs in the
"delegation" paradigm rather than the "inheritance" paradigm. Or am I
completely off base here?)

--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"If this is Bolton, I shall return to the pet shop"