Introduction
It is possible (and even relatively common) in Python code and
in extension modules to "trap" when an instance's client code
attempts to set an attribute and execute code instead. In other
words it is possible to allow users to use attribute assignment/
retrieval/deletion syntax even though the underlying implementation
is doing some computation rather than directly modifying a
binding.
This PEP describes a feature that makes it easier, more efficient
and safer to implement these handlers for Python instances.
Justification
cenario 1:
You have a deployed class that works on an attribute named
"stdout". After a while, you think it would be better to
check that stdout is really an object with a "write" method
at the moment of assignment. Rather than change to a
setstdout method (which would be incompatible with deployed
code) you would rather trap the assignment and check the
object's type.
Scenario 2:
You want to be as compatible as possible with an object
model that has a concept of attribute assignment. It could
be the W3C Document Object Model or a particular COM
interface (e.g. the PowerPoint interface). In that case
you may well want attributes in the model to show up as
attributes in the Python interface, even though the
underlying implementation may not use attributes at all.
Scenario 3:
A user wants to make an attribute read-only.
In short, this feature allows programmers to separate the
interface of their module from the underlying implementation
for whatever purpose. Again, this is not a new feature but
merely a new syntax for an existing convention.
Current Solution
To make some attributes read-only:
class foo:
def __setattr__( self, name, val ):
if name=="readonlyattr":
raise TypeError
elif name=="readonlyattr2":
raise TypeError
...
else:
self.__dict__["name"]=val
This has the following problems:
1. The creator of the method must be intimately aware of whether
somewhere else in the class hiearchy __setattr__ has also been
trapped for any particular purpose. If so, she must specifically
call that method rather than assigning to the dictionary. There
are many different reasons to overload __setattr__ so there is a
decent potential for clashes. For instance object database
implementations often overload setattr for an entirely unrelated
purpose.
2. The string-based switch statement forces all attribute handlers
to be specified in one place in the code. They may then dispatch
to task-specific methods (for modularity) but this could cause
performance problems.
3. Logic for the setting, getting and deleting must live in
__getattr__, __setattr__ and __delattr__. Once again, this can be
mitigated through an extra level of method call but this is
inefficient.
Proposed Syntax
Special methods should declare themselves with declarations of the
following form:
class x:
def __attr_XXX__(self, op, val ):
if op=="get":
return someComputedValue(self.internal)
elif op=="set":
self.internal=someComputedValue(val)
elif op=="del":
del self.internal
Client code looks like this:
fooval=x.foo
x.foo=fooval+5
del x.foo
Semantics
Attribute references of all three kinds should call the method.
The op parameter can be "get"/"set"/"del". Of course this string
will be interned so the actual checks for the string will be
very fast.
It is disallowed to actually have an attribute named XXX in the
same instance as a method named __attr_XXX__.
An implementation of __attr_XXX__ takes precedence over an
implementation of __getattr__ based on the principle that
__getattr__ is supposed to be invoked only after finding an
appropriate attribute has failed.
An implementation of __attr_XXX__ takes precedence over an
implementation of __setattr__ in order to be consistent. The
opposite choice seems fairly feasible also, however. The same
goes for __del_y__.
Proposed Implementation
There is a new object type called an attribute access handler.
Objects of this type have the following attributes:
name (e.g. XXX, not __attr__XXX__
method (pointer to a method object
In PyClass_New, methods of the appropriate form will be detected and
converted into objects (just like unbound method objects). These are
stored in the class __dict__ under the name XXX. The original method
is stored as an unbound method under its original name.
If there are any attribute access handlers in an instance at all,
a flag is set. Let's call it "I_have_computed_attributes" for
now. Derived classes inherit the flag from base classes. Instances
inherit the flag from classes.
A get proceeds as usual until just before the object is returned.
In addition to the current check whether the returned object is a
method it would also check whether a returned object is an access
handler. If so, it would invoke the getter method and return
the value. To remove an attribute access handler you could directly
fiddle with the dictionary.
A set proceeds by checking the "I_have_computed_attributes" flag. If
it is not set, everything proceeds as it does today. If it is set
then we must do a dictionary get on the requested object name. If it
returns an attribute access handler then we call the setter function
with the value. If it returns any other object then we discard the
result and continue as we do today. Note that having an attribute
access handler will mildly affect attribute "setting" performance for
all sets on a particular instance, but no more so than today, using
__setattr__. Gets are more efficient than they are today with
__getattr__.
The I_have_computed_attributes flag is intended to eliminate the
performance degradation of an extra "get" per "set" for objects not
using this feature. Checking this flag should have miniscule
performance implications for all objects.
The implementation of delete is analogous to the implementation
of set.
Caveats
1. You might note that I have not proposed any logic to keep
the I_have_computed_attributes flag up to date as attributes
are added and removed from the instance's dictionary. This is
consistent with current Python. If you add a __setattr__ method
to an object after it is in use, that method will not behave as
it would if it were available at "compile" time. The dynamism is
arguably not worth the extra implementation effort. This snippet
demonstrates the current behavior:
>>> def prn(*args):print args
>>> class a:
... __setattr__=prn
>>> a().foo=5
(<__main__.a instance at 882890>, 'foo', 5)
>>> class b: pass
>>> bi=b()
>>> bi.__setattr__=prn
>>> b.foo=5
2. Assignment to __dict__["XXX"] can overwrite the attribute
access handler for __attr_XXX__. Typically the access handlers will
store information away in private __XXX variables
3. An attribute access handler that attempts to call setattr or getattr
on the object itself can cause an infinite loop (as with __getattr__)
Once again, the solution is to use a special (typically private)
variable such as __XXX.
Note
The descriptor mechanism described in PEP 252 is powerful enough
to support this more directly. A 'getset' constructor may be
added to the language making this possible:
class C:
def get_x(self):
return self.__x
def set_x(self, v):
self.__x = v
x = getset(get_x, set_x)
Additional syntactic sugar might be added, or a naming convention
could be recognized.