Abstract
This proposal puts forth an extensible cooperative mechanism for
the adaptation of an incoming object to a context which expects an
object supporting a specific protocol (say a specific type, class,
or interface).
This proposal provides a built-in "adapt" function that, for any
object X and any protocol Y, can be used to ask the Python
environment for a version of X compliant with Y. Behind the
scenes, the mechanism asks object X: "Are you now, or do you know
how to wrap yourself to provide, a supporter of protocol Y?".
And, if this request fails, the function then asks protocol Y:
"Does object X support you, or do you know how to wrap it to
obtain such a supporter?" This duality is important, because
protocols can be developed after objects are, or vice-versa, and
this PEP lets either case be supported non-invasively with regard
to the pre-existing component[s].
Lastly, if neither the object nor the protocol know about each
other, the mechanism may check a registry of adapter factories,
where callables able to adapt certain objects to certain protocols
can be registered dynamically. This part of the proposal is
optional: the same effect could be obtained by ensuring that
certain kinds of protocols and/or objects can accept dynamic
registration of adapter factories, for example via suitable custom
metaclasses. However, this optional part allows adaptation to be
made more flexible and powerful in a way that is not invasive to
either protocols or other objects, thereby gaining for adaptation
much the same kind of advantage that Python standard library's
"copy_reg" module offers for serialization and persistence.
This proposal does not specifically constrain what a protocol
_is_, what "compliance to a protocol" exactly _means_, nor what
precisely a wrapper is supposed to do. These omissions are
intended to leave this proposal compatible with both existing
categories of protocols, such as the existing system of type and
classes, as well as the many concepts for "interfaces" as such
which have been proposed or implemented for Python, such as the
one in PEP 245 [1], the one in Zope3 [2], or the ones discussed in
the BDFL's Artima blog in late 2004 and early 2005 [3]. However,
some reflections on these subjects, intended to be suggestive and
not normative, are also included.
Motivation
Currently there is no standardized mechanism in Python for
checking if an object supports a particular protocol. Typically,
existence of certain methods, particularly special methods such as
__getitem__, is used as an indicator of support for a particular
protocol. This technique works well for a few specific protocols
blessed by the BDFL (Benevolent Dictator for Life). The same can
be said for the alternative technique based on checking
'isinstance' (the built-in class "basestring" exists specifically
to let you use 'isinstance' to check if an object "is a [built-in]
string"). Neither approach is easily and generally extensible to
other protocols, defined by applications and third party
frameworks, outside of the standard Python core.
Even more important than checking if an object already supports a
given protocol can be the task of obtaining a suitable adapter
(wrapper or proxy) for the object, if the support is not already
there. For example, a string does not support the file protocol,
but you can wrap it into a StringIO instance to obtain an object
which does support that protocol and gets its data from the string
it wraps; that way, you can pass the string (suitably wrapped) to
subsystems which require as their arguments objects that are
readable as files. Unfortunately, there is currently no general,
standardized way to automate this extremely important kind of
"adaptation by wrapping" operations.
Typically, today, when you pass objects to a context expecting a
particular protocol, either the object knows about the context and
provides its own wrapper or the context knows about the object and
wraps it appropriately. The difficulty with these approaches is
that such adaptations are one-offs, are not centralized in a
single place of the users code, and are not executed with a common
technique, etc. This lack of standardization increases code
duplication with the same adapter occurring in more than one place
or it encourages classes to be re-written instead of adapted. In
either case, maintainability suffers.
It would be very nice to have a standard function that can be
called upon to verify an object's compliance with a particular
protocol and provide for a wrapper if one is readily available --
all without having to hunt through each library's documentation
for the incantation appropriate to that particular, specific case.
Requirements
When considering an object's compliance with a protocol, there are
several cases to be examined:
a) When the protocol is a type or class, and the object has
exactly that type or is an instance of exactly that class (not
a subclass). In this case, compliance is automatic.
b) When the object knows about the protocol, and either considers
itself compliant, or knows how to wrap itself suitably.
c) When the protocol knows about the object, and either the object
already complies or the protocol knows how to suitably wrap the
object.
d) When the protocol is a type or class, and the object is a
member of a subclass. This is distinct from the first case (a)
above, since inheritance (unfortunately) does not necessarily
imply substitutability, and thus must be handled carefully.
e) When the context knows about the object and the protocol and
knows how to adapt the object so that the required protocol is
satisfied. This could use an adapter registry or similar
approaches.
The fourth case above is subtle. A break of substitutability can
occur when a subclass changes a method's signature, or restricts
the domains accepted for a method's argument ("co-variance" on
arguments types), or extends the co-domain to include return
values which the base class may never produce ("contra-variance"
on return types). While compliance based on class inheritance
_should_ be automatic, this proposal allows an object to signal
that it is not compliant with a base class protocol.
If Python gains some standard "official" mechanism for interfaces,
however, then the "fast-path" case (a) can and should be extended
to the protocol being an interface, and the object an instance of
a type or class claiming compliance with that interface. For
example, if the "interface" keyword discussed in [3] is adopted
into Python, the "fast path" of case (a) could be used, since
instantiable classes implementing an interface would not be
allowed to break substitutability.
Specification
This proposal introduces a new built-in function, adapt(), which
is the basis for supporting these requirements.
The adapt() function has three parameters:
- `obj', the object to be adapted
- `protocol', the protocol requested of the object
- `alternate', an optional object to return if the object could
not be adapted
A successful result of the adapt() function returns either the
object passed `obj', if the object is already compliant with the
protocol, or a secondary object `wrapper', which provides a view
of the object compliant with the protocol. The definition of
wrapper is deliberately vague, and a wrapper is allowed to be a
full object with its own state if necessary. However, the design
intention is that an adaptation wrapper should hold a reference to
the original object it wraps, plus (if needed) a minimum of extra
state which it cannot delegate to the wrapper object.
An excellent example of adaptation wrapper is an instance of
StringIO which adapts an incoming string to be read as if it was a
textfile: the wrapper holds a reference to the string, but deals
by itself with the "current point of reading" (from _where_ in the
wrapped strings will the characters for the next, e.g., "readline"
call come from), because it cannot delegate it to the wrapped
object (a string has no concept of "current point of reading" nor
anything else even remotely related to that concept).
A failure to adapt the object to the protocol raises an
AdaptationError (which is a subclass of TypeError), unless the
alternate parameter is used, in this case the alternate argument
is returned instead.
To enable the first case listed in the requirements, the adapt()
function first checks to see if the object's type or the object's
class are identical to the protocol. If so, then the adapt()
function returns the object directly without further ado.
To enable the second case, when the object knows about the
protocol, the object must have a __conform__() method. This
optional method takes two arguments:
- `self', the object being adapted
- `protocol, the protocol requested
Just like any other special method in today's Python, __conform__
is meant to be taken from the object's class, not from the object
itself (for all objects, except instances of "classic classes" as
long as we must still support the latter). This enables a
possible 'tp_conform' slot to be added to Python's type objects in
the future, if desired.
The object may return itself as the result of __conform__ to
indicate compliance. Alternatively, the object also has the
option of returning a wrapper object compliant with the protocol.
If the object knows it is not compliant although it belongs to a
type which is a subclass of the protocol, then __conform__ should
raise a LiskovViolation exception (a subclass of AdaptationError).
Finally, if the object cannot determine its compliance, it should
return None to enable the remaining mechanisms. If __conform__
raises any other exception, "adapt" just propagates it.
To enable the third case, when the protocol knows about the
object, the protocol must have an __adapt__() method. This
optional method takes two arguments:
- `self', the protocol requested
- `obj', the object being adapted
If the protocol finds the object to be compliant, it can return
obj directly. Alternatively, the method may return a wrapper
compliant with the protocol. If the protocol knows the object is
not compliant although it belongs to a type which is a subclass of
the protocol, then __adapt__ should raise a LiskovViolation
exception (a subclass of AdaptationError). Finally, when
compliance cannot be determined, this method should return None to
enable the remaining mechanisms. If __adapt__ raises any other
exception, "adapt" just propagates it.
The fourth case, when the object's class is a sub-class of the
protocol, is handled by the built-in adapt() function. Under
normal circumstances, if "isinstance(object, protocol)" then
adapt() returns the object directly. However, if the object is
not substitutable, either the __conform__() or __adapt__()
methods, as above mentioned, may raise an LiskovViolation (a
subclass of AdaptationError) to prevent this default behavior.
If none of the first four mechanisms worked, as a last-ditch
attempt, 'adapt' falls back to checking a registry of adapter
factories, indexed by the protocol and the type of `obj', to meet
the fifth case. Adapter factories may be dynamically registered
and removed from that registry to provide "third party adaptation"
of objects and protocols that have no knowledge of each other, in
a way that is not invasive to either the object or the protocols.
Intended Use
The typical intended use of adapt is in code which has received
some object X "from the outside", either as an argument or as the
result of calling some function, and needs to use that object
according to a certain protocol Y. A "protocol" such as Y is
meant to indicate an interface, usually enriched with some
semantics constraints (such as are typically used in the "design
by contract" approach), and often also some pragmatical
expectation (such as "the running time of a certain operation
should be no worse than O(N)", or the like); this proposal does
not specify how protocols are designed as such, nor how or whether
compliance to a protocol is checked, nor what the consequences may
be of claiming compliance but not actually delivering it (lack of
"syntactic" compliance -- names and signatures of methods -- will
often lead to exceptions being raised; lack of "semantic"
compliance may lead to subtle and perhaps occasional errors
[imagine a method claiming to be threadsafe but being in fact
subject to some subtle race condition, for example]; lack of
"pragmatic" compliance will generally lead to code that runs
``correctly'', but too slowly for practical use, or sometimes to
exhaustion of resources such as memory or disk space).
When protocol Y is a concrete type or class, compliance to it is
intended to mean that an object allows all of the operations that
could be performed on instances of Y, with "comparable" semantics
and pragmatics. For example, a hypothetical object X that is a
singly-linked list should not claim compliance with protocol
'list', even if it implements all of list's methods: the fact that
indexing X[n] takes time O(n), while the same operation would be
O(1) on a list, makes a difference. On the other hand, an
instance of StringIO.StringIO does comply with protocol 'file',
even though some operations (such as those of module 'marshal')
may not allow substituting one for the other because they perform
explicit type-checks: such type-checks are "beyond the pale" from
the point of view of protocol compliance.
While this convention makes it feasible to use a concrete type or
class as a protocol for purposes of this proposal, such use will
often not be optimal. Rarely will the code calling 'adapt' need
ALL of the features of a certain concrete type, particularly for
such rich types as file, list, dict; rarely can all those features
be provided by a wrapper with good pragmatics, as well as syntax
and semantics that are really the same as a concrete type's.
Rather, once this proposal is accepted, a design effort needs to
start to identify the essential characteristics of those protocols
which are currently used in Python, particularly within the
standard library, and to formalize them using some kind of
"interface" construct (not necessarily requiring any new syntax: a
simple custom metaclass would let us get started, and the results
of the effort could later be migrated to whatever "interface"
construct is eventually accepted into the Python language). With
such a palette of more formally designed protocols, the code using
'adapt' will be able to ask for, say, adaptation into "a filelike
object that is readable and seekable", or whatever else it
specifically needs with some decent level of "granularity", rather
than too-generically asking for compliance to the 'file' protocol.
Adaptation is NOT "casting". When object X itself does not
conform to protocol Y, adapting X to Y means using some kind of
wrapper object Z, which holds a reference to X, and implements
whatever operation Y requires, mostly by delegating to X in
appropriate ways. For example, if X is a string and Y is 'file',
the proper way to adapt X to Y is to make a StringIO(X), *NOT* to
call file(X) [which would try to open a file named by X].
Numeric types and protocols may need to be an exception to this
"adaptation is not casting" mantra, however.
Guido's "Optional Static Typing: Stop the Flames" Blog Entry
A typical simple use case of adaptation would be:
def f(X):
X = adapt(X, Y)
# continue by using X according to protocol Y
In [4], the BDFL has proposed introducing the syntax:
def f(X: Y):
# continue by using X according to protocol Y
to be a handy shortcut for exactly this typical use of adapt, and,
as a basis for experimentation until the parser has been modified
to accept this new syntax, a semantically equivalent decorator:
@arguments(Y)
def f(X):
# continue by using X according to protocol Y
These BDFL ideas are fully compatible with this proposal, as are
other of Guido's suggestions in the same blog.
Reference Implementation and Test Cases
The following reference implementation does not deal with classic
classes: it consider only new-style classes. If classic classes
need to be supported, the additions should be pretty clear, though
a bit messy (x.__class__ vs type(x), getting boundmethods directly
from the object rather than from the type, and so on).
-----------------------------------------------------------------
adapt.py
-----------------------------------------------------------------
class AdaptationError(TypeError):
pass
class LiskovViolation(AdaptationError):
pass
_adapter_factory_registry = {}
def registerAdapterFactory(objtype, protocol, factory):
_adapter_factory_registry[objtype, protocol] = factory
def unregisterAdapterFactory(objtype, protocol):
del _adapter_factory_registry[objtype, protocol]
def _adapt_by_registry(obj, protocol, alternate):
factory = _adapter_factory_registry.get((type(obj), protocol))
if factory is None:
adapter = alternate
else:
adapter = factory(obj, protocol, alternate)
if adapter is AdaptationError:
raise AdaptationError
else:
return adapter
def adapt(obj, protocol, alternate=AdaptationError):
t = type(obj)
# (a) first check to see if object has the exact protocol
if t is protocol:
return obj
try:
# (b) next check if t.__conform__ exists & likes protocol
conform = getattr(t, '__conform__', None)
if conform is not None:
result = conform(obj, protocol)
if result is not None:
return result
# (c) then check if protocol.__adapt__ exists & likes obj
adapt = getattr(type(protocol), '__adapt__', None)
if adapt is not None:
result = adapt(protocol, obj)
if result is not None:
return result
except LiskovViolation:
pass
else:
# (d) check if object is instance of protocol
if isinstance(obj, protocol):
return obj
# (e) last chance: try the registry
return _adapt_by_registry(obj, protocol, alternate)
-----------------------------------------------------------------
test.py
-----------------------------------------------------------------
from adapt import AdaptationError, LiskovViolation, adapt
from adapt import registerAdapterFactory, unregisterAdapterFactory
import doctest
class A(object):
'''
>>> a = A()
>>> a is adapt(a, A) # case (a)
True
'''
class B(A):
'''
>>> b = B()
>>> b is adapt(b, A) # case (d)
True
'''
class C(object):
'''
>>> c = C()
>>> c is adapt(c, B) # case (b)
True
>>> c is adapt(c, A) # a failure case
Traceback (most recent call last):
...
AdaptationError
'''
def __conform__(self, protocol):
if protocol is B:
return self
class D(C):
'''
>>> d = D()
>>> d is adapt(d, D) # case (a)
True
>>> d is adapt(d, C) # case (d) explicitly blocked
Traceback (most recent call last):
...
AdaptationError
'''
def __conform__(self, protocol):
if protocol is C:
raise LiskovViolation
class MetaAdaptingProtocol(type):
def __adapt__(cls, obj):
return cls.adapt(obj)
class AdaptingProtocol:
__metaclass__ = MetaAdaptingProtocol
@classmethod
def adapt(cls, obj):
pass
class E(AdaptingProtocol):
'''
>>> a = A()
>>> a is adapt(a, E) # case (c)
True
>>> b = A()
>>> b is adapt(b, E) # case (c)
True
>>> c = C()
>>> c is adapt(c, E) # a failure case
Traceback (most recent call last):
...
AdaptationError
'''
@classmethod
def adapt(cls, obj):
if isinstance(obj, A):
return obj
class F(object):
pass
def adapt_F_to_A(obj, protocol, alternate):
if isinstance(obj, F) and issubclass(protocol, A):
return obj
else:
return alternate
def test_registry():
'''
>>> f = F()
>>> f is adapt(f, A) # a failure case
Traceback (most recent call last):
...
AdaptationError
>>> registerAdapterFactory(F, A, adapt_F_to_A)
>>> f is adapt(f, A) # case (e)
True
>>> unregisterAdapterFactory(F, A)
>>> f is adapt(f, A) # a failure case again
Traceback (most recent call last):
...
AdaptationError
>>> registerAdapterFactory(F, A, adapt_F_to_A)
'''
doctest.testmod()
Relationship To Microsoft's QueryInterface
Although this proposal has some similarities to Microsoft's (COM)
QueryInterface, it differs by a number of aspects.
First, adaptation in this proposal is bi-directional, allowing the
interface (protocol) to be queried as well, which gives more
dynamic abilities (more Pythonic). Second, there is no special
"IUnknown" interface which can be used to check or obtain the
original unwrapped object identity, although this could be
proposed as one of those "special" blessed interface protocol
identifiers. Third, with QueryInterface, once an object supports
a particular interface it must always there after support this
interface; this proposal makes no such guarantee, since, in
particular, adapter factories can be dynamically added to the
registried and removed again later.
Fourth, implementations of Microsoft's QueryInterface must support
a kind of equivalence relation -- they must be reflexive,
symmetrical, and transitive, in specific senses. The equivalent
conditions for protocol adaptation according to this proposal
would also represent desirable properties:
# given, to start with, a successful adaptation:
X_as_Y = adapt(X, Y)
# reflexive:
assert adapt(X_as_Y, Y) is X_as_Y
# transitive:
X_as_Z = adapt(X, Z, None)
X_as_Y_as_Z = adapt(X_as_Y, Z, None)
assert (X_as_Y_as_Z is None) == (X_as_Z is None)
# symmetrical:
X_as_Z_as_Y = adapt(X_as_Z, Y, None)
assert (X_as_Y_as_Z is None) == (X_as_Z_as_Y is None)
However, while these properties are desirable, it may not be
possible to guarantee them in all cases. QueryInterface can
impose their equivalents because it dictates, to some extent, how
objects, interfaces, and adapters are to be coded; this proposal
is meant to be not necessarily invasive, usable and to "retrofit"
adaptation between two frameworks coded in mutual ignorance of
each other without having to modify either framework.
Transitivity of adaptation is in fact somewhat controversial, as
is the relationship (if any) between adaptation and inheritance.
The latter would not be controversial if we knew that inheritance
always implies Liskov substitutability, which, unfortunately we
don't. If some special form, such as the interfaces proposed in
[4], could indeed ensure Liskov substitutability, then for that
kind of inheritance, only, we could perhaps assert that if X
conforms to Y and Y inherits from Z then X conforms to Z... but
only if substitutability was taken in a very strong sense to
include semantics and pragmatics, which seems doubtful. (For what
it's worth: in QueryInterface, inheritance does not require nor
imply conformance). This proposal does not include any "strong"
effects of inheritance, beyond the small ones specifically
detailed above.
Similarly, transitivity might imply multiple "internal" adaptation
passes to get the result of adapt(X, Z) via some intermediate Y,
intrinsically like adapt(adapt(X, Y), Z), for some suitable and
automatically chosen Y. Again, this may perhaps be feasible under
suitably strong constraints, but the practical implications of
such a scheme are still unclear to this proposal's authors. Thus,
this proposal does not include any automatic or implicit
transitivity of adaptation, under whatever circumstances.
For an implementation of the original version of this proposal
which performs more advanced processing in terms of transitivity,
and of the effects of inheritance, see Phillip J. Eby's
PyProtocols [5]. The documentation accompanying PyProtocols is
well worth studying for its considerations on how adapters should
be coded and used, and on how adaptation can remove any need for
typechecking in application code.
Questions and Answers
Q: What benefit does this proposal provide?
A: The typical Python programmer is an integrator, someone who is
connecting components from various suppliers. Often, to
interface between these components, one needs intermediate
adapters. Usually the burden falls upon the programmer to
study the interface exposed by one component and required by
another, determine if they are directly compatible, or develop
an adapter. Sometimes a supplier may even include the
appropriate adapter, but even then searching for the adapter
and figuring out how to deploy the adapter takes time.
This technique enables suppliers to work with each other
directly, by implementing __conform__ or __adapt__ as
necessary. This frees the integrator from making their own
adapters. In essence, this allows the components to have a
simple dialogue among themselves. The integrator simply
connects one component to another, and if the types don't
automatically match an adapting mechanism is built-in.
Moreover, thanks to the adapter registry, a "fourth party" may
supply adapters to allow interoperation of frameworks which
are totally unaware of each other, non-invasively, and without
requiring the integrator to do anything more than install the
appropriate adapter factories in the registry at start-up.
As long as libraries and frameworks cooperate with the
adaptation infrastructure proposed here (essentially by
defining and using protocols appropriately, and calling
'adapt' as needed on arguments received and results of
call-back factory functions), the integrator's work thereby
becomes much simpler.
For example, consider SAX1 and SAX2 interfaces: there is an
adapter required to switch between them. Normally, the
programmer must be aware of this; however, with this
adaptation proposal in place, this is no longer the case --
indeed, thanks to the adapter registry, this need may be
removed even if the framework supplying SAX1 and the one
requiring SAX2 are unaware of each other.
Q: Why does this have to be built-in, can't it be standalone?
A: Yes, it does work standalone. However, if it is built-in, it
has a greater chance of usage. The value of this proposal is
primarily in standardization: having libraries and frameworks
coming from different suppliers, including the Python standard
library, use a single approach to adaptation. Furthermore:
0. The mechanism is by its very nature a singleton.
1. If used frequently, it will be much faster as a built-in.
2. It is extensible and unassuming.
3. Once 'adapt' is built-in, it can support syntax extensions
and even be of some help to a type inference system.
Q: Why the verbs __conform__ and __adapt__?
A: conform, verb intransitive
1. To correspond in form or character; be similar.
2. To act or be in accord or agreement; comply.
3. To act in accordance with current customs or modes.
adapt, verb transitive
1. To make suitable to or fit for a specific use or
situation.
Source: The American Heritage Dictionary of the English
Language, Third Edition
Backwards Compatibility
There should be no problem with backwards compatibility unless
someone had used the special names __conform__ or __adapt__ in
other ways, but this seems unlikely, and, in any case, user code
should never use special names for non-standard purposes.
This proposal could be implemented and tested without changes to
the interpreter.
Credits
This proposal was created in large part by the feedback of the
talented individuals on the main Python mailing lists and the
type-sig list. To name specific contributors (with apologies if
we missed anyone!), besides the proposal's authors: the main
suggestions for the proposal's first versions came from Paul
Prescod, with significant feedback from Robin Thomas, and we also
borrowed ideas from Marcin 'Qrczak' Kowalczyk and Carlos Ribeiro.
Other contributors (via comments) include Michel Pelletier, Jeremy
Hylton, Aahz Maruch, Fredrik Lundh, Rainer Deyke, Timothy Delaney,
and Huaiyu Zhu. The current version owes a lot to discussions
with (among others) Phillip J. Eby, Guido van Rossum, Bruce Eckel,
Jim Fulton, and Ka-Ping Yee, and to study and reflection of their
proposals, implementations, and documentation about use and
adaptation of interfaces and protocols in Python.
References and Footnotes
[1] PEP 245, Python Interface Syntax, Pelletier
http://www.python.org/peps/pep-0245.html
[2] http://www.zope.org/Wikis/Interfaces/FrontPage
[3] http://www.artima.com/weblogs/index.jsp?blogger=guido
[4] http://www.artima.com/weblogs/viewpost.jsp?thread=87182
[5] http://peak.telecommunity.com/PyProtocols.html
Copyright
This document has been placed in the public domain.