PEP 208 -- Reworking the Coercion Model

PEP:	208
Title:	Reworking the Coercion Model
Version:	$Revision: 1368 $
Author:	Neil Schemenauer <nas at arctrix.com>, Marc-André Lemburg <mal at lemburg.com>
Status:	Final
Type:	Standards Track
Created:	04-Dec-2000
Post-History:
Python-Version:	2.1

Abstract

    Many Python types implement numeric operations.  When the arguments of
    a numeric operation are of different types, the interpreter tries to
    coerce the arguments into a common type.  The numeric operation is
    then performed using this common type.  This PEP proposes a new type
    flag to indicate that arguments to a type's numeric operations should
    not be coerced.  Operations that do not support the supplied types
    indicate it by returning a new singleton object.  Types which do not
    set the type flag are handled in a backwards compatible manner.
    Allowing operations handle different types is often simpler, more
    flexible, and faster than having the interpreter do coercion.

Rationale

    When implementing numeric or other related operations, it is often
    desirable to provide not only operations between operands of one type
    only, e.g. integer + integer, but to generalize the idea behind the
    operation to other type combinations as well, e.g. integer + float.

    A common approach to this mixed type situation is to provide a method
    of "lifting" the operands to a common type (coercion) and then use
    that type's operand method as execution mechanism.  Yet, this strategy
    has a few drawbacks:

        * the "lifting" process creates at least one new (temporary)
          operand object,

        * since the coercion method is not being told about the operation
          that is to follow, it is not possible to implement operation
          specific coercion of types,

        * there is no elegant way to solve situations were a common type
          is not at hand, and

        * the coercion method will always have to be called prior to the
          operation's method itself.

    A fix for this situation is obviously needed, since these drawbacks
    make implementations of types needing these features very cumbersome,
    if not impossible.  As an example, have a look at the DateTime and
    DateTimeDelta[1] types, the first being absolute, the second
    relative.  You can always add a relative value to an absolute one,
    giving a new absolute value.  Yet, there is no common type which the
    existing coercion mechanism could use to implement that operation.

    Currently, PyInstance types are treated specially by the interpreter
    in that their numeric methods are passed arguments of different types.
    Removing this special case simplifies the interpreter and allows other
    types to implement numeric methods that behave like instance types.
    This is especially useful for extension types like ExtensionClass.

Specification

    Instead of using a central coercion method, the process of handling
    different operand types is simply left to the operation.  If the
    operation finds that it cannot handle the given operand type
    combination, it may return a special singleton as indicator.

    Note that "numbers" (anything that implements the number protocol, or
    part of it) written in Python already use the first part of this
    strategy - it is the C level API that we focus on here.

    To maintain nearly 100% backward compatibility we have to be very
    careful to make numbers that don't know anything about the new
    strategy (old style numbers) work just as well as those that expect
    the new scheme (new style numbers).  Furthermore, binary compatibility
    is a must, meaning that the interpreter may only access and use new
    style operations if the number indicates the availability of these.

    A new style number is considered by the interpreter as such if and
    only it it sets the type flag Py_TPFLAGS_CHECKTYPES.  The main
    difference between an old style number and a new style one is that the
    numeric slot functions can no longer assume to be passed arguments of
    identical type.  New style slots must check all arguments for proper
    type and implement the necessary conversions themselves.  This may seem
    to cause more work on the behalf of the type implementor, but is in
    fact no more difficult than writing the same kind of routines for an
    old style coercion slot.

    If a new style slot finds that it cannot handle the passed argument
    type combination, it may return a new reference of the special
    singleton Py_NotImplemented to the caller.  This will cause the caller
    to try the other operands operation slots until it finds a slot that
    does implement the operation for the specific type combination.  If
    none of the possible slots succeed, it raises a TypeError.

    To make the implementation easy to understand (the whole topic is
    esoteric enough), a new layer in the handling of numeric operations is
    introduced.  This layer takes care of all the different cases that need
    to be taken into account when dealing with all the possible
    combinations of old and new style numbers.  It is implemented by the
    two static functions binary_op() and ternary_op(), which are both
    internal functions that only the functions in Objects/abstract.c
    have access to.  The numeric API (PyNumber_*) is easy to adapt to
    this new layer.

    As a side-effect all numeric slots can be NULL-checked (this has to be
    done anyway, so the added feature comes at no extra cost).


    The scheme used by the layer to execute a binary operation is as
    follows:

      v        | w          | Action taken
      ---------+------------+----------------------------------
      new      | new        | v.op(v,w), w.op(v,w)
      new      | old        | v.op(v,w), coerce(v,w), v.op(v,w)
      old      | new        | w.op(v,w), coerce(v,w), v.op(v,w)
      old      | old        | coerce(v,w), v.op(v,w)

    The indicated action sequence is executed from left to right until
    either the operation succeeds and a valid result (!=
    Py_NotImplemented) is returned or an exception is raised.  Exceptions
    are returned to the calling function as-is.  If a slot returns
    Py_NotImplemented, the next item in the sequence is executed.

    Note that coerce(v,w) will use the old style nb_coerce slot methods
    via a call to PyNumber_Coerce().

    Ternary operations have a few more cases to handle:

      v   | w   | z   | Action taken
      ----+-----+-----+------------------------------------
      new | new | new | v.op(v,w,z), w.op(v,w,z), z.op(v,w,z)
      new | old | new | v.op(v,w,z), z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
      old | new | new | w.op(v,w,z), z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
      old | old | new | z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
      new | new | old | v.op(v,w,z), w.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
      new | old | old | v.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
      old | new | old | w.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
      old | old | old | coerce(v,w,z), v.op(v,w,z)

    The same notes as above, except that coerce(v,w,z) actually does:

        if z != Py_None:
            coerce(v,w), coerce(v,z), coerce(w,z)
        else:
            # treat z as absent variable
            coerce(v,w)


    The current implementation uses this scheme already (there's only one
    ternary slot: nb_pow(a,b,c)).

    Note that the numeric protocol is also used for some other related
    tasks, e.g. sequence concatenation.  These can also benefit from the
    new mechanism by implementing right-hand operations for type
    combinations that would otherwise fail to work.  As an example, take
    string concatenation: currently you can only do string + string.  With
    the new mechanism, a new string-like type could implement new_type +
    string and string + new_type, even though strings don't know anything
    about new_type.

    Since comparisons also rely on coercion (every time you compare an
    integer to a float, the integer is first converted to float and then
    compared...), a new slot to handle numeric comparisons is needed:

        PyObject *nb_cmp(PyObject *v, PyObject *w)

    This slot should compare the two objects and return an integer object
    stating the result.  Currently, this result integer may only be -1, 0,
    1. If the slot cannot handle the type combination, it may return a
    reference to Py_NotImplemented.  [XXX Note that this slot is still
    in flux since it should take into account rich comparisons
    (i.e. PEP 207).]

    Numeric comparisons are handled by a new numeric protocol API:

        PyObject *PyNumber_Compare(PyObject *v, PyObject *w)

    This function compare the two objects as "numbers" and return an
    integer object stating the result.  Currently, this result integer may
    only be -1, 0, 1.  In case the operation cannot be handled by the given
    objects, a TypeError is raised.

    The PyObject_Compare() API needs to adjusted accordingly to make use
    of this new API.

    Other changes include adapting some of the built-in functions (e.g.
    cmp()) to use this API as well.  Also, PyNumber_CoerceEx() will need to
    check for new style numbers before calling the nb_coerce slot.  New
    style numbers don't provide a coercion slot and thus cannot be
    explicitly coerced.

Reference Implementation

    A preliminary patch for the CVS version of Python is available through
    the Source Forge patch manager[2].

Credits

    This PEP and the patch are heavily based on work done by Marc-André
    Lemburg[3].

Copyright

    This document has been placed in the public domain.

References

    [1] http://www.lemburg.com/files/python/mxDateTime.html
    [2] http://sourceforge.net/patch/?func=detailpatch&patch_id=102652&group_id=5470
    [3] http://www.lemburg.com/files/python/CoercionProposal.html