PEP 225 -- Elementwise/Objectwise Operators

PEP:	225
Title:	Elementwise/Objectwise Operators
Version:	$Revision: 876 $
Author:	Huaiyu Zhu <hzhu at users.sourceforge.net>, Gregory Lielens <gregory.lielens at fft.be>
Status:	Draft
Type:	Standards Track
Python-Version:	2.1
Created:	19-Sep-2000
Post-History:

Introduction

    This PEP describes a proposal to add new operators to Python which
    are useful for distinguishing elementwise and objectwise
    operations, and summarizes discussions in the news group
    comp.lang.python on this topic.  See Credits and Archives section
    at end.  Issues discussed here include:

    - Background.
    - Description of proposed operators and implementation issues.
    - Analysis of alternatives to new operators.
    - Analysis of alternative forms.
    - Compatibility issues
    - Description of wider extensions and other related ideas.

    A substantial portion of this PEP describes ideas that do not go
    into the proposed extension.  They are presented because the
    extension is essentially syntactic sugar, so its adoption must be
    weighed against various possible alternatives.  While many
    alternatives may be better in some aspects, the current proposal
    appears to be overall advantageous.

    The issues concerning elementwise-objectwise operations extends to
    wider areas than numerical computation.  This document also
    describes how the current proposal may be integrated with more
    general future extensions.

Background

    Python provides six binary infix math operators: + - * / % **
    hereafter generically represented by "op".  They can be overloaded
    with new semantics for user-defined classes.  However, for objects
    composed of homogeneous elements, such as arrays, vectors and
    matrices in numerical computation, there are two essentially
    distinct flavors of semantics.  The objectwise operations treat
    these objects as points in multidimensional spaces.  The
    elementwise operations treat them as collections of individual
    elements.  These two flavors of operations are often intermixed in
    the same formulas, thereby requiring syntactical distinction.

    Many numerical computation languages provide two sets of math
    operators.  For example, in MatLab, the ordinary op is used for
    objectwise operation while .op is used for elementwise operation.
    In R, op stands for elementwise operation while %op% stands for
    objectwise operation.

    In Python, there are other methods of representation, some of
    which already used by available numerical packages, such as

    - function:   mul(a,b)
    - method:     a.mul(b)
    - casting:    a.E*b 

    In several aspects these are not as adequate as infix operators.
    More details will be shown later, but the key points are

    - Readability: Even for moderately complicated formulas, infix
      operators are much cleaner than alternatives.

    - Familiarity: Users are familiar with ordinary math operators.

    - Implementation: New infix operators will not unduly clutter
      Python syntax.  They will greatly ease the implementation of
      numerical packages.

    While it is possible to assign current math operators to one
    flavor of semantics, there is simply not enough infix operators to
    overload for the other flavor.  It is also impossible to maintain
    visual symmetry between these two flavors if one of them does not
    contain symbols for ordinary math operators.

Proposed extension

    - Six new binary infix operators ~+ ~- ~* ~/ ~% ~** are added to
      core Python.  They parallel the existing operators + - * / % **.

    - Six augmented assignment operators ~+= ~-= ~*= ~/= ~%= ~**= are
      added to core Python.  They parallel the operators += -= *= /=
      %= **= available in Python 2.0.

    - Operator ~op retains the syntactical properties of operator op,
      including precedence.

    - Operator ~op retains the semantical properties of operator op on
      built-in number types.

    - Operator ~op raise syntax error on non-number builtin types.
      This is temporary until the proper behavior can be agreed upon.

    - These operators are overloadable in classes with names that
      prepend "t" (for tilde) to names of ordinary math operators.
      For example, __tadd__ and __rtadd__ work for ~+ just as __add__
      and __radd__ work for +.

    - As with exiting operators, the __r*__() methods are invoked when
      the left operand does not provide the appropriate method.

    It is intended that one set of op or ~op is used for elementwise
    operations, the other for objectwise operations, but it is not
    specified which version of operators stands for elementwise or
    objectwise operations, leaving the decision to applications.

    The proposed implementation is to patch several files relating to
    the tokenizer, parser, grammar and compiler to duplicate the
    functionality of corresponding existing operators as necessary.
    All new semantics are to be implemented in the classes that
    overload them.

    The symbol ~ is already used in Python as the unary "bitwise not"
    operator.  Currently it is not allowed for binary operators.  The
    new operators are completely backward compatible.

Prototype Implementation

    Greg Lielens implemented the infix ~op as a patch against Python
    2.0b1 source[1].

    To allow ~ to be part of binary operators, the tokenizer would
    treat ~+ as one token.  This means that currently valid expression
    ~+1 would be tokenized as ~+ 1 instead of ~ + 1.  The parser would
    then treat ~+ as composite of ~ +.  The effect is invisible to
    applications.

    Notes about current patch:

    - It does not include ~op= operators yet.

    - The ~op behaves the same as op on lists, instead of raising
      exceptions.

    These should be fixed when the final version of this proposal is
    ready.

    - It reserves xor as an infix operator with the semantics
      equivalent to:
      
        def __xor__(a, b):
            if not b: return a
            elif not a: return b
            else: 0

   This preserves true value as much as possible, otherwise preserve
   left hand side value if possible.

   This is done so that bitwise operators could be regarded as
   elementwise logical operators in the future (see below).

Alternatives to adding new operators

    The discussions on comp.lang.python and python-dev mailing list
    explored many alternatives.  Some of the leading alternatives are
    listed here, using the multiplication operator as an example.

    1. Use function mul(a,b).

       Advantage:
       -  No need for new operators.

       Disadvantage: 
       - Prefix forms are cumbersome for composite formulas.
       - Unfamiliar to the intended users.
       - Too verbose for the intended users.
       - Unable to use natural precedence rules.

    2. Use method call a.mul(b)

       Advantage:
       - No need for new operators.

       Disadvantage:
       - Asymmetric for both operands.
       - Unfamiliar to the intended users.
       - Too verbose for the intended users.
       - Unable to use natural precedence rules.

    3. Use "shadow classes".  For matrix class define a shadow array
       class accessible through a method .E, so that for matrices a
       and b, a.E*b would be a matrix object that is
       elementwise_mul(a,b).

       Likewise define a shadow matrix class for arrays accessible
       through a method .M so that for arrays a and b, a.M*b would be
       an array that is matrixwise_mul(a,b).

       Advantage:
       - No need for new operators.
       - Benefits of infix operators with correct precedence rules.
       - Clean formulas in applications.

       Disadvantage:
       - Hard to maintain in current Python because ordinary numbers
         cannot have user defined class methods; i.e. a.E*b will fail
         if a is a pure number.
       - Difficult to implement, as this will interfere with existing
         method calls, like .T for transpose, etc.
       - Runtime overhead of object creation and method lookup.
       - The shadowing class cannot replace a true class, because it
         does not return its own type.  So there need to be a M class
         with shadow E class, and an E class with shadow M class.
       - Unnatural to mathematicians.

    4. Implement matrixwise and elementwise classes with easy casting
       to the other class.  So matrixwise operations for arrays would
       be like a.M*b.M and elementwise operations for matrices would
       be like a.E*b.E.  For error detection a.E*b.M would raise
       exceptions.

       Advantage:
       - No need for new operators.
       - Similar to infix notation with correct precedence rules.

       Disadvantage:
       - Similar difficulty due to lack of user-methods for pure numbers.
       - Runtime overhead of object creation and method lookup.
       - More cluttered formulas
       - Switching of flavor of objects to facilitate operators
         becomes persistent.  This introduces long range context
         dependencies in application code that would be extremely hard
         to maintain.

    5. Using mini parser to parse formulas written in arbitrary
       extension placed in quoted strings.

       Advantage:
       - Pure Python, without new operators

       Disadvantage:
       - The actual syntax is within the quoted string, which does not
         resolve the problem itself.
       - Introducing zones of special syntax.
       - Demanding on the mini-parser.

    6. Introducing a single operator, such as @, for matrix
       multiplication.

       Advantage:
       - Introduces less operators

       Disadvantage:
       - The distinctions for operators like + - ** are equally
         important.  Their meaning in matrix or array-oriented
         packages would be reversed (see below).
       - The new operator occupies a special character.
       - This does not work well with more general object-element issues.

    Among these alternatives, the first and second are used in current
    applications to some extent, but found inadequate.  The third is
    the most favorite for applications, but it will incur huge
    implementation complexity.  The fourth would make applications
    codes very context-sensitive and hard to maintain.  These two
    alternatives also share significant implementational difficulties
    due to current type/class split.  The fifth appears to create more
    problems than it would solve.  The sixth does not cover the same
    range of applications.

Alternative forms of infix operators

    Two major forms and several minor variants of new infix operators
    were discussed:

    - Bracketed form

        (op)
        [op]
        {op}
        <op>
        :op:
        ~op~
        %op%

    - Meta character form

        .op
        @op
        ~op

      Alternatively the meta character is put after the operator.

    - Less consistent variations of these themes.  These are
      considered unfavorably.  For completeness some are listed here

        - Use @/ and /@ for left and right division
        - Use [*] and (*) for outer and inner products
        - Use a single operator @ for multiplication.

    - Use __call__ to simulate multiplication.
      a(b)  or (a)(b)

    Criteria for choosing among the representations include:

        - No syntactical ambiguities with existing operators.  

        - Higher readability in actual formulas.  This makes the
          bracketed forms unfavorable.  See examples below.

        - Visually similar to existing math operators.

        - Syntactically simple, without blocking possible future
          extensions.

    With these criteria the overall winner in bracket form appear to
    be {op}.  A clear winner in the meta character form is ~op.
    Comparing these it appears that ~op is the favorite among them
    all.

    Some analysis are as follows:

        - The .op form is ambiguous: 1.+a would be different from 1 .+a

        - The bracket type operators are most favorable when standing
          alone, but not in formulas, as they interfere with visual
          parsing of parenthesis for precedence and function argument.
          This is so for (op) and [op], and somewhat less so for {op}
          and <op>.

        - The <op> form has the potential to be confused with < > and =

        - The @op is not favored because @ is visually heavy (dense,
          more like a letter): a@+b is more readily read as a@ + b
          than a @+ b.

        - For choosing meta-characters: Most of existing ASCII symbols
          have already been used.  The only three unused are @ $ ?.

Semantics of new operators

    There are convincing arguments for using either set of operators
    as objectwise or elementwise.  Some of them are listed here:

    1. op for element, ~op for object

       - Consistent with current multiarray interface of Numeric package
       - Consistent with some other languages
       - Perception that elementwise operations are more natural
       - Perception that elementwise operations are used more frequently

    2. op for object, ~op for element

       - Consistent with current linear algebra interface of MatPy package
       - Consistent with some other languages
       - Perception that objectwise operations are more natural
       - Perception that objectwise operations are used more frequently
       - Consistent with the current behavior of operators on lists
       - Allow ~ to be a general elementwise meta-character in future
         extensions.

    It is generally agreed upon that 

       - there is no absolute reason to favor one or the other
       - it is easy to cast from one representation to another in a
         sizable chunk of code, so the other flavor of operators is
         always minority
       - there are other semantic differences that favor existence of
         array-oriented and matrix-oriented packages, even if their
         operators are unified.
       - whatever the decision is taken, codes using existing
         interfaces should not be broken for a very long time.

    Therefore not much is lost, and much flexibility retained, if the
    semantic flavors of these two sets of operators are not dictated
    by the core language.  The application packages are responsible
    for making the most suitable choice.  This is already the case for
    NumPy and MatPy which use opposite semantics.  Adding new
    operators will not break this.  See also observation after
    subsection 2 in the Examples below.

    The issue of numerical precision was raised, but if the semantics
    is left to the applications, the actual precisions should also go
    there.

Examples

    Following are examples of the actual formulas that will appear
    using various operators or other representations described above.

    1. The matrix inversion formula:

       - Using op for object and ~op for element:

         b = a.I - a.I * u / (c.I + v/a*u) * v / a

         b = a.I - a.I * u * (c.I + v*a.I*u).I * v * a.I

       - Using op for element and ~op for object:

         b = a.I @- a.I @* u @/ (c.I @+ v@/a@*u) @* v @/ a

         b = a.I ~- a.I ~* u ~/ (c.I ~+ v~/a~*u) ~* v ~/ a

         b = a.I (-) a.I (*) u (/) (c.I (+) v(/)a(*)u) (*) v (/) a

         b = a.I [-] a.I [*] u [/] (c.I [+] v[/]a[*]u) [*] v [/] a

         b = a.I <-> a.I <*> u </> (c.I <+> v</>a<*>u) <*> v </> a

         b = a.I {-} a.I {*} u {/} (c.I {+} v{/}a{*}u) {*} v {/} a

       Observation: For linear algebra using op for object is preferable.

       Observation: The ~op type operators look better than (op) type
       in complicated formulas.

       - using named operators

         b = a.I @sub a.I @mul u @div (c.I @add v @div a @mul u) @mul v @div a

         b = a.I ~sub a.I ~mul u ~div (c.I ~add v ~div a ~mul u) ~mul v ~div a

       Observation: Named operators are not suitable for math formulas.

    2. Plotting a 3d graph

       - Using op for object and ~op for element:

         z = sin(x~**2 ~+ y~**2);    plot(x,y,z)

       - Using op for element and ~op for object:

         z = sin(x**2 + y**2);   plot(x,y,z)

        Observation: Elementwise operations with broadcasting allows
        much more efficient implementation than MatLab.

        Observation: It is useful to have two related classes with the
        semantics of op and ~op swapped.  Using these the ~op
        operators would only need to appear in chunks of code where
        the other flavor dominates, while maintaining consistent
        semantics of the code.

    3. Using + and - with automatic broadcasting

         a = b - c;  d = a.T*a

       Observation: This would silently produce hard-to-trace bugs if
       one of b or c is row vector while the other is column vector.

Miscellaneous issues:

    - Need for the ~+ ~- operators.  The objectwise + - are important
      because they provide important sanity checks as per linear
      algebra.  The elementwise + - are important because they allow
      broadcasting that are very efficient in applications.

    - Left division (solve).  For matrix, a*x is not necessarily equal
      to x*a.  The solution of a*x==b, denoted x=solve(a,b), is
      therefore different from the solution of x*a==b, denoted
      x=div(b,a).  There are discussions about finding a new symbol
      for solve.  [Background: MatLab use b/a for div(b,a) and a\b for
      solve(a,b).]

      It is recognized that Python provides a better solution without
      requiring a new symbol: the inverse method .I can be made to be
      delayed so that a.I*b and b*a.I are equivalent to Mat lab's a\b
      and b/a.  The implementation is quite simple and the resulting
      application code clean.

    - Power operator.  Python's use of a**b as pow(a,b) has two
      perceived disadvantages:

      - Most mathematicians are more familiar with a^b for this purpose.
      - It results in long augmented assignment operator ~**=.

      However, this issue is distinct from the main issue here.

    - Additional multiplication operators.  Several forms of
      multiplications are used in (multi-)linear algebra.  Most can be
      seen as variations of multiplication in linear algebra sense
      (such as Kronecker product).  But two forms appear to be more
      fundamental: outer product and inner product.  However, their
      specification includes indices, which can be either

      - associated with the operator, or
      - associated with the objects.

      The latter (the Einstein notation) is used extensively on paper,
      and is also the easier one to implement.  By implementing a
      tensor-with-indices class, a general form of multiplication
      would cover both outer and inner products, and specialize to
      linear algebra multiplication as well.  The index rule can be
      defined as class methods, like,

          a = b.i(1,2,-1,-2) * c.i(4,-2,3,-1)   # a_ijkl = b_ijmn c_lnkm

      Therefore one objectwise multiplication is sufficient.

    - Bitwise operators. 

      - The proposed new math operators use the symbol ~ that is
        "bitwise not" operator.  This poses no compatibility problem
        but somewhat complicates implementation.

      - The symbol ^ might be better used for pow than bitwise xor.
        But this depends on the future of bitwise operators.  It does
        not immediately impact on the proposed math operator.

      - The symbol | was suggested to be used for matrix solve.  But
        the new solution of using delayed .I is better in several
        ways.

      - The current proposal fits in a larger and more general
        extension that will remove the need for special bitwise
        operators.  (See elementization below.)

    - Alternative to special operator names used in definition,

          def "+"(a, b)      in place of       def __add__(a, b)

      This appears to require greater syntactical change, and would
      only be useful when arbitrary additional operators are allowed.

Impact on general elementization

    The distinction between objectwise and elementwise operations are
    meaningful in other contexts as well, where an object can be
    conceptually regarded as a collection of elements.  It is
    important that the current proposal does not preclude possible
    future extensions.

    One general future extension is to use ~ as a meta operator to
    "elementize" a given operator.  Several examples are listed here:

    1. Bitwise operators.  Currently Python assigns six operators to
       bitwise operations: and (&), or (|), xor (^), complement (~),
       left shift (<<) and right shift (>>), with their own precedence
       levels.

       Among them, the & | ^ ~ operators can be regarded as
       elementwise versions of lattice operators applied to integers
       regarded as bit strings.

           5 and 6                # 6
           5 or 6                 # 5

           5 ~and 6               # 4
           5 ~or 6                # 7

       These can be regarded as general elementwise lattice operators,
       not restricted to bits in integers.

       In order to have named operators for xor ~xor, it is necessary
       to make xor a reserved word.

    2. List arithmetics. 

           [1, 2] + [3, 4]        # [1, 2, 3, 4]
           [1, 2] ~+ [3, 4]       # [4, 6]

           ['a', 'b'] * 2         # ['a', 'b', 'a', 'b']
           'ab' * 2               # 'abab'

           ['a', 'b'] ~* 2        # ['aa', 'bb']
           [1, 2] ~* 2            # [2, 4]

       It is also consistent to Cartesian product

           [1,2]*[3,4]            # [(1,3),(1,4),(2,3),(2,4)]

    3. List comprehension.

           a = [1, 2]; b = [3, 4]
           ~f(a,b)                # [f(x,y) for x, y in zip(a,b)]
           ~f(a*b)                # [f(x,y) for x in a for y in b]
           a ~+ b                 # [x + y for x, y in zip(a,b)]

    4. Tuple generation (the zip function in Python 2.0)

          [1, 2, 3], [4, 5, 6]   # ([1,2, 3], [4, 5, 6])
          [1, 2, 3]~,[4, 5, 6]   # [(1,4), (2, 5), (3,6)]

    5. Using ~ as generic elementwise meta-character to replace map

          ~f(a, b)               # map(f, a, b)
          ~~f(a, b)              # map(lambda *x:map(f, *x), a, b)

       More generally,

          def ~f(*x): return map(f, *x)
          def ~~f(*x): return map(~f, *x)
          ...

    6. Elementwise format operator (with broadcasting)

          a = [1,2,3,4,5]
          print ["%5d "] ~% a 
          a = [[1,2],[3,4]]
          print ["%5d "] ~~% a

    7.  Rich comparison

          [1, 2, 3]  ~< [3, 2, 1]  # [1, 0, 0]
          [1, 2, 3] ~== [3, 2, 1]  # [0, 1, 0]

    8. Rich indexing

          [a, b, c, d] ~[2, 3, 1]  # [c, d, b]

    9. Tuple flattening

          a = (1,2);  b = (3,4)
          f(~a, ~b)                # f(1,2,3,4)      

    10. Copy operator

          a ~= b                   # a = b.copy()

        There can be specific levels of deep copy

          a ~~= b                  # a = b.copy(2)

    Notes:

    1. There are probably many other similar situations.  This general
       approach seems well suited for most of them, in place of
       several separated extensions for each of them (parallel and
       cross iteration, list comprehension, rich comparison, etc).

    2. The semantics of "elementwise" depends on applications.  For
       example, an element of matrix is two levels down from the
       list-of-list point of view.  This requires more fundamental
       change than the current proposal.  In any case, the current
       proposal will not negatively impact on future possibilities of
       this nature.

    Note that this section describes a type of future extensions that
    is consistent with current proposal, but may present additional
    compatibility or other problems.  They are not tied to the current
    proposal.

Impact on named operators

    The discussions made it generally clear that infix operators is a
    scarce resource in Python, not only in numerical computation, but
    in other fields as well.  Several proposals and ideas were put
    forward that would allow infix operators be introduced in ways
    similar to named functions.  We show here that the current
    extension does not negatively impact on future extensions in this
    regard.

    1. Named infix operators.

        Choose a meta character, say @, so that for any identifier
        "opname", the combination "@opname" would be a binary infix
        operator, and

        a @opname b == opname(a,b)

        Other representations mentioned include .name ~name~ :name:
        (.name) %name% and similar variations.  The pure bracket based
        operators cannot be used this way.

        This requires a change in the parser to recognize @opname, and
        parse it into the same structure as a function call.  The
        precedence of all these operators would have to be fixed at
        one level, so the implementation would be different from
        additional math operators which keep the precedence of
        existing math operators.

        The current proposed extension do not limit possible future
        extensions of such form in any way.

    2. More general symbolic operators.

        One additional form of future extension is to use meta
        character and operator symbols (symbols that cannot be used in
        syntactical structures other than operators).  Suppose @ is
        the meta character.  Then

            a + b,    a @+ b,    a @@+ b,  a @+- b

        would all be operators with a hierarchy of precedence, defined by

            def "+"(a, b)
            def "@+"(a, b)
            def "@@+"(a, b)
            def "@+-"(a, b)

        One advantage compared with named operators is greater
        flexibility for precedences based on either the meta character
        or the ordinary operator symbols.  This also allows operator
        composition.  The disadvantage is that they are more like
        "line noise".  In any case the current proposal does not
        impact its future possibility.

        These kinds of future extensions may not be necessary when
        Unicode becomes generally available.

        Note that this section discusses compatibility of the proposed
        extension with possible future extensions.  The desirability
        or compatibility of these other extensions themselves are
        specifically not considered here.

Credits and archives

    The discussions mostly happened in July to August of 2000 on news
    group comp.lang.python and the mailing list python-dev.  There are
    altogether several hundred postings, most can be retrieved from
    these two pages (and searching word "operator"):

        http://www.python.org/pipermail/python-list/2000-July/
        http://www.python.org/pipermail/python-list/2000-August/

    The names of contributers are too numerous to mention here,
    suffice to say that a large proportion of ideas discussed here are
    not our own.

    Several key postings (from our point of view) that may help to
    navigate the discussions include:

        http://www.python.org/pipermail/python-list/2000-July/108893.html
        http://www.python.org/pipermail/python-list/2000-July/108777.html
        http://www.python.org/pipermail/python-list/2000-July/108848.html
        http://www.python.org/pipermail/python-list/2000-July/109237.html
        http://www.python.org/pipermail/python-list/2000-July/109250.html
        http://www.python.org/pipermail/python-list/2000-July/109310.html
        http://www.python.org/pipermail/python-list/2000-July/109448.html
        http://www.python.org/pipermail/python-list/2000-July/109491.html
        http://www.python.org/pipermail/python-list/2000-July/109537.html
        http://www.python.org/pipermail/python-list/2000-July/109607.html
        http://www.python.org/pipermail/python-list/2000-July/109709.html
        http://www.python.org/pipermail/python-list/2000-July/109804.html
        http://www.python.org/pipermail/python-list/2000-July/109857.html
        http://www.python.org/pipermail/python-list/2000-July/110061.html
        http://www.python.org/pipermail/python-list/2000-July/110208.html
        http://www.python.org/pipermail/python-list/2000-August/111427.html
        http://www.python.org/pipermail/python-list/2000-August/111558.html
        http://www.python.org/pipermail/python-list/2000-August/112551.html
        http://www.python.org/pipermail/python-list/2000-August/112606.html
        http://www.python.org/pipermail/python-list/2000-August/112758.html

        http://www.python.org/pipermail/python-dev/2000-July/013243.html
        http://www.python.org/pipermail/python-dev/2000-July/013364.html
        http://www.python.org/pipermail/python-dev/2000-August/014940.html

    These are earlier drafts of this PEP:

        http://www.python.org/pipermail/python-list/2000-August/111785.html
        http://www.python.org/pipermail/python-list/2000-August/112529.html
        http://www.python.org/pipermail/python-dev/2000-August/014906.html

    There is an alternative PEP (officially, PEP 211) by Greg Wilson,
    titled "Adding New Linear Algebra Operators to Python".

    Its first (and current) version is at:

        http://www.python.org/pipermail/python-dev/2000-August/014876.html
        http://www.python.org/peps/pep-0211.html

Additional References

    [1] http://MatPy.sourceforge.net/Misc/index.html