PEP 224 -- Attribute Docstrings

PEP:	224
Title:	Attribute Docstrings
Version:	$Revision: 757 $
Author:	Marc-Andre Lemburg <mal at lemburg.com>
Status:	Rejected
Type:	Standards Track
Python-Version:	2.1
Created:	23-Aug-2000
Post-History:

Introduction

    This PEP describes the "attribute docstring" proposal for Python
    2.0.  This PEP tracks the status and ownership of this feature.
    It contains a description of the feature and outlines changes
    necessary to support the feature.  The CVS revision history of
    this file contains the definitive historical record.

Rationale

    This PEP proposes a small addition to the way Python currently
    handles docstrings embedded in Python code.

    Python currently only handles the case of docstrings which appear
    directly after a class definition, a function definition or as
    first string literal in a module.  The string literals are added
    to the objects in question under the __doc__ attribute and are
    from then on available for introspection tools which can extract
    the contained information for help, debugging and documentation
    purposes.

    Docstrings appearing in locations other than the ones mentioned
    are simply ignored and don't result in any code generation.

    Here is an example:

        class C:
            "class C doc-string"

            a = 1
            "attribute C.a doc-string (1)"

            b = 2
            "attribute C.b doc-string (2)"

    The docstrings (1) and (2) are currently being ignored by the
    Python byte code compiler, but could obviously be put to good use
    for documenting the named assignments that precede them.
    
    This PEP proposes to also make use of these cases by proposing
    semantics for adding their content to the objects in which they
    appear under new generated attribute names.

    The original idea behind this approach which also inspired the
    above example was to enable inline documentation of class
    attributes, which can currently only be documented in the class's
    docstring or using comments which are not available for
    introspection.

Implementation

    Docstrings are handled by the byte code compiler as expressions.
    The current implementation special cases the few locations
    mentioned above to make use of these expressions, but otherwise
    ignores the strings completely.

    To enable use of these docstrings for documenting named
    assignments (which is the natural way of defining e.g. class
    attributes), the compiler will have to keep track of the last
    assigned name and then use this name to assign the content of the
    docstring to an attribute of the containing object by means of
    storing it in as a constant which is then added to the object's
    namespace during object construction time.

    In order to preserve features like inheritance and hiding of
    Python's special attributes (ones with leading and trailing double
    underscores), a special name mangling has to be applied which
    uniquely identifies the docstring as belonging to the name
    assignment and allows finding the docstring later on by inspecting
    the namespace.

    The following name mangling scheme achieves all of the above:

        __doc_<attributename>__

    To keep track of the last assigned name, the byte code compiler
    stores this name in a variable of the compiling structure.  This
    variable defaults to NULL.  When it sees a docstring, it then
    checks the variable and uses the name as basis for the above name
    mangling to produce an implicit assignment of the docstring to the
    mangled name.  It then resets the variable to NULL to avoid
    duplicate assignments.

    If the variable does not point to a name (i.e. is NULL), no
    assignments are made.  These will continue to be ignored like
    before.  All classical docstrings fall under this case, so no
    duplicate assignments are done.

    In the above example this would result in the following new class
    attributes to be created:

        C.__doc_a__ == "attribute C.a doc-string (1)"
        C.__doc_b__ == "attribute C.b doc-string (2)"

    A patch to the current CVS version of Python 2.0 which implements
    the above is available on SourceForge at [1].

Caveats of the Implementation

    Since the implementation does not reset the compiling structure
    variable when processing a non-expression, e.g. a function
    definition, the last assigned name remains active until either the
    next assignment or the next occurrence of a docstring.

    This can lead to cases where the docstring and assignment may be
    separated by other expressions:

        class C:
            "C doc string"

            b = 2

            def x(self):
                "C.x doc string"
                y = 3
                return 1

            "b's doc string"

    Since the definition of method "x" currently does not reset the
    used assignment name variable, it is still valid when the compiler
    reaches the docstring "b's doc string" and thus assigns the string
    to __doc_b__.

    A possible solution to this problem would be resetting the name
    variable for all non-expression nodes in the compiler.

Possible Problems

    Even though highly unlikely, attribute docstrings could get
    accidentally concatenated to the attribute's value:

    class C:
	  x = "text" \
	  "x's docstring"

    The trailing slash would cause the Python compiler to concatenate
    the attribute value and the docstring.

    A modern syntax highlighting editor would easily make this
    accident visible, though, and by simply inserting emtpy lines
    between the attribute definition and the docstring you can avoid
    the possible concatenation completely, so the problem is
    negligible.

    Another possible problem is that of using triple quoted strings as
    a way to uncomment parts of your code. 

    If there happens to be an assignment just before the start of the
    comment string, then the compiler will treat the comment as
    docstring attribute and apply the above logic to it.

    Besides generating a docstring for an otherwise undocumented
    attribute there is no breakage.

Comments from our BDFL

    Early comments on the PEP from Guido:

        I "kinda" like the idea of having attribute docstrings (meaning
        it's not of great importance to me) but there are two things I
        don't like in your current proposal:

        1. The syntax you propose is too ambiguous: as you say,
        stand-alone string literal are used for other purposes and could
        suddenly become attribute docstrings.

        2. I don't like the access method either (__doc_<attrname>__).

    The author's reply:

        > 1. The syntax you propose is too ambiguous: as you say, stand-alone
        >    string literal are used for other purposes and could suddenly
        >    become attribute docstrings.

        This can be fixed by introducing some extra checks in the
        compiler to reset the "doc attribute" flag in the compiler
        struct.

        > 2. I don't like the access method either (__doc_<attrname>__).

        Any other name will do. It will only have to match these
        criteria:

        * must start with two underscores (to match __doc__)
        * must be extractable using some form of inspection (e.g. by using
          a naming convention which includes some fixed name part)
        * must be compatible with class inheritence (i.e. should be
          stored as attribute)

    Later on in March, Guido pronounced on this PEP in March 2001 (on
    python-dev). Here are his reasons for rejection mentioned in
    private mail to the author of this PEP:

        ...

        It might be useful, but I really hate the proposed syntax.

            a = 1
            "foo bar"
            b = 1

        I really have no way to know whether "foo bar" is a docstring
        for a or for b.

        ...
        
        You can use this convention:

            a = 1
            __doc_a__ = "doc string for a"

        This makes it available at runtime.

        > Are you completely opposed to adding attribute documentation
        > to Python or is it just the way the implementation works ? I
        > find the syntax proposed in the PEP very intuitive and many
        > other users on c.l.p and in private emails have supported it
        > at the time I wrote the PEP.

        It's not the implementation, it's the syntax.  It doesn't
        convey a clear enough coupling between the variable and the
        doc string.

Copyright

    This document has been placed in the Public Domain.

References

    [1] http://sourceforge.net/patch/?func=detailpatch&patch_id=101264&group_id=5470