Python Programming Environment - Session Overview
Ken Manheimer, ken.manheimer@nist.gov, 19-May-1995
Session Purpose
As a Python programmer, i need tools that will help me capitalize on
python's versatility, and on the growing, increasingly diverse
contingent of software libraries. As Python grows, i wish to help
promote the development of such tools. That's what i see as the focus
of this session.
(Two personal traits particularly shape my own programming environment
concerns:
I'm sorta lazy - like most people, i want to work effectively,
and avoid unnecessary work.
My memory for names, locations, and such, is pretty feeble,
particularly when compared with the reliability of the computer's
"memory". So i would like, as much as possible, to depend on the
computer to keep track of the location of things...
So my interests tend towards tools and conventions that help organize
and track new and existing software components and tools, to help me
find and reuse what's already there.)
Some Current Programming Environment Issues
The following list is not meant to be a comprehensive survey of the
important issues in software management tools. Rather, it includes
some items that i see as being of current relevance. Suggestions of
other items to discuss are welcome.
(The list specifically includes a few issues that i promised to pursue
during the last workshop. Other items are motivated partly by my own
continuing software mgmt tools concerns and interests, and partly by
suggestions and questions that i have seen go by on the list.)
Proposals that we discussed at the last workshop have made their way,
with refinements, into the language. Also, a new implementation of
the 'dir()' builtin has been added to the language as an alternative,
within the module 'newdir.py'. Ultimately, these are the sorts of
things that are necessary in the development of a comprehensive object
examination system, as part of an environment browser.
Docstrings
Docstrings are brief descriptive strings that can be associated with
many of the complex python objects, by assigning to those objects
'__doc__' attribute. In addition, the definition syntax for callables
(functions, methods, and classes) and modules include optional, formal
slots for docstring declarations. In particular, when the first
executable statement of a callable or a module is a string literal,
that string becomes the value of the callable's __doc__ property.
We have only started to formulate conventions about the format of
docstrings. As with GNU emacs docstrings, we suggest that the first
line of the string be a self-contained, very terse description of the
purpose of the object, and subsequent lines describe the public
properties and behaviors of the object.
For an extensive example of the use of docstrings, see the newimp.py
module which is in the distribution library for Python 1.2.
Where to go from here with docstrings? Several items suggest
themselves:
- Use them!
- Organize effort to create docstrings for existing important
objects, like the sys module and it's components.
- Identify format conventions for docstrings that makes them most
informative (but, also, does not make them cumbersome!).
- Incorporate recognition of docstrings in existing python
programming tools, like debuggers, attribute identifiers ('dir()'), etc.
- Identify other object aspects needed as the basis for
comprehensive programming environment browsers.
Of course, there are more. Suggestions?
Also as promised at the last workshop, i have designed, proposed, and
contributed a working prototype which implements module-nesting
"packages". The prototype is included in the Python 1.2 distribution,
in the module 'newimp.py'.
Why packages? Packages enable module nesting and sibling module
imports. 'Til now, the python module namespace was flat, which
means every module had to have a unique name, in order to not
conflict with names of other modules on the load path. Furthermore,
suites of modules could not be structurally affiliated with one
another.
With packages, a suite of, eg, email-oriented modules can include a
module named 'mailbox', without conflicting with the, eg, 'mailbox'
module of a shared-memory suite - 'email.mailbox' vs
'shmem.mailbox'. Packages also enable modules within a suite to
load other modules within their package without having the package
name hard-coded. Similarly, package suites of modules can be loaded
as a unit, by loading the package that contains them.
Usage: once installed (newimp.install(); newimp.revert() to revert to
the prior __import__ routine), 'import ...' and 'from ... import ...'
can be used to:
- import modules from the search path, as before.
- import modules from within other directory "packages" on the search
path using a '.' dot-delimited nesting syntax. The nesting is fully
recursive.
- import siblings from modules within a package, using '__.' as a shorthand
prefix to refer to the parent package. This enables referential
transparency - package modules need not know their package name.
The '__' package references are actually names assigned within
modules, to refer to their containing package. This means that
variable references can be made to imported modules, or to variables
defined via 'import ... from', also using the '__.var' shorthand
notation. This establishes a proper equivalence between the import
reference '__.sibling' and the var reference '__.sibling'.
- import an entire package as a unit, by importing the package directory.
If there is a module named '__main__.py' in the package, it controls the
load. Otherwise, all the modules in the dir, including packages, are
inherently loaded into the package module's namespace.
- perform any combination of the above - have a package that contains
packages, etc.
As Guido mentions in his Works In Progress page, Sun's new web language, Java, provides a very
similar module-nesting capability. However, there is a subtle difference
that turns out to be important.
As Guido succinctly puts it:
Java got this wrong -- their "import A.B" creates a local name B, so
it is hard to use two unrelated packages that happen to define a
module with the same name, like "P.main" and "Q.main"
... because both P.main and Q.main would try to populate the current
namespace with the name 'main', and hence conflict.
In the new python mechanism, the 'import P.main' would establish
module P in the current namespace. You would then reach the 'P's
component, 'main', as 'P.main', without colliding with the 'Q's 'main'
component.
Furthermore, Python's 'from X import Y' form allows you to
deliberately import a component of a module directly into the current
namespace, when you specifically wish. I'm not sure whether or not
Java has an equivalent of the 'from' form, which might explain why
they made the choice they did.
I would like to have the import mechanism make it easy to hook in
alternate search and load mechanisms, to, eg, enable easy hookup with
the web and other media. This would essentially entail making the
relationship of the load mechanisms to the module name, or file-type,
keyed through a table lookup. Developers of new import types could
then hook in their special load mechanisms by adding new entries to
the appropriate places in the tables. For instance:
- Structure the module search and access mechanism so it is
table-driven, according to module name. This would facilitate
incorporation of foreign interfaces to recognize, eg, URL's, and do
web fetches instead of file-system search.
- Similarly, make the file-load mechanism table-driven, to
expose the load-mechanisms for the various module types
(eg, .py, .so, directory-packages, etc), and enable easy incorporation
of new types, or to change the search-order precedences, or whatever.
Restructuring the code to accomplish this sort of thing would take
some attention, however, and i would like to find a volunteer with
time and motivation to take it on.
In thinking about the prospects for advanced environment browsers for
python, i've reached an interesting branch-point. I see two primary
and divergent options for embedding the user interface for such tools:
- With all the attention on it, we could plan to use a python GUI
mechanism, as it emerges. As, eg, Guido embedded the wdb window
debugger interface in stdwin.
- Alternately, we could work immediately on embedding such
interfaces in emacs.
I may be biased, because i am both an intensive emacs user and emacs
hacker, but i see some very compelling advantages to using emacs as
the runtime-tools interface substrate.
- It is already here, and ported to just about all the platforms we
could wish. (It may soon be better supported on Apple platforms,
since the FSF has dropped the Apple boycott.)
- It provides X windows, DOS/Windows, and character-cell
screen presentations.
- It is available to everyone.
- It provides a basis for exquisite integration of these tools with
eachother and with the rest of the users operating environment.
- As any contemporary Unix-associated GUI must, it "comes with" an
extensive emacs-like text editing widget!-)
The primary disadvantage is the same one that always comes with emacs
involvement - it's complicated for the user.
It also is sort-of one-way - it would be a good (i think) substrate
for embedding our programming-environment tools, but would not be so
good as a basis for a python GUI toolkit. (Or, at least, not without
some mondo hacking, to make an emacs "API" available from within
python!) (Did i hear someone gasp?-)
My inclination is to recommend both avenues - emacs immediately, and
elegant-and-easy-to-use GUI interfaces as they coelesce. I don't
think the different substrates will interfere with eachother, each
having somewhat divergent audiences - user-friendly point-and-click
with the GUI's, vs programmer-friendly (and user fiendish:-)
customize-and-automate-out-the-wazoo of emacs.
With the subject broached, i figure i might as well take the
opportunity to mention some of the emacs interfaces which i see as
useful, exciting prospects:
- Editing - continue to use the wild and wonderful python-mode
(thanks to Tim Peters, wherever you are, and to Barry Warsaw, for your
recent maintence efforts...)
- Debugger - hook up with the emacs debugging interface protocol,
GUD ("Grand Unified Debugger"), and get a truly comprehensive python
debugger.
- Environment Browser (!!) - investigate hooking up with
the maturing emacs Object-Oriented Browser utility.
(See Browser Features, or ftp the
recent sources.)
- Code and document indexing - emacs TAGS facilities
I have to mention one other, non-emacs-affiliated option, re code and document
indexing: the Glimpse
indexing and query system. It looks very promising and useful in it's
own right, and is easy to use. And it's even more intriguing
when you consider the potentials inherent in it's connection with the
Harvest network information
discovery and access system.
Documentation formats and efforts
1. Do we need, and can we muster a documentation project, a la Linux?
And,
2. What format should we primarily support?
I haven't thought about this very much, but have increasingly hit up
against the second issue. I figure the issue is quite ripe, and just
yearning to be resolved. Some offhand thoughts:
- HTML means pretty-looking format on a web-based viewing interface.
- texinfo means more immediate ability to search across nodes
(important, to me!!), but not inherently web distributed.
- Many others to consider? SGML, linux doc, etc
- We probably need to focus on a baseline format from which we can
derive the others. What?
Programming style practices
Here's the beginnings of a grab-bag of emerging coding style and
conventions, from a discussion with Guido.
- Var Names
- exposed internal routines, vars, etc: leading _underscore
- constants: all UPPERCASE
- classes: Capitalized
- global names: not_abbreviated
- boolean vars: long_elaborate_informative_names?
- name groups lacking name space: common short prefix + '_' -
sys.exc_type
- Misc Python programming practices
- Avoid globals
- Avoid mutable globals - use classes or parameters+defaulting instead
- Avoid globals
- modules that contain mostly a single class are named after the class
- Put all imports near top
- Use doc strings, including module doc and version
- Use a version string, but only when you will use it with discipline
Perhaps we can start accumulating a list. I'd be willing to include
it in my Python bestiary, if i ever get time to update it...