|
|
|||||||||
|
This is a first experiment in whether I can make a useful, interesting, somewhat coherent summary of python-dev activity. If reaction is favorable, and time permits, it may become a biweekly posting. --amk ================== The 2-week period started with reactions to Guido's June 30 announcement that the 2.0b1 release would be delayed for an indefinite period due to legal wrangling. This gave everyone a second chance to contribute more patches while waiting for the release, and the activity level remained high. Two dominant issues for this time period were Unicode-related issues, and list comprehensions. The Unicode issues, as usual, turned on the question of where strings and Unicode strings should be interchangeable. A discussion in the thread "Minidom and Unicode" considered whether it's legal to return a Unicode string from __repr__. The consensus was that it should be legal, despite fears of breaking code that expects only an 8-bit string, and the CVS tree was patched accordingly. Python's interpreter mode uses repr() to display the results of expressions, and it will convert Unicode strings to ASCII, using the unicode-escape encoding. The following code, typed into the interpreter, will print 'abc\u3456'. class C: def __repr__(self): return u'abc\u3456' print repr( C() ) Hashing also presented a problem. As M.-A. Lemburg explained in http://www.python.org/pipermail/python-dev/2000-July/006843.html: The problem comes from the fact that the default encoding can be changed to a locale specific value (site.py does the lookup for you), e.g. given you have defined LANG to be us_en, Python will default to Latin-1 as default encoding. This results in 'äöü' == u'äöü', but hash('äöü') != hash(u'äöü'), which is in conflict with the general rule about objects having the same hash value if they compare equal. The resolution seems to be simply removing the ability to change the default encoding and adopt ASCII as the fixed default; if you want to use other encodings, you must specify them explicitly. List comprehensions originated as a patch from Greg Ewing that's now being kept up-to-date versus the CVS tree by Skip Montanaro. Originally they weren't on the roadmap for 1.6, but with the greater version number jump to 2.0, GvR is more willing to incorporate larger changes. Augmented assignment, as in 'a += 1', and range literals, so [0:10] is the same as range(10), may also make their way into 2.0. List comprehensions provide a more concise way to create lists in situations where map() and filter() would currently be used. To take some examples from the patched-for-list-comprehensions version of the Python Tutorial: >>> spcs = [" Apple", " Banana ", "Coco nut "] >>> print [s.strip() for s in spcs] ['Apple', 'Banana', 'Coco nut'] >>> vec1 = [2, 4, 6] >>> vec2 = [4, 3, -9] >>> print [x*y for x in vec1 for y in vec2] [8, 6, -18, 16, 12, -36, 24, 18, -54] A lengthy subthread about intuitiveness sprang from the second example, and from a patch from Thomas Wouters that implements parallel 'for' loops. The patch makes "for x in [1,2]; y in ['a','b']" cause x,y to be 1,'a', and then 2,'b'. The thread circulated around whether people would expect this syntax to produce the Cartesian product of the two lists: (1,'a'), (1, 'b'), (2, 'a'), (2, 'b'). No clear answer or final syntax has emerged yet, though GvR seems to be leaning toward adding new built-ins such as zip() instead of new syntax. Greg Wilson has been trying out syntaxes on Python-unaware people and asking them what they'd expect: http://www.python.org/pipermail/python-dev/2000-July/006427.html The alternative to new syntax is to add a new built-in function for parallel 'for' loops, so you would code 'for x,y in zip([1,2], ['a','b']):'. A lengthy and very dull discussion ensued about the name 'zip': should it be 'plait', 'knit', 'parallel', or even 'marry'? Some new procedures for Python development were set out: Tim Peters wrote some guidelines for using SourceForge's patch manager: http://www.python.org/pipermail/python-dev/2000-July/005923.html Barry Warsaw announced a series of Python Extension Proposal (PEP) documents, which will play the role of RFCs for significant changes to Python: http://www.python.org/pipermail/python-dev/2000-July/006347.html Mark Hammond gave the first glimpse of a fourth Python implementation: "This new compiler could be compared, conceptually, with JPython - it is a completely new implementation of Python. It has a compiler that generates native Windows .DLL/.EXE files. It uses a runtime that consists of a few thousand lines of C# (C-Sharp) code. The Python programs can be debugged at the source level with Visual Studio 7, as well as stand-alone debuggers for this environment. Python can sub-class VB or C# classes, and vice-versa." http://www.python.org/pipermail/python-dev/2000-July/006307.html Other bits: Skip Montanaro experimented with using code coverage tools to measure the effectiveness of the Python test suite, by seeing which lines of code (both C and Python) that are exercised by the tests. Start browsing at: http://www.musi-cal.com/~skip/python/Python/dist/src/ Skip also added support to the readline module for saving and loading command histories. ESR suggested adding a standard lexer to the core, and /F suggested an extension to regular expressions that would make them more useful for tokenizing: http://www.python.org/pipermail/python-dev/2000-July/005320.html CVS problems were briefly a distraction, with dangling locks preventing commits to the Lib/ and Modules/ subdirectories for a few days. Despite such glitches, the move to SourceForge has accelerated development overall, as more people can make check-ins and review them. For some time Tim Peters has been suggesting removing the Py_PROTO macro and making the sources require ANSI C; mostly this is because the macro breaks the C cross-referencing support in Tim's editor. :) The ball finally started rolling on this, and snowballed into a massive set of patches to use ANSI C prototypes everywhere. Fred Drake and Peter Schneider-Kamp rose to the occasion and edited the prototypes in dozens of files. Jeremy Hylton pointed out that "Tuple, List, String, and Dict have a Py*_Size method. The abstract object interface uses PySequence_Length. This is inconsistent and hard to remember," and suggested that *_Size be made the standard form, and *_Length will be deprecated. Just before the cutoff date, Paul Prescod proposed a new help() function for interactive use, and began implementing it: http://www.python.org/pipermail/python-dev/2000-July/006634.html Huaiyu Zhu suggested adding new operators to support matrix math: http://www.python.org/pipermail/python-dev/2000-July/006652.html A slew of minor patches and bugfixes were made, too. Some highlights: * Ka-Ping Yee improved the syntax error messages. * ESR made various changes to ConfigParser.py * Some of Sam Rushing's patches from Medusa were applied to add os.setreuid() and friends; AMK is working on adding the poll() system call. * /F was his usual "patching machine" self, integrating PythonWin's win32popen function so that os.popen will now work correctly on Windows as well as Unix, writing PyErr_SafeFormat() to prevent buffer overflows, and proposing some patches to reduce the 600K size of the Unicode character database. Some fun posts came up during the near-endless zip()/plait()/whatever naming thread: http://www.python.org/pipermail/python-dev/2000-July/006208.html: "BTW: How comes, that Ping very often invents or introduces very clever ideas and concepts, but also very often chooses unclear names for them? Is it just me not being a native english speaker?" "I don't know. Perhaps my florx bignal zupkin isn't very moognacious?" -- Peter Funk and Ka-Ping Yee, 12 Jul 2000 http://www.python.org/pipermail/python-dev/2000-July/006338.html, while everyone was trying to think up alternative names for zip(): "Let me throw one more out, in honor of our fearless leader's recent life change: marry(). Usually only done in pairs, and with two big sequences, I get the image of a big Unification Church event :)" "Isn't it somewhat of a political statement to allow marriages of three or more items? I always presumed that this function was n-ary, like map." -- Barry Warsaw and Paul Prescod |