This is a summary of traffic on the python-dev mailing list between November 1, 2002 and November 15, 2002 (inclusive). It is intended to inform the wider Python community of on-going developments on the list that might interest the wider Python community. To comment on anything mentioned here, just post to firstname.lastname@example.org or comp.lang.python in the usual way; give your posting a meaningful subject line, and if it's about a PEP, include the PEP number (e.g. Subject: PEP 201 - Lockstep iteration). All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join python-dev!
This is the fifth summary written by Brett Cannon (after a relaxing two week hiatus; thanks to Raymond Hettinger for doing the Summary while I was gone).
All summaries are now archived at http://www.python.org/dev/summary/ (thanks to A.M. Kuchling for setting that up).
Please note that this summary is written using reStructuredText which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST; you can safely ignore it (although I suggest learning reST; its simple and is accepted for PEP markup). Also, because of the wonders of programs that like to reformat, I cannot guarantee you will be able to run the text version of this summary through Docutils as-is. If you want to do that, get an original copy of the text file.
Not much to say for this summary. The only thread skipped that someone out there might care about was one on getting the PEPs to display properly for IE 6 when the PEP was written in reST.
Thanks goes to Raymond Hettinger for covering the Summary while I was away on vacation. Thanks also goes out to Laura Creighton and Guido for suggesting graduate schools that are Python-friendly.
Michael Chermside was the only person to respond directly to my question as to whether anyone had issue with me injecting my personality into the summary. But he said to go ahead and let me personality permeate throughout this thing, so it will. Let that be a lesson to anyone who wanted me to shut up and be drab; had you and a friend spoken up this little tirade would not be happening. =)
Gustavo Niemeyer asked how he should go about getting his python-bz2 module into the standard library. He was basically told to submit a patch complete with the module, docs, regression tests, etc.; everything a healthy module needs. It was also suggested that he provide a MSVC project file.
That didn't work for Gustavo (who now has CVS write priveleges; congrats) since he doesn't use Windows. This was a slight problem because if the extension file doesn't build under Windows it can't be included in the PythonLabs Windows distro. So keep in mind that if your module won't directly compile for 6 different versions of Windows it won't be included in the Windows distro.
Gustavo Niemeyer asked about how he could help contribute to Python. He made the observation that "Guido and others [have been] bothered a few times because of the lack of man power" which has led to a "small core of very busy developers working on core/essential/hard stuff and in code reviewing as well" (and I can attest to this fact that this is very true; I am amazed the guys at PythonLabs have any form of a life outside of work with the amount of time they put in). Gustavo felt "that the Python development is currently overly centralized".
Martin v. Loewis responded first. He said that "the most important aspect I'd like to hand off is the review of patches; to Tim, it is the analysis of bug reports". Martin then listed various points on how to be able to review a patch and how to handle bug reports; the email is at http://mail.python.org/pipermail/python-dev/2002-November/029831.html . I highly recommend reading the email because Martin's points are all good and more patch reviewers would be rather nice.
M.A. Lemburg commented next, saying that he "wouldn't mind if other developers with some time at hand jump in on already assigned patches and bug reports to help out". Just because a patch has been assigned to someone doesn't mean the patch or the assignee couldn't use more help. Having the patch assigned to someone just means that they take responsibility to apply the patch if it is worthy of being accepted or to reject it. It should not stop other people from making comments or helping out so that the assignee can have a little bit of time saved for other things... like another patch to assign to themselves. MAL also suggested that we have more maintainers that are in charge of chunks of code, e.g. Martin handles all locale code.
Martin disagreed with this idea, though. Jack Jansen agreed with Martin. He thought that if something came up that was not within the realm of a specific person that it "either get[s] ignored, or passed on to Guido, or picked up by yourself or Michael [Hudson] or one of the very few other people who do general firefighting". Martin commented later that he would hope that the stewardship of specific code does not get any more formal.
Martin then stated how one goes about getting commit priveleges for Python on SF: "just step forward and say that you want. In the past, Guido has set a policy that people who's commit privilege is fresh will still have to use SF, but can perform the checkin themselves". Tradition has stated that "fresh" is your first two or three SF patches. But please only step forward if you are known to python-dev or PythonLabs since commit priveleges won't be given to people who just wander in off the proverbial street and ask for it. Martin basically states this in a later email by saying that "people should not produce a burst of patches just to get commit privileges. Instead, they should contribute patches steadily (and should have done so in the past), and then get CVS write access as a simplification for the rest of the maintainers".
Neal Norwitz pointed out that if a bug ends up with a fix it is best to submit a separate patch instead of attaching it to the bug report. That way there is a bigger chance of the patch being seen and dealth with. But please make sure to mention that it fixes a bug so that the bug can be closed! Martin even suggested mentioning this fact in the title of the patch submission.
This is a splinter thread of the 'Becoming a python contributor' thread in which Micheal Hudson asked how using Roundup for replacing SF was coming along. Guido said things had come up that was holding it up. One was that the test server running at http://www.python.org:8080 had died when the box was restarted (it is now back up). There were also some changes to Roundup that needed to be dealt with in order to get everything over from SF on to Roundup. Guido also just ran out of time to review it more, although he did like what he had reviewed so far. Guido asked for a volunteer.
It was asked what was needed. Guido said that Roundup had moved over to Zope-style templating so all the old templates that Gordon (I assuming this is Gordon McMillan) wrote needed to be changed. There were also some bugs that needed to be dealt with that are being tracked at Roundup hosted at http://www.python.org:8080 . And there also will need to be preparations for the day that development is moved over from SF to the Roundup setup; that will require transferring everything over from SF, shutting down SF, and handling any bugs that crop up from the heavy use that the new setup is going to get.
So if you have any dislike for SF, then please contribute to Roundup and help get Python off of there!
Neal Norwitz went through the SF bug tracker and counted 325 bugs and generated a very easy to read page listing all the bugs with their relevant info. You can find the HTML version at http://www.metaslash.com/py/sf.data.html . If you find a bug there you think you can help out on, then go to SF and do so!
Neal Norwitz generated a list of what he thought were easily fixable bugs and put them in this thread (it's the first email so just go to the link for this thread). So if you have a little bit of free time and want to help out why don't you try to tackle one of these bugs?
The only reason I am mentioning this thread here is to help get the word out about the Snake Farm ; hosted by Lysator and sponsored by the Python Business Forum . It is a compile farm that downloads from CVS, compiles, and runs the test suite of Python daily. It has caught a bunch of bugs and has been a great help.
The majority of the thread was spent trying to get FreeBSD 4.4 to compile Python and trying to work out a possible bug in pymalloc.
David Goodger has somehow gotten suckered into becoming a PEP editor (I suspect Barry Warsaw had something to do with this since he used to do all of the PEP editing). So you can all welcome David to his new responsibility by flooding him with all of those PEPs you have lying around and were not sure were good enough to submit. =)
Michael Hudson posted the question as to how to get writing __bases__ for new-style classes to do the "right thing" since the mro does not seem to be updated. Kevin Jacob gave it a go but ran into a bug.
Guido said that currently there is no way to touch the MRO from Python code. All of that info is stored in the tp_mro slot in the object's C representation and is stored as a tuple whose members are expected to be either types or classic classes. Guido said he would accept a patch for assigning to the mro if a check was included to make sure the previously mentioned constraint was maintained. He also said he would probably accept a patch that allowed for assignable __bases__ and writing __name__ .
Michael Hudson commented about the difficulty of all of these patches. One thing that came up was the connection between __bases__ and __base__ and how assignment to __base__ should not be taken lightly; Guido commented that "the old and new base must be layout compatible, exactly like for assignment to __class__".
Guido later pointed out that __base__ becomes the built-in type that you derive from; whether it be object, list, dict, etc. It was agreed that __base__ shouldn't be writable.
The point was made that nestable classes are not picklable since only thing at the top level of a module can be pickled. Guido said he considered this a flaw although he couldn't think of why someone would want to embed a class within a class or a function. This spawned comments on how to deal with this. It seemed the best solution was to change __name__ for the inner class to a fully dotted name (e.g. X.Y.__name__ = "X.Y") and then make a simple change to either pickle or getattr(). It was eventually agreed upon that setting __name__ to the full dotted name of the class was the best solution and it was filed as bug #633930 .
As part of trying to give Guido good examples of why nested classes are good (the best attempt was Walter Dorwald in an email at http://mail.python.org/pipermail/python-dev/2002-November/029906.html that elicited a "that's cool" comment from Guido), the idea of having __iter__ contain a class definition that returned an instance of that class came up. That was shot down because of the performance hit of dealing with the class definition on every call to __iter__. But the idea of defining __iter__ as a generator was pointed out by Just van Rossum. Doing that simplifies the code usually a good amount since the generator handles all .next() calls and you just need to have the generator stop when you want your iterator to stop. Apparently this is not a widely used idiom, so I am mentioning it here since it is a great idea that I can personally attest to as being a rather nice way to handle iterators.
I am mentioning this thread not because the bug is that big of a deal, but because of the PEP that was brought up when dealing with the bug; PEP 291 . This informational PEP, among other things, lists modules that must be kept compatible with certain versions of Python; in this case sre has to be kept compatible with Python 1.5.2. Except for sre, they are all packages that have made their way into the library. You might want to have a look at the list if you are hacking on any packages in the stdlib.
Once again this is not to meant to mention directly what as discussed in the thread but a point made. The bug that was discovered was an issue with 64-bit machines and the 32-bit limitation of lists and slicing. The 32-bit limit currently is hard-coded into the C code. Obviously some people would like to see this changed.
Neal Norwitz laid down a rough outline on how one could go about changing this at http://mail.python.org/pipermail/python-dev/2002-November/029953.html . Guido then made a pronouncement later stating that he is willing to break binary compatibility once for Python 2.3 or 2.4 to get this done. He also mentioned some other things to make sure to do.
As of this writing no one has stepped forward to take this on.
While doing some work on Modules/unicodedata.c , Martin v. Loewis noticed that the indentation style didn't follow PEP 7 and he wondered if it would be okay to re-indent the file. This brought up two points.
One was that PEP 7 was not stringently followed. The PEP says to "Use single-tab indents, where a tab is worth 8 spaces". Now that goes against Python coding style where you are supposed to use 4-space indents. So the question of whether one should still use the tab style in new C code came up. Guido said he wished new C code would, but for files he doesn't touch very often he doesn't feel he can enforce it. Barry and Martin came up with an Emacs "local variables" stanza for the bottom of any file that uses a non-PEP style which can be found at http://mail.python.org/pipermail/python-dev/2002-November/030067.html . But following the PEP is still the "officially" supported style. But you should try to follow PEP 7 for all new C code and PEP 8 for Python code.
The other point was when to re-indent. It was agreed upon to only do that when a major change in the code was occuring. And when you do re-indent, do it as a separate check-in for CVS.
Martin v. Loewis brought up a point made by Henry Thompson on c.l.py asking why printing ignores __unicode__. Martin thought it shouldn't and listed a bunch of options on how to make printing work with __unicode__. The winner was to have "A file indicates "unicode-awareness" somehow. For a Unicode-aware file, it tries __unicode__, __str__, and __repr__, in order". The agreed solution was to add an .encoding attribute. This attribute can be set to None when the stream is in Unicode and never converts to a byte-stream.
But then M.A. Lemburg chimed in. He was fine with the addition of the .encoding; "this attribute is already available on stream objects created with codecs.open()". What he didn't like was having .encoding set to None mean the stream would accept Unicode. Martin asked then if StringIO should have .encoding.
MAL replied that "StringIO should be considered a non-Unicode aware stream, so it should not implement .encoding". He thought that if someone wanted StringIO to be Unicode-aware they could use "the tools in codecs.py [since they] can be used for this (basically by doing the same kind of wrapping as codecs.open() does)". But then Martin pointed out that StringIO is already Unicode-aware.
This debate continued between MAL, Martin, and Guido. But then Guido just said he gave up since the current behavior was relied upon too much.
Guido sent an email saying that he would like to get Python 2.3a out the door by X-mas. This means trying to come up with a new name for Optik , Greg Ward's command-line options parser. The complaint about the current name is that it's too cute. Guido started it by suggesting the name options.
Peter Funk brought up two things. One was that Docutils already used the module under its Optik name. To this Guido responded that they could just change the name in the code. Peter also said that Greg liked OptionsParser. There was also the point that module names are preferred to be short and lowercase.
Ka-Ping Yee and myself brought up the point that options is very generic. Guido agreed with this point while also mentioning that many people already have their own modules named options.py. He then suggested optlib since "It's short, un-cute, and follows the *lib pattern used all over the Python stdlib". Greg Ward liked this suggestion.
I personally suggested ArgParser, trying to get a tie-in for sys.argv. Greg Ewing built off of this and suggested argvparse, similar to urlparse. David Ascher preferred argparse since "The v is archaic and so silent it fades away =)". David Abrahams disagreed, stating how the "v" deals with any ambiguity. David Abrahams also said he would vote for argvparse to break a tie.
Ka-Ping Yee suggested cmdline and cmdopts. David Abrahams liked both.
Steve Holden built off of Guido's optlib and pushed for optionlib.
Raymond Hettinger threw OptionParser into the ring.
But then Guido asked for a call to votes between optlib and argvparse. I tallied the votes at one point with it kind of split between them. Then Guido announced that optparse won (and no, that is not a typo).
A.M. Kuchling asked if anyone would mind if bdist_dumb was removed from distutils (the conversation also happened with distutils-sig ). Apparently it is rather broken in terms of the paths of the files because it makes them all relative. M.A. Lemburg, though, spoke up stating that he uses bdist_dumb. It still wasn't fully resolved as of this writing.
The point of being able to build bdist_wininst on non-Windows platforms came up during this discussion as well. It was decided to move the binary files needed for bdist_wininst into CVS (now filed as bug #638595 ).
Paul Dubois decided to cause himself some grief and attempt to comprehend the miracle that is descriptors (more info can be found in the 2.2.2 What's New doc and in PEP 252). He was wondering if there was more documentation than the signatures of the functions for the C API. He also wanted to know if there was a way to play with them in the Python world since all of his attempts fell short of them being "first-class citizen from Python".
Guido admitted that the docs were lacking. He then asked for some help in writing the docs. He responded to Paul's second question by saying that he thought it should work as long as you avoided classic classes.
Thanks to Guido saying that is should work, Paul realized that he misread the PEP. To be nice he emailed out an example of how a descriptor could know where it came from at http://mail.python.org/pipermail/python-dev/2002-November/030141.html . Phillip Ebey also chimed in on how to write a metaclass that said the name of the descriptor at http://mail.python.org/pipermail/python-dev/2002-November/030213.html .
Patrick O'Brien (author of PyCrust ) wondered how IDLE managed to keep its local scope so clean once it finished launching. Guido said that IDLE (or at least the GRPC version which Patrick O'Brien believes stands for Generalized Remote Procedure Call) runs the shell in a subprocess that is careful not to pollute the namespace.
Patrick thanked Guido and then presented his real question: how to handle not polluting the namespace with a way getting around a pickling issue. He "used to just pass a regular dictionary to code.InteractiveInterpreter, which worked well enough", but there was an issue with pickling. So then he tried passing sys.modules['__main__'].__dict__, which worked but cluttered the namespace.
Guido's response: "Remove the clutter". Guido said that this would most likely require a minimalistic main program that bootstrapped using __import__('run').main() which would run the code without adding to the namespace.