python-dev Summary for 2003-01-15 through 2003-01-31

This is a summary of traffic on the python-dev mailing list from January 16, 2003 to January 31, 2003. It is intended to inform the wider Python community of on-going developments on the list that might be of interest. To comment on anything mentioned here, just post to python-list@python.org or comp.lang.python with subject line delineating what you discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join python-dev!

This is the tenth summary written by Brett Cannon (double digits, baby!).

All summaries are archived at http://www.python.org/dev/summary/ .

Please note that this summary is written using reStructuredText which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST; you can safely ignore it (although I suggest learning reST; its simple and is accepted for PEP markup). Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils as-is unless it is from the original text file.

Contents

Summary Announcements
Adding Japanese Codecs to the distro
make install failing with current cvs
logging package -- rename warn to warning?
sys.exit and PYTHONINSPECT
disable writing .py[co]
Re: GadflyDA in core? Or as add-on-product?
Mixed-type datetime comparisons
Itertools
Idea for avoiding exception masking
Extended Function syntax
Capabilities in Python
Skipped Threads

Summary Announcements

To make the Summaries more thorough I have decided to make a list of threads that I have decided not to summarize and give my reasoning behind why I decided to skip it which should also explain what the thread is about. I doubt anyone will have complaints about doing this, but if you do let me know. You can find this section entitled "Skipped threads" at the end of the summary.

Some PyCon announcements to make. Registration is scheduled to start next week. Also, the list of accepted papers is expected to be announced on-time. The conference is shaping up and getting organized. It should be a lot of fun and educational, so make sure you come to PyCon !

Adding Japanese Codecs to the distro

MA Lemburg brought up a patch on SourceForge by Hisao SUZUKI (I am assuming the capitalization of the last name is a Japanese thing so I will stick with it in the summary) to add "codecs for the Japanese encodings EUC-JP, Shift_JIS and ISO-2022-JP" and donate the code to the Python Software Foundation. MAL's reason for wanting to use these codecs over the current ones most commonly used with Python (written by Tamito KAJIYAMA whose codecs can be found at http://www.asahi-net.or.jp/~rd6t-kjym/python/ ) was that Hisao's were smaller and written in pure Python.

Martin v. Lwis thought it was fine to add the codecs but thought Tamito's should be the defaults if any codecs were added to Python. His reasoning was that they have been in active development for a while and thus have been independently verified, in a way, to be correct. He also didn't like the "absence of the cp932 encoding in Suzuki's codecs" since the "suggestion to equate this to "mbcs" on Windows is not convincing". He also thought it would be best to add the C version only. He also suggested "that it might be more worthwhile to expose platform codecs, which would give us all CJK codecs on a number of major platforms, with a minimum increase in the size of the Python distribution, and with very good performance".

In response to Martin's comments, MAL thought that the benefit of the Python versions of the codecs was that they were easier to modify. He suggested that the C version then be made available to download as a separate entity (which apparently has been suggested multiple times). As for the independent verification, Hisao built his based on Java's codecs which MAL thought was pretty good verification that they should work. MAL agreed that exposing more platform codecs would be good.

Martin didn't go for the "easier editing" argument. He noted that since the table was autogenerated you would need to know the algorithm that made the table in order to be able to edit it.

Hye-Shik Chang responded with the support of the C version since it allows for much easier interprocess communication. It also beat the Python version when it came to memory consumption by a large margin.

Guido asked how Tamito would feel about adding these codecs, whether both sets outputted the same, and whether Tamito's codecs would still be preferred by people doing a lot of Japanese work.

Martin reponded that he didn't think they produced the same result and that people would continue to use JapaneseCodecs (Tamito's codecs) as long as Python didn't include the cp932 codec.

Barry Warsaw commented that Mailman used Tamito's codecs and no one has ever complained. He said he would love to see one of the codecs included in Python. Apparently their tend to be license issues with the various Asian codecs. The KorenCodecs (or KoCo) are licensed under LGPL, but Hye-Shik Chang said he would "like to make an essence of KoreanCodecs in PSF License if python needs it".

Atsuo Ishimoto voted for Tamito's codecs. He said that the Japanese community would continue to use them regardless of what went into Python. And in regards to the question of maintaining the codecs, Atsuo said the Japanese Python community would be willing to help Tamito with his but did not expect that kind of support for Hisao's codecs (Hisao has volunteered to maintain his codecs, though).

MAL pointed out that "Tamito's codecs have an installed size of 1790kB while Hisao's codecs are around 81kB" and that was an issue. Martin said that it was only 690kB if you only counted the C version. The issue of size became a major point. It was pointed out that Tamito's covered more codecs and being in C meant it only needed to be loaded in memory once for interprocess use.

Tamito chimed in finally and commented that he was busy and would not be able to maintain the codecs until he graduated from graduate school. MAL suggested that he start maintaining it when he had the time and Tamito agreed.

I am not sure where the situation stands as of this moment.

make install failing with current cvs

Jack Jansen was having issues with make install because he had various packages in site-python that had inconsistent tab usage and thus was raising a TabError during the compileall() step. Jack asked if site-packages could either be skipped by this step or at least not lead to the abortion of the install. All of this was from a change in compileall() that caused it to exit with a non-zero value if any errors were raised (although it continued to run). That feature was left in; the Makefile was modified so as to skip site-packages.

logging package -- rename warn to warning?

Guido said that he would "like to change 'warn' to 'warning' (and WARN to WARNING)" in the logging package. His argument that nothing else was a verb in the package.

Vinay Sajip (who wrote the logging package) said he was fine with the change, but warned it would break existing code that already used the package. He suggested putting both names in with a warning that warn was deprecated. This was shot down since a new package does not need to start with anything deprecated.

Then the discussion of whether to ditch fatal/FATAL since they are synonyms for critical/CRITICAL came up. Guido said the reverse should happen; keep FATAL and ditch CRITICAL (for the rest of this summary I will only refer to the capitalized version to keep things simple). Some opposition came up, holding the view that CRITICAL was closer to the truth. Vinay suggested SEVERE as a replacement. Guido said it didn't "tickle" his "fancy like fatal/FATAL does"; CRITICAL was okay if no agreed upon replacement could be found. A quick grepping of the package by me shows that CRITICAL is staying as well was WARN (will most likely not be documented, though, so don't use it for new code).

sys.exit and PYTHONINSPECT

Kevin Altis discovered that invoking the interpreter with the -i argument does not prevent the interpreter from exiting if SystemExit is raised (sys.exit() will raise the exception). He thought it would be good if -i would override even the SystemExit exception. Bug #670311 was created for this, but no patch has been written yet (anyone care to step up and do this?).

disable writing .py[co]

Splinter threads:

Proto-PEP regarding writing bytecode files

In the thread Parallel pyc construction (as covered in the last summary), Paul Dubois brought up how he would like to be able to disable the writing of .py[co] files. Well, this led to a long discussion that culminated into the creation of PEP 304 by Skip Montanaro. If this topic interests you then read the PEP.

Re: GadflyDA in core? Or as add-on-product?

The idea was proposed to add Gadfly to the stdlib. Two issues came up about doing this. One was that it would require having to deal with another license. Aaron Watters (the original author of Gadfly) was emailed to ask if he would be willing to donate the code to the Python Software Foundation, but he did not respond (at least to the list).

The other issue was the use of kjbuckets. Guido was -1 on letting it into the stdlib because he had "heard it's some of .the hairiest C code ever written". But Richard Jones said that Anthony Baxter is working on removing the dependency on kjbuckets by rewriting the code using the Sets module introduced in Python 2.3.

As of this writing the decision as to whether to add Gadfly has been put on hold until both of the above issues are resolved.

Mixed-type datetime comparisons

Tim Peters sent an email to the list saying that he had made changes to the datetime type so that it returned NotImplemented if the thing being compared to had a timetuple() method so as to allow other implementations to provide the functionality.

He also mentioned how you still can't compare datetimes and timedeltas. This caused MA Lemburg to ask why not. Tim said it just wouldn't make any sense since they don't even measure time in the same units. Tim ended on saying that if MAL and Fredrik Lundh could come up with a good proposal Tim would get it in, otherwise his time on datetime for Python 2.3 is spent.

Itertools

Raymond Hettinger checked in his itertools module into the sandbox for testing and asked for feedback. He got it and has subsequently checked the module into the stdlib.

Idea for avoiding exception masking

Raymond Hettinger discovered that "trapping an exception (a TypeError for a mutable argument passed to a dictionary) resulted in masking other errors that should have been passed through (potential TypeErrors in the called iterator for example)". Raymond proposed a new function named PyErr_FormatAppend that would append the extra info. Ka-Ping Yee suggested adding the original exception to the new exception as an attribute. This was generally viewed as a good solution. Questions of how to store traceback information and how to handle this in the C code has yet to be resolved.

Extended Function syntax

Splinter threads:

This threads wins the award for longest thread summarized here (and possibly ever for a summary written by me). Fair warning if you decide to read the actual thread.

Also, none of this applies to Python 2.3; Guido promised syntactic stability for this next release and you will get it.

Guido remembered that a proposal for extending the syntax of Python to make using things such as property(), staticmethod(), etc. much easier and wanted a reminder on where more info was. The suggeted syntax was:

def name(arg, ...) [expr1, expr2, expr3, ...]:
    ...body...

where expr, ... would be evaluated like name = expr3(expr2(expr1(name))). Michael Hudson wrote up a patch which can be found at http://starship.python.net/crew/mwh/hacks/meth-syntax-sugar-3.diff with the original python-dev thread on this at http://mail.python.org/pipermail/python-dev/2002-February/020005.html .

All of this led to an explosion of other syntax suggestions. One that garnered some support was to make something like:

class Foo:
    def Foo.bar(self, a, b, c):
        ...body...

work by basically making a def statement work like Foo.bar=new.function(CODE, GLOBS, 'Foo.bar'). If properties were made assignable then you could just use this syntax to assign to a property you instantiated. People also came up with metaclass implementations of some things (which Guido didn't like because they "all abuse the class keyword for something that's definitely not a class").

An idea that Walter Drwald suggested was:

class Foo(object):
   property myprop:
       """Doc string for myprop."""
       def __get__(self):
           ...body...
       def __set__(self, arg):
           ...body...
       def __delete__(self):
           ...body...

Guido's objection was that it turned property into a keyword. But he admitted that a new keyword would be okay if an elegant solution came about.

Guido then made the suggestion of:

foo = property:
    ...

This also garnered some support. The problem with this is that it would require the parser to be able to go from '"expression parsing mode" to "block parsing mode"' in a new way. But Guido seemed to solve it. It would also allow the generalization of this to something along the lines of:

v = e:  # Possibility for adding args -- v = e:(x, y):
   S  # Creates a thunk

which would be equivalent to:

v = e(T)  # T is a thunk created from S

None of this is meant to replace the suggested [expr, ...] syntax that started this whole thread.

Another syntactic suggestion was:

def foo as property:
    def __get__(self):
        ...
    def __set__(self, x):
        ...

Which seemed to grab a lot of people's fancies. This would allow for "... as class", "... as generator", etc.

But the problem with all of these proposals is how to deal with namespaces. Should, for instance, the thunks Guido suggested have their own namespace or use the one of their parent? Should their be only one type of thunk or multiple types with different scoping rules based on the situation? The namespace issue (especially for thunks) is still playing out.

Capabilities in Python

This thread actually came out of the whole property/thunk thing summarized directly above but was so tangential that I thought it deserved its own summary.

Ben Laurie had apparently suggested that capabilities be added to Python. Jeremy Hylton gave a pretty good description of a capability as "a ticket. The ticket is good for some event and possession of the ticket is sufficient to attend the event"; limit abilities of objects based on what capabilities you possess. The problem is that Python's introspection lets you see everything and thus you can get around restrictions; you can get into the event by sneaking out back and crawling under the fence all without having a ticket. Jeremy mentioned Zope's proxy which fully wraps an object and forces all interaction through the proxy.

Then Ka-Ping Yee stepped up to discuss the topic. He made the point that "creating a capability is equivalent to creating an object -- which you can only do if you hold the constructor". Ping said that "To build a capability system, all you need to do is to constrain the transfer of object references such that they can only be transmitted along other object references. That's all." In order to pull this off you would need to restrict access to only the declared interface of an object and have "no global namespace of modules". In other words have something like a private declaration for things and get rid of sys.modules. Ping suggested coming up with a "secure-mode" Python where these restrictions would be turned on. Neil Schemenauer liked this idea.

Skipped Threads

test_logging failing on Windows 2000: Since it dealt with a bug that received a patch I didn't worry about it.
PyConDC sprints: Meant for the people on python-dev directly, so no point in telling the general community about it.
String method: longest common prefix of two strings: Didn't go anywhere; Guido said he doesn't want to add every useful thing as a string method since there are already so many.
very slow compare of recursive objects: Discussed a patch about recursive compares. Basically led to no change and dealt with how to make a tuple recursive.
test_email always supposed to be skipped?: Glitch in Makefile.pre.in which got fixed.
extending readline functionality (patch): Someone sending a patch directly to python-dev; big no-no! Use SourceForge for any all patches.
Re: [Python-checkins] python/dist/src Makefile.pre.in,1.111,1.112: No one ever responded to the email; problem was probably solved.
Cookie.py too strict: Barry Warsaw wanted to make the Cookie module less strict in acceptance. Michael Chermside suggested adding an argument to be able to specify whether the acceptance of data should be strict or not.
HTMLParser patches: Mentions two patches for HTMLParser to fix some parsing problems.
Name space pollution (fwd): Someone on c.l.py didn't like the naming of a function destructor in the core because it is such a common name and thus can clash with other code. Guido said submit a patch or use some #define tricks.
logging package -- spurious package contents: Guido didn't like the config file for the logging package nor the extra handlers. It was agreed to ditch the config file but keep the handlers.
logging package -- documentation: Better docs for the logging package were uploaded and are ready to be applied.
Re: [Zope3-dev] PosixTimeZone tzinfo implementation: Single email mentioning the tm_gmtoff attribute of the datetime type.
test_ossaudiodev hangs: The ossaudiodev module was hanging in tests. It has been disabled in setup.py.
Looking for a Windows bsddb fan: Tim Peters upgraded the Windows distribution to berkeleydb 4.1.25; doing so caused a huge amount of test failures. Turned out they were race conditions in the tests inherit in such complicated tests.
Extension modules, Threading, and the GIL: Mention of how the GIL puts a cramp into writing PyGTK. This thread started last month ( http://www.python.org/dev/summary/2003-01-01_2003-01-15.html#extension-modules-threading-and-the-gil ).
the new 2.3a1 settimeout() with httplib and SSL: There was a problem with the new timeout mechanism in socket and SSL. There is a a patch at http://www.python.org/sf/676472 to fix the problem.
non-blocking sockets and sendall: Brian Ellin noticed that socket.sendall() and non-blocking sockets don't work together.
Re: [Python-checkins] python/dist/src/Lib/bsddb dbshelve.py,1.4,1.4.4.1: Skip Montanaro asking whether bsddb185 should become "a fallback to the fallback".
xmlparse.c no longer compiles on Windows: Tim Peters discovered xmlparse.c had some #define lines for the non-existent case of no memmove() function.
fixing traceback.extract_stack() line numbers: Greg Klanderman noticed that traceback.extract_stack() only gives the first line of the method and not of the actual line that caused an issue.
python2.3 on m68k-nommu uClinux, really slow: Brad Clements got Python working on an m68k processor and performance was horrible. He asked if anyone had any ideas on how to speed it up.
native code compiler? (or, OCaml vs. Python): Someone asked if Python was ever going to have a native code compiler. The answer is "if you want to write one" and to ask the question nicely. It also became a debate over how difficult a native code compiler would be.
Re: RHSA-2002:202-25: Skip Montanaro sent an email about a security hole found in os.exec*p() method that was patched back in August. It has subsequently been back-patched to 2.1.
map, filter, reduce, lambda: Andrew Koenig commenting on the discussion from the last summary.
Global interpreter lock/import mechanism: Joerg Budischewski asked some questions about the proper way of doing some things in a C extension.
Should Cygwin Python use --enable-runtime-pseudo-reloc?: Title of this thread makes what it is about obvious.
New PEP 306 needs proof reading.: Another thread whose content is obvious by the title.
StringIO not an iterator?: Misunderstanding initially. Guido then suggested that (c?)StringIO become its own iterator which became a bug report.
SF patch #677429 (PEP293 callbacks): Walter Drwald asking for someone to look over a patch since Martin v. Lwis in on vacation (and this explains why I have not been bringing up the character palette so much in this summary for inserting umlauts =).
rexec/bastion: Someone asking an improper question on the list (Mailman lists this in February, but I am on Pacific time so I say it is in January =).
PEP 42: sizeof(obj) builtin: Raymond Hettinger asked if people still wanted sizeof() in Python. One person said yes. The difficulty of doing this correctly was also pointed out along with the suggestion that if this function is written it be a part of the sys module.
PEP 305 - CSV File API: "A new PEP (305), "CSV File API", is available for reader feedback".
New version of PEP 304: A new version of PEP 304 is up.
import.c:load_source_module() confusion: Skip Montanaro had a question about returning when a possible error was signalled on an import.