python-dev Summary for 2003-03-01 through 2003-03-15

This is a summary of traffic on the python-dev mailing list from March 1, 2003 through March 15, 2003. It is intended to inform the wider Python community of on-going developments on the list and to have an archived summary of each thread started on the list. To comment on anything mentioned here, just post to python-list@python.org or comp.lang.python with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join python-dev!

This is the thirteenth summary written by Brett Cannon (same number as my predecessor, Michael Hudson =).

All summaries are archived at http://www.python.org/dev/summary/ .

Please note that this summary is written using reStructuredText which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST (else it is probably regular expression syntax); you can safely ignore it (although I suggest learning reST; its simple and is accepted for PEP markup). Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils as-is unless it is from the original text file.

Contents

Summary Announcements
Ridiculously minor tweaks?
Capabilities in Python
Quickies

Summary Announcements

As I am sure most readers of this summary know by now, I am going to PyCon. This means that I will be occupied the whole last week of this month. I suspect python-dev traffic will be light since I believe most of PythonLabs will be at Pycon and thus not working. =) But still, I will be occupied myself and thus won't have a chance to work on the summary until I come home. This means you should expect the next summary to be rather late. I will get to it, though, at some point.

And in case you haven't yet, register for PyCon.

Ridiculously minor tweaks?

Splinter threads:

The original point of this thread was Jeremy Fincher finding out if patches changing lists to tuples where the list was not mutated would be accepted for a miniscule performance boost (the answer was no). But this wasn't the interesting knowledge that came out of this thread. This thread led to Guido stating his intended uses of tuples and lists.

And you might be going, "lists are for mutable sequences of objects while tuples are for immutable sequences of objects". Well, that is not what Guido thinks of lists and tuples (and don't feel bad if you thought otherwise; Christian Tismer didn't even know what Guido had in mind and Python does not exactly require you to agree with Guido on this). Turns out that tuples, in Guido's view of the world, are "for heterogeneous data" and "list[s] are for homogeneous data"; "Tuples are not read-only lists".

Guido spelled out his thinking on this in a later email. He basically said that he viewed lists as "a sequence of items of type X" while tuples are more like "a sequence of length N with items of type X1, X2, X3, ..." This makes sense since lists can be sorted while tuples can't; sorting on different types don't necessarily result in a sequence sorted the way you think about it.

And if you are still having issues of wrapping your head around this, just view tuples as structs and lists as arrays as in C.

This thread then led to another topic of comparisons in Python. Guido ended up mentioning how he wished == and != worked on all types (with disparate types always being !=) while all of the other comparisons only worked on similar types for the interpreter's default comparison abilities.

This then led to Guido saying how he wished the __cmp__() magic method and the cmp() built-in didn't exist. This is because there are currently two ways to do comparisons; __cmp__(), and then all of the other rich comparison magic methods. You can implement the same functionality as __cmp__() using just __lt__() and __eq__(). There can also be an unneeded performance penalty for __cmp__() since (using the previously mentioned way of re-implementing __cmp__()) you might have to do some unneeded comparisons when all you need is __eq__().

This discussion is still going on.

Capabilities in Python

Splinter threads:

This is a continuation of a discussion covered in the last summary.

This was definitely the thread from hell for this summary. =) It is very long and there was confusion at multiple points over terminology. You have been warned.

Three things were constantly being discussed in this thread; restricted execution, capabilities, and proxies. We discuss them in this order.

Restricted execution basically cuts out access to certain objects at execution time. Currently, if you replace the global __builtins__ with something other then what __builtin__.__dict__ has then you enable restricted execution in Python. This cuts off access to built-in objects so as to prevent you from circumventing security code by, for instance, importing the sys module so you can replace a module's code in sys.modules. Both capabilities and proxies are worthless without restricted execution since they could be circumvented without it.

Capabilities can be viewed as references or bound methods (but not both at the same time, necessarily). Security with capabilities is done based on possession; if you hold a reference to an object you can use that object without issue. The distribution of capabilities becomes the issue with this system.

Proxies are a wrapper around objects that restrict access to the object. This restriction extends all the way to the core; even core code such as built-ins can't get access to parts of a proxied object that it doesn't want any object to get a hold of. The trick with proxies is making them secure on their own since they get passed around without worrying who gets it since it is the proxy that provides security, not possession like with capabilities. Two examples of proxies are Zope proxies and mxProxy.

There was talk of a PEP on all of this but one has not appeared yet; it is currently being worked on by Ben Laurie, though.

Quickies

Codec registry: Gustavo Niemeyer asked someone to review a patch.
Changes to logging in CVS: Vinay Sajip if someone checked-in changes to the logging package could be rolled back since it broke compatibility with Python 1.5.2 which the logging package tries to keep (as mentioned in PEP 291). The changes were removed.

__slots__ for metatypes: Christian Tismer asked Guido and the list to take a look at a patch that would allow meta-types to have a __slots__. The patch was accepted and applied.
new bytecode results: Damien Morton continues on his quest to get performance boosts from fiddling with the eval loop contained in ceval.c and trying out various opcode ideas. It was pointed out that pystone is a good indicator of how Zope will perform on a new box. It was also stated by Tim Peters that since it is such an atypical test that it helps to make sure any improvements you make really do make an improvement. Damien also requested more people contribute statistical information to Skip Montanaro's stat server (more info at http://manatee.mojam.com/~skip/python/ ).

module extension search order - can it be changed?: This was discussed in the last summary. Tim Peters mentioned how he doesn't use linecache often and that it's printing out of date info is not of any great use for tracebacks.

JUMP_IF_X opcodes: Damien Morton, still on the prowl for better opcodes, suggested introducing opcodes that combined branching opcodes and POP_TOP (which pops the top of the interpreter stack) and did the pop based on the truth value of what was being tested. Neal Norwitz suggested that instead the branching instructions just always pop the stack. But then Raymond Hettinger came up with prediction macros that give a great speed boost to loops and conditionals by checking if the next opcode is one that normally follows the current opcode. If it does, it skips going through the eval loop for the next opcode, thus saving on overhead. If all of this cool opcode stuff that Damien keeps doing interests you, you will want to read opcode.h, ceval.c, and learn how to use the dis module.

Fun with timeit.py: A new module named timeit was added to the stdlib at the request of Jim Fulton. The module times the execution of code snippets. Guido timed the execution of going through a 'for' loop a million times with interpreters from Python 1.3 up to the current CVS (2.3a2 with patches up to that point). The result was that CVS was the fastest by a large margin.

Pre-PyCon sprint ideas: I asked the list to suggest ideas to sprint on at PyCon.
More Zen: Words of wisdom from Raymond Hettinger that everyone should read. And if you have never read Raymond's School of Hard Knocks email you owe yourself to stop whatever you are doing and read it now. I can personally vouch that email is right on the money; I have experienced (or suffered, depending on your view =) every single thing on that list sans writing a PEP (although writing the Summary is starting to be enough writing to be equal =) .

xmlrpclib : xmlrpclib: Apology: Bill Bumgarner, the "hillbilly from the midwest of the US", asked if the xmlrpclib module was being maintained. The lesson was also learned to not call Fredrick Lundh "Fred" on the list since Fred L. Drake, Jr. tends to be associated with the name. =)

httplib SSLFile broken in CVS: Something got broken and fixed.
super() bug (?): Samuele Pedroni thought he may have found a bug with super() but turned out it wasn't.
test_popen broken on Win2K: Win2k does not like quoting of commands when there is no space in the command as Tim Peters discovered. There were discussions on how to deal with this. The suggestion of coming up with an sh-like syntax that works on all platforms (like what tcl's exec command has) was suggsted.
Change in int() behavior: David Abrahams rediscovered the joys of the road to which leads to int/long unification when he noticed that isinstance(int(sys.maxint*2), int) returns False. This will not be an issue once we are farther down this road.
acceptability of asm in python code?: Damien Morton popped his optimizing head back up on python-dev asking if assembly code was acceptable in the core. As of right now there is none, but Tim Peters stated that if there was some that had "a huge speedup, on all programs" then it would be considered, although "on the weak end of maybe". Christian Tismer (who plays with assembly in Stackless) warned against it because it could cause a compiler to not use optimizations.

Internationalizing domain names: Martin v. Löwis asked someone to look over his patches to implement IDNA (International Domain Names in Applications) which allows non-ASCII characters in domain names.
VERSION in getpath.c: Guido explains to someone what compile variables are used to generate some compile-based search paths.
Where is OSS used?: Greg Ward asked what OSs use OSS.

Audio devices: Greg Ward asked for opinions on some API issues for ossaudiodev.

bsddb3 test errors - are these expected?: Skip Montanaro asked if some errors from the testing of bsddb3 on OS X were expected.

os.path.dirname misleading?: Kevin Altis was surprised to discover that os.path.dirname would return the tail end of a directory instead of an empty string when the argument to the function was just a directory name.

Care to sprint on the core at PyCon?: Me asking the world if they wanted to sprint on the core at the pre-PyCon sprint (if you do, read the email for details).
Iterable sockets?: Andrew McNamara wished that socket objects were iterable on a per-line basis without having to call makefile(). Guido said he would rather come up with a better abstraction for Python 3 and prototype it in Python 2.4 or later.
More int/long integration issues: David Abrahams noticed that range() and xrange() couldn't accept a long. It basically led to Guido stating he considers xrange() a mistake and wished it didn't exist now that we have iterators. But since getting rid of it would break code he can at least prevent it from gaining abilities. It also led to Guido mentioning again how he would like to prohibit shadowing of built-ins.
tzset: A new function, time.tzset(), was added to Python and the tests had failed under Windows. The tests and the ./configure check were changed as needed.
PyObject_New vs PyObject_NEW: Lesson of the thread: PyObject_NEW is only to be used in the core; use PyObject_New() for extension modules.

are NULL checks in Objects/abstract.c really needed?: ... They are not required, but they are there to protect you against poorly written extensions. Skip Montanaro subsequently suggested a --without-null-checks compile option.
PyEval_GetFrame() revisited: A possible API for manipulating the current frame was still being discussed.