python-dev Summary from 2003-02-01 through 2003-02-15
This is a summary of traffic on the python-dev mailing list from February 1, 2003 through February 15, 2003. It is intended to inform the wider Python community of on-going developments on the list that might be of interest to them. To comment on anything mentioned here, just post to email@example.com or comp.lang.python with a subject line delineating what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join python-dev!
This is the eleventh summary written by Brett Cannon (and still sane even after over 800 emails for this summary).
All summaries are archived at http://www.python.org/dev/summary/ .
Please note that this summary is written using reStructuredText which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST (else it is probably regular expression syntax); you can safely ignore it (although I suggest learning reST; its simple and is accepted for PEP markup). Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils as-is unless it is from the original text file.
I just realized how much regex syntax I use in the summaries. You might help you read the summaries if you know it. =)
I renamed the 'Skipped Threads' section that I introduced in the last section to 'Quickies' since they are not really skipped but are just "quickie" summaries that I don't feel warrant a full-blown summary (or I was just feeling lazy =).
As part of the massive thread that caused some hoopla on the list about how to try to keep info manageable on the list, a suggested syntax for an acquire/release setup came up. It basically defines a way to have a method called before the beginning of the execution of a block of code and then a method to be called when the block is finished. It uses a suggested keyword 'with'.
Read PEP 310 for all the details (thanks to Paul Moore and Michael Hudson for writing the PEP; I would have gone insane summarizing this!).
Thanks goes to Samuele Pedroni for writing up a technical summary of all of this and sending me a copy.
This whole thread started up over a suggested extended synatax for functions:
def fxn(args) [fun1, fun2, ..., funX]: ...code...
This was to be equivalent to:
fxn = funX(...(fun2(fun1(fxn)))...)
This came up as a way to make it easier for using things such as property(), classmethod(), and staticmethod() since you could define things as:
def fxn(args) [staticmethod]: ...code...
A variant on all of this that was proposed was:
def fxn(args) as fun1, fun2, ..., funX: ...code...
def fxn(args) is fun1, fun2, ..., funX: ...code...
The two proposed syntaxes above were suggested to be made very general so that you could do something like:
def fxn(args) as function: ...code...
It would be difficult and kludgy, though, to make the above syntax work for both function and staticmethod. And since this was all about how make using property() easier, property-specific suggestions included:
class klass(object): def prup as property: ...code...
The problem with the former is that it requires turning "property" into a keyword.
None of these are PEPs yet nor have had BDFL pronouncement.
Thanks to Samuele Pedroni for providing me a technical summary that I was able to use to help write this summary.
The concept of adding thunks to Python came up through the extended function syntax thread (in case you don't know what a thunk is, you can think of it as a chunk of code that is compiled and ready to use but is not executed until a later time). Some suggested syntaxes were:
lvalue = thunk(args): ...code...
do lvalue = thunk(args): ...code...
The lvalues would be optional. Samuele Pedroni suggested of thinking of thunk(args) more like thunk_consumer_maker(args) since the thunk would "be a reification of the ...code... suite, i.e. an object for ...code...".
As with the extended function syntax, no PEPs have been done nor any BDFL pronouncements.
After comments were made that the way capabilities were described earlier in the thread were strange, Ben Laurie got to the point and said that if he gets "secure bound methods" he will have what he wants. Going with the ticket analogy that was used earlier, Ben said that he didn't like having an intermediate ticket checker like the proxies used by Zope; the "ticket is the method".
Jeremy commented on some of Ben and Ka-Ping Yee's comments. It seems that security code is sprinkled throughout the interpreter; centralizing it would help secure Python.
The thread stopped on python-dev (the emails suggested the discussion was taken off-list) with no specific plans on changes.
Ignoring all the emails that were in direct relation in dealing with the whole mini-flame war that this thread started on, it did lead to a bet between Guido and Dan Sugalski over whether Parrot or CPython would run a standard Python benchmark suite the fastest. The stakes are $10, a round of beer for the winning development team, and a pie in the face of the loser.
Since python-dev's honor is partially at stake, various ideas came up about how to speed up the core. One idea was to inline built-in functions by giving them their own bytecode. So, for instance, len() might be inlined in bytecode as BUILTIN_LEN instead of having to bother with a function call to the actual len() code in the core.
Another suggestion was to continue work on namespace access. These are covered in PEP 266, PEP 267, and PEP 280. Guido also suggested "not to allow binding new globals if they shadow certain builtins" so as to save on the name lookup.
Skip Montanaro mentioned his peephole optimizer.
There was also a mention of rattlesnake; a register-based VM that Skip started and Neil Schemenauer picked up.
Taking a fresh look at PyEval_EvalCodeEx was also mentioned since a great expense is setting up a function call.
During the whole with/do/thunk thread and another thread from a disgruntled person who didn't like how the list responded to his email, Samuele Pedroni let the list know that Fredrik Lundh (a.k.a. effbot) had unsubscribed from python-dev, disillusioned with how things had ended up.
There was a suggestion to spin a list off of python-dev that handled "blue-sky" discussions. This was basically killed because most people who would subscribe to the new list would be subscribed to python-dev and thus would basically a moot point.
There are two things to be learned from this thread. One is to make sure you change the subject line of an email when you are replying to an email but changing the focus. The other is to make sure not to say anything in an email that comes off as accusatory. One of the reasons this whole email thread turned negative was that a statement made in the initial email was received as condescending and questioning the abilities of python-dev when trying to guess as to why something had not been implemented. It is best to ask why something has not been done then ask and then immediately wager an answer to your own question.
Michael McLay brought up the point that although FixedPoint had received BDFL pronouncement for inclusion into the stdlib, it still had not happened. It was then pointed out that no one had put the effort into doing the work necessary to make sure that the package was ready to be added.
A discussion then started amongst some people about some details of the package. There was a discussion of naming of certain methods and such. One conclusion that was reached before the discussion was taken off-list was that the package's interface should be minimized and kept as simple as possible while still being useful.
MA Lemburg brought up the question of what the ramifications will be with PyArg_Parse() issuing a DeprecationWarning instead of TypeError when a float is passed as an argument where "i" is used as the argument type. What causes this to not be cut and dry is that floats have an __int__ method that actually loses information so you can't just use int() to do conversions and still have the same value.
Guido said that, in hindsight, there should have been a function that converts an int-like objects to real ints and another to convert floats to ints; this separates conversions that might lose information. He also said that eventually "i" will raise a TypeError when less code will be broken by this behavior.
The history of how the "i" argument marker was given. Apparently, way back when, someone said that "i" should not only accept true ints as it did in its first incarnation. So having it use __int__ was added. Unfortunately no one realized that float implemented __int__ and obviously floats for things such as indexing are not reasonable. Raising a warning is a step towards eliminating this wart.
The acronym YAWTM (You Are Worrying Too Much) and its snarky younger brother YADWTM (You are DEFINITELY Worrying Too Much) were also brought into this world in this thread.
Robert Ledwith wished that Python 2.3 would support the --without-cycle-gc compile option that is scheduled to be eliminated; objects would have a smaller footprint and was faster.
The reason the option was removed in the first place was that the code for the trashcan (handles the deallocation of objects) "is nearly obvious when we can rely on gc being there". With GC turned off extension modules cannot use the trashcan without editing the source of Python itself.
Christian Tismer (who wrote the trashcan code) said that it could actually choose whether or not to take in an object based on its type. So what could happen is there could be a non-GC type that would cause the object to not even interact the trashcan and thus not complicate the code. The leading idea was to have a nogcobject that acted as the "true" root object which 'object' inheriting from nogcobject while causing GC to be used.
It was also pointed out that __slots__ saves a lot of space on objects when used. It was then suggested that you be able to specify the type of each thing in __slots__ since that would save even more space when the type is a simple one. Guido said that it seemed a new name would be needed for this attribute. He also said that the idea allowing you to specify the order of the objects was good.
Just van Rossum posted a rough PEP proposing a new protocol to allow mutable objects as keys in dictionaries. His suggestion was that if an object was mutable the dictionary's __setitem__() would call the key's __immutable_copy__() and use that as the key. The dict's __contains__() would use __temporary_immutable__() from the object (this is needed so as to return something that defines __eq__() and __hash__() in order to be able to do the comparison against other keys). Just claimed that putting this in 'would add no overhead for the "normal" case (when keys _are_ directly hashable)'.
The question of whether this would be thread-safe was raised. Just said it would be for Python code thanks "to the wonders of the Global Interpreter Lock".
Guido responded that he didn't think implementing this would have no performance hit. If anything, he thought that it "may reduce code locality and hence cause the code to miss the cache more often than before". He suggested just coming up with a subclass of dict that implemented this. That was not an option, though, because the entire reason for this was to make dicts work more easily with PyObjC and thus would need to be built in. So Guido ended up asking Just to implement it and benchmark it to see what kind of hit was taken.
A thread on comp.lang.python sprung up about adding a ternary operator to Python (yes, the title says "Trinary Operators", but that is apparently not technically correct). In case you are not a C programmer, the ternary operator can be thought of an if/else statement that acts like an expression (it's the <cond>?<true stmt>:<false stmt> syntax from C). Another way of thinking of it is as an if/else statement that is inlined.
Guido was sick of this debate coming up, so he wrote PEP 308 on the subject and has left it up to comp.lang.python to argue this one out (he actually explicitly said not to discuss it on python-dev) and decide on the final fate of this idea. If you have an opinion voice it on comp.lang.python since this will also be used as a test to see if more things should be put on c.l.py for discussion instead of python-dev.
As of this writing, Samuele Pedroni updated the list on posting stats for those of us who don't follow c.l.py. Two syntactic choices are currently in the running (this is ignoring objections to the whole idea). One is if <cond>: <true stmt> else: <false stmt> and the other is <cond> ?? <true stmt> || <false stmt>.
PEP 307 has been unleashed upon the world. Guido and Tim Peters came up with a new pickling protocol (aptly named protocol 2; I am sure it took a lot of Guido's magical pickles to come up with that name). The reason for doing this was to have the pickle for new-style classes not be so immense (this was brought to the forefront by datetime and the hell that Tim and Guido went through to get it to pickle reasonably). Everything is backwards compatible since it is a new protocol, so there is no need to panic. Still interesting in terms of what they are doing, though, so you might want to read it.
Paul DuBois asked if it would be possible at some point to be able to pickle a class or function definition. Paul was informed that pickling is meant for data only. Jeremy Hylton, though, has managed to pull something like what Paul wants off in Zope 3 by subclassing Pickler and UnPickler from the pickle module (so doesn't work with cPickle).
Arne Koewing asked if pickling weak references could be done. Short answer, no. Long answer, not unless you want great pain trying to do implement it.
Jack van Rossum was "tempted to set Py_FileSystemDefaultEncoding to "utf8" for OSX". He tried it and it works well, but he noticed that os.listdir() returns strings and not Unicode objects. So obviously the question came up as to whether to change this so that it returned Unicode when Py_FileSystemDefaultEncoding is set to something that is non-NULL.
The issue is round-trip passing of strings. If you get something from os.listdir() you would like to be able to pass to file() with no issues. The question is how to handle this need without breaking backwards-compatibility.
As was summarized before, there was talk of adding Gadfly to the stdlib. Two issues with adding it was its license and having kjbuckets as a dependency. The first one was solved when Aaron Watters said he would be happy to transfer the license to the PSF. The second is currently being worked on.
Another issue came up, though. Gadfly is approximately 11,000 lines of Python code (Richard Jones thinks 1,200 lines could be removed by taking out the parser builder). That is an immense chunk of code with no one stepping forward to be the active maintainer of it. Without someone willing to make sure the code does not suffer from bit rot there is a chance Gadfly won't be allowed into the stdlib. If you happen to think you can take on maintenance responsibilities, please step forward and let it be known.
In an attempt to make it easier for C code to access the GIL in complicated, threaded situations where knowledge about the state of the Python interpreter is limited, Mark Hammond has come up with a new API that is covered in PEP 311. In essence, Mark has come up with a way to allow external C code to get the GIL to do work with Python code in a threaded environment without having to know too much about the status of the interpreter in regards to threading.
Glyph Lefkowitz suggested creating a new bytecode instruction for doing method calls since the current setup takes 3 separate bytecode calls and is about 20% slower than a function call according to Glyph. Guido said he was all for a special bytecode as long as the semantic meaning, in the end, is the same.
Skip Montanaro suggested caching the results of getattr(). Guido said it could only work if the getattr() for each object did the caching. Glyph clarified, though, that he was not going after caching but just cutting the bytecode instruction count for a method count down 'to mean "call this method" rather than "get this attribute, call the result"'.