Re: sort() method

Guido van Rossum (Guido.van.Rossum@cwi.nl)
Thu, 30 Jul 1992 21:38:33 +0200

Here are some comments on the issue of a proposed extension to
list.sort() (an optional comparison function that determines how the
elements of a list are to be compared).

There's nothing wrong with this wish, and it has come up before
(although I wouldn't call it perl-ish but C-ish, see qsort(3)).

The value of such an extension, however, is limited, and usually
the alternative method suggested by Tim Peters is better: create a
temporary list containing tuples containing some function of the items
to be compared that determines the right ordering. E.g., to sort
strings independent of case, create a list containing tuples with the
string converted to lowercase first and the original string second,
and sort this list.

I'll try to explain why I think this way of sorting is better in many
cases.

Doing the equivalent in C would be wasteful of space and probably not
significantly faster than a cleverly-coded version that does the
lowercase conversion on the fly. However, the situation in Python is
different. True, we are wasting some space, but we are gaining speed,
since a "cleverly-coded" comparison function in Python has to be much
slower than Python's built-in tuple or string comparison operation.
(In general, many techniques for "clever coding" in C don't work in
Python because the relative speed of operations are vastly different.)

Knuth tells us that sorting takes at least, roughly, N log2 N
comparisons. Using the "comparison-function" method to sort a
1000-item list would mean roughly 10000 calls to this function, and
thus 20000 calls to string.lower(). On the other hand, the "temp
list" method would require only 1000 calls!

There are, however, situations in which the ordering of items is not
easily converted to an equivalent ordering on some function of the
items, while comparing any two items would be a piece of cake. A
specia case, not unimportant for Python, is the situation of a
user-defined class whose objects have a cmp() method.

It is easy to add an optional comparison function to list.sort() now,
so that I might just do that. (Maybe it wasn't so easy when I wrote
the original code; calling Python back from C is a "recent" addition
to the tools available to the extension writer.)

--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"Oh, so you want to fly an AEROPLANE?"