The iterator feature introduced in Python 2.2 and the itertools module make it easier to write programs that loop through large data sets without having the entire data set in memory at one time. List comprehensions don't fit into this picture very well because they produce a Python list object containing all of the items. This unavoidably pulls all of the objects into memory, which can be a problem if your data set is very large. When trying to write a functionally-styled program, it would be natural to write something like:
links = [link for link in get_all_links() if not link.followed] for link in links: ...
instead of
for link in get_all_links(): if link.followed: continue ...
The first form is more concise and perhaps more readable, but if you're dealing with a large number of link objects you'd have to write the second form to avoid having all link objects in memory at the same time.
Generator expressions work similarly to list comprehensions but don't materialize the entire list; instead they create a generator that will return elements one by one. The above example could be written as:
links = (link for link in get_all_links() if not link.followed) for link in links: ...
Generator expressions always have to be written inside parentheses, as in the above example. The parentheses signalling a function call also count, so if you want to create a iterator that will be immediately passed to a function you could write:
print sum(obj.count for obj in list_all_objects())
Generator expressions differ from list comprehensions in various small ways. Most notably, the loop variable (obj in the above example) is not accessible outside of the generator expression. List comprehensions leave the variable assigned to its last value; future versions of Python will change this, making list comprehensions match generator expressions in this respect.
See Also:
See About this document... for information on suggesting changes.