PEP: 343 Title: The "with" Statement Version: $Revision: 41350 $ Last-Modified: $Date: 2005-10-28 23:08:12 -0700 (Fri, 28 Oct 2005) $ Author: Guido van Rossum, Nick Coghlan Status: Draft Type: Standards Track Content-Type: text/plain Created: 13-May-2005 Post-History: 2-Jun-2005, 16-Oct-2005, 29-Oct-2005 Abstract This PEP adds a new statement "with" to the Python language to make it possible to factor out standard uses of try/finally statements. The PEP was approved in principle by the BDFL, but there were still a couple of implementation details to be worked out (see the section on Resolved Issues). It's still at Draft status until Guido gives a final blessing to the updated PEP. Author's Note This PEP was originally written in first person by Guido, and subsequently updated by Nick Coghlan to reflect later discussion on python-dev. Any first person references are from Guido's original. Introduction After a lot of discussion about PEP 340 and alternatives, I decided to withdraw PEP 340 and proposed a slight variant on PEP 310. After more discussion, I have added back a mechanism for raising an exception in a suspended generator using a throw() method, and a close() method which throws a new GeneratorExit exception; these additions were first proposed on python-dev in [2] and universally approved of. I'm also changing the keyword to 'with'. On-line discussion of this PEP should take place in the Python Wiki [3]. If this PEP is approved, the following PEPs will be rejected due to overlap: - PEP 310, Reliable Acquisition/Release Pairs. This is the original with-statement proposal. - PEP 319, Python Synchronize/Asynchronize Block. Its use cases can be covered by the current PEP by providing suitable with-statement controllers: for 'synchronize' we can use the "locking" template from example 1; for 'asynchronize' we can use a similar "unlocking" template. I don't think having an "anonymous" lock associated with a code block is all that important; in fact it may be better to always be explicit about the mutex being used. (PEP 340 and PEP 346 have already been withdrawn.) Motivation and Summary PEP 340, Anonymous Block Statements, combined many powerful ideas: using generators as block templates, adding exception handling and finalization to generators, and more. Besides praise it received a lot of opposition from people who didn't like the fact that it was, under the covers, a (potential) looping construct. This meant that break and continue in a block-statement would break or continue the block-statement, even if it was used as a non-looping resource management tool. But the final blow came when I read Raymond Chen's rant about flow-control macros[1]. Raymond argues convincingly that hiding flow control in macros makes your code inscrutable, and I find that his argument applies to Python as well as to C. I realized that PEP 340 templates can hide all sorts of control flow; for example, its example 4 (auto_retry()) catches exceptions and repeats the block up to three times. However, the with-statement of PEP 310 does *not* hide control flow, in my view: while a finally-suite temporarily suspends the control flow, in the end, the control flow resumes as if the finally-suite wasn't there at all. Remember, PEP 310 proposes rougly this syntax (the "VAR =" part is optional): with VAR = EXPR: BLOCK which roughly translates into this: VAR = EXPR VAR.__enter__() try: BLOCK finally: VAR.__exit__() Now consider this example: with f = opening("/etc/passwd"): BLOCK1 BLOCK2 Here, just as if the first line was "if True" instead, we know that if BLOCK1 completes without an exception, BLOCK2 will be reached; and if BLOCK1 raises an exception or executes a non-local goto (a break, continue or return), BLOCK2 is *not* reached. The magic added by the with-statement at the end doesn't affect this. (You may ask, what if a bug in the __exit__() method causes an exception? Then all is lost -- but this is no worse than with other exceptions; the nature of exceptions is that they can happen *anywhere*, and you just have to live with that. Even if you write bug-free code, a KeyboardInterrupt exception can still cause it to exit between any two virtual machine opcodes.) This argument almost led me to endorse PEP 310, but I had one idea left from the PEP 340 euphoria that I wasn't ready to drop: using generators as "templates" for abstractions like acquiring and releasing a lock or opening and closing a file is a powerful idea, as can be seen by looking at the examples in that PEP. Inspired by a counter-proposal to PEP 340 by Phillip Eby I tried to create a decorator that would turn a suitable generator into an object with the necessary __enter__() and __exit__() methods. Here I ran into a snag: while it wasn't too hard for the locking example, it was impossible to do this for the opening example. The idea was to define the template like this: @contextmanager def opening(filename): f = open(filename) try: yield f finally: f.close() and used it like this: with f = opening(filename): ...read data from f... The problem is that in PEP 310, the result of calling EXPR is assigned directly to VAR, and then VAR's __exit__() method is called upon exit from BLOCK1. But here, VAR clearly needs to receive the opened file, and that would mean that __exit__() would have to be a method on the file. While this can be solved using a proxy class, this is awkward and made me realize that a slightly different translation would make writing the desired decorator a piece of cake: let VAR receive the result from calling the __enter__() method, and save the value of EXPR to call its __exit__() method later. Then the decorator can return an instance of a wrapper class whose __enter__() method calls the generator's next() method and returns whatever next() returns; the wrapper instance's __exit__() method calls next() again but expects it to raise StopIteration. (Details below in the section Optional Generator Decorator.) So now the final hurdle was that the PEP 310 syntax: with VAR = EXPR: BLOCK1 would be deceptive, since VAR does *not* receive the value of EXPR. Borrowing from PEP 340, it was an easy step to: with EXPR as VAR: BLOCK1 Additional discussion showed that people really liked being able to "see" the exception in the generator, even if it was only to log it; the generator is not allowed to yield another value, since the with-statement should not be usable as a loop (raising a different exception is marginally acceptable). To enable this, a new throw() method for generators is proposed, which takes one to three arguments representing an exception in the usual fashion (type, value, traceback) and raises it at the point where the generator is suspended. Once we have this, it is a small step to proposing another generator method, close(), which calls throw() with a special exception, GeneratorExit. This tells the generator to exit, and from there it's another small step to proposing that close() be called automatically when the generator is garbage-collected. Then, finally, we can allow a yield-statement inside a try-finally statement, since we can now guarantee that the finally-clause will (eventually) be executed. The usual cautions about finalization apply -- the process may be terminated abruptly without finalizing any objects, and objects may be kept alive forever by cycles or memory leaks in the application (as opposed to cycles or leaks in the Python implementation, which are taken care of by GC). Note that we're not guaranteeing that the finally-clause is executed immediately after the generator object becomes unused, even though this is how it will work in CPython. This is similar to auto-closing files: while a reference-counting implementation like CPython deallocates an object as soon as the last reference to it goes away, implementations that use other GC algorithms do not make the same guarantee. This applies to Jython, IronPython, and probably to Python running on Parrot. Use Cases See the Examples section near the end. Specification: The 'with' Statement A new statement is proposed with the syntax: with EXPR as VAR: BLOCK Here, 'with' and 'as' are new keywords; EXPR is an arbitrary expression (but not an expression-list) and VAR is a single assignment target. It can *not* be a comma-separated sequence of variables, but it *can* be a *parenthesized* comma-separated sequence of variables. (This restriction makes a future extension possible of the syntax to have multiple comma-separated resources, each with its own optional as-clause.) The "as VAR" part is optional. The translation of the above statement is: abc = (EXPR).__context__() exc = (None, None, None) VAR = abc.__enter__() try: try: BLOCK except: exc = sys.exc_info() raise finally: abc.__exit__(*exc) Here, the variables 'abc' and 'exc' are internal variables and not accessible to the user; they will most likely be implemented as special registers or stack positions. The above translation is fairly literal - if any of the relevant methods are not found as expected, the interpreter will raise AttributeError. The call to the __context__() method serves a similar purpose to that of the __iter__() method of iterator and iterables. An object with with simple state requirements (such as threading.RLock) may provide its own __enter__() and __exit__() methods, and simply return 'self' from its __context__ method. On the other hand, an object with more complex state requirements (such as decimal.Context) may return a distinct context manager object each time its __context__ method is invoked. If the "as VAR" part of the syntax is omitted, the "VAR =" part of the translation is omitted (but abc.__enter__() is still called). The calling convention for abc.__exit__() is as follows. If the finally-suite was reached through normal completion of BLOCK or through a non-local goto (a break, continue or return statement in BLOCK), abc.__exit__() is called with three None arguments. If the finally-suite was reached through an exception raised in BLOCK, abc.__exit__() is called with three arguments representing the exception type, value, and traceback. The motivation for this API to __exit__(), as opposed to the argument-less __exit__() from PEP 310, was given by the transactional() use case, example 3 below. The template in that example must commit or roll back the transaction depending on whether an exception occurred or not. Rather than just having a boolean flag indicating whether an exception occurred, we pass the complete exception information, for the benefit of an exception-logging facility for example. Relying on sys.exc_info() to get at the exception information was rejected; sys.exc_info() has very complex semantics and it is perfectly possible that it returns the exception information for an exception that was caught ages ago. It was also proposed to add an additional boolean to distinguish between reaching the end of BLOCK and a non-local goto. This was rejected as too complex and unnecessary; a non-local goto should be considered unexceptional for the purposes of a database transaction roll-back decision. Generator Decorator With PEP 342 accepted, it is possible to write a decorator that makes it possible to use a generator that yields exactly once to control a with-statement. Here's a sketch of such a decorator: class GeneratorContextManager(object): def __init__(self, gen): self.gen = gen def __context__(self): return self def __enter__(self): try: return self.gen.next() except StopIteration: raise RuntimeError("generator didn't yield") def __exit__(self, type, value, traceback): if type is None: try: self.gen.next() except StopIteration: return else: raise RuntimeError("generator didn't stop") else: try: self.gen.throw(type, value, traceback) except (type, StopIteration): return else: raise RuntimeError("generator caught exception") def contextmanager(func): def helper(*args, **kwds): return GeneratorContextManager(func(*args, **kwds)) return helper This decorator could be used as follows: @contextmanager def opening(filename): f = open(filename) # IOError is untouched by GeneratorContext try: yield f finally: f.close() # Ditto for errors here (however unlikely) A robust builtin implementation of this decorator will be made part of the standard library. Just as generator-iterator functions are very useful for writing __iter__() methods for iterables, generator-context functions will be very useful for writing __context__() methods for contexts. These methods will still need to be decorated using the contextmanager decorator. To ensure an obvious error message if the decorator is left out, generator-iterator objects will NOT be given a native context - if you want to ensure a generator is closed promptly, use something similar to the duck-typed "closing" context manager in the examples. Optional Extensions It would be possible to endow certain objects, like files, sockets, and locks, with __enter__() and __exit__() methods so that instead of writing: with locking(myLock): BLOCK one could write simply: with myLock: BLOCK I think we should be careful with this; it could lead to mistakes like: f = open(filename) with f: BLOCK1 with f: BLOCK2 which does not do what one might think (f is closed before BLOCK2 is entered). OTOH such mistakes are easily diagnosed; for example, the generator-context decorator above raises RuntimeError when a second with-statement calls f.__enter__() again. A similar error can be raised if __enter__ is invoked on a closed file object. For Python 2.5, the following candidates have been identified for native context managers: - file - decimal.Context - thread.LockType - threading.Lock - threading.RLock - threading.Condition Standard Terminology Discussions about iterators and iterables are aided by the standard terminology used to discuss them. The protocol used by the for statement is called the iterator protocol and an iterator is any object that properly implements that protocol. The term "iterable" then encompasses all objects with an __iter__() method that returns an iterator (this means that all iterators are iterables, but not all iterables are iterators). This PEP proposes that the protocol used by the with statement be known as the "context management protocol", and that objects that implement that protocol be known as "context managers". The term "context" then encompasses all objects with a __context__() method that returns a context manager (this means that all context managers are contexts, but not all contexts are context managers). The term "context" is based on the concept that the context object defines a context of execution for the code that forms the body of the with statement. In cases where the general term "context" would be ambiguous, it can be made explicit by expanding it to "manageable context". Resolved Issues The following issues were resolved either by BDFL approval, consensus on python-dev, or a simple lack of objection to proposals in the original version of this PEP. 1. The __exit__() method of the GeneratorContextManager class catches StopIteration and considers it equivalent to re-raising the exception passed to throw(). Is allowing StopIteration right here? This is so that a generator doing cleanup depending on the exception thrown (like the transactional() example below) can *catch* the exception thrown if it wants to and doesn't have to worry about re-raising it. I find this more convenient for the generator writer. Against this was brought in that the generator *appears* to suppress an exception that it cannot suppress: the transactional() example would be more clear according to this view if it re-raised the original exception after the call to db.rollback(). I personally would find the requirement to re-raise the exception an annoyance in a generator used as a with-template, since all the code after yield is used for is cleanup, and it is invoked from a finally-clause (the one implicit in the with-statement) which re-raises the original exception anyway. 2. What exception should GeneratorContextManager raise when the underlying generator-iterator misbehaves? The following quote is the reason behind Guido's choice of RuntimeError for both this and for the generator close() method in PEP 342 (from [8]): "I'd rather not introduce a new exception class just for this purpose, since it's not an exception that I want people to catch: I want it to turn into a traceback which is seen by the programmer who then fixes the code. So now I believe they should both raise RuntimeError. There are some precedents for that: it's raised by the core Python code in situations where endless recursion is detected, and for uninitialized objects (and for a variety of miscellaneous conditions)." 3. After this PEP was originally approved, a subsequent discussion on python-dev [4] settled on the term "context manager" for objects which provide __enter__ and __exit__ methods, and "context management protocol" for the protocol itself. With the addition of the __context__ method to the protocol, a natural extension is to call all objects which provide a __context__ method "contexts" (or "manageable contexts" in situations where the general term "context" would be ambiguous). This is now documented in the "Standard Terminology" section. 4. The originally approved version of this PEP did not include a __context__ method - the method was only added to the PEP after Jason Orendorff pointed out the difficulty of writing appropriate __enter__ and __exit__ methods for decimal.Context [5]. This approach allows a class to define a native context manager using generator syntax. It also allows a class to use an existing independent context manager as its native context manager by applying the independent context manager to 'self' in its __context__ method. It even allows a class written in C to use a generator context manager written in Python. The __context__ method parallels the __iter__ method which forms part of the iterator protocol. An earlier version of this PEP called this the __with__ method. This was later changed to match the name of the protocol rather than the keyword for the statement [9]. 5. The suggestion was made by Jason Orendorff that the __enter__ and __exit__ methods could be removed from the context management protocol, and the protocol instead defined directly in terms of the enhanced generator interface described in PEP 342 [6]. Guido rejected this idea [7]. The following are some of benefits of keeping the __enter__ and __exit__ methods: - it makes it easy to implement a simple context manager in C without having to rely on a separate coroutine builder - it makes it easy to provide a low-overhead implementation for context managers which don't need to maintain any special state between the __enter__ and __exit__ methods (having to use a generator for these would impose unnecessary overhead without any compensating benefit) - it makes it possible to understand how the with statement works without having to first understand the mechanics of how generator context managers are implemented. 6. The decorator to make a context manager from a generator will be a builtin called "contextmanager". The shorter term "context" was considered too ambiguous and potentially confusing [9]. The different flavours of generators can then be described as: - A "generator function" is an undecorated function containing the 'yield' keyword, and the objects produced by such functions are "generator-iterators". The term "generator" may refer to either a generator function or a generator-iterator depending on the situation. - A "generator context function" is a generator function to which the "contextmanager" decorator is applied and the objects produced by such functions are "generator-context- managers". The term "generator context" may refer to either a generator context function or a generator-context-manager depending on the situation. 7. A generator function used to implement a __context__ method will need to be decorated with the contextmanager decorator in order to have the correct behaviour. Otherwise, you will get an AttributeError when using the class in a with statement, as normal generator-iterators will NOT have __enter__ or __exit__ methods. Getting deterministic closure of generators will require a separate context manager such as the closing example below. As Guido put it, "too much magic is bad for your health" [10]. 8. It is fine to raise AttributeError instead of TypeError if the relevant methods aren't present on a class involved in a with statement. The fact that the abstract object C API raises TypeError rather than AttributeError is an accident of history, rather than a deliberate design decision [11]. Examples The generator based examples rely on PEP 342. Also, some of the examples are likely to be unnecessary in practice, as the appropriate objects, such as threading.RLock, will be able to be used directly in with statements. The tense used in the names of the example context managers is not arbitrary. Past tense ("-ed") is used when the name refers to an action which is done in the __enter__ method and undone in the __exit__ method. Progressive tense ("-ing") is used when the name refers to an action which is to be done in the __exit__ method. 1. A template for ensuring that a lock, acquired at the start of a block, is released when the block is left: @contextmanager def locked(lock): lock.acquire() try: yield finally: lock.release() Used as follows: with locked(myLock): # Code here executes with myLock held. The lock is # guaranteed to be released when the block is left (even # if via return or by an uncaught exception). PEP 319 gives a use case for also having an unlocked() template; this can be written very similarly (just swap the acquire() and release() calls). 2. A template for opening a file that ensures the file is closed when the block is left: @contextmanager def opened(filename, mode="r"): f = open(filename, mode) try: yield f finally: f.close() Used as follows: with opened("/etc/passwd") as f: for line in f: print line.rstrip() 3. A template for committing or rolling back a database transaction: @contextmanager def transaction(db): db.begin() try: yield None except: db.rollback() else: db.commit() 4. Example 1 rewritten without a generator: class locked: def __init__(self, lock): self.lock = lock def __context__(self): return self def __enter__(self): self.lock.acquire() def __exit__(self, type, value, tb): self.lock.release() (This example is easily modified to implement the other relatively stateless examples; it shows that it is easy to avoid the need for a generator if no special state needs to be preserved.) 5. Redirect stdout temporarily: @contextmanager def stdout_redirected(new_stdout): save_stdout = sys.stdout sys.stdout = new_stdout try: yield None finally: sys.stdout = save_stdout Used as follows: with opened(filename, "w") as f: with stdout_redirected(f): print "Hello world" This isn't thread-safe, of course, but neither is doing this same dance manually. In single-threaded programs (for example, in scripts) it is a popular way of doing things. 6. A variant on opened() that also returns an error condition: @contextmanager def opened_w_error(filename, mode="r"): try: f = open(filename, mode) except IOError, err: yield None, err else: try: yield f, None finally: f.close() Used as follows: with opened_w_error("/etc/passwd", "a") as (f, err): if err: print "IOError:", err else: f.write("guido::0:0::/:/bin/sh\n") 7. Another useful example would be an operation that blocks signals. The use could be like this: import signal with signal.blocked(): # code executed without worrying about signals An optional argument might be a list of signals to be blocked; by default all signals are blocked. The implementation is left as an exercise to the reader. 8. Another use for this feature is the Decimal context. Here's a simple example, after one posted by Michael Chermside: import decimal @contextmanager def extra_precision(places=2): c = decimal.getcontext() saved_prec = c.prec c.prec += places try: yield None finally: c.prec = saved_prec Sample usage (adapted from the Python Library Reference): def sin(x): "Return the sine of x as measured in radians." with extra_precision(): i, lasts, s, fact, num, sign = 1, 0, x, 1, x, 1 while s != lasts: lasts = s i += 2 fact *= i * (i-1) num *= x * x sign *= -1 s += num / fact * sign # The "+s" rounds back to the original precision, # so this must be outside the with-statement: return +s 9. Here's a proposed native context manager for decimal.Context: # This would be a new decimal.Context method @contextmanager def __context__(self): # We set the thread context to a copy of this context # to ensure that changes within the block are kept # local to the block. This also gives us thread safety # and supports nested usage of a given context. newctx = self.copy() oldctx = decimal.getcontext() decimal.setcontext(newctx) try: yield newctx finally: decimal.setcontext(oldctx) Sample usage: def sin(x): with decimal.getcontext() as ctx: ctx.prec += 2 # Rest of sin calculation algorithm # uses a precision 2 greater than normal return +s # Convert result to normal precision def sin(x): with decimal.ExtendedContext: # Rest of sin calculation algorithm # uses the Extended Context from the # General Decimal Arithmetic Specification return +s # Convert result to normal context 10. A generic "object-closing" template: @contextmanager def closing(obj): try: yield obj finally: try: close = obj.close except AttributeError: pass else: close() This can be used to deterministically close anything with a close method, be it file, generator, or something else. It can even be used when the object isn't guaranteed to require closing (e.g., a function that accepts an arbitrary iterable): # emulate opening(): with closing(open("argument.txt")) as contradiction: for line in contradiction: print line # deterministically finalize an iterator: with closing(iter(data_source)) as data: for datum in data: process(datum) 11. Native contexts for objects with acquire/release methods: # This would be a new method of e.g., threading.RLock def __context__(self): return locked(self) def released(self): return unlocked(self) Sample usage: with my_lock: # Operations with the lock held with my_lock.released(): # Operations without the lock # e.g. blocking I/O # Lock is held again here 12. A "nested" context manager that automatically nests the supplied contexts from left-to-right to avoid excessive indentation: class nested(object): def __init__(*contexts): self.contexts = contexts self.entered = None def __context__(self): return self def __enter__(self): if self.entered is not None: raise RuntimeError("Context is not reentrant") self.entered = deque() vars = [] try: for context in self.contexts: mgr = context.__context__() vars.append(mgr.__enter__()) self.entered.appendleft(mgr) except: self.__exit__(*sys.exc_info()) raise return vars def __exit__(self, *exc_info): # Behave like nested with statements # first in, last out # New exceptions override old ones ex = exc_info for mgr in self.entered: try: mgr.__exit__(*ex) except: ex = sys.exc_info() self.entered = None if ex is not exc_info: raise ex[0], ex[1], ex[2] Sample usage: with nested(a, b, c) as (x, y, z): # Perform operation Is equivalent to: with a as x: with b as y: with c as z: # Perform operation References [1] http://blogs.msdn.com/oldnewthing/archive/2005/01/06/347666.aspx [2] http://mail.python.org/pipermail/python-dev/2005-May/053885.html [3] http://wiki.python.org/moin/WithStatement [4] http://mail.python.org/pipermail/python-dev/2005-July/054658.html [5] http://mail.python.org/pipermail/python-dev/2005-October/056947.html [6] http://mail.python.org/pipermail/python-dev/2005-October/056969.html [7] http://mail.python.org/pipermail/python-dev/2005-October/057018.html [8] http://mail.python.org/pipermail/python-dev/2005-June/054064.html [9] http://mail.python.org/pipermail/python-dev/2005-October/057520.html [10] http://mail.python.org/pipermail/python-dev/2005-October/057535.html [11] http://mail.python.org/pipermail/python-dev/2005-October/057625.html Copyright This document has been placed in the public domain.