11.23 distutils.text_file -- The TextFile class

This module provides the TextFile class, which gives an interface to text files that (optionally) takes care of stripping comments, ignoring blank lines, and joining lines with backslashes.

class TextFile( [filename=None, file=None, **options])
This class provides a file-like object that takes care of all the things you commonly want to do when processing a text file that has some line-by-line syntax: strip comments (as long as # is your comment character), skip blank lines, join adjacent lines by escaping the newline (ie. backslash at end of line), strip leading and/or trailing whitespace. All of these are optional and independently controllable.

The class provides a warn() method so you can generate warning messages that report physical line number, even if the logical line in question spans multiple physical lines. Also provides unreadline() for implementing line-at-a-time lookahead.

TextFile instances are create with either filename, file, or both. RuntimeError is raised if both are None. filename should be a string, and file a file object (or something that provides readline() and close() methods). It is recommended that you supply at least filename, so that TextFile can include it in warning messages. If file is not supplied, TextFile creates its own using the open() built-in function.

The options are all boolean, and affect the values returned by readline()

option name description default
strip from "#" to end-of-line, as well as any whitespace leading up to the "#"--unless it is escaped by a backslash true
strip leading whitespace from each line before returning it false
strip trailing whitespace (including line terminator!) from each line before returning it. true
skip lines that are empty *after* stripping comments and whitespace. (If both lstrip_ws and rstrip_ws are false, then some lines may consist of solely whitespace: these will *not* be skipped, even if skip_blanks is true.) true
if a backslash is the last non-newline character on a line after stripping comments and whitespace, join the following line to it to form one logical line; if N consecutive lines end with a backslash, then N+1 physical lines will be joined to form one logical line. false
strip leading whitespace from lines that are joined to their predecessor; only matters if "(join_lines and not lstrip_ws)" false

Note that since rstrip_ws can strip the trailing newline, the semantics of readline() must differ from those of the builtin file object's readline() method! In particular, readline() returns None for end-of-file: an empty string might just be a blank line (or an all-whitespace line), if rstrip_ws is true but skip_blanks is not.

open( filename)
Open a new file filename. This overrides any file or filename constructor arguments.

close( )
Close the current file and forget everything we know about it (including the filename and the current line number).

warn( msg[,line=None])
Print (to stderr) a warning message tied to the current logical line in the current file. If the current logical line in the file spans multiple physical lines, the warning refers to the whole range, such as ""lines 3-5"". If line is supplied, it overrides the current line number; it may be a list or tuple to indicate a range of physical lines, or an integer for a single physical line.

readline( )
Read and return a single logical line from the current file (or from an internal buffer if lines have previously been ``unread'' with unreadline()). If the join_lines option is true, this may involve reading multiple physical lines concatenated into a single string. Updates the current line number, so calling warn() after readline() emits a warning about the physical line(s) just read. Returns None on end-of-file, since the empty string can occur if rstrip_ws is true but strip_blanks is not.
readlines( )
Read and return the list of all logical lines remaining in the current file. This updates the current line number to the last line of the file.
unreadline( line)
Push line (a string) onto an internal buffer that will be checked by future readline() calls. Handy for implementing a parser with line-at-a-time lookahead. Note that lines that are ``unread'' with unreadline are not subsequently re-cleansed (whitespace stripped, or whatever) when read with readline. If multiple calls are made to unreadline before a call to readline, the lines will be returned most in most recent first order.

See About this document... for information on suggesting changes.