4.2 Built-in Module re

 

 

This module provides regular expression matching operations similar to those found in Perl. It's 8-bit clean: both patterns and strings may contain null bytes and characters whose high bit is set. It is always available.

Regular expressions use the backslash character (\) to indicate special forms or to allow special characters to be used without invoking their special meaning. This collides with Python's usage of the same character for the same purpose in string literals; for example, to match a literal backslash, one might have to write \\\\ as the pattern string, because the regular expression must be \\, and each backslash must be expressed as \\ inside a regular Python string literal.

The solution is to use Python's raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with 'r'. So r"\n" is a two character string containing a backslash and the letter 'n', while "\n" is a one-character string containing a newline. Usually patterns will be expressed in Python code using this raw string notation.





guido@python.org