The module defines the following functions and constants, and an exception:
The expression's behaviour can be modified by specifying a flags value. Values can be any of the following variables, combined using bitwise OR (the | operator).
Perform case-insensitive matching; expressions like [A-Z] will match lowercase letters, too. This is not affected by the current locale.
Make \w, \W, \b, \B, dependent on the current locale.
When specified, the pattern character ^ matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character $ matches at the end of the string and at the end of each line (immediately preceding each newline). By default, ^ matches only at the beginning of the string, and $ only at the end of the string and immediately before the newline (if any) at the end of the string.
Make the . special character any character at all, including a newline; without this flag, . will match anything except a newline.
Ignore whitespace within the pattern except when in a character class or preceded by an unescaped backslash, and, when a line contains a # neither in a character class or preceded by an unescaped backslash, all characters from the leftmost such # through the end of the line are ignored.
The sequence
prog = re.compile(pat) result = prog.match(str)
result = re.match(pat, str)
but the version using compile() is more efficient when the expression will be used several times in a single program.
>>> re.split('[\W]+', 'Words, words, words.') ['Words', 'words', 'words', ''] >>> re.split('([\W]+)', 'Words, words, words.') ['Words', ', ', 'words', ', ', 'words', '.', ''] >>> re.split('[\W]+', 'Words, words, words.', 1) ['Words', 'words, words.']
>>> def dashrepl(matchobj): ... if matchobj.group(0) == '-': return ' ' ... else: return '-' >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files') 'pro--gram files'
sub("(?i)b+", "x", "bbbb BBBB") returns 'x x'.
The optional argument count is the maximum number of pattern occurrences to be replaced; count must be a non-negative integer, and the default value of 0 means to replace all occurrences.
Empty matches for the pattern are replaced only when not adjacent to a previous match, so sub('x*', '-', 'abc') returns '-a-b-c-'.