The module defines the following functions and constants, and an exception:
The expression's behaviour can be modified by specifying a flags value. Values can be any of the following variables, combined using bitwise OR (the | operator).
The sequence
prog = re.compile(pat) result = prog.match(str)
is equivalent to
result = re.match(pat, str)
but the version using compile() is more efficient when the expression will be used several times in a single program.
Note: If you want to locate a match anywhere in string, use search() instead.
>>> re.split('\W+', 'Words, words, words.') ['Words', 'words', 'words', ''] >>> re.split('(\W+)', 'Words, words, words.') ['Words', ', ', 'words', ', ', 'words', '.', ''] >>> re.split('\W+', 'Words, words, words.', 1) ['Words', 'words, words.']
This function combines and extends the functionality of the old regsub.split() and regsub.splitx().
>>> def dashrepl(matchobj): .... if matchobj.group(0) == '-': return ' ' .... else: return '-' >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files') 'pro--gram files'
The pattern may be a string or a regex object; if you need to specify regular expression flags, you must use a regex object, or use embedded modifiers in a pattern; e.g. "sub("(?i)b+", "x", "bbbb BBBB")" returns 'x x'.
The optional argument count is the maximum number of pattern occurrences to be replaced; count must be a non-negative integer, and the default value of 0 means to replace all occurrences.
Empty matches for the pattern are replaced only when not adjacent to a previous match, so "sub('x*', '-', 'abc')" returns '-a-b-c-'.
If repl is a string, any backslash escapes in it are processed. That is, "\n" is converted to a single newline character, "\r" is converted to a linefeed, and so forth. Unknown escapes such as "\j" are left alone. Backreferences, such as "\6", are replaced with the substring matched by group 6 in the pattern.
In addition to character escapes and backreferences as described above, "\g<name>" will use the substring matched by the group named "name", as defined by the (?P<name>...) syntax. "\g<number>" uses the corresponding group number; "\ g<2>" is therefore equivalent to "\2", but isn't ambiguous in a replacement such as "\g<2>0". "\20" would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character "0".