Changed in version 1.5.2.
This module defines a class XMLParser which serves as the basis for parsing text files formatted in XML (Extensible Markup Language).
This class provides the following interface methods and instance variables:
None
. The default
value is the empty dictionary. This variable is meant to be
overridden, not extended since the default is shared by all instances
of XMLParser.
None
if the method unknown_starttag() or
unknown_endtag() is to be called. The default value is the
empty dictionary. This variable is meant to be overridden, not
extended since the default is shared by all instances of
XMLParser.
'lt'
, 'gt'
, 'amp'
, 'quot'
,
and 'apos'
.
None
and the string
'no'
respectively.
None
if not specified), the system identifier,
and the uninterpreted contents of the internal DTD subset as a string
(or None
if not present).
<>
brackets. Character and entity references in the
value have been interpreted. For instance, for the start tag
<A HREF="http://www.cwi.nl/">
, this method would be called as
handle_starttag('A', self.elements['A'][0], {'HREF': 'http://www.cwi.nl/'})
.
The base implementation simply calls method with attributes
as the only argument.
</A>
, this method would be called as handle_endtag('A',
self.elements['A'][1])
. The base implementation simply calls
method.
unknown_charref(ref)
is called to handle the error. A
subclass must override this method to provide support for character
references outside of the ASCII range.
unknown_entityref(ref)
. The default entitydefs
defines translations for &
, &apos
, >
,
<
, and "
.
'text'
. The
default method does nothing.
'text'
. The
default method does nothing, and is intended to be overridden.
'XML'
and 'text'
. The default method
does nothing. Note that if a document starts with "<?xml
..?>", handle_xml() is called to handle it.
'ENTITY text'
. The
default method does nothing. Note that "<!DOCTYPE ...>" is
handled separately if it is located at the start of the document.
See Also:
The XML specification, published by the World Wide Web Consortium (W3C), is available online at http://www.w3.org/TR/REC-xml. References to additional material on XML are available at http://www.w3.org/XML/.
The Python XML Topic Guide provides a great deal of information on using XML from Python and links to other sources of information on XML. It's located on the Web at http://www.python.org/topics/xml/.
The Python XML Special Interest Group is developing substantial support for processing XML from Python. See http://www.python.org/sigs/xml-sig/ for more information.