jpayne@69: """ jpayne@69: This module contains the core classes of version 2.0 of SAX for Python. jpayne@69: This file provides only default classes with absolutely minimum jpayne@69: functionality, from which drivers and applications can be subclassed. jpayne@69: jpayne@69: Many of these classes are empty and are included only as documentation jpayne@69: of the interfaces. jpayne@69: jpayne@69: $Id$ jpayne@69: """ jpayne@69: jpayne@69: version = '2.0beta' jpayne@69: jpayne@69: #============================================================================ jpayne@69: # jpayne@69: # HANDLER INTERFACES jpayne@69: # jpayne@69: #============================================================================ jpayne@69: jpayne@69: # ===== ERRORHANDLER ===== jpayne@69: jpayne@69: class ErrorHandler: jpayne@69: """Basic interface for SAX error handlers. jpayne@69: jpayne@69: If you create an object that implements this interface, then jpayne@69: register the object with your XMLReader, the parser will call the jpayne@69: methods in your object to report all warnings and errors. There jpayne@69: are three levels of errors available: warnings, (possibly) jpayne@69: recoverable errors, and unrecoverable errors. All methods take a jpayne@69: SAXParseException as the only parameter.""" jpayne@69: jpayne@69: def error(self, exception): jpayne@69: "Handle a recoverable error." jpayne@69: raise exception jpayne@69: jpayne@69: def fatalError(self, exception): jpayne@69: "Handle a non-recoverable error." jpayne@69: raise exception jpayne@69: jpayne@69: def warning(self, exception): jpayne@69: "Handle a warning." jpayne@69: print(exception) jpayne@69: jpayne@69: jpayne@69: # ===== CONTENTHANDLER ===== jpayne@69: jpayne@69: class ContentHandler: jpayne@69: """Interface for receiving logical document content events. jpayne@69: jpayne@69: This is the main callback interface in SAX, and the one most jpayne@69: important to applications. The order of events in this interface jpayne@69: mirrors the order of the information in the document.""" jpayne@69: jpayne@69: def __init__(self): jpayne@69: self._locator = None jpayne@69: jpayne@69: def setDocumentLocator(self, locator): jpayne@69: """Called by the parser to give the application a locator for jpayne@69: locating the origin of document events. jpayne@69: jpayne@69: SAX parsers are strongly encouraged (though not absolutely jpayne@69: required) to supply a locator: if it does so, it must supply jpayne@69: the locator to the application by invoking this method before jpayne@69: invoking any of the other methods in the DocumentHandler jpayne@69: interface. jpayne@69: jpayne@69: The locator allows the application to determine the end jpayne@69: position of any document-related event, even if the parser is jpayne@69: not reporting an error. Typically, the application will use jpayne@69: this information for reporting its own errors (such as jpayne@69: character content that does not match an application's jpayne@69: business rules). The information returned by the locator is jpayne@69: probably not sufficient for use with a search engine. jpayne@69: jpayne@69: Note that the locator will return correct information only jpayne@69: during the invocation of the events in this interface. The jpayne@69: application should not attempt to use it at any other time.""" jpayne@69: self._locator = locator jpayne@69: jpayne@69: def startDocument(self): jpayne@69: """Receive notification of the beginning of a document. jpayne@69: jpayne@69: The SAX parser will invoke this method only once, before any jpayne@69: other methods in this interface or in DTDHandler (except for jpayne@69: setDocumentLocator).""" jpayne@69: jpayne@69: def endDocument(self): jpayne@69: """Receive notification of the end of a document. jpayne@69: jpayne@69: The SAX parser will invoke this method only once, and it will jpayne@69: be the last method invoked during the parse. The parser shall jpayne@69: not invoke this method until it has either abandoned parsing jpayne@69: (because of an unrecoverable error) or reached the end of jpayne@69: input.""" jpayne@69: jpayne@69: def startPrefixMapping(self, prefix, uri): jpayne@69: """Begin the scope of a prefix-URI Namespace mapping. jpayne@69: jpayne@69: The information from this event is not necessary for normal jpayne@69: Namespace processing: the SAX XML reader will automatically jpayne@69: replace prefixes for element and attribute names when the jpayne@69: http://xml.org/sax/features/namespaces feature is true (the jpayne@69: default). jpayne@69: jpayne@69: There are cases, however, when applications need to use jpayne@69: prefixes in character data or in attribute values, where they jpayne@69: cannot safely be expanded automatically; the jpayne@69: start/endPrefixMapping event supplies the information to the jpayne@69: application to expand prefixes in those contexts itself, if jpayne@69: necessary. jpayne@69: jpayne@69: Note that start/endPrefixMapping events are not guaranteed to jpayne@69: be properly nested relative to each-other: all jpayne@69: startPrefixMapping events will occur before the corresponding jpayne@69: startElement event, and all endPrefixMapping events will occur jpayne@69: after the corresponding endElement event, but their order is jpayne@69: not guaranteed.""" jpayne@69: jpayne@69: def endPrefixMapping(self, prefix): jpayne@69: """End the scope of a prefix-URI mapping. jpayne@69: jpayne@69: See startPrefixMapping for details. This event will always jpayne@69: occur after the corresponding endElement event, but the order jpayne@69: of endPrefixMapping events is not otherwise guaranteed.""" jpayne@69: jpayne@69: def startElement(self, name, attrs): jpayne@69: """Signals the start of an element in non-namespace mode. jpayne@69: jpayne@69: The name parameter contains the raw XML 1.0 name of the jpayne@69: element type as a string and the attrs parameter holds an jpayne@69: instance of the Attributes class containing the attributes of jpayne@69: the element.""" jpayne@69: jpayne@69: def endElement(self, name): jpayne@69: """Signals the end of an element in non-namespace mode. jpayne@69: jpayne@69: The name parameter contains the name of the element type, just jpayne@69: as with the startElement event.""" jpayne@69: jpayne@69: def startElementNS(self, name, qname, attrs): jpayne@69: """Signals the start of an element in namespace mode. jpayne@69: jpayne@69: The name parameter contains the name of the element type as a jpayne@69: (uri, localname) tuple, the qname parameter the raw XML 1.0 jpayne@69: name used in the source document, and the attrs parameter jpayne@69: holds an instance of the Attributes class containing the jpayne@69: attributes of the element. jpayne@69: jpayne@69: The uri part of the name tuple is None for elements which have jpayne@69: no namespace.""" jpayne@69: jpayne@69: def endElementNS(self, name, qname): jpayne@69: """Signals the end of an element in namespace mode. jpayne@69: jpayne@69: The name parameter contains the name of the element type, just jpayne@69: as with the startElementNS event.""" jpayne@69: jpayne@69: def characters(self, content): jpayne@69: """Receive notification of character data. jpayne@69: jpayne@69: The Parser will call this method to report each chunk of jpayne@69: character data. SAX parsers may return all contiguous jpayne@69: character data in a single chunk, or they may split it into jpayne@69: several chunks; however, all of the characters in any single jpayne@69: event must come from the same external entity so that the jpayne@69: Locator provides useful information.""" jpayne@69: jpayne@69: def ignorableWhitespace(self, whitespace): jpayne@69: """Receive notification of ignorable whitespace in element content. jpayne@69: jpayne@69: Validating Parsers must use this method to report each chunk jpayne@69: of ignorable whitespace (see the W3C XML 1.0 recommendation, jpayne@69: section 2.10): non-validating parsers may also use this method jpayne@69: if they are capable of parsing and using content models. jpayne@69: jpayne@69: SAX parsers may return all contiguous whitespace in a single jpayne@69: chunk, or they may split it into several chunks; however, all jpayne@69: of the characters in any single event must come from the same jpayne@69: external entity, so that the Locator provides useful jpayne@69: information.""" jpayne@69: jpayne@69: def processingInstruction(self, target, data): jpayne@69: """Receive notification of a processing instruction. jpayne@69: jpayne@69: The Parser will invoke this method once for each processing jpayne@69: instruction found: note that processing instructions may occur jpayne@69: before or after the main document element. jpayne@69: jpayne@69: A SAX parser should never report an XML declaration (XML 1.0, jpayne@69: section 2.8) or a text declaration (XML 1.0, section 4.3.1) jpayne@69: using this method.""" jpayne@69: jpayne@69: def skippedEntity(self, name): jpayne@69: """Receive notification of a skipped entity. jpayne@69: jpayne@69: The Parser will invoke this method once for each entity jpayne@69: skipped. Non-validating processors may skip entities if they jpayne@69: have not seen the declarations (because, for example, the jpayne@69: entity was declared in an external DTD subset). All processors jpayne@69: may skip external entities, depending on the values of the jpayne@69: http://xml.org/sax/features/external-general-entities and the jpayne@69: http://xml.org/sax/features/external-parameter-entities jpayne@69: properties.""" jpayne@69: jpayne@69: jpayne@69: # ===== DTDHandler ===== jpayne@69: jpayne@69: class DTDHandler: jpayne@69: """Handle DTD events. jpayne@69: jpayne@69: This interface specifies only those DTD events required for basic jpayne@69: parsing (unparsed entities and attributes).""" jpayne@69: jpayne@69: def notationDecl(self, name, publicId, systemId): jpayne@69: "Handle a notation declaration event." jpayne@69: jpayne@69: def unparsedEntityDecl(self, name, publicId, systemId, ndata): jpayne@69: "Handle an unparsed entity declaration event." jpayne@69: jpayne@69: jpayne@69: # ===== ENTITYRESOLVER ===== jpayne@69: jpayne@69: class EntityResolver: jpayne@69: """Basic interface for resolving entities. If you create an object jpayne@69: implementing this interface, then register the object with your jpayne@69: Parser, the parser will call the method in your object to jpayne@69: resolve all external entities. Note that DefaultHandler implements jpayne@69: this interface with the default behaviour.""" jpayne@69: jpayne@69: def resolveEntity(self, publicId, systemId): jpayne@69: """Resolve the system identifier of an entity and return either jpayne@69: the system identifier to read from as a string, or an InputSource jpayne@69: to read from.""" jpayne@69: return systemId jpayne@69: jpayne@69: jpayne@69: #============================================================================ jpayne@69: # jpayne@69: # CORE FEATURES jpayne@69: # jpayne@69: #============================================================================ jpayne@69: jpayne@69: feature_namespaces = "http://xml.org/sax/features/namespaces" jpayne@69: # true: Perform Namespace processing (default). jpayne@69: # false: Optionally do not perform Namespace processing jpayne@69: # (implies namespace-prefixes). jpayne@69: # access: (parsing) read-only; (not parsing) read/write jpayne@69: jpayne@69: feature_namespace_prefixes = "http://xml.org/sax/features/namespace-prefixes" jpayne@69: # true: Report the original prefixed names and attributes used for Namespace jpayne@69: # declarations. jpayne@69: # false: Do not report attributes used for Namespace declarations, and jpayne@69: # optionally do not report original prefixed names (default). jpayne@69: # access: (parsing) read-only; (not parsing) read/write jpayne@69: jpayne@69: feature_string_interning = "http://xml.org/sax/features/string-interning" jpayne@69: # true: All element names, prefixes, attribute names, Namespace URIs, and jpayne@69: # local names are interned using the built-in intern function. jpayne@69: # false: Names are not necessarily interned, although they may be (default). jpayne@69: # access: (parsing) read-only; (not parsing) read/write jpayne@69: jpayne@69: feature_validation = "http://xml.org/sax/features/validation" jpayne@69: # true: Report all validation errors (implies external-general-entities and jpayne@69: # external-parameter-entities). jpayne@69: # false: Do not report validation errors. jpayne@69: # access: (parsing) read-only; (not parsing) read/write jpayne@69: jpayne@69: feature_external_ges = "http://xml.org/sax/features/external-general-entities" jpayne@69: # true: Include all external general (text) entities. jpayne@69: # false: Do not include external general entities. jpayne@69: # access: (parsing) read-only; (not parsing) read/write jpayne@69: jpayne@69: feature_external_pes = "http://xml.org/sax/features/external-parameter-entities" jpayne@69: # true: Include all external parameter entities, including the external jpayne@69: # DTD subset. jpayne@69: # false: Do not include any external parameter entities, even the external jpayne@69: # DTD subset. jpayne@69: # access: (parsing) read-only; (not parsing) read/write jpayne@69: jpayne@69: all_features = [feature_namespaces, jpayne@69: feature_namespace_prefixes, jpayne@69: feature_string_interning, jpayne@69: feature_validation, jpayne@69: feature_external_ges, jpayne@69: feature_external_pes] jpayne@69: jpayne@69: jpayne@69: #============================================================================ jpayne@69: # jpayne@69: # CORE PROPERTIES jpayne@69: # jpayne@69: #============================================================================ jpayne@69: jpayne@69: property_lexical_handler = "http://xml.org/sax/properties/lexical-handler" jpayne@69: # data type: xml.sax.sax2lib.LexicalHandler jpayne@69: # description: An optional extension handler for lexical events like comments. jpayne@69: # access: read/write jpayne@69: jpayne@69: property_declaration_handler = "http://xml.org/sax/properties/declaration-handler" jpayne@69: # data type: xml.sax.sax2lib.DeclHandler jpayne@69: # description: An optional extension handler for DTD-related events other jpayne@69: # than notations and unparsed entities. jpayne@69: # access: read/write jpayne@69: jpayne@69: property_dom_node = "http://xml.org/sax/properties/dom-node" jpayne@69: # data type: org.w3c.dom.Node jpayne@69: # description: When parsing, the current DOM node being visited if this is jpayne@69: # a DOM iterator; when not parsing, the root DOM node for jpayne@69: # iteration. jpayne@69: # access: (parsing) read-only; (not parsing) read/write jpayne@69: jpayne@69: property_xml_string = "http://xml.org/sax/properties/xml-string" jpayne@69: # data type: String jpayne@69: # description: The literal string of characters that was the source for jpayne@69: # the current event. jpayne@69: # access: read-only jpayne@69: jpayne@69: property_encoding = "http://www.python.org/sax/properties/encoding" jpayne@69: # data type: String jpayne@69: # description: The name of the encoding to assume for input data. jpayne@69: # access: write: set the encoding, e.g. established by a higher-level jpayne@69: # protocol. May change during parsing (e.g. after jpayne@69: # processing a META tag) jpayne@69: # read: return the current encoding (possibly established through jpayne@69: # auto-detection. jpayne@69: # initial value: UTF-8 jpayne@69: # jpayne@69: jpayne@69: property_interning_dict = "http://www.python.org/sax/properties/interning-dict" jpayne@69: # data type: Dictionary jpayne@69: # description: The dictionary used to intern common strings in the document jpayne@69: # access: write: Request that the parser uses a specific dictionary, to jpayne@69: # allow interning across different documents jpayne@69: # read: return the current interning dictionary, or None jpayne@69: # jpayne@69: jpayne@69: all_properties = [property_lexical_handler, jpayne@69: property_dom_node, jpayne@69: property_declaration_handler, jpayne@69: property_xml_string, jpayne@69: property_encoding, jpayne@69: property_interning_dict]