jpayne@68: """ jpayne@68: This module contains the core classes of version 2.0 of SAX for Python. jpayne@68: This file provides only default classes with absolutely minimum jpayne@68: functionality, from which drivers and applications can be subclassed. jpayne@68: jpayne@68: Many of these classes are empty and are included only as documentation jpayne@68: of the interfaces. jpayne@68: jpayne@68: $Id$ jpayne@68: """ jpayne@68: jpayne@68: version = '2.0beta' jpayne@68: jpayne@68: #============================================================================ jpayne@68: # jpayne@68: # HANDLER INTERFACES jpayne@68: # jpayne@68: #============================================================================ jpayne@68: jpayne@68: # ===== ERRORHANDLER ===== jpayne@68: jpayne@68: class ErrorHandler: jpayne@68: """Basic interface for SAX error handlers. jpayne@68: jpayne@68: If you create an object that implements this interface, then jpayne@68: register the object with your XMLReader, the parser will call the jpayne@68: methods in your object to report all warnings and errors. There jpayne@68: are three levels of errors available: warnings, (possibly) jpayne@68: recoverable errors, and unrecoverable errors. All methods take a jpayne@68: SAXParseException as the only parameter.""" jpayne@68: jpayne@68: def error(self, exception): jpayne@68: "Handle a recoverable error." jpayne@68: raise exception jpayne@68: jpayne@68: def fatalError(self, exception): jpayne@68: "Handle a non-recoverable error." jpayne@68: raise exception jpayne@68: jpayne@68: def warning(self, exception): jpayne@68: "Handle a warning." jpayne@68: print(exception) jpayne@68: jpayne@68: jpayne@68: # ===== CONTENTHANDLER ===== jpayne@68: jpayne@68: class ContentHandler: jpayne@68: """Interface for receiving logical document content events. jpayne@68: jpayne@68: This is the main callback interface in SAX, and the one most jpayne@68: important to applications. The order of events in this interface jpayne@68: mirrors the order of the information in the document.""" jpayne@68: jpayne@68: def __init__(self): jpayne@68: self._locator = None jpayne@68: jpayne@68: def setDocumentLocator(self, locator): jpayne@68: """Called by the parser to give the application a locator for jpayne@68: locating the origin of document events. jpayne@68: jpayne@68: SAX parsers are strongly encouraged (though not absolutely jpayne@68: required) to supply a locator: if it does so, it must supply jpayne@68: the locator to the application by invoking this method before jpayne@68: invoking any of the other methods in the DocumentHandler jpayne@68: interface. jpayne@68: jpayne@68: The locator allows the application to determine the end jpayne@68: position of any document-related event, even if the parser is jpayne@68: not reporting an error. Typically, the application will use jpayne@68: this information for reporting its own errors (such as jpayne@68: character content that does not match an application's jpayne@68: business rules). The information returned by the locator is jpayne@68: probably not sufficient for use with a search engine. jpayne@68: jpayne@68: Note that the locator will return correct information only jpayne@68: during the invocation of the events in this interface. The jpayne@68: application should not attempt to use it at any other time.""" jpayne@68: self._locator = locator jpayne@68: jpayne@68: def startDocument(self): jpayne@68: """Receive notification of the beginning of a document. jpayne@68: jpayne@68: The SAX parser will invoke this method only once, before any jpayne@68: other methods in this interface or in DTDHandler (except for jpayne@68: setDocumentLocator).""" jpayne@68: jpayne@68: def endDocument(self): jpayne@68: """Receive notification of the end of a document. jpayne@68: jpayne@68: The SAX parser will invoke this method only once, and it will jpayne@68: be the last method invoked during the parse. The parser shall jpayne@68: not invoke this method until it has either abandoned parsing jpayne@68: (because of an unrecoverable error) or reached the end of jpayne@68: input.""" jpayne@68: jpayne@68: def startPrefixMapping(self, prefix, uri): jpayne@68: """Begin the scope of a prefix-URI Namespace mapping. jpayne@68: jpayne@68: The information from this event is not necessary for normal jpayne@68: Namespace processing: the SAX XML reader will automatically jpayne@68: replace prefixes for element and attribute names when the jpayne@68: http://xml.org/sax/features/namespaces feature is true (the jpayne@68: default). jpayne@68: jpayne@68: There are cases, however, when applications need to use jpayne@68: prefixes in character data or in attribute values, where they jpayne@68: cannot safely be expanded automatically; the jpayne@68: start/endPrefixMapping event supplies the information to the jpayne@68: application to expand prefixes in those contexts itself, if jpayne@68: necessary. jpayne@68: jpayne@68: Note that start/endPrefixMapping events are not guaranteed to jpayne@68: be properly nested relative to each-other: all jpayne@68: startPrefixMapping events will occur before the corresponding jpayne@68: startElement event, and all endPrefixMapping events will occur jpayne@68: after the corresponding endElement event, but their order is jpayne@68: not guaranteed.""" jpayne@68: jpayne@68: def endPrefixMapping(self, prefix): jpayne@68: """End the scope of a prefix-URI mapping. jpayne@68: jpayne@68: See startPrefixMapping for details. This event will always jpayne@68: occur after the corresponding endElement event, but the order jpayne@68: of endPrefixMapping events is not otherwise guaranteed.""" jpayne@68: jpayne@68: def startElement(self, name, attrs): jpayne@68: """Signals the start of an element in non-namespace mode. jpayne@68: jpayne@68: The name parameter contains the raw XML 1.0 name of the jpayne@68: element type as a string and the attrs parameter holds an jpayne@68: instance of the Attributes class containing the attributes of jpayne@68: the element.""" jpayne@68: jpayne@68: def endElement(self, name): jpayne@68: """Signals the end of an element in non-namespace mode. jpayne@68: jpayne@68: The name parameter contains the name of the element type, just jpayne@68: as with the startElement event.""" jpayne@68: jpayne@68: def startElementNS(self, name, qname, attrs): jpayne@68: """Signals the start of an element in namespace mode. jpayne@68: jpayne@68: The name parameter contains the name of the element type as a jpayne@68: (uri, localname) tuple, the qname parameter the raw XML 1.0 jpayne@68: name used in the source document, and the attrs parameter jpayne@68: holds an instance of the Attributes class containing the jpayne@68: attributes of the element. jpayne@68: jpayne@68: The uri part of the name tuple is None for elements which have jpayne@68: no namespace.""" jpayne@68: jpayne@68: def endElementNS(self, name, qname): jpayne@68: """Signals the end of an element in namespace mode. jpayne@68: jpayne@68: The name parameter contains the name of the element type, just jpayne@68: as with the startElementNS event.""" jpayne@68: jpayne@68: def characters(self, content): jpayne@68: """Receive notification of character data. jpayne@68: jpayne@68: The Parser will call this method to report each chunk of jpayne@68: character data. SAX parsers may return all contiguous jpayne@68: character data in a single chunk, or they may split it into jpayne@68: several chunks; however, all of the characters in any single jpayne@68: event must come from the same external entity so that the jpayne@68: Locator provides useful information.""" jpayne@68: jpayne@68: def ignorableWhitespace(self, whitespace): jpayne@68: """Receive notification of ignorable whitespace in element content. jpayne@68: jpayne@68: Validating Parsers must use this method to report each chunk jpayne@68: of ignorable whitespace (see the W3C XML 1.0 recommendation, jpayne@68: section 2.10): non-validating parsers may also use this method jpayne@68: if they are capable of parsing and using content models. jpayne@68: jpayne@68: SAX parsers may return all contiguous whitespace in a single jpayne@68: chunk, or they may split it into several chunks; however, all jpayne@68: of the characters in any single event must come from the same jpayne@68: external entity, so that the Locator provides useful jpayne@68: information.""" jpayne@68: jpayne@68: def processingInstruction(self, target, data): jpayne@68: """Receive notification of a processing instruction. jpayne@68: jpayne@68: The Parser will invoke this method once for each processing jpayne@68: instruction found: note that processing instructions may occur jpayne@68: before or after the main document element. jpayne@68: jpayne@68: A SAX parser should never report an XML declaration (XML 1.0, jpayne@68: section 2.8) or a text declaration (XML 1.0, section 4.3.1) jpayne@68: using this method.""" jpayne@68: jpayne@68: def skippedEntity(self, name): jpayne@68: """Receive notification of a skipped entity. jpayne@68: jpayne@68: The Parser will invoke this method once for each entity jpayne@68: skipped. Non-validating processors may skip entities if they jpayne@68: have not seen the declarations (because, for example, the jpayne@68: entity was declared in an external DTD subset). All processors jpayne@68: may skip external entities, depending on the values of the jpayne@68: http://xml.org/sax/features/external-general-entities and the jpayne@68: http://xml.org/sax/features/external-parameter-entities jpayne@68: properties.""" jpayne@68: jpayne@68: jpayne@68: # ===== DTDHandler ===== jpayne@68: jpayne@68: class DTDHandler: jpayne@68: """Handle DTD events. jpayne@68: jpayne@68: This interface specifies only those DTD events required for basic jpayne@68: parsing (unparsed entities and attributes).""" jpayne@68: jpayne@68: def notationDecl(self, name, publicId, systemId): jpayne@68: "Handle a notation declaration event." jpayne@68: jpayne@68: def unparsedEntityDecl(self, name, publicId, systemId, ndata): jpayne@68: "Handle an unparsed entity declaration event." jpayne@68: jpayne@68: jpayne@68: # ===== ENTITYRESOLVER ===== jpayne@68: jpayne@68: class EntityResolver: jpayne@68: """Basic interface for resolving entities. If you create an object jpayne@68: implementing this interface, then register the object with your jpayne@68: Parser, the parser will call the method in your object to jpayne@68: resolve all external entities. Note that DefaultHandler implements jpayne@68: this interface with the default behaviour.""" jpayne@68: jpayne@68: def resolveEntity(self, publicId, systemId): jpayne@68: """Resolve the system identifier of an entity and return either jpayne@68: the system identifier to read from as a string, or an InputSource jpayne@68: to read from.""" jpayne@68: return systemId jpayne@68: jpayne@68: jpayne@68: #============================================================================ jpayne@68: # jpayne@68: # CORE FEATURES jpayne@68: # jpayne@68: #============================================================================ jpayne@68: jpayne@68: feature_namespaces = "http://xml.org/sax/features/namespaces" jpayne@68: # true: Perform Namespace processing (default). jpayne@68: # false: Optionally do not perform Namespace processing jpayne@68: # (implies namespace-prefixes). jpayne@68: # access: (parsing) read-only; (not parsing) read/write jpayne@68: jpayne@68: feature_namespace_prefixes = "http://xml.org/sax/features/namespace-prefixes" jpayne@68: # true: Report the original prefixed names and attributes used for Namespace jpayne@68: # declarations. jpayne@68: # false: Do not report attributes used for Namespace declarations, and jpayne@68: # optionally do not report original prefixed names (default). jpayne@68: # access: (parsing) read-only; (not parsing) read/write jpayne@68: jpayne@68: feature_string_interning = "http://xml.org/sax/features/string-interning" jpayne@68: # true: All element names, prefixes, attribute names, Namespace URIs, and jpayne@68: # local names are interned using the built-in intern function. jpayne@68: # false: Names are not necessarily interned, although they may be (default). jpayne@68: # access: (parsing) read-only; (not parsing) read/write jpayne@68: jpayne@68: feature_validation = "http://xml.org/sax/features/validation" jpayne@68: # true: Report all validation errors (implies external-general-entities and jpayne@68: # external-parameter-entities). jpayne@68: # false: Do not report validation errors. jpayne@68: # access: (parsing) read-only; (not parsing) read/write jpayne@68: jpayne@68: feature_external_ges = "http://xml.org/sax/features/external-general-entities" jpayne@68: # true: Include all external general (text) entities. jpayne@68: # false: Do not include external general entities. jpayne@68: # access: (parsing) read-only; (not parsing) read/write jpayne@68: jpayne@68: feature_external_pes = "http://xml.org/sax/features/external-parameter-entities" jpayne@68: # true: Include all external parameter entities, including the external jpayne@68: # DTD subset. jpayne@68: # false: Do not include any external parameter entities, even the external jpayne@68: # DTD subset. jpayne@68: # access: (parsing) read-only; (not parsing) read/write jpayne@68: jpayne@68: all_features = [feature_namespaces, jpayne@68: feature_namespace_prefixes, jpayne@68: feature_string_interning, jpayne@68: feature_validation, jpayne@68: feature_external_ges, jpayne@68: feature_external_pes] jpayne@68: jpayne@68: jpayne@68: #============================================================================ jpayne@68: # jpayne@68: # CORE PROPERTIES jpayne@68: # jpayne@68: #============================================================================ jpayne@68: jpayne@68: property_lexical_handler = "http://xml.org/sax/properties/lexical-handler" jpayne@68: # data type: xml.sax.sax2lib.LexicalHandler jpayne@68: # description: An optional extension handler for lexical events like comments. jpayne@68: # access: read/write jpayne@68: jpayne@68: property_declaration_handler = "http://xml.org/sax/properties/declaration-handler" jpayne@68: # data type: xml.sax.sax2lib.DeclHandler jpayne@68: # description: An optional extension handler for DTD-related events other jpayne@68: # than notations and unparsed entities. jpayne@68: # access: read/write jpayne@68: jpayne@68: property_dom_node = "http://xml.org/sax/properties/dom-node" jpayne@68: # data type: org.w3c.dom.Node jpayne@68: # description: When parsing, the current DOM node being visited if this is jpayne@68: # a DOM iterator; when not parsing, the root DOM node for jpayne@68: # iteration. jpayne@68: # access: (parsing) read-only; (not parsing) read/write jpayne@68: jpayne@68: property_xml_string = "http://xml.org/sax/properties/xml-string" jpayne@68: # data type: String jpayne@68: # description: The literal string of characters that was the source for jpayne@68: # the current event. jpayne@68: # access: read-only jpayne@68: jpayne@68: property_encoding = "http://www.python.org/sax/properties/encoding" jpayne@68: # data type: String jpayne@68: # description: The name of the encoding to assume for input data. jpayne@68: # access: write: set the encoding, e.g. established by a higher-level jpayne@68: # protocol. May change during parsing (e.g. after jpayne@68: # processing a META tag) jpayne@68: # read: return the current encoding (possibly established through jpayne@68: # auto-detection. jpayne@68: # initial value: UTF-8 jpayne@68: # jpayne@68: jpayne@68: property_interning_dict = "http://www.python.org/sax/properties/interning-dict" jpayne@68: # data type: Dictionary jpayne@68: # description: The dictionary used to intern common strings in the document jpayne@68: # access: write: Request that the parser uses a specific dictionary, to jpayne@68: # allow interning across different documents jpayne@68: # read: return the current interning dictionary, or None jpayne@68: # jpayne@68: jpayne@68: all_properties = [property_lexical_handler, jpayne@68: property_dom_node, jpayne@68: property_declaration_handler, jpayne@68: property_xml_string, jpayne@68: property_encoding, jpayne@68: property_interning_dict]