Lexical Events
Lexical information consists of CDATA tags, comments, and references to parsed entities; meta-information about the XML itself, rather than the information content. Lexical event handling is a optional parser feature, enabled by the LexicalEventListener API.
To view lexical information, configure LexicalHandlers are used to view lexical information such as:
comment String(comment) Passes comments to the application. startCDATA()
endCDATA()Tells when a CDATA section is starting and ending, which tells the application what kind of characters to expect the next time characters() is called. startEntity String(name)
endEntity String(name)name of a parsed entity. startDTD String(name
String publicId
String systemId)
endDTD()Tells when a DTD is being processed, and identifies it. Use an instance of ourselves as the SAX event handler
import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.ext.LexicalHandler; ... public class Echo extends HandlerBase implements LexicalHandler { public static void main String(argv[]) { ... Echo handler = new Echo(); ...Get the XMLReader that the parser delegates to, and configure it to send lexical events to the lexical handler:
public static void main String(argv[]) { ... try { ... // Parse the input SAXParser saxParser = factory.newSAXParser(); XMLReader xmlReader = saxParser.getXMLReader(); xmlReader.setProperty( "http://xml.org/sax/properties/lexical-handler", handler ); saxParser.parse( new File(argv[0]), handler); } catch SAXParseException(spe) { ...Here, configured the XMLReader using the setProperty() method defined in the XMLReader class. The property name, defined as part of the SAX standard, is the URL, http://xml.org/sax/properties/lexical-handler.
Echoing Comments and Events
To echo comments in the XML file:
public void comment(char[] ch, int start, int length) throws SAXException { String text = new String(ch, start, length); nl(); emit("COMMENT: "+text); }To see events that occur while the DTD is being processed, use org.xml.sax.ext.DeclHandler.
Here is some of the additional output you see when the internally defined products entity is processed with the latest version of the program:
START ENTITY: products CHARS: WonderWidgets END ENTITY: productsAnd here is the additional output you see as a result of processing the external copyright entity:
START ENTITY: copyright CHARS: This is the standard copyright message that our lawyers make us put everywhere so we don't have to shell out a million bucks every time someone spills hot coffee in their lap... END ENTITY: copyrightFinally, you get output that shows when the CDATA section was processed:
START CDATA SECTION CHARS: Diagram: frobmorten <--------------fuznaten | <3> ^ | <1> | <1> = fozzle V | <2> = framboze staten----------------------+ <3> = frenzle <2> END CDATA SECTIONTo accurately echo the input, you would modify the characters() method to echo the text it sees in the appropriate fashion, depending on whether or not the program was in CDATA mode.