Lexical Events


Lexical information consists of CDATA tags, comments, and references to parsed entities; meta-information about the XML itself, rather than the information content. Lexical event handling is a optional parser feature, enabled by the LexicalEventListener API.

To view lexical information, configure LexicalHandlers are used to view lexical information such as:

comment String(comment) Passes comments to the application.
startCDATA()
endCDATA()
Tells when a CDATA section is starting and ending, which tells the application what kind of characters to expect the next time characters() is called.
startEntity String(name)
endEntity String(name)
name of a parsed entity.
startDTD String(name
String publicId
String systemId)
endDTD()
Tells when a DTD is being processed, and identifies it.

Use an instance of ourselves as the SAX event handler

import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.ext.LexicalHandler;
...
public class Echo extends HandlerBase
  implements LexicalHandler
{ 
  public static void main String(argv[])
    {
      ...
      
      Echo handler = new Echo();
      ...

Get the XMLReader that the parser delegates to, and configure it to send lexical events to the lexical handler:

public static void main String(argv[])
{
  ...
  try {
    ...
    // Parse the input
    SAXParser saxParser = factory.newSAXParser();
    XMLReader xmlReader = saxParser.getXMLReader();
    xmlReader.setProperty(
      "http://xml.org/sax/properties/lexical-handler",
      handler
      ); 
    saxParser.parse( new File(argv[0]), handler);
  } catch  SAXParseException(spe) {
    ...

Here, configured the XMLReader using the setProperty() method defined in the XMLReader class. The property name, defined as part of the SAX standard, is the URL, http://xml.org/sax/properties/lexical-handler.

 

Echoing Comments and Events

To echo comments in the XML file:

public void comment(char[] ch, int start, int length)
  throws SAXException
{
  String text = new String(ch, start, length);
  nl(); 
  emit("COMMENT: "+text);
}

To see events that occur while the DTD is being processed, use org.xml.sax.ext.DeclHandler.

Here is some of the additional output you see when the internally defined products entity is processed with the latest version of the program:

START ENTITY: products
CHARS:   WonderWidgets
END ENTITY: products

And here is the additional output you see as a result of processing the external copyright entity:

  START ENTITY: copyright
  CHARS: 
This is the standard copyright message that our lawyers
make us put everywhere so we don't have to shell out a
million bucks every time someone spills hot coffee in their
lap...

  END ENTITY: copyright

Finally, you get output that shows when the CDATA section was processed:

  START CDATA SECTION
  CHARS:   Diagram:

frobmorten <--------------fuznaten
     |          <3>          ^
     | <1>                  |   <1> = fozzle
    V                  |   <2> = framboze 
  staten----------------------+   <3> = frenzle
           <2>


  END CDATA SECTION

To accurately echo the input, you would modify the characters() method to echo the text it sees in the appropriate fashion, depending on whether or not the program was in CDATA mode.


 

Home