The XSL based XML DOM Parser enables TDI to parse XML documents in any format using the XSL supplied by the user, into attribute value pairs, stored in the entry object. The XSL based parser is required to facilitate reading of any kind of XML format. Particularly, when the user needs only a specific chunk of the XML he can write an XSL for picking the required chunk. The parser will create an in-memory parse tree to represent the input XML and the TDI internal format. The XSL transforms the DOM Document generated from input XML, and produces an output DOM for the TDI internal format. It uses the javax transformation libraries to carry out transformations.
The XSL based DOM XML Parser provides the following parameters:
This Parser extends the Simple XML Parser; therefore, the same notices with regards to Character Encoding apply.
The parser can be used with the Filesystem Connector in Iterator or AddOnly mode. The XSL based DOM XML parser requires the user to specify:
In an XSL transformation, an XSLT processor reads both an XML document and an XSLT style sheet. Based on the instructions the processor finds in the XSLT style sheet, it outputs a new XML document or fragment thereof. The parser will do the basic validation of the XSL files for authenticity. The parser also has optional Document and namespace validation of the file supplied by the Connector. The parser can be used in conjunction with the filesystem connector. The parser will support reading as well as writing, in the sense that XML files can be read and written to in a format specified by the respective XSL. The following optional validations are provided:
<DocRoot> <Entry> <attribute_name> <value_tag>attribute_value</value_tag> <value_tag>attribute_value</value_tag> <value_tag>attribute_value</value_tag> </ attribute_name> <attribute_name> <value_tag>attribute_value</value_tag> </ attribute_name> - - - </Entry> <Entry> - - - </Entry> - </DocRoot>
<?XML version="1.0" encoding="UTF-8"?> <Class> <Order Name="TINAMIFORMES"> <Family Name="TINAMIDAE"> <Species Scientific_Name="Tinamus major"> Great Tinamou.</Species> <Species Scientific_Name="Nothocercus">Highland Tinamou.</Species> <Species Scientific_Name="Crypturellus soui">Little Tinamou.</Species> <Species Scientific_Name="Crypturellus cinnamomeus">Thicket Tinamou.</Species> <Species Scientific_Name="Crypturellus boucardi">Slaty-breasted Tinamou.</Species> <Species Scientific_Name="Crypturellus kerriae">Choco Tinamou.</Species> </Family> </Order> <Order Name="GAVIIFORMES"> <Family Name="GAVIIDAE"> <Species Scientific_Name="Gavia stellata">Red-throated Loon.</Species> <Species Scientific_Name="Gavia arctica">Arctic Loon.</Species> <Species Scientific_Name="Gavia pacifica">Pacific Loon.</Species> <Species Scientific_Name="Gavia immer">Common Loon.</Species> <Species Scientific_Name="Gavia adamsii">Yellow-billed Loon.</Species> </Family> </Order> </Class>
<?XML version="1.0" ?> <XSL:stylesheet xmlns:XSL="http://www.w3.org/1999/XSL/Transform" version="1.0"> <XSL:output method="XML" indent="yes" /> <XSL:template match="Class"> <DocRoot> <XSL:for-each select="Order"> <XSL:variable name="order"><XSL:value-of select="@Name" /> </XSL:variable> <XSL:for-each select="Family"> <Entry> <Attribute name="Order"> <Value><XSL:value-of select="$order" /></Value> </Attribute> < Attribute name="Family"> <Value><XSL:value-of select="@Name" /></Value> </Attribute> <Attribute name="Species"> <XSL:for-each select="Species"> <Value><XSL:value-of select="." /></Value> </XSL:for-each> </Attribute> </Entry> </XSL:for-each> </XSL:for-each> </DocRoot> </XSL:template> </XSL:stylesheet>
birds.xsl transforms birds.xml to TDI internal format from entry object with attribute value pairs, can be formed.
Simple XML Parser
The XML Bible (the chapter on XSL)
http://www.ibiblio.org/xml/books/bible2/chapters/ch17.html
W3C Document Object Model
http://www.w3.org/DOM/
Effective XML processing with DOM and XPath in Java
http://www.ibm.com/developerworks/xml/library/x-domjava/