XML Overview

The following sections provide an overview of XML technology and the WebLogic Server XML subsystem:

 


What Is XML?

Extensible Markup Language (XML) is a markup language used to describe the content and structure of data in a document. It is a simplified version of Standard Generalized Markup Language (SGML). XML is an industry standard for delivering content on the Internet. Because it provides a facility to define new tags, XML is also extensible.

Like HTML, XML uses tags to describe content. However, rather than focusing on the presentation of content, the tags in XML describe the meaning and hierarchical structure of data. This functionality allows for the sophisticated data types that are required for efficient data interchange between different programs and systems. Further, because XML enables separation of content and presentation, the content, or data, is portable across heterogeneous systems.

The XML syntax uses matching start and end tags (such as <name> and </name>) to mark up information. Information delimited by tags is called an element. Every XML document has a single root element, which is the top-level element that contains all the other elements. Elements that are contained by other elements are often referred to as sub-elements. An element can optionally have attributes, structured as name-value pairs, that are part of the element and are used to further define it.

The following sample XML file describes the contents of an address book:

<?xml version="1.0"?>
<address_book>
  <person gender="f">
    <name>Jane Doe</name>
    <address>
      <street>123 Main St.</street>
      <city>San Francisco</city>
      <state>CA</state>
      <zip>94117</zip>
    </address>
    <phone area_code=415>555-1212</phone>
  </person>
  <person gender="m">
    <name>John Smith</name>
    <phone area_code=510>555-1234</phone>
    <email>johnsmith@somewhere.com</email>
  </person>
</address_book>

The root element of the XML file is address_book. The address book currently contains two entries in the form of person elements: Jane Doe and John Smith. Jane Doe's entry includes her address and phone number; John Smith's includes his phone and email address. Note that the structure of the XML document defines the phone element as storing the area code using the area_code attribute rather than a sub-element in the body of the element. Also note that not all sub-elements are required for the person element.

 


How Do You Describe an XML Document?

There are two ways to describe an XML document: DTDs and XML Schemas.

Document Type Definitions (DTDs) define the basic requirements for the structure of a particular XML document. A DTD describes the elements and attributes that are valid in an XML document, and the contexts in which they are valid. In other words, a DTD specifies which tags are allowed within certain other tags, and which tags and attributes are optional.

The following example shows a DTD that describes the preceding address book sample XML document:

<!DOCTYPE address_book [
<!ELEMENT person (name, address?, phone?, email?)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT address (street, city, state, zip)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT street (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT zip (#PCDATA)> 

<!ATTLIST person gender CDATA #REQUIRED>
<!ATTLIST phone area_code CDATA #REQUIRED>
]>

Schemas are a recent development in XML specifications and are intended to supersede DTDs. They describe XML documents with more flexibility and detail than DTDs do, and are XML documents themselves, which DTDs are not. The schema specification, currently under development, is a product of the World Wide Web Consortium (W3C) and is intended to address many limitations of DTDs. For detailed information on XML schemas, see http://www.w3.org/TR/xmlschema-0/.

The following example shows a schema that describes the preceding address book sample XML document:

<xsd:schema xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<xsd:element     name="address_book" type="bookType"/>
<xsd:complexType name="bookType">
  <xsd:element name=name="person"     type="personType"/>
</xsd:complexType>

<xsd:complexType name="personType">
  <xsd:element   name="name"     type="xsd:string"/>
  <xsd:element   name="address"  type="addressType"/>
  <xsd:element   name="phone"    type="phoneType"/>
  <xsd:element   name="email"    type="xsd:string"/>
  <xsd:attribute name="gender"   type="xsd:string"/> 
</xsd:complexType>

<xsd:complexType name="addressType">

<xsd:element   name="street"  type="xsd:string"/>
  <xsd:element   name="city"    type="xsd:string"/>
  <xsd:element   name="state"   type="xsd:string"/>
  <xsd:element   name="zip"     type="xsd:string"/>
</xsd:complexType>

<xsd:simpleType name="phoneType">
  <xsd:restriction base="xsd:string"/>
  <xsd:attribute name="area_code" type="xsd:string"/>
</xsd:simpleType>

</xsd:schema>

An XML document can include a DTD or Schema as part of the document itself, reference an external DTD or Schema using the DOCTYPE declaration, or not include or reference a DTD or Schema at all. The following excerpt from an XML document shows how to reference an external DTD called address.dtd:

<?xml version=1.0?>
<!DOCTYPE address_book SYSTEM "address.dtd">
<address_book>
...

XML documents only need to be accompanied by a DTD or Schema if they need to be validated by a parser or if they contain complex types. An XML document is considered valid if 1) it has an associated DTD or Schema, and 2) it complies with the constraints expressed in the associated DTD or Schema. If, however, an XML document only needs to be well-formed, then the document does not have to be accompanied by a DTD or Schema. A document is considered well-formed if it follows all the rules in the W3C Recommendation for XML 1.0. For the full XML 1.0 specification, see http://www.w3.org/XML/.

 


Why Use XML?

An industry typically uses data exchange methods that are meaningful and specific to that industry. With the advent of e-commerce, businesses conduct an increasing number of relationships with a variety of industries and, therefore, must develop expert knowledge of the various protocols used by those industries for electronic communication.

The extensibility of XML makes it a very effective tool for standardizing the format of data interchange among various industries. For example, when message brokers and workflow engines must coordinate transactions among multiple industries or departments within an enterprise, they can use XML to combine data from disparate sources into a format that is understandable by all parties.

 


What Are XSL and XSLT?

The Extensible Stylesheet Language (XSL) is a W3C standard for describing presentation rules that apply to XML documents. XSL includes both a transformation language, (XSLT), and a formatting language. These two languages function independently of each other. XSLT is an XML-based language and W3C specification that describes how to transform an XML document into another XML document, or into HTML, PDF, or some other document format.

An XSLT transformer accepts as input an XML document and an XSLT document. The template rules contained in an XSLT document include patterns that specify the XML tree to which the rule applies. The XSLT transformer scans the XML document for patterns that match the rule, and then it applies the template to the appropriate section of the original XML document.

 


What Are DOM and SAX?

DOM and SAX are two standard Java application programming interfaces (APIs) for parsing XML data. Both are supported by the WebLogic Server built-in parser. The two APIs differ in their approach to parsing, with each API having its strengths and weaknesses.

 

SAX

SAX stands for the Simple API for XML. It is a platform-independent language neutral standard interface for event-based XML parsing. SAX defines events that can occur as a parser is reading through an XML document, such as the start or the end of an element. Programmers provide handlers to deal with different events as the document is parsed.

Programmers that use the SAX API to parse XML documents have full control over what happens when these events occur and can, as a result, customize the parsing process extensively. For example, a programmer might decide to stop parsing an XML document as soon as the parser encounters an error that indicates that the document is invalid, rather than waiting until the entire document is parsed, thus improving performance.

The WebLogic Server built-in parser (Apache Xerces) supports SAX Version 2.0. Programmers who have created programs that use Version 1.0 of SAX to parse XML documents should read about the changes between the two versions and update their programs accordingly. For detailed information about the differences between the two versions, refer to http://www.saxproject.org/.

 

DOM

DOM stands for the Document Object Model. It is platform- and language-neutral interface that allows programs and scripts to access and update the content, structure, and style of XML documents dynamically. DOM reads an XML document into memory and represents it as a tree; each node of the tree represents a particular piece of data from the original XML document. Because the tree structure is a standard programming mechanism for representing data, traversing and manipulating the tree using Java is relatively easy, fast, and efficient. The main drawback, however, is that the entire XML document has to be read into memory for DOM to create the tree, which might decrease the performance of an application as the XML documents get larger.

The WebLogic Server built-in parser (Apache Xerces) supports DOM Level 2.0 Core. Programmers who have created programs that use Level 1.0 of DOM to parse XML documents should read about the changes between the two versions and update their programs accordingly. For detailed information about the differences, refer to http://www.w3.org/DOM/DOMTR.

 


What Is XML Streaming?

In addition to SAX and DOM, you can also parse an XML document using the XML streaming API.

The WebLogic XML Streaming API provides an easy and intuitive way to parse and generate XML documents. It is based upon the SAX API, but enables a procedural, stream-based handling of XML documents rather than requiring you to write SAX event handlers, which can get complicated when you work with complex XML documents. In other words, the streaming API gives you more control over parsing than the SAX API.

The XML Streaming API uses the WebLogic FastParser when parsing documents.

For detailed information on using the WebLogic XML Streaming API, see Using the WebLogic XML Streaming API.

Note: Unlike DOM and SAX, XML Streaming is not yet part of the Java API for XML Processing (JAXP).

 


What Is JAXP?

The previous section discusses two APIs, SAX and DOM, that programmers can use to parse XML data. The Java API for XML Processing (JAXP) provides a means to get to these parsers. JAXP also defines a pluggability layer that allows programmers to use any compliant parser or transformer.

WebLogic Server implements JAXP to facilitate XML application development and the work required to move XML applications built on WebLogic Server to other Web application servers. JAXP was developed by Sun Microsystems to make XML applications portable; it provides basic support for parsing and transforming XML documents through a standardized set of Java platform APIs. JAXP 1.1, included in the WebLogic Server distribution, is configured to use the built-in parser. Therefore, by default, XML applications built using WebLogic Server use JAXP.

The WebLogic Server distribution contains the interfaces and classes needed for JAXP 1.1. JAXP 1.1 contains explicit support for SAX Version 2 and DOM Level 2. The Javadoc for JAXP is included with the WebLogic Server online reference documentation.

 

JAXP Packages

JAXP contains the following two packages:

  • javax.xml.parsers
  • javax.xml.transform

The javax.xml.parsers package contains the classes to parse XML data in SAX Version 2.0 and DOM Level 2.0 mode. To parse an XML document in SAX mode, a programmer first instantiates a new SaxParserFactory object with the newInstance() method. This method looks up the specific implementation of the parser to load based on a well-defined list of locations. The programmer then obtains a SaxParser instance from the SaxParserFactory and executes its parse() method, passing it the XML document to be parsed. Parsing an XML document in DOM mode is similar, except that the programmer uses the DocumentBuilder and DocumentBuilderFactory classes instead.

For detailed information on using JAXP to parse XML documents, see Parsing XML Documents.

The javax.xml.transform package contains classes to transform XML data, such as an XML document, a DOM tree, or SAX events, into a different format. The transformer classes work similarly to the parser classes. To transform an XML document, a programmer first instantiates a TransformerFactory object with the newInstance() method. This method looks up the specific implementation of the XSLT transformer to load based on a well-defined list of locations. The programmer then instantiates a new Transformer object based on a specific XSLT style sheet and executes its transform() method, passing it the XML object to transform. The XML object might be an XML file, a DOM tree, and so on.

For detailed information on using JAXP to transform XML objects, see Using JAXP to Transform XML Data.

 


Common Uses of XML and XSLT

How you use XML and XSLT depends on your particular business needs.

 

Using XML and XSLT to Separate Content from Presentation

XML and XSLT are often used in applications that support multiple client types. For example, suppose you have a Web-based application that supports both browser-based clients and Wireless Application Protocol (WAP) clients. These clients understand different markup languages, HTML and Wireless Markup Language (WML), respectively, but your application must deliver content that is appropriate for both.

To accomplish this goal, you can write your application to first produce an XML document that represents the data it is sending to the client. Then the application can transform the XML document that represents the data into HTML or WML, depending on the client's browser type. Your application can determine the client browser type by examining the User-Agent request header of an HTTP request. Once the application knows the client browser type, it uses the appropriate XSLT style sheet to transform the document into the correct markup language. See the SnoopServlet example included in the examples/servlets directory of your WebLogic Server distribution for an example of how to access this type of header information.

This method of rendering the same XML document using different markup languages in respective client types helps concentrate the effort required to support multiple client types into the development of the appropriate XSLT style sheets. Additionally, it allows your application to adapt to other clients types easily, if necessary.

For additional information about XSLT, see Other XML Specifications and Information.

 

XML as a Message Format for Business-to-Business Communication

In a business-to-business (B2B) environment, Company A and Company B want to exchange information about e-commerce transactions in which both are involved. Company A is a major e-commerce site. Company B is a small affiliate that sells Company A's products to a niche group of customers. When Company B sends customers to Company A, Company B is compensated in two ways: it receives, from Company A, both money and information about other customers that make the same sort of purchases as those made by the customers referred by Company B. To exchange information, Company A and Company B must agree on a data format for information that is machine readable and that operates with systems from both companies easily. XML is the logical data format to use in this scenario, but selecting this format is only the first step. The companies must then agree on the format of the XML messages to be exchanged. Because Company A has a one-to-many relationship with its affiliates, Company A must define the format of the XML messages that will be exchanged.

To define the format of XML messages, or XML documents, Company A creates two document type definitions (DTDs): one that describes the information that A will provide about customers and one that describes the information that A wants to receive about a newly affiliated company. Company B must also create two DTDs: one to process the XML documents received from Company A and one to prepare an XML document in a format that can be processed by Company A.

 


WebLogic Server XML Features

WebLogic Server consolidates XML technologies applicable to WebLogic Server and XML applications based on WebLogic Server. The WebLogic Server XML subsystem allows customers to use standard parsers, the WebLogic FastParser, XSLT transformers, and DTDs and XML Schemas to process and convert XML files.

The WebLogic Server XML subsystem includes the following features:

 

XML Document Parsers

WebLogic Server includes the following two parsers:

Parser

Description

Built-in A validating parser based on the Apache Xerces parser version 2.1.0. You can use the built-in parser in either Simple API For XML (SAX) mode or Document Object Model (DOM) mode using the JAXP API. The package name of the built-in WebLogic Server parser is weblogic.apache.xerces.*. For detailed information on this parsers, see its Javadoc.If you have not used the XML Registry to configure a different built-in parser for WebLogic Server, and you use JAXP in your application to obtain a parser, this built-in parser is the one get.
WebLogic FastParser A high-performance non-validating XML parser specifically designed for processing small to medium size documents, such as SOAP and WSDL files associated with WebLogic Web services. The FastParser supports SAX-style parsing only. Configure WebLogic Server to use FastParser if your application mostly handles small to medium size (up to 10,000 elements) XML documents.For detailed information on using WebLogic FastParser, refer to Using the WebLogic FastParser.

You can also use any other XML parser of your choice by using the Administration Console to configure it in the XML Registry. You can configure a single instance of WebLogic Server to use one parser for a particular application and use another parser for a different application.

 

XML Document Transformer

The built-in XSLT transformer included in WebLogic Server is the same one that is included in the JDK 1.4.1 that is shipped with WebLogic Server: Version 2.2.D11 of the Apache Xalan XSL transformer.

If you have not used the XML Registry to configure a different built-in transformer for WebLogic Server, and you use JAXP in your application to obtain a transformer, this built-in transformer is the one get. The package name of this transformer is org.apache.xalan.*.

You can use this built-in XSLT transformer or other XSLT transformers in your XML application to transform XML documents into other XML documents, HTML, and so on. For more information about transforming XML documents, see Using JAXP to Transform XML Data.

 

Difference in Built-In Transformer Between Versions 8.1 and Previous of WebLogic Server

The built-in transformer in Versions 7.0 and previous of WebLogic Server was one that was based on Apache's Xalan XSLT transformer and whose package name started with weblogic.apache.xalan.*. In Version 8.1 of WebLogic Server, this transformer has been deprecated. Instead, the built-in transformer is the same one that is shipped in JDK 1.4.1: Apache's Xalan 2.2.D11.

For backward compatibility, the weblogic.apache.xalan.* transformer is still available in Version 8.1 of WebLogic Server, although BEA highly recommends you do not use it since it will not be available in future versions. If, however, you need to temporarily continue using this transformer, use the Administration Console to configure a transformer other than the built-in for your WebLogic Server instance by updating, or creating a new, XML Registry. Use the following transformer factory:

weblogic.apache.xalan.processor.TransformerFactoryImpl

For detailed information on using the Administration Console to configure the XML Registry for WebLogic Server, see Configuring a Parser or Transformer Other Than the Built-In.

 

WebLogic XML Streaming API

The WebLogic XML Streaming API provides an easy and intuitive way to parse and generate XML documents. It is based upon the SAX API, but provides a more procedural, stream-based handling of XML documents rather than having to write SAX event handlers, which can get complicated when dealing with complex XML documents. In other words, the streaming API gives you more control over parsing than the SAX API.

For detailed information on using the WebLogic XML Streaming API, see Using the WebLogic XML Streaming API.

 

JAXP Pluggability Layer Implementation

Java API for XML Processing (JAXP) 1.1 is a Java-standard, parser-independent API for XML. For more information on JAXP, see What Is JAXP?.

Note: WebLogic Server uses the XML Registry, accessed through the Administration Console, to plug in parsers and transformers. This is different from the JAXP 1.1 specification which specifies the use of system properties to plug in parsers and transformers.

 

WebLogic Servlet Attributes

WebLogic Server supports the following special Servlet attributes:

Calling the setAttribute (for SAX parsing) and getAttribute (for DOM parsing) methods on a ServletRequest object with the preceding attributes will parse any given XML document.

The following code sections show an example of how to use these methods:

request.setAttribute("org.xml.sax.helpers.DefaultHandler", new DefHandler());

org.w3c.dom.Document = (Document)request.getAttribute("org.w3c.dom.Document");

The setAttribute and getAttribute methods are provided for convenience only; they are not required to parse XML from a Servlet.

 

WebLogic XSLT JSP Tag Library

The JSP tag library provides a simple tag that enables access to the built-in XSLT transformer from within a Java Server Page (JSP) running on WebLogic Server. Currently, this tag supports the built-in XSLT transformer only; you cannot use the tag to transform an XML document from within a JSP using a different transformer.

The JSP tag library is included in xmlx-tags.jar, which is installed when you install your WebLogic Server distribution.

Note: The JSP tag library is provided for convenience only; it is not required to access XSLT transformers from within a JSP.

 

XML Registry For Configuring Parsers and Transformers

The XML Registry simplifies administration and configuration tasks by separating these tasks from the XML application. Use the Administration Console (a graphical user interface, or GUI, for WebLogic Server administration) to configure the parsers and transformers for an instance of WebLogic Server.

Note: Each WebLogic Server domain can include any number of registries; each WebLogic Server instance in a domain can be assigned zero or one registry.

By using the XML Registry, you:

  • Can specify the parser or transformer at deployment time, not only at build time.
  • Do not need to include any parser- or transformer- dependent code in your applications.
  • Can support multiple parsers and transformers in a single server more conveniently.

You can use the XML Registry to perform the following tasks:

  • Configure an alternative XML parser instead of the built-in parser shipped in this version of WebLogic Server.
  • Configure an alternative XSLT transformer instead of the built-in transformer shipped in this version of WebLogic Server.
  • Configure an XML parser to process a particular application.

All the preceding capabilities are available if your application uses the standard Java API for XML Processing (JAXP), which is included in this version of WebLogic Server. These capabilities are for use on the server side only.

 

XML Registry for Configuring External Entity Resolution

WebLogic XML supports external entity resolution through the XML Registry. External entities are chunks of text that are not literally part of an XML document, but are referenced inside the XML document. The actual text might reside anywhere - in another file on the same computer or even somewhere on the Web. An example of an external entity is a DTD file that is used to validate an XML document. To use this feature, open the Administration Console and use the XML Registry to enter the Public ID or System ID associated with the external entity.

In addition to storing external entities locally, you can configure WebLogic Server to retrieve and cache external entities from external repositories that support an HTTP interface, such as a URL. You can configure WebLogic Server to cache the external entity in memory or on the disk and specify how long the entity should remain cached before it is considered out of date.

For more information about using the XML Registry for external entity resolution, see External Entity Configuration Tasks.

 

Code Examples for Parsing and Transforming XML Documents

WebLogic Server includes examples of parsing and transforming XML documents.

The examples are located in the WL_HOME\samples\server\examples\src\examples\xml directory, where WL_HOME refers to the top-level WebLogic Platform directory.

For detailed instructions on how to build and run the examples, invoke the Web page WL_HOME\samples\server\examples\src\examples\xml\package-summary.html in your browser.

 


Editing XML Files

To edit XML files, use the BEA XML Editor, an entirely Java-based XML stand-alone editor. It is a simple, user-friendly tool for creating and editing XML files. It displays XML file contents both as a hierarchical XML tree structure and as raw XML code. Thus you can choose how to edit the XML document:

  • The hierarchical tree view allows structured, limited constrained editing, providing you with a set of allowable functions at each point in the hierarchical XML tree structure. The allowable functions are syntactically dictated and in accordance with the XML document's DTD or schema, if one is specified.
  • The raw XML code view allows free-form editing of the data.

BEA XML Editor can validate XML code according to a specified DTD or XML schema.

For detailed information about using the BEA XML Editor, see its online help.

You can download BEA XML Editor from dev2dev Online.

 


Learning About XML

To learn about XML, see the following online courses and tutorials. XML Reference, provides links to more information.

Skip navigation bar  Back to Top Previous Next