Last week's XML TechMail illustrated how to use Sun Microsystems' Java API for XML (JAXP) to create a Document Object Model (DOM) object from an XML file. This week, we'll take you another step and show how to extract data from the DOM object using the JAXP parser.
Navigating the forest
In order to find a piece of data within an XML document, you need to know where it's located. Because the DOM model creates a node-oriented tree structure, you have to understand how the tree is built in order to navigate it. The table below illustrates our simple XML document as a series of nodes.
CustomerNumber
Address1
Address2
City
State
CustomerRecord Zip
Phone
Fax
FirstName
MiddleInitial
LastName
E-mail
As you can see, the CustomerRecord node is the main or root node for the XML document and the other nodes are all children of the root node. Because the nodes are structured in a tree format, we'll create a special class that will represent the location of each node within the tree. Our new class, called treeNav, is shown in Listing 1.
Listing 1: The treeNav class
public class treeNav {
public String element;
public int index;
public treeNav() {
element = "";
index = 0;
}
}
Using an array of treeNav objects, we can create a navigation path through the XML document. Listing 2 shows an example of creating a treeNav array to access the customer's CustomerNumber.
Listing 2: An array of treeNav objects to access CustomerNumber
treeNav[] tree = new treeNav[1];
tree[0] = new treeNav();
tree[0].element = "CustomerRecord";
tree[0].index = 0;
Using this array, we can "walk" through the XML tree to find the element we are looking for. Because CustomerNumber is attached to the CustomerRecord node, we only need to navigate to the CustomerNumber node. Once we are at that node, we can access the CustomerNumber tag to extract the data.
Listing 3: The getSingleElement() method
public static String getSingleElement(Document doc, treeNav[] tree, String tagname) throws Exception {
String retval = "";
if (doc == null) return "";
Element el = doc.getDocumentElement();
el.normalize();
for (int i = 0; i < tree.length; i++) {
NodeList nl = el.getElementsByTagName(tree[i].element);
if (nl.getLength() > 0) {
if (nl.item(tree[i].index).getNodeType() == Node.ELEMENT_NODE) {
el = (Element) nl.item(tree[i].index);
el.normalize();
}
}
}
el.normalize();
NodeList n = el.getElementsByTagName(tagname);
if (n.getLength() > 0) {
NodeList children = n.item(0).getChildNodes();
if (children.getLength() > 0) {
retval = children.item(0).getNodeValue();
}
}
return retval;
}
Listing 3 shows the getSingleElement() method, which uses the tree to navigate the XML document.
This method uses the first element of the treeNav array to find the starting point of the navigation. Each subsequent array element is used in a similar way until the end of the array. At this point, the tagname that is passed to the method is used to find a matching tag at the current document node. If a match is found, then the value of the node is assigned to the return value and passed to the calling procedure.
To get the value of the CustomerNumber field, you simply call the getSingleElement function as follows:
String CustomerNumber = getSingleElement(doc, tree, "CustomerNumber");
Summary
Sun's JAXP parser is a powerful XML parsing engine. In this two-part series, we've presented an approach to working with XML documents within Java. Last week, we illustrated the process of parsing an XML file into a DOM Document object. This week, we've shown you how to navigate the DOM Document and access XML data.