Friday, 3 August 2012

Parse & Create XML Documents in Java

There are 3 main types of XML parsers available out there: DOM, SAX and StAX. StAX is an improvement on SAX and much easier to use. Therefore, we are not going to cover it here. DOM and StAX offer enough functionalities to work with XML documents.

DOM is a parser building a complete tree of nodes in-memory. This can be an issue when parsing large documents. However, it is the only (and easiest) mean to manipulate documents via CRUD (Create, Read, Update, Delete) operations.

StAX is a pull kind of parser. It parses documents step by step and lets the user pull node elements one by one. It is much more efficient regarding memory consumption, but it cannot be used for CRUD operations.

We will use a maven code sample available here. In the resource directory, there is a rates.xml example file.

DOM

DocumentBuilderFactory factory =
    DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream IS = DOM.class.getResourceAsStream("/rates.xml");
Document doc = builder.parse(IS);

// Retrieving cube XML nodes
NodeList list = doc.getElementsByTagName("Cube");

for (int i = 0; i < list.getLength(); i++) {

  Element element = (Element) list.item(i);

  // Retrieving attributes
  NamedNodeMap attr = element.getAttributes();

  for (int j=0;j<attr.getLength();j++) {
    System.out.print(attr.item(j).getTextContent() + " ");
  }

  System.out.println("");

}
The above code loads the rates.xml file, extracts Cube nodes, and prints their attributes. The output is:
2012-08-02
USD 1.2346
JPY 96.64 
BGN 1.9558 
CZK 25.260
...

StAX

XMLInputFactory inputFactory = XMLInputFactory.newInstance();
InputStream IS = StAX.class.getResourceAsStream("/rates.xml");
XMLEventReader eventReader
  = inputFactory.createXMLEventReader(IS);

// Pulling XML elements
while (eventReader.hasNext()) {

  XMLEvent event = eventReader.nextEvent();

  if (event.isStartElement()) {
    StartElement se = event.asStartElement();

    // Filtering on Cube elements
    if (se.getName().getLocalPart().equals("Cube")) {

      Iterator it = se.getAttributes();
      while (it.hasNext()) {
        Attribute a = (Attribute) it.next();
        System.out.print(a.getValue() + " ");
      }

      event = eventReader.nextEvent();
      System.out.println("");
      continue;

    }

  }

}

The above code pulls XML elements one by one, filters for Cube ones, and prints corresponding attributes. The output is:
2012-08-02
1.2346 USD
96.64 JPY
1.9558 BGN
25.260 CZK
...

CRUD and Print

For creation:
//We need a Document
DocumentBuilderFactory dbfac
    = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.newDocument();

Element root = doc.createElement("MyXML");
doc.appendChild(root);

Element sub = doc.createElement("MyNode");
sub.setAttribute("MyAttribute", "33");
root.appendChild(sub);

Text text = doc.createTextNode("Some text for my node");
sub.appendChild(text);

Element sub2 = doc.createElement("MyNode2");
sub2.setAttribute("MyAttribute", "45");
root.appendChild(sub2);

Element subnode = doc.createElement("MySubNode");
sub2.appendChild(subnode);

printXML(doc);
The above creates a document with a root node, then adds subnodes, and a subsubnode to the subnode. One also sets some attribute value.

For printing:
TransformerFactory transfac
    = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.INDENT, "yes");

StringWriter sw = new StringWriter();
StreamResult sr = new StreamResult(sw);
DOMSource source = new DOMSource(doc);

trans.transform(source, sr);

String result = sw.toString();
System.out.println(result);
The ident line adds a newline after each node. The output is:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<MyXML>
<MyNode MyAttribute="33">Some text for my node</MyNode>
<MyNode2 MyAttribute="45">
<MySubNode/>
</MyNode2>
</MyXML>

Pojo to XMLPojo to JSONJAXB to XMLJAXB to JSONJAXB Crash Course