by Sandeep Desai (http://www.thdesai.net/)
XML is a stream of structured text. It is a portable way to describe data. XML is a standard defined by World Wide Web Consortium
XML defines syntax for data but not semantics. Vertical standards are being developed by the ebXML project
XML Parsers
SAX (Simple API for XML) to read event driven API to process XML
DOM (Document Object Model) API to read XML line by line
DTD (Document Type Definition) XML specification for validating XML documents, not an XML document. Useful for validating simple XML documents
XSL (Extensible Stylesheet Language) Describes how to identify data (not display)
XSLT+XPATH transform XML
XML Schema, XML based standard for validating XML documents
JAXP (Java API for XML processing) Common API for working with XML irrespective of XML parser vendors API
JAXB (Java Architecture for XML Binding) standard for writing out Java objects as XML
JDOM (Java DOM) Creates a tree of elements makes it easier to work with than DOM
JAXM (Java API for XML Processing) defines a mechanism for exchanging asynchronous messages between applications
JAXR (Java API for XML Registeries)
XML document has header and content. The <?xml> tag is the header. The content has a root element in the example below it is the <message>
tag. There can be only one root element. Each tag must have a matching closing tag
<?xml
version="1.0" encoding="ISO-8859-1">
<!-- This is a comment -->
<!--
optonal DTD referenc SYSTEM keyword for local file PUBLIC keyword for URL à
<!DOCTYPE
book SYSTEM "DTD/JavaXML.dtd">
<contents>
<!--
standard element with attribute -->
<!--
element names cannot have spaces, names are case sensitive -->
<message
to="you" from="me" subject="XML">
<!-- element with textual data -->
<text>
This is the list of XML escape
characters
< and > &
" double quotation &apos is single quotaton
<!-- empty element -->
<signature/>
</text>
<text>
<![CDATA[FooBar:
This text is not parsed by XML so you
can use < and > useful for large text
]]>
</text>
<!--
closing tag -->
</message>
</contents>
namespaces,
namespaces, constraints, DTDs
XML Schema
Transformations, XSL, XSLT, Xpath
XML URI reference file as file:///c:/foo/bar/blah.xml
SAX is a sequential model for parsing XML, the developer implements a callback using the org.xml.sax.ContentHandler Interface. The startElement(String namespaceURI, String localName, String qName, Attributes atts) method is called for each element. There are no methods to walk the XML document.
SAX is good for looking at a specific information in the XML document e.g. building an index for an XML document that represents a book.