In contrast to recent Web technologies such as HTML, Cascading Style Sheets and Dynamic HTML, XML is about all about structure, not presentation.
If written properly, normal HTML may reflect document structure, but it cannot adequately represent the structure of data. Consider this trivial HTML example.
John Q Public
[email protected]
phone: 301-286-aaaa
fax: 301-286-bbbb
Bldg. 23, Rm. 999
NASA
Goddard Space Flight Center
588.0
Greenbelt, MD 20221
As humans, we recognize that this example represents information about an employee: name, phone, address, etc. However, the elements used to markup this snippet do not in fact reveal any such interpretation! The markup merely describes how the lines should be displayed. When the HTML is processed by the browser, no semantics can be inferred; your poor computer has no understanding of the kind of information being rendered.
Now consider a possible XML representation of the same information which conveys the relationship between various data objects. In the XML version below, we have an employee, described by a name, an email address, phone and fax numbers, a location, and an address. Note that each conceptual piece of information is represented by its own XML element, such as , , and .
John
Q
Public
[email protected]
301-286-aaaa
301-286-bbbb
Bldg. 23
999
NASA
Goddard Space Flight Center
588.0
Greenbelt
MD
20221
The advantage of XML in this example is that it preserves the semantics and structure of the data. We can think of this information as hierarchical data. An employee object consists of name, email, phone, fax, location, and address objects. A name consists of first, middle, and last components, a location contains a building and a room object, and so forth. The parallel to database records should be obvious.
We can easily imagine a Document Object Model (DOM) based on JavaScript or ECMAScript that accommodates this hierarchy, so if this entry were the fifth employee "record" on the page, we could reference the value of the last name ("Public") like this:
doc.employee[4].name[0].last[0].value
Note that in the XML representation, there is no description of how to display the content. While this might at first appear to be undesirable, the separation of semantics from visual representation makes possible several of the benefits discussed below.