What is XML?

XML stands for EXtensible Markup Language
XML is a markup language much like HTML
XML was designed to describe data
XML tags are not predefined. You must define your own tags
XML uses a Document Type Definition (DTD) or an XML Schema
to separates data from its DTD
XML with a DTD or XML Schema is designed to be self-descriptive.
XML is based on Unicode (16 bit), while HTML on ASCII (8 bit)

The difference between XML and HTML

XML is not a replacement for HTML.
XML and HTML were designed with different goals:

XML was designed to describe data and to focus on what data is.
HTML was designed to display data and to focus on how data looks.

HTML is about displaying information, while XML is about describing information.

XML is free and extensible

XML tags are not predefined. You must "invent" your own tags.

The tags used to mark up HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard (like <p>, <h1>, etc.).

XML allows the author to define his own tags and his own document structure.
XML tags are flexible, more meaningful.

The following example is a note to Tove from Jani, stored as XML:

<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

The note has a header and a message body. It also has sender and receiver information. This XML document does not DO anything. It is just pure information wrapped in XML tags. Someone must write a piece of software to send, receive or display it.

The tags in the example above (like <to> and <from>) are not defined in any XML standard. These tags are "invented" by the author of the XML document.

XML is a complement to HTML

XML is not a replacement for HTML.

It is important to understand that XML is not a replacement for HTML. In future Web development it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data.

My best description of XML is this: XML is a cross-platform, software and hardware independent tool for transmitting information.

XML in future Web development

XML will be as important to the future of the Web as HTML has been to the foundation of the Web and that XML will be the most common tool for all data manipulation and data transmission.

**** http://www.shocknet.org.uk/defpage.asp?pageID=92

XML Problems

The following article comes from suggestions by several developers. The idea is to illustrate some 'common' scenarios/problems using XML documents and to show and demonstrate a workable solution. The examples are embedded in this document and the source is available to download. Note that these examples use the XML parser scripts written by Andy White, with the extensions that are available from this site. Downloading the source will ensure that you have the most up to date version of the XML parser scripts.

Problem 1

Stated simply is: ? how do I find the names of all students who have taken a particular class and received a mark of 65 or more?.

It's not an easy problem, though it's not difficult to produce a solution. The hard (old) way, would be to parse each node individually, trying to maintain context and be cycling back and forth. The easier way is to review the parts of the problem:

Get the names of students - this is the end result
Taken a particular class
Scored 65 or better

So there are, possibly, three steps that we need to complete to accomplish our goal.

The XML document has this format:
<STUDENT>
     <FNAME>Alexis</FNAME>
     <LNAME>Sinclair</LNAME>
     <COURSE>
          <MARK>23</MARK>
          <SECTION>01</SECTION>
          <COURSENAME>ENG101</COURSENAME>
     </COURSE>
</STUDENT>

As with any problem, it's a case of knowing which functions should be used and from which end of the problem we should start.

Forward Traversal or Document Order

Let's examine searching for our goal using document order. First of all we start by getting the student nodes using:
students = obj.getElementByName("STUDENT")

This will return a list of STUDENT nodes. We can then, in turn, inspect each student node with the following.
theCourse = studentNode.getElementByName("COURSENAME")
-- get the course now check if it's the course we want

if theCourse.getText() = courseIn then -- we have a node
-- now we need to check the mark
theMark = studentNode.getElementByName("MARK")
if value(theMark.getText()) >= markIn then -- we now have a mark in the correct range
-- and therefore a student whose name we want to show
studentName = studentNode.getElementByName("FNAME").getText()
studentName = studentName && studentNode.getElementByName("LNAME").getText()

Of course the above example leaves a lot of questions unanswered. There's no error checking; assumptions have been made in regard to variables and types. But, I think you get the idea.

Reverse Traversal

Start by getting the Course name:

courses = obj.getElementByName("COURSENAME")

We can now inspect the course nodes
if course.getText() = theCourse then
-- we have a valid course

Next we need to check the mark. COURSENAME and MARK both share the same parent ,COURSE, so it makes sense to access the parent to find the MARK.
theParent = course.parent
mark = theParent.getElementByName("MARK")
if value(mark.getText()) = theMark then
-- we have a valid mark, so we'll get the parent of COURSE and get the student name

courseParent = theParent.parent
studentName = courseParent.getElementByName("FNAME").getText()
studentName = studentName && courseParent.getElementByName("LNAME").getText()

The method to use is entirely up to you, though the reverse traversal technique is shorter and requires fewer lookups.

The function Definition

The function should return a list of student name(s) and has two arguments, a course name and a mark. Course names are in the format of ENG101, CHE302 etc. and mark is an integer. Because XML is text based, we either have to convert mark to a string or convert the MARK element text to an integer. I prefer the second version as it ensures that the element value is a number. To view the code, click here, or try the movie.

Problem 2

This next problem was sent to me by email and the developer suggested I use it as the basis of an article. Simply stated, the problem is: ? how do I retrieve all restaurants that accept credit cards in neighborhoods where none of the banks are open on weekends ?.

This is not an easy problem because the XML document has multiiple purposes. It will contain neighborhoods that contain both banks and restaurants. These elements have no direct relationship other than being in the same neighborhood. Originally the XML document had this format:

<neighborhood>
      <name>
      <banks>
            <bank>
                  <name>Fleet Bank</name>
                  <atm>Yes</atm>
                  <weekend_hours>No</weekend_hours>
            </bank>
      </banks>
      <street>
            <name>Hanover Street</name>
            <restaurant>
                  <name>Il Panino</name>
                  <accept_credit_card>No</accept_credit_card>
            </restaurant>
      </street>
      </name>
</neighborhood>

The problem with this format is that there are too many name elements. It would be difficult to find the name of a restaurant, by using getElementByName("name"), because the element "name" is also present in banks, street and neighborhood. The first thing would be to restructure the document and to use attributes wherever possible. Here's the new format:

<neighborhood name="North End">
      <banks>
            <bank name="Fleet Bank" atm="Yes" weekend_hours="No"/>
            <bank name="Citizens Bank" atm="No" weekend_hours="No"/>
      </banks>
      <street name="Hanover Street">
            <restaurant>
                  <name>Il Panino</name>
                  <accept_credit_card>No</accept_credit_card>
            </restaurant>
      </street>
</neighborhood>

The Task At Hand

Retrieve all restaurants
restaurants = obj.getElementByName("restaurant")

Which of these accept credit cards?
-- get a restaurant node
restaurant = restaurants[1]
if restaurant.getElementByName("accept_credit_card").getText() = "Yes" then
-- we have a match
-- store it in a list

Getting the neighborhood
Restaurants are children of a street element and streets are children of a neighborhood element. To find which neighborhood a restaurant comes from requires that we access the parent of the parent of the restaurant node eg
neighborhood = restaurant.parent.parent

Find which banks are in the neighborhood
banks = neighborhood.getElementByName("bank")

Find out which banks are not open on weekends
The bank element only has attributes. To access an attribute you need to inspect the nodes attributeValue.some_attribute_name eg
notOpen = bank.attributeValue.weekend_hours
if notOpen = "No" then
-- the bank is not open on the weekend

Does the bank have an ATM?
hasATM = notOpen.attributeValue.atm
if hasATM = "Yes" then
-- has an atm, store it

The function Definition

The function takes no arguments and returns nothing. It's a relatively fixed representation because it's tied to outputting a string and directly assigning it to the member. To view the code, click here, or try the movie.

Conclusion

The above examples illustrate how we can mine an XML document to gather information. Though the example code is not thoroughly optimised, it is a small step for a Lingo programmer to adapt and make these small changes. Remember that the above movies are evaluated in real time. There are no look up tables or prepped lists. It reads the raw XML, parses it, searches for a solution to the problem and then displays it.

Resources

Download the latest extensions to the parser scripts version 1.0.5
Check out DOM-Lingo
Check out MTML by Mark Brownell

Hosted by www.Geocities.ws