Presents your XML E-NEWSLETTER for January 15, 2003 <-------------------------------------------> ADD COMPLEX DATA TO XML DOCUMENTS WITH VISUAL BASIC One of the best and most interesting aspects of XML is that all documents are based on human-readable ASCII text. This is generally good because it means that the average person can examine an XML document and understand the data without too much effort. The downside to this feature is that certain data becomes more difficult to communicate. Most of the time, you will not have to worry about complex data within your XML documents; however, one day you may run into a situation where you have to store binary or other complex data structures in XML. COMPLEX DATA? Complex data is a rather ambiguous term. For our purposes, we'll use it to mean data that is somehow difficult to represent in "normal" XML. Normal XML is just basic XML that uses element (or tag) names, some attributes, and, of course, the data. Here's an example of a "normal" XML document: John Doe 33 This example is both trivial and mundane. To make it more interesting, let's add another element called Photo. It's no longer trivial to just add the photo data within the tags. Now you have some idea of what we mean by complex data. There are many types of complex data, including, graphic files, audio files, movie and flash files, application files, databases, and archives (such as ZIPs and JARs). ONE APPROACH The usual approach to storing complex data in XML documents is to simply drop the data in an element and wrap it with the ubiquitous CDATA declaration. A CDATA section in an XML document declares that the XML parser should not parse the data. There are many problems with using CDATA in this way. First, not all parsers can successfully process binary data in CDATA sections. Second, storing binary data like this can wreak havoc. For example, NULL characters may cause the file to be unreadable by some parsers. Finally, it means that it's going to be more difficult for you to look at the document. A BETTER SOLUTION Rather than using brute force to get a CDATA solution, there's a really easy and somewhat elegant solution: make the binary data textual. You can do this through various processes; however, the most popular is a format called base64 encoding. Essentially, this encoding translates 2 bytes of binary data into 3 bytes of character data through a well-defined algorithm. What's even better is that the Microsoft XML parser has the capability to do this translation for you. SETTING THE ELEMENT DATA TYPE In LISTING 1, we illustrate how to accomplish this technique using the Microsoft XML parser and Visual Basic. The basic setup should look familiar to those who have used this parser before. In order to use the built-in base64 algorithm, you have to do two things. First, you must define the data type of the element. As you can see in the code, we have defined the data type for the Photo element to be "bin.base64". Then, you must assign a new value to the element's nodeTypedValue property. Because the data type is set to base64, when we set a new value to the nodeTypedValue property, the XML parser will automatically convert the binary data of the bitmap image to a base64 encoded representation. The base64 data is completely text-based, so it sits nicely within the XML document: Listing 1: Base64 encoding an image Dim mydoc As New MSXML2.DOMDocument40 Dim docroot As IXMLDOMElement Dim newElem As IXMLDOMElement mydoc.loadXML ("") Set docroot = mydoc.documentElement Set newElem = mydoc.createElement("Name") newElem.Text = "John Doe" docroot.appendChild newElem Set newElem = mydoc.createElement("Age") newElem.Text = "33" docroot.appendChild newElem Set newElem = mydoc.createElement("Photo") newElem.dataType = "bin.base64" newElem.nodeTypedValue = getfile("C:\photo.bmp") docroot.appendChild newElem MsgBox docroot.xml Brian Schaffner is a senior consultant for Fujitsu Consulting. He provides architecture, design, and development support for Fujitsu's Telcom360 group. ----------------------------------------