MANAGE DATA VALIDATION WITH XML SCHEMAS The Document Type Definition (DTD) is the de facto standard mechanism for managing XML document definitions and validations. Unfortunately, DTDs don't provide a robust enough set of features to fulfill the needs of many XML applications. To fill this gap, the World Wide Web Consortium (W3C) developed the XML Schema specification. We'll examine what an XML Schema is and how you can easily create one using Altova's XML Spy tool. XML SCHEMAS XML Schemas are generally preferred over DTDs in situations where data validation is stricter than what DTDs allow. DTDs describe valid XML documents in terms of their structure and, to a limited extent, their content. Another DTD shortcoming is that it uses a specialized grammar to describe the document type. While the DTD grammar is not rocket science, XML Schemas improve the process by providing an XML grammar to describe XML document types. XML Schemas allow you to specify many things that DTDs are unable to specify. For example, in a DTD, there are only a few ways to indicate the expected number of times that an element will appear. DTDs allow you to specify that an item can appear only once in the document or that it can appear zero or more times. DTDs cannot specify that an element should appear exactly four times. Using XML Schema, you can require an element to occur at least three times but not more than eight times. For example, you may want to include support for an address field that has at least one AddressLine element but not more than three (because your legacy database doesn't support that). XML Schemas can process this type of validation. XML SPY XML Spy is a powerful tool used to create and modify XML documents, DTDs, XML Schemas, and even SOAP transactions. Among the most popular features of XML Spy are its ability to create DTDs and Schemas from XML documents and its ability to convert from DTD to Schema and vice versa. Because XML Spy does all of the work, it's easy to create a new Schema from an existing XML document. THE XML DOCUMENT To begin, you will need an XML document. For our example, we'll use a simple document that includes a customer name, account number, and address: 123456
900 N. Michigan Ave. Suite 2599 Chicago IL 60612
We'll first open this document in XML Spy. You can easily navigate the XML data using XML Spy's Enhanced Grid View. CREATING THE DTD Now that we have our XML document, we can extract the information needed to create a DTD. The DTD may not contain all of the information for our document, but it will save us the time of creating a DTD from scratch (and by hand). To create a DTD from the example document, select Generate DTD/Schema. . . from the DTD/Schema menu. The default setting is to output to a DTD file format. When you click OK, XML Spy will create the new DTD and assign it as the appropriate DTD for the example XML document. You can navigate and modify the new DTD from within the XML Spy environment. CREATING THE SCHEMA Creating a new Schema is similar to creating a new DTD. Now that you have both an XML document and a DTD, you have two options for how XML Spy can create a new XML Schema. First, you can generate a new schema from the XML document in a manner similar to what you used to create the new DTD. Alternately, you can use XML Spy to convert the DTD just created into a new XML Schema file. To convert the DTD into an XML Schema, select the DTD file in XML Spy. Next, choose Convert DTD/Schema. . . from the DTD/Schema menu. You'll notice a dialog box similar to the one used to create the new DTD. The default selection is to convert to a DTD. Since you already have a DTD, this is not what you want. Instead, select W3C Schema as the DTD/Schema file format. The last two sections of the dialog box are enabled, allowing you to indicate how to convert complex elements and how to handle global and local definitions. Click OK; XML Spy will convert the existing DTD to a new XML Schema and display it in the XML Spy environment. With the new XML Schema in view, click the tree icon to the left of the Address element. You should now see a new graphical view of the XML tree that shows the Address element and its child elements. Click on the AddressLine element. In the Details panel, you will see parameters for the AddressLine element. Within the Details panel, there are two properties called minOcc and maxOcc. These properties control the minimum and maximum occurrences for the selected element. Set the minOcc value to 1 and the maxOcc value to 3. Now your schema can validate XML documents so that the Address element contains at least one and at most three AddressLine elements. Brian Schaffner is a senior consultant for Fujitsu Consulting. He provides architecture, design, and development support for Fujitsu's Telcom360 group. ----------------------------------------