Builder http://builder.com.com Presents your XML E-NEWSLETTER for August 14, 2002 <-------------------------------------------> CONTROLLING THE PHP XML PARSER WITH XML PROCESSING INSTRUCTIONS PHP is a popular Web application tool, and XML is a popular Web data format. The two working together can provide a rich set of functionality for building dynamic Internet applications. With XML, and particularly with XML parsers, you may need to tweak certain areas. CASE FOLDING IN PHP In the case of PHP's XML parsing, one area you may need to tweak is case folding. Case folding is essentially a technique where the XML parser translates all element names to uppercase when sending them to the element handler functions. This may not be a big deal for your application, or it might be a huge problem depending on how you handle your element names. Fortunately, PHP provides functionality for turning this feature on or off. Because the switch to turn the feature on (or off) isn't really part of your XML data, it's better served as an XML processing instruction. XML PROCESSING INSTRUCTIONS A processing instruction (PI) is a piece of the XML document that provides information for the XML parser. In other words, it contains instructions about how to parse the data, rather than actually containing any data. You can use PIs to alter the way the parser works. For example, you might want to inform the parser that case folding should be turned off. XML processing instructions have a special format within the XML document. Here's a familiar processing instruction: This PI tells the parser that this is an XML document and that it conforms to the XML 1.0 standard. As you can see, the PI starts with the characters, and contains two space-separated pieces of data. The first piece of data is called the target, the second the value. What if we want to use a processing instruction that described whether case folding should be turned on or off? We could use something like this: This creates a new PI with a target "casefolding" and a value of on. A SIMPLE EXERCISE Now that we understand what a PI is, let's build a scenario that uses it. We'll start with a simple HTML form that will let us type in (or cut and paste) an XML document, as shown in Listing 1. Listing 1: xmltest.html PHP XML Parser Test
Next, we'll need the test.php document referenced by our form. Listing 2 shows the test.php script. Listing 2: test.php XML Parser Output \n"; } function end_handler($parser, $elementName) { echo "Done with element $elementName
\n"; } function pi_handler($parser, $target, $data) { if ($target == "casefolding") { if ($data == "on") { xml_parser_set_option ( $parser, XML_OPTION_CASE_FOLDING, 1); } else { xml_parser_set_option ( $parser, XML_OPTION_CASE_FOLDING, 0); } } } $parser = xml_parser_create(); xml_set_element_handler($parser, "start_handler", "end_handler"); xml_set_processing_instruction_handler($parser, "pi_handler" ); xml_parse($parser, trim($_REQUEST["xml"]), true); ?> In the PHP script, there are only a few things happening. Once you skip past the function definitions, there are only four lines. First, we create a new XML parser object. Then, we set the element start and end handlers (which get called when the start and end tags for each element are encountered). Next, we set the handler for our processing instructions. Finally, we parse the document that will call our handlers. In the sample document, we have set casefolding on. When you submit the form to the test.php script, the parser will start processing the XML data. Upon parsing the processing instruction, the parser will call the pi_handler function we've defined, and pass to it the information contained in the processing instruction. Our pi_handler function will examine the PI target to determine if it concerns the casefolding feature. Then it will turn the feature on or off depending on the value of the processing instruction. RUNNING THE EXAMPLE If you run the example as-is, you'll get an output similar to the following: Got element FOO Got element BAR Done with element BAR Got element BAR Done with element BAR Done with element FOO As you can see, with casefolding turned on (which is the default setting), all of the element names are converted to uppercase. If you go back and change the processing instruction so that casefolding is turned off, then you should get an output like this: Got element Foo Got element Bar Done with element Bar Got element Bar Done with element Bar Done with element Foo Again, you can see that the element names are now processed exactly as they appear in the XML document without any casefolding. Brian Schaffner is a senior consultant for Fujitsu Consulting. He provides architecture, design, and development support for Fujitsu's Telcom360 group. ----------------------------------------