Using the XML Parser
IDL's XML parser object class (IDLffXMLSAX) implements a SAX 2 event-based parser. The object's methods are a set of callback routines that are called automatically when the parser encounters different constituents of an XML document. For example, when the parser encounters the beginning of an XML element, it calls the StartElement method. When the StartElement method returns, the parser continues.
The IDLffXMLSAX object's methods are completely generic. As provided, they do nothing with the items encountered in the XML file. To use the parser object to read data from an XML file, you must write a subclass of the IDLffXMLSAX class, overriding the superclass's methods to accomplish your objectives. This requirement that you subclass the object makes the IDLffXMLSAX class unlike any other object class supplied by IDL.
For a detailed discussion of IDL object classes, subclassing, and method overriding, see Creating Custom Objects in IDL (Object Programming). For a description of the parser object class and its methods, see "IDLffXMLSAX" (IDL Reference Guide).
Subclassing the IDLffXMLSAX Object Class
Writing a subclass of the IDLffXMLSAX object class is similar to writing a subclass of any of IDL's other object classes. The basic steps are:
Let's look at these steps individually:
Define a Class Structure
Every object class has a unique class structure that defines the instance data contained in the object. (See Creating an Object Class Structure (Object Programming) for details.) When writing your own parser object (a subclass of the IDLffXMLSAX object), you must first determine what instance data you need your parser object to contain, and define a class structure accordingly.
Note
Your parser object's class structure must inherit from the IDLffXMLSAX class structure. See Inheritance (Object Programming) for details.
For example, suppose you want to use your parser to extract an array of data from an XML file. You might choose to define your class structure to include an IDL pointer that will contain the data array. For this case, your class structure definition might look something like
Within your subclass's methods, this data structure will always be available via the implicit self argument (see Creating Custom Object Method Routines (Object Programming) for details). Setting the value of self.ptr within a method routine sets the instance data of the object.
In most cases, your class structure definition will be included in a routine that does Automatic Structure Definition (see Automatic Class Structure Definition (Object Programming) for details).
Override Superclass Methods
For your XML parser to do any work, you must override the generic methods of the IDLffXMLSAX object class. Overriding a method is as simple as defining a method routine with the same name as the superclass's method. When your parser encounters an item in the parsed XML file that triggers one of the IDLffXMLSAX methods, it will look first for a method of the same name in the definition of your subclass of the IDLffXMLSAX object class. See Method Overriding (Object Programming) for details.
For example, suppose you want your parser to print out the element name of each XML element it encounters to IDL's output. You could override the StartElement method of the IDLffXMLSAX class as follows:
Note
The new method must take the same parameters as the overridden method.
When your parser encounters the beginning of an XML element, it will look for a method named StartElement and call that method with the parameters specified for the IDLffXMLSAX::StartElement method. Since your subclass's StartElement method is found before the superclass's StartElement method, your method is used.
Note
You do not necessarily need to override all of the IDLffXMLSAX object methods. Depending on your application, it may be sufficient to override four or five of the superclass's methods. See the parser definitions later in this chapter for examples.
Overriding the IDLffXMLSAX methods is the heart of writing your own XML parser. To write an efficient parser, you will need detailed knowledge of the structure of the XML file you want to parse.
See Example: Reading Data Into an Array and Example: Reading Data Into Structures for examples of how to work with parsed XML data and return the data in IDL variables.
Write Additional Methods
Depending on your application, you may need to write additional object methods to work with the instance data retrieved from the parsed XML file. Like the overridden object methods, any new methods you write have access to the object's instance data via the implicit self parameter.
Create a Class Definition Routine
If you combine your class definition routine with your class's method routines in a file, you can use IDL's Automatic Structure Definition feature to automatically compile the class routines when an instance of your class is created via the OBJ_NEW function. Keep the following in mind when creating the .pro file that will contain the definition of your class structure and method routines:
- The routine that creates your class structure should be named with the characters "__define" appended to the end of the class name. For example, if your parser object class is named "myParser" and its class structure is the one described in Define a Class Structure, the routine definition would be:
- The
.profile should be named after the class structure definition routine. In this case, the name would bemyParser__define.pro. - The class structure definition routine should be the last routine in the
.profile.
Using Your Parser
Once you have written the class definition routine for your parser, you are ready to parse an XML file. The process is straightforward:
For example, if your parser object is named myParser and the object class definition file is named myParser__define.pro, you could use the following IDL statements:
The first statement creates a new XML parser based on your class definition and places a reference to the parser object in the variable xmlFile. The second statement calls the ParseFile method on that object with the filename data.xml.
What happens next depends on your application. If your object definition stores values from the parsed file in the object's instance data, you will need some way to retrieve the values into IDL variables that are accessible outside the object. See Example: Reading Data Into an Array and Example: Reading Data Into Structures for examples that return data variables that are accessible to other routines.
Validation
An XML document is said to be valid if it adheres to a set of constraints set forth in either a Document Type Definition (DTD) or an XML schema. Both DTDs and schemas define which elements can be included in an XML file and what values those elements can assume. XML schemas are a newer technology that is designed to replace and be more robust than DTDs. In working with existing XML files, you are likely to encounter both types of validation mechanisms.
Ensuring that a file contains valid XML helps in writing an efficient parsing mechanism. For example, if your validation method specifies that element B can only occur inside element A, and the XML document you are parsing is known to be valid, then your parser can assume that if it encounters element B it is inside element A.
The IDLffXMLSAX parser object can check an XML document using either validation mechanism, depending on whether a DTD or a schema definition is present. By default, if either is present, the parser will attempt to validate the XML document. See SCHEMA_CHECKING and VALIDATION_MODE under "IDLffXMLSAX Properties" (IDL Reference Guide) for details.