Using the XML DOM Classes
Continuing from the weather plug-in example (see About the DOM Structure), this section describes how to use the IDL XLM DOM object classes, namely how to do the following actions:
Loading an XML Document
Although the DOM tree structure is in memory after the XML file is loaded, you cannot directly access the data from IDL until you have created IDLffXMLDOM objects to access them. The DOM loads and parses the XML data into a tree structure, but you need to create a document object to access that data through a mirroring IDL tree structure.
To prepare the interface, load the document:
This code causes the DOM tree structure to be formed in memory. You could also perform the same action in one line:
Be aware that either of these examples will discard an existing DOM tree referenced by oDocument. You can load and reload an XML file as often as desired, but each loading action will overwrite, not add to, the existing tree and remove its objects from memory.
Tip
You can read from and write to IDL variables rather than disk files, see "IDLffXMLDOMDocument::Init" (IDL Reference Guide) for more details.
Reading XML Data
Suppose that you want to print the name of the plug-in. The plug-in element node is the first and only child of the document node. A document node can have only one element child node, which represents the containing element for the entire document (for comparison, consider that an HTML file has only one <HTML></HTML> pair). The name of the element node is the first element child of the plug-in element. There may be several ways to locate a desired piece of data using the IDL XML DOM classes. The following example illustrates one way to find the plug-in name.
First, access the first child of the document, which is the plug-in element:
The GetFirstChild method creates an IDLffXMLDOMElement node object and returns its object reference, which is stored in oPlugin.
Next, ask the plug-in for a list of all of its child element nodes. The oPlugin object creates an IDLffXMLDOMNodeList object and places all the child element nodes in the list. You could have asked for only the name element, but by asking for them all, you will have the other elements in the list in case you need to look at them later.
You know from the design of the XML data, perhaps as defined in a DTD, that the name element must always be the first child of a plug-in element. You can access the name as follows:
You also know that the name element can only contain a text node. Getting access to the text node lets you print the data that you want.
This command prints out:
Note that the oPlugin and the oName objects are of type IDLffXMLDOMElement, and the oNameText object is of type IDLffXMLDOMText. The oName and oNameText objects are created by the GetFirstChild and Item methods, using the object class that is appropriate for the type of data in the DOM tree. You used the GetElementsByTagName method to get the child elements of the plug-in, without having to sort through the whitespace text nodes that are present.
At this point, you have four IDL objects in addition to the root document object that give you access to only the portion of the DOM tree to which these objects correspond. You can create additional objects to explore other parts of the tree and destroy objects for parts that you are no longer interested in.
Modifying Existing Data
You can also modify XML data and write the result back out to a file.
oDocument = OBJ_NEW('IDLffXMLDOMDocument')
oDocument->Load, FILENAME='sample.xml'
oPlugin = oDocument->GetFirstChild()
oNodeList = oPlugin->GetElementsByTagName('*')
oName = oNodeList->Item(0)
oNameText = oName->GetFirstChild()
oNameText->SetNodeValue, 'Weather.com Radar Image [PDX]'
oDocument->Save, FILENAME='sample2.xml'
OBJ_DESTROY, oDocument
This code modifies the name node to change the airport to Portland, Oregon, and writes the modified XML to a new file. Please note that if you save to an existing file (e.g., using sample.xml instead of sample2.xml at the end of this example), the current XML data will replace the file entirely.
Creating New Data
You can create an IDLffXMLDOMDocument object and start adding nodes to it without loading a file.
oDocument = OBJ_NEW('IDLffXMLDOMDocument')
oElement = oDocument->CreateElement('myElement')
oVoid = oDocument->AppendChild(oElement)
oDocument->Save, FILENAME='new.xml'
OBJ_DESTROY, oDocument
This code creates the following XML file:
Note that <myElement/> is XML shorthand for <myElement></myElement>.
Example
The following lines create an XML file containing the lines shown in About the DOM Structure:
oDocument = OBJ_NEW('IDLffXMLDOMDocument')
oPluginElement = oDocument->CreateElement('plugin')
oPluginElement->SetAttribute, 'type', 'tab-iframe'
oVoid = oDocument->AppendChild(oPluginElement)
oNameElement = oDocument->CreateElement('name')
oText = oDocument->CreateTextNode('Weather.com Radar Image [DEN]')
oVoid = oNameElement->AppendChild(oText)
oVoid = oPluginElement->AppendChild(oNameElement)
oDescriptionElement = oDocument->CreateElement('description')
oText = oDocument-> $
CreateTextNode('600 mile Doppler radar image for DEN')
oVoid = oDescriptionElement->AppendChild(oText)
oVoid = oPluginElement->AppendChild(oDescriptionElement)
oVersionElement = oDocument->CreateElement('version')
oText = oDocument->CreateTextNode('1.0')
oVoid = oVersionElement->AppendChild(oText)
oVoid = oPluginElement->AppendChild(oVersionElement)
oTabElement = oDocument->CreateElement('tab')
oIconElement = oDocument->CreateElement('icon')
oText = oDocument->CreateTextNode('weather.gif')
oVoid = oIconElement->AppendChild(oText)
oVoid = oTabElement->AppendChild(oIconElement)
oTooltipElement = oDocument->CreateElement('tooltip')
oText = oDocument->CreateTextNode('DEN Doppler radar image')
oVoid = oTooltipElement->AppendChild(oText)
oVoid = oTabElement->AppendChild(oTooltipElement)
oVoid = oPluginElement->AppendChild(oTabElement)
; Write the output to a file in the TEMP directory.
; Use the PRETTY_PRINT keyword to generate nicely-formatted
; output.
tmpdir = GETENV('IDL_TMPDIR')
oDocument->Save, FILENAME=tmpdir+'sample.xml', /PRETTY_PRINT
OBJ_DESTROY, oDocument
; Optionally, you can open the sample.xml file in the
; IDL Workbench editor:
PUSHD, tmpdir
.EDIT sample.xml
POPD
Destroying IDLffXMLDOM Objects
Suppose that you are done with the name node and want to look at the description.
OBJ_DESTROY, oName oDesc = oNodeList->Item(1) oDescText = oDesc->GetFirstChild() PRINT, oDescText->GetNodeValue()
This code destroys the oName object and oNameText with it because it was created by oName's GetFirstChild method. This automatic destruction cleans up all the objects that you might have created from the oName node. You can then fetch the description element from the node list and print its name in the same manner. The name node is still in the node list and can be fetched again from the node list with the Item method, if needed.
Finally,
destroys the top-level object that you originally created with the OBJ_NEW function and also destroys any other objects that were created directly or indirectly from the oDocument object.
You can write the first code sample above more compactly because of the ability of the IDLffXMLDOMDocument object to clean up all the objects it and its children created:
oDocument = OBJ_NEW('IDLffXMLDOMDocument')
oDocument->Load, FILENAME='sample.xml'
PRINT, ((((oDocument->GetFirstChild())-> $
GetElementsByTagName('name'))-> $
Item(0))->GetFirstChild())->GetNodeValue()
OBJ_DESTROY, oDocument
Under normal circumstances, the three object references created by the calls to the GetFirstChild and GetElementsByTagName methods would be lost because the object references to these three objects were not stored in IDL user variables. However, these objects are cleaned up by the document object when it is destroyed.
For additional information, see Orphan Nodes.
Please note:
- In general, you should not use the OBJ_NEW function to create any IDLffXMLDOM objects except for the top-level document object. Use the methods such as GetFirstChild to create the objects.
- You can destroy objects obtained from the various methods (e.g., GetFirstChild) at any time by the OBJ_DESTROY procedure.
- Objects destroyed by the OBJ_DESTROY procedure also destroy objects that they created.
- Destroying objects does not modify the DOM structure. That is, destroying any of the IDLffXMLDOM objects does not modify the data in the DOM tree. There are explicit methods for modifying DOM tree data. Destroying IDLffXMLDOM objects only removes your ability to access the DOM tree data.
Working with Whitespace
The XML parser is very particular about whitespace because all characters in an XML document define the content of that document. Whitespace consists of spaces, tabs, and newline characters, all of which are commonly used to format documents to make them easier to work with. In many cases, this whitespace is unimportant with respect to the document content. It is there only for presentation and does not affect the actual data stored in the XML document. However, in some cases, for example with CDATA or text node information, the whitespace might be important.
When whitespace is not important, IDL can treat it as ignorable. In many circumstances, you might want the parser to skip over this ignorable whitespace and not place it in the DOM tree so that you do not need to deal with it when visiting nodes in the DOM tree.
For example, the following two XML fragments produce different DOM trees when parsed with the default parser settings:
In the first fragment, the stateList element has two child nodes that the second fragment does not. They are text nodes containing whitespace, a newline, and some tabs or spaces.
For the parser to distinguish between non-ignorable and ignorable whitespace, there must be a DTD associated with the XML document, and it must be used to validate the document during parsing. This implies that a VALIDATION_MODE of 1 or 2 must be used when loading the XML document with the IDLffXMLDOMDocument::Load method.
Once validation is established, you can either:
- Tell the parser not to include ignorable text nodes in the DOM tree by setting the EXCLUDE_IGNORABLE_WHITESPACE keyword in the IDLffXMLDOMDocument::Load method. If you select this option, the DOM trees for each of the above two fragments are the same.
- Check each text node in the DOM tree with the IDLffXMLDOMText::IsIgnorableWhitespace method.
Orphan Nodes
You can remove nodes from the DOM tree by using the IDLffXMLDOMNode::RemoveChild and IDLffXMLDOMNode::ReplaceChild methods. When these nodes are removed from the tree, they are owned by the DOM document directly and have no parent (since they are not in the tree anymore). Similarly, when these methods are used, the IDLffXMLDOM objects' ownership is changed as well because the IDL tree (made by creating the document interface and adding nodes) must mirror the underlying DOM tree.
If you issue the following command:
oMyChild is no longer owned by oMyElement and becomes owned by the document object to which all these nodes belong. Here, oMyRemovedChild and oMyChild are actually object references to the same object. The function method syntax provides a convenient way to create a new object reference variable with a new name that reflects the new status of the removed object, and you can use either name to access the orphaned node.
After removal, the orphan node is loosely associated with the document via the ownership relationship and would not be included in the output if the DOM tree were written to a file. You can insert the node back into the DOM tree with an InsertBefore or AppendChild method.
If the document that contains orphan nodes is destroyed, the orphan nodes are lost. More specifically, DOM tree orphan nodes are not written out to a file if they are orphans at the time that the IDLffXMLDOMDocument::Save method is used to save the tree, and the IDL node objects referring to the orphans are destroyed when the document object is destroyed.