.NET XML Best Practices

Home Products Support Corporate

Support Knowledge Base, Article 673

Product

General

Title

.NET XML Best Practices - Choosing an XML API

Solution

Part I: Choosing an XML API

by Aaron Skonnard

As someone who has spent the past several years learning and using the various XML APIs available throughout the industry, I can honestly say that the .NET implementations are the best I've seen in terms of productivity, efficiency, and extensibility. The .NET XML framework provides support for all of the W3C-stamped XML specifications including:

XML 1.0
Namespaces in XML
DOM Level 2
XPath 1.0
XSLT 1.0
XML Schema

All of the above have achieved the W3C Recommendation status, which means that the specifications have been approved by a majority of the consortium members. Once a specification becomes a W3C Recommendation, it cannot be changed without repeating the development process and receiving consensus once again. The fact that XPath, XSLT, and XML Schema were all finalized within the last year makes it excellent timing for Microsoft to build .NET on top of a solid XML foundation, which includes these standards.

In addition to these W3C specifications, the .NET XML framework also provides support for several other XML-related technologies that don't share the same level of acceptance but which are quickly gaining mind-share as they continue to evolve:

Streaming XML APIs (like SAX)

SOAP 1.1
WSDL
UDDI

Most of these XML-related technologies/specifications will undergo heavy churn for the years to come. Nevertheless, providing support for them today offers too many advantages to pass up.

The .NET XML framework provides API support for all of the XML-related technologies listed above, both new and old. This gives the .NET XML developer a powerful tool chest for solving a wide array of problems. It's ironic that although this tool chest was designed to simplify things, the surplus of choices often ends up confusing developers. However, as you become familiar with the .NET XML framework, you'll see that in most situations the "right" technology choice is obvious.

This article is the first part of a three-part series on the .NET XML framework. In this first part, we'll cover how to choose the right XML API for both reading and writing XML documents. Then, in the following two parts, we'll walk through practical code examples of each approach discussed in this piece.

Reading XML Documents

The most basic task in XML programming is to read and process an XML 1.0 document. When I use the term, "XML 1.0 document", I'm referring to a physical XML 1.0 byte stream that contains angle brackets and namespace declarations as shown here:

<x:name xmlns:x="http://example.org/name"> <first>Aaron</first> <last>Skonnard</last> </x:name>

I'll try to be precise with my terminology because the term "XML document" can refer to either a physical document (like the one above) or a logical document (like a DOM tree). The term "XML 1.0 document" more precisely refers to a physical byte stream of XML content.

The task of reading and processing an XML 1.0 document consists of several steps that can be divided into the layers shown in Figure 1:

Figure 1: XML Processing Layers in .NET

This image illustrates that developers can choose to work with XML at a variety of levels, with the bottom of the stack representing the lowest productivity level and the top of the stack the highest. And to further complicate things, the middle levels are full of different choices for developers to use.

Parsing the XML 1.0 Byte Stream

The bottom-most layer is the task of parsing the XML 1.0 byte stream according to the XML 1.0 and Namespace specifications. This task is typically handled by an XML parser that hides the complexities of the XML 1.0 syntax by serving up the document as a logical-tree structure through higher-level APIs.

One of the reasons that XML has become so popular is because there are so many of these low-level parsers freely available to use. In .NET, the XmlTextReader class provides this type of XML 1.0 parsing functionality. There are really no other choices available at this level but because of the framework's extensible design, more could easily be added in the future.

Processing the Logical Tree via XML APIs

The next layer in the stack requires developers to choose an XML API for processing the logical tree structure exposed by the parser. There are many different XML API flavors available today but the two most common are: 1) streaming and 2) traversal-oriented.

The most common streaming API used today is the Simple API for XML (SAX). Microsoft introduced support for SAX in MSXML 3.0 but then determined that the SAX-based programming model was too obscure and unnecessarily difficult for the majority of their developer community. So to provide .NET developers with a more intuitive alternative, Microsoft introduced a new streaming API through the XmlReader class hierarchy.

The main difference between XmlReader and SAX is that the former allows the client to control the flow of execution by pulling the nodes from the stream one at a time while with the latter, the processor is in control, pushing the nodes back to the client one node at a time. This significant difference makes XmlReader much easier to use for most Microsoft developers that are used to working with firehose (forward-only/read-only) cursors in ADO.

Since SAX has not been stamped by any of today's standards bodies, most do not consider it a standard in W3C-speak, or in the same group as the other technologies mentioned earlier. Therefore, Microsoft's decision to not directly support SAX but rather to provide an easier-to-use alternative for their developers is completely reasonable as far as standards are concerned.

XML Schema and DTD validation is also provided at the streaming level through a specialized XmlValidatingReader class, which can be used in conjunction with any XmlReader implementation including XmlTextReader.

The most common traversal-oriented API used today is the Document Object Model (DOM), which is available in .NET through the XmlNode class hierarchy. In .NET, XmlNode-based trees are built via underlying XmlReader streams. These trees, however, remain in memory until the client is finished with them. DOM trees are completely dynamic and can be traversed in a variety of ways. Most developers working with XML are familiar with the DOM API since it has been around the longest. The DOM is considered the "easiest to use" by many developers but that simplicity comes with a cost.

In addition to the DOM, Microsoft introduced another traversal-oriented API called XPathNavigator. XPathNavigator makes it possible to traverse a logical tree (as defined by the XPath specification) using a cursor model, which gives the underlying implementation more options in terms of how the tree is actually stored. XPathNavigator can sit on top of an in-memory DOM tree but it can also sit on top of other non-XML data stores if desired. And as its name suggests, XPathNavigator also supports higher-level services like XPath expressions and XSLT integration.

Simplifying Code via XPath, XSLT, XSD

XML also comes with several higher-level services that offer additional functionality to the APIs just mentioned. The most important of these are XPath, XSLT, and XML Schema, all of which are supported today in .NET. Most XML developers use at least one or more of these services in most of their code because they can greatly boost productivity.

XPath can be used in conjunction with the standard .NET DOM implementation as well as through XPathNavigator but not with today's XmlReader implementations (at least not out of the box). Using XPath with the DOM API or XPathNavigator is a no-brainer because it greatly simplifies the traversal code commonly written with these APIs. I for one never write DOM code without the help of XPath expressions sprinkled throughout.

Also, if you write a lot of code that transforms XML documents into some other format, XSLT may increase your productivity if the XSLT learning curve isn't too steep. XSLT can be used in .NET through the XslTransform class, which is capable of executing transformations on top of any XPathNavigator implementation, such as the one that sits on top of DOM trees.

In addition to the XPathNavigator implementation for standard DOM trees, .NET provides an optimized alternative called XPathDocument. XPathDocument does not implement the W3C DOM specification but rather a more efficient internal tree structure that can be traversed through XPathNavigator. So if you're writing a lot of XPath or XSLT intensive code, XPathDocument might offer you some additional performance benefits.

And finally, XML Schema makes it possible to annotate logical XML trees with application type information. This type information can be used for 1) validation, and 2) type reflection. Validation can greatly simplify the amount of error handling that you need to build into your XML processing code. Reflection, on the other hand, makes it possible to automatically generate the processing code (see System.Xml.Serialization for more details). In .NET, you must use XmlValidatingReader if you wish to leverage either of these powerful concepts.

Reading Choices

The previous sections describe all of the .NET XML API choices available at the different processing layers shown in Figure 1. When you start writing the application logic required to process an XML file, you must first decide which of these choices is "right" for you. As Figure 1 illustrates, you can choose to work close to the low-level XML parser, through XmlTextReader, or at more abstract levels through layered APIs and services.

The choice is not always easy since there are so many things to consider. The following table helps narrow things down by describing the most common scenarios and the pros and cons of each.

CHOICES	PROS	CONS
XmlTextReader	-Fastest -Most efficient (memory) -Extensible	-Forward-only -Read-only -Requires manual validation
XmlValidatingReader	-Automatic validation -Run-time type info -Relatively fast & efficient (compared to DOM)	-2 to 3x slower than XmlTextReader -Forward-only -Read-only
XmlDocument (DOM)	-Full traversal -Read/write -XPath expressions	-2 to 3x slower than XmlTextReader/XmlValidatingReader -More overhead than XmlTextReader/XmlValidatingReader
XPathNavigator	-Full traversal -XPath expressions -XSLT integration -Extensible	-Read-only -Not as familiar as DOM
XPathDocument	-Faster than XmlDocument -Optimized for XPath/XSLT	-Slower than XmlTextReader

Figure 2: .NET API Choices for Reading XML-Pros and Cons

As you can see, there is a fundamental tradeoff between performance/efficiency and productivity when comparing the different choices. The following figure roughly describes the trade-off between processing time and productivity for performing some common XML processing tasks.

In general, as productivity increases, so does processing time. The same is true for memory overhead only the difference between XmlTextReader and XmlDocument is a bit more severe.

Choosing a configuration of .NET classes usually boils down to the following three decisions:

What kind of reader should I use?
Should I load the tree into memory?
XmlDocument or XPathDocument?

Making the Decision

The first choice is what kind of reader to use. If the data is in XML 1.0 format (the general case), you have no choice but to use XmlTextReader to process the byte stream. However, if you also need XML Schema/DTD validation or wish to access XML Schema type information at runtime, you need to layer XmlValidatingReader over an XmlTextReader instance and use it instead.

It's also possible to read data that's not in XML 1.0 format as if it were an XML document through custom XmlReader or XPathNavigator implementations. For more details on how to write this type of custom XML provider see my article in the Septermber 2001 issues of MSDN Magazine (see [1] in the References section at the bottom of this page). The following figure illustrates the decision tree for choosing a reader:

The second choice is whether to load the document into memory. This decision depends mostly on whether you plan to use any higher-level XML services like XPath or XSLT. If you do, you must load the document into memory today. But even when you don't plan to use these services, if your schema is fairly complex or if the document instances are typically small, you might choose to load the document into memory anyway. The following decision tree illustrates these considerations:

If you decide against loading the document into memory, you should stop here and write your code in terms of the reader you already chose. But if you decide to load the document into memory, you also need to choose the right in-memory structure.

Today there are only two in-memory structures available in .NET, XmlDocument (the standard DOM implementation) and XPathDocument (a tree optimized for XPath/XSLT). Deciding between these mostly depends on how much you care about performance, programming ease, and future extensibility as illustrated by this decision tree:

In terms of performance, XPathDocument offers XPath and XSLT optimizations. XPathDocument implements the XPathNavigator interface over its more efficient tree structure. XmlDocument, however, is typically easier to use since it implements both the standard DOM interfaces as well as XPathNavigator. Since most developers are already familiar with the DOM interfaces, you'll sacrifice a bit of short-term productivity if you choose to use XPathNavigator for the performance benefits.

Also, writing code via XPathNavigator is generally a good idea if you care about taking advantage of future extensions. Since XPathNavigator is much easier to implement than the DOM API, you're more likely to see new-and-improved custom implementations down the road.

Writing XML Documents

Luckily the choices for writing XML documents are nowhere near as complex as those for reading XML documents. For generating XML documents you really only have two API choices: XmlTextWriter and the DOM.

Unlike XmlTextReader, XmlTextWriter is actually more intuitive for many developers than the DOM, especially for those used to working with SAX. Using XmlTextWriter to generate a document feels a lot like the SAX ContentHandler interface since a sequence of method calls represents a logical XML document. As with SAX, the key benefit to XmlTextWriter is that the resulting document doesn't need to be buffered in memory. Instead it can be written directly into the output stream as the document is generated. This makes XmlTextWriter much more efficient than the DOM and fortunately it's still quite easy to use.

With the DOM, you build a document into an in-memory object graph. Then you can serialize the object graph out as XML 1.0 through a variety of mechanisms, which are all actually built on top of XmlTextWriter. Since XmlTextWriter is more efficient and there isn't a huge productivity tradeoff, you would probably only choose to use the DOM if you need to work with the document in-memory before producing the XML 1.0 byte stream.

Writing Choices

The following table helps narrow things down by describing the tradeoffs between XmlTextWriter and the DOM approach:

CHOICES	PROS	CONS
XmlTextWriter	-Fastest -Most efficient, not buffered -Familiar to SAX developers	-You can't manipulate the document (forward-only stream)
DOM	-More flexibility in-memory -Familiar to DOM developers	-Slower and less efficient since buffered in-memory

Figure 3: .NET API Choices for Writing XML-Pros and Cons

In general, if you're generating a stream of XML, use XmlTextWriter. If you need to manipulate the document before serializing it, use the DOM.

Conclusion

The .NET XML framework offers a wide-array of API choices for reading and writing XML documents. The various choices often present a tradeoff between performance/efficiency and productivity. The following two parts of this series help illustrate these tradeoffs through practical code examples of each approach discussed in this piece.

References

[1] XML in .NET: .NET Framework XML Classes and C# Offer Simple, Scalable Data Manipulation, MSDN Magazine January 2001, http://msdn.microsoft.com/msdnmag/issues/01/01/xml/xml.asp, by Aaron Skonnard

[2] Writing XML Providers for Microsoft .NET, MSDN Magazine September 2001, http://msdn.microsoft.com/msdnmag/issues/01/09/xml/xml0109.asp, by Aaron Skonnard

About The Author

Aaron Skonnard is a consultant, instructor, and author specializing in Windows technologies and Web applications. Aaron teaches courses for DevelopMentor and is a columnist for Microsoft Internet Developer. He is the author of Essential WinInet, and co-author of Essential XML: Beyond MarkUp ( Addison Wesley Longman). Contact him at http://staff.develop.com/aarons.