XML (.NET 1.1) Performance Guidelines - Parsing XML

From Guidance Share

Revision as of 04:44, 14 December 2007; JD (Talk | contribs)
(diff) ←Older revision | Current revision | Newer revision→ (diff)
Jump to: navigation, search

- J.D. Meier, Srinath Vasireddy, Ashish Babbar, and Alex Mackman


Contents

Use XmlTextReader to Parse Large XML Documents

Use the XmlTextReader class to process large XML documents in an efficient, forward - only manner. XmlTextReader uses small amounts of memory. Avoid using the DOM because the DOM reads the entire XML document into memory. If the entire XML document is read into memory, the scalability of your application is limited. Using XmlTextReader in combination with an XmlTextWriter class permits you to handle much larger documents than a DOM-based XmlDocument class.

The following code fragment shows how to use XmlTextReader to process large XML documents.


  while (reader.Read())
  {
      switch (reader.NodeType)
      {
        case System.Xml.XmlNodeType.Element :
        {
          if( reader.Name.Equals("patient")
          && reader.GetAttribute("number").Equals("25") )
        {
          doc = new System.Xml.XmlDocument(); 
          XmlNode node = doc.ReadNode( reader );
          doc.AppendChild( node );
        }
        break;
      }
    }
  }

You can only use XmlTextReader and XmlValidatingReader to process files that are up to 2 gigabytes (GB) in size. If you need to process larger files, divide the source file into multiple smaller files or streams.


Use XmlValidatingReader for Validation

If you need to validate an XML document, use XmlValidatingReader. The XmlValidatingReader class adds XML Schema and DTD validation support to XmlReader.


Consider Combining XmlReader and XmlDocument

In certain circumstances, the best solution may be to combine the pull model and the DOM model. For example, if you only need to manipulate part of a very large XML document, you can use XmlReader to read the document, and then you can construct a DOM that has only the data required for additional modification. This approach is shown in the following code fragment.


  while (reader.Read())
  {
    switch (reader.NodeType)
    {
      case System.Xml.XmlNodeType.Element :
      {
        if( reader.Name.Equals("patient")
            && reader.GetAttribute("number").Equals("25") )
        {
          doc = new System.Xml.XmlDocument();
          XmlNode node = doc.ReadNode( reader );
          doc.AppendChild( node );
        }
        break;
      }
    }

}


On the XmlReader, Use the MoveToContent and Skip Methods to Skip Unwanted Items

Use the XmlReader.MoveToContent method to skip white space, comments, and processing instructions, and to move to the next content element. MoveToContent skips to the next Text, CDATA, Element, EndElement, EntityReference, or EndEntity node. You can also skip the current element by using the XmlReader.Skip method.

For example, consider the following XML input.

  <?xml version="1.0">
  <!DOCTYPE price SYSTEM "abc">
  <!––the price of the book –->
  <price>123</price>

The following code finds the price element "123.4" and then converts the text content to a double:


  if (readr.MoveToContent() == XmlNodeType.Element && readr.Name =="price")
  {
      _price = XmlConvert.ToDouble(readr.ReadString());
  }

For more information about how to use the MoveToContent method, see MSDN article, "Skipping Content with XmlReader" at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconSkippingContentWithXmlReader.asp.

References

For more information about XMLReader, see MSDN article "Comparing XmlReader to SAX Reader," at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconcomparingxmlreadertosaxreader.asp.

For more information about how to parse XML, see the following Microsoft Knowledge Base articles:

Personal tools