.NET Framework 2.0 Performance Inspection Questions - Collections

From Guidance Share

Jump to: navigation, search

- J.D. Meier, Srinath Vasireddy, Ashish Babbar, Rico Mariani, and Alex Mackman


Design Considerations

For design considerations, see Web Application Performance Design Inspection Questions - Data Structures and Algorithms. It addresses the following questions:

  • Are you using the right collection type?
  • Have you analyzed the requirements?
  • Are you creating your own data structures unnecessarily?
  • Are you implementing custom collections?

For design considerations, see .NET 2.0 Performance Guidelines - Collections. It addresses the following questions:

  • Do you need to sort your collection?
  • Do you need to search your collection?
  • Do you need to access each element by index?
  • Do you need a custom collection?

Have You Considered Arrays?

Arrays avoid boxing and unboxing overhead for value types, as long as you use strongly typed arrays. You should consider using arrays for collections where possible unless you need special features such as sorting or storing the values as key/value pairs.

Do You Enumerate Through Collections?

Enumerating through a collection using foreach is costly in comparison to iterating using a simple index. You should avoid foreach for iteration in performance-critical code paths, and use for loops instead.

Do You Initialize the Collection to an Approximate Final Size?

It is more efficient to initialize collections to a final approximate size even if the collection is capable of growing dynamically. For example, you can initialize an ArrayList using the following overloaded constructor.

  ArrayList ar = new ArrayList (43);

Do You Store Value Types in a Collection?

Storing value types in a collection involves a boxing and unboxing overhead. The overhead can be significant when iterating through a large collection for inserting or retrieving the value types. Consider using arrays or developing a custom, strongly typed collection for this purpose.

Note At the time of this writing, the .NET Framework 2.0 (code-named "Whidbey") introduces generics to the C# language that avoid the boxing and unboxing overhead.

Have You Considered Strongly Typed Collections?

Does your code use an ArrayList for storing string types? You should prefer StringCollection over ArrayList when storing strings. This avoids casting overhead that occurs when you insert or retrieve items and also ensures that type checking occurs. You can develop a custom collection for your own data type. For example, you could create a Cart collection to store objects of type CartItem.

Do You Use ArrayList?

If your code uses ArrayList, review the following questions:

Do you store strongly typed data in ArrayLists?

Use ArrayList to store custom object types, particularly when the data changes frequently and you perform frequent insert and delete operations. By doing so, you avoid the boxing overhead. The following code fragment demonstrates the boxing issue.

  ArrayList al = new ArrayList();
  al.Add(42.0F); // Implicit boxing because the Add method takes an object
  float f = (float)al[0]; // Item is unboxed here

Do you use Contains to search ArrayLists?

Store presorted data and use ArrayList.BinarySearch for efficient searches. Sorting and linear searches using Contains are inefficient. This is of particular significance for large lists. If you only have a few items in the list, the overhead is insignificant. If you need several lookups, then consider Hashtable instead of ArrayList.

Do You Use Hashtable?

If your code uses a Hashtable collection of key/value pairs, consider the following review questions:

Do you store small amounts of data in a Hashtable?

If you store small amounts of data (10 or fewer items), this is likely to be slower than using a ListDictionary. If you do not know the number of items to be stored, use a HybridDictionary.

Do you store strings?

Prefer StringDictionary instead of Hashtable for storing strings, because this preserves the string type and avoids the cost of up-casting and down-casting during storing and retrieval.

Do You Use SortedList?

You should use a SortedList to store key-and-value pairs that are sorted by the keys and accessed by key and by index. New items are inserted in sorted order, so the SortedList is well suited for retrieving stored ranges.

You should use SortedList if you need frequent re-sorting of data after small inserts or updates. If you need to perform a number of additions or updates and then re-sort the whole collection, an ArrayList performs better than the SortedList.

Evaluate both collection types by conducting small tests and measuring the overall overhead in terms of time taken for sorting, and choose the one which is right for your scenario.

Personal tools