Performance Design Principles - Data Structures and Algorithms

From Guidance Share
Jump to navigationJump to search

- J.D. Meier, Srinath Vasireddy, Ashish Babbar, Rico Mariani, and Alex Mackman

Choose an Appropriate Data Structure

Before choosing the collection type for your scenarios, you should spend time analyzing your specific requirements by using the following common criteria:

  • Data storage. Consider how much data will be stored. Will you store a few records or a few thousand records? Do you know the amount of data to be stored ahead of time instead of at run time? How do you need to store the data? Does it need to be stored in order or randomly?
  • Type. What type of data do you need to store? Is it strongly typed data? Do you store variant objects or value types?
  • Growth. How will your data grow? What size of growth? What frequency?
  • Access. Do you need indexed access? Do you need to access data by using a key-value pair? Do you need sorting in addition to searching?
  • Concurrency. Does access to the data need to be synchronized? If the data is regularly updated, you need synchronized access. You may not need synchronization if the data is read-only.
  • Marshaling. Do you need to marshal your data structure across boundaries? For example, do you need to store your data in a cache or a session store? If so, you need to make sure that the data structure supports serialization in an efficient way.

Pre-Assign Size for Large Dynamic Growth Data Types

If you know that you need to add a lot of data to a dynamic data type, assign an approximate size up front wherever you can. This helps avoid unnecessary memory re-allocations.

Use Value and Reference Types Appropriately

Value types are stack-based and are passed by value, while reference types are heap-based and are passed by reference. Use the following guidelines when choosing between pass-by-value and pass-by-reference semantics:

  • Avoid passing large value types by value to local methods. If the target method is in the same process or application domain, the data is copied onto the stack. You can improve performance by passing a reference to a large structure through a method parameter, rather than passing the structure by value.
  • Consider passing reference types by value across process boundaries. If you pass an object reference across a process boundary, a callback to the client process is required each time the objects' fields or methods are accessed. By passing the object by value, you avoid this overhead. If you pass a set of objects or a set of connected objects, make sure all of them can be passed by value.
  • Consider passing a reference type when the size of the object is very large or the state is relevant only within the current process boundaries. For example, objects that maintain handles to local server resources, such as files.