Performance Design Principles - Concurrency
- J.D. Meier, Srinath Vasireddy, Ashish Babbar, Rico Mariani, and Alex Mackman
Choose the Appropriate Remote Communication Mechanism
Your choice of transport mechanism is governed by various factors, including available network bandwidth, amount of data to be passed, average number of simultaneous users, and security restrictions such as firewalls.
Services are the preferred communication across application boundaries, including platform, deployment, and trust boundaries. Object technology, such as Enterprise Services or .NET remoting, should generally be used only within a service's implementation. Use Enterprise Services only if you need the additional feature set (such as object pooling, declarative distributed transactions, role-based security, and queued components) or where your application communicates between components on a local server and you have performance issues with Web services.
You should choose secure transport protocols such as HTTPS only where necessary and only for those parts of a site that require it.
Design Chunky Interfaces
Design chunky interfaces and avoid chatty interfaces. Chatty interfaces require multiple request/response round trips to perform a single logical operation, which consumes system and potentially network resources. Chunky interfaces enable you to pass all of the necessary input parameters and complete a logical operation in a minimum number of calls. For example, you can wrap multiple get and set calls with a single method call. The wrapper would then coordinate property access internally.
You can have a facade with chunky interfaces that wrap existing components to reduce the number of round trips. Your facade would encapsulate the functionality of the set of wrapped components and would provide a simpler interface to the client. The interface internally coordinates the interaction among various components in the layer. In this way, the client is less prone to any changes that affect the business layer, and the facade also helps you to reduce round trips between the client and the server.
Consider How to Pass Data Between Layers
Passing data between layers involves processing overhead for serialization as well as network utilization. Your options include using ADO.NET DataSet objects, strongly typed DataSet objects, collections, XML, or custom objects and value types.
To make an informed design decision, consider the following questions:
- In what format is the data retrieved?
If the client retrieves data in a certain format, it may be expensive to transform it. Transformation is a common requirement, but you should avoid multiple transformations as the data flows through your application.
- In what format is the data consumed?
If the client requires data in the form of a collection of objects of a particular type, a strongly typed collection is a logical and correct choice.
- What features does the client require?
A client might expect certain features to be available from the objects it receives as output from the business layer. For example, if your client needs to be able to view the data in multiple ways, needs to update data on the server by using optimistic concurrency, and needs to handle complex relationships between various sets of data, a DataSet is well suited to this type of requirement.
However, the DataSet is expensive to create due to its internal object hierarchy, and it has a large memory footprint. Also, default DataSet serialization incurs a significant processing cost even when you use the BinaryFormatter.
Other client-side requirements can include the need for validation, data binding, sorting, and sharing assemblies between client and server.
For more information about how to improve DataSet serialization performance, see "How To: Improve Serialization Performance" at http://msdn.microsoft.com/library/en-us/dnpag/html/ScaleNetHowTo01.asp
- Can the data be logically grouped?
If the data required by the client represents a logical grouping, such as the attributes that describe an employee, consider using a custom type. For example, you could return employee details as a struct type that has employee name, address, and employee number as members.
The main performance benefit of custom classes is that they allow you to create your own optimized serialization mechanisms to reduce the communication footprint between computers.
- Do you need to consider cross-platform interoperability?
XML is an open standard and is the ideal data representation for cross-platform interoperability and communicating with external (and heterogeneous) systems.
Performance issues to consider include the considerable parsing effort required to process large XML strings. Large and verbose strings also consume large amounts of memory. For more information about XML processing, see Chapter 9, "Improving XML Performance".
References
- For more information about passing data across layers, see "Data Access" in Chapter 4, "Architecture and Design Review of a .NET Application for Performance and Scalability" at http://msdn.microsoft.com/library/en-us/dnpag/html/scalenetchapt04.asp
Minimize the Amount of Data Sent Across the Wire
Avoid sending redundant data over the wire. You can optimize data communication by using a number of design patterns:
- Use coarse-grained wrappers. You can develop a wrapper object with a coarse-grained interface to encapsulate and coordinate the functionality of one or more objects that have not been designed for efficient remote access. The wrapper object abstracts complexity and the relationships between various business objects, provides a chunky interface optimized for remote access, and helps provide a loosely coupled system. It provides clients with single interface functionality for multiple business objects. It also helps define coarser units of work and encapsulate change. This approach is described by facade design patterns.
- Wrap and return the data that you need. Instead of making a remote call to fetch individual data items, you fetch a data object by value in a single remote call. You then operate locally against the locally cached data. This might be sufficient for many scenarios.
In other scenarios, where you need to ultimately update the data on the server, the wrapper object exposes a single method that you call to send the data back to the server. This approach is demonstrated in the following code fragment.
struct Employee { private int _employeeID; private string _projectCode;
public int EmployeeID { get {return _ employeeID;} } public string ProjectCode { get {return _ projectCode;} } public SetData(){ // Send the changes back and update the changes on the remote server } }
- Besides encapsulating the relevant data, the value object can expose a SetData or method for updating the data back on the server. The public properties act locally on the cached data without making a remote method call. These individual methods can also perform data validation. This approach is sometimes referred to as the data transfer object design pattern.
- Serialize only what you need to. Analyze the way your objects implement serialization to ensure that only the necessary data is serialized. This reduces data size and memory overhead. For more information, see "How To: Improve Serialization Performance" at http://msdn.microsoft.com/library/en-us/dnpag/html/ScaleNetHowTo01.asp.
- Use data paging. Use a paging solution when you need to present large volumes of data to the client. This helps reduce processing load on the server, client, and network, and it provides a superior user experience. For more information about various implementation techniques, see "How To: Page Records in .NET Applications" in the "How To" section of this guide.
- Consider compression techniques. In situations where you absolutely must send large amounts of data, and where network bandwidth is limited, consider compression techniques such as HTTP 1.1 compression.
Batch Work to Reduce Calls Over the Network
Batch your work to reduce the amount of remote calls over the network. Some examples of batching include the following:
- Batch updates. The client sends multiple updates as a single batch to a remote application server instead of making multiple remote calls for updates for a transaction.
- Batch queries. Multiple SQL queries can be batched by separating them with a semicolon or by using stored procedures.
Reduce Transitions Across Boundaries
Keep frequently interacting entities within the same boundary, such as the same application domain, process, or machine, to reduce communication overhead. When doing so, consider the performance against scalability tradeoff. A single-process, single-application domain solution provides optimum performance, and a multiple server solution provides significant scalability benefits and enables you to scale out your solution.
The main boundaries you need to consider are the following:
- Managed to unmanaged code
- Process to process
- Server to server
Consider Asynchronous Communication
To avoid blocking threads, consider using asynchronous calls for any sort of I/O operation. Synchronous calls continue to block on threads during the time they wait for response. Asynchronous calls give you the flexibility to free up the processing thread for doing some useful work (maybe handling new requests for server applications). As a result, asynchronous calls are helpful for potentially long-running calls that are not CPU-bound. The .NET Framework provides an asynchronous design pattern for implementing asynchronous communication.
Note that each asynchronous call actually uses a worker thread from the process thread pool. If they are used excessively on a single-CPU system, this can lead to thread starvation and excessive thread switching, thus degrading performance. If your clients do not need results to be returned immediately, consider using client and server-side queues as an alternative approach.
Consider Message Queuing
A loosely coupled, message-driven approach enables you to do the following:
- Decouple the lifetime of the client state and server state, which helps to reduce complexity and increase the resilience of distributed applications.
- Improve responsiveness and throughput because the current request is not dependent on the completion of a potentially slow downstream process.
- Offload processor-intensive work to other servers.
- Add additional consumer processes that read from a common message queue to help improve scalability.
- Defer processing to nonpeak periods.
- Reduce the need for synchronized access to resources.
The basic message queuing approach is shown in Figure 3.6. The client submits requests for processing in the form of messages on the request queue. The processing logic (which can be implemented as multiple parallel processes for scalability) reads requests from the request queue, performs the necessary work, and places the response messages on the response queue, which are then read by the client.
Figure : Message queuing with response
Message queuing presents additional design challenges:
- How will your application behave if messages are not delivered or received?
- How will your application behave if duplicate messages arrive or messages arrive out of sequence? Your design should not have order and time dependencies.