Performance Design Principles - Caching
- J.D. Meier, Srinath Vasireddy, Ashish Babbar, Rico Mariani, and Alex Mackman
Decide Where to Cache Data
Cache state where it can save the most processing and network round trips. This might be at the client, a proxy server, your application's presentation logic, business logic, or in a database. Choose the cache location that supports the lifetime you want for your cached items. If you need to cache data for lengthy periods of time, you should use a SQL Server database. For shorter cache durations, use in-memory caches.
Consider the following scenarios:
- Data caching in the presentation layer. Consider caching data in the presentation layer when it needs to be displayed to the user and the data is not cached on per-user basis. For example, if you need to display a list of states, you can look these up once from the database and then store them in the cache.
For more information about ASP.NET caching techniques, see Chapter 6, "Improving ASP.NET Performance" at http://msdn.microsoft.com/library/en-us/dnpag/html/scalenetchapt06.asp
- Data caching in the business layer. You can implement caching mechanisms by using hash tables or other data structures in your application's business logic. For example, you could cache taxation rules that enable tax calculation. Consider caching in the business layer when the data cannot be efficiently retrieved from the database. Data that changes frequently should not be cached.
- Data caching in the database. Cache large amounts of data in a database and when you need to cache for lengthy periods of time. The data can be served in smaller chunks or as a whole, depending on your requirements. The data will be cached in temporary tables, which consume more RAM and may cause memory bottlenecks. You should always measure to see whether caching in a database is hurting or improving your application performance.
Decide What Data to Cache
Caching the right data is the most critical aspect of caching. If you fail to get this right, you can end up reducing performance instead of improving it. You might end up consuming more memory and at the same time suffer from cache misses, where the data is not actually getting served from cache but is refetched from the original source.
The following are some important recommendations that help you decide what to cache:
- Avoid caching per-user data. Caching data on a per-user basis can cause a memory bottleneck. Imagine a search engine that caches the results of the query fired by each user, so that it can page through the results efficiently. Do not cache per-user data unless the retrieval of the data is expensive and the concurrent load of clients does not build up memory pressure. Even in this case, you need to measure both approaches for better performance and consider caching the data on a dedicated server. In such cases, you can also consider using session state as a cache mechanism for Web applications, but only for small amounts of data. Also, you should be caching only the most relevant data.
- Avoid caching volatile data. Cache frequently used, not frequently changing, data. Cache static data that is expensive to retrieve or create.
Caching volatile data, which is required by the user to be accurate and updated in real time, should be avoided. If you frequently expire the cache to keep in synchronization with the rapidly changing data, you might end up using more system resources such as CPU, memory, and network.
Decide the Expiration Policy and Scavenging Mechanism
You need to determine the appropriate time interval to refresh data, and design a notification process to indicate that the cache needs refreshing.
If you hold data too long, you run the risk of using stale data, and if you expire the data too frequently you can affect performance. Decide on the expiration algorithm that is right for your scenario. These include the following:
- Least recently used.
- Least frequently used.
- Absolute expiration after a fixed interval.
- Caching expiration based on a change in an external dependency, such as a file.
- Cleaning up the cache if a resource threshold (such as a memory limit) is reached.
Note The best choice of scavenging mechanism also depends on the storage choice for the cache.
Decide How to Load the Cache Data
For large caches, consider loading the cache asynchronously with a separate thread or by using a batch process.
When a client accesses an expired cache, it needs to be repopulated. Doing so synchronously affects client-response time and blocks the request processing thread.
Cache data that does not change very frequently or is completely static. If the data does change frequently, you should evaluate the acceptable time limit during which stale data can be served to the user. For example, consider a stock ticker, which shows the stock quotes. Although the stock rates are continuously updated, the stock ticker can safely be updated after a fixed time interval of five minutes.
You can then devise a suitable expiration mechanism to clear the cache and retrieve fresh data from the original medium.
- Do not cache shared expensive resources. Do not cache shared expensive resources such as network connections. Instead, pool those resources.
- Cache transformed data, keeping in mind the data use. If you need to transform data before it can be used, transform the data before caching it.
- Try to avoid caching data that needs to be synchronized across servers. This approach requires manual and complex synchronization logic and should be avoided where possible.
Avoid Distributed Coherent Caches
In a Web farm, if you need multiple caches on different servers to be kept synchronized because of localized cache updates, you are probably dealing with transactional state. You should store this type of state in a transactional resource manager such as SQL Server. Otherwise, you need to rethink the degree of integrity and potential staleness you can trade off for increased performance and scalability.
A localized cache is acceptable even in a server farm as long as you require it only for serving the pages faster. If the request goes to other servers that do not have the same updated cache, they should still be able to serve the same pages, albeit by querying the persistent medium for same data.