The following are notes I gathered while studying for a storage certification and be forewarned, I keep things very basic. One of the most important pieces of the intelligent storage system puzzle is cache. Whether it’s EMC, NetApp, 3Par, HP, or Hitachi they all have it and they all need it. In the following post I will provide an overview of the different cache operations that exist and why this little piece of hardware is crucial to obtaining maximum performance and response time in any storage system.
Cache enhances the performance of storage systems by masking the mechanical delays that are inherent with physical disks. Just imagine if every I/O request had to be services by disk 100% of the time. Queuing times would grow beyond a reasonable threshold. Think of the checkout lines at a grocery store, with the scanned items being the I/O request and the cashier being the read/write head. The cashier can only scan so many items per minute resulting in a long line of groceries (I/O) waiting to be processed. Being able to access data from high-speed cache takes less than a millisecond.
Read Operations with Cache:
- Pre-Fetch/Read Ahead – This type of operation is used when a call for a read request that is sequential. A sequential request being defined as a contiguous set of associated blocks that needs to be retrieved. If some blocks of data have not been previously accessed they can be read and put into cache in advance.
- Fixed Pre-Fetch – Intelligent storage systems can “pre-fetch” a fixed amount of data. Obviously, this algorithm works best when I/O size is uniform.
- Variable Pre-Fetch – Once again we are pre-fetching data here but without a static size. The size works in multiples of the size that is being requested. Of course, this has the potential to consume a large portion of the disk utilization, which can cause real I/O requests to become delayed. Maximum pre-fetch limits exist to limit the number of data blocks that can be pre-fetched.
Read performance is realized by what’s called the “Read hit ratio” or the “Hit rate”, which are usually represented in percentages. The ration is the number of “read hits” (read directly from cache) with respect to the total number of read requests. Of course, a higher read it ratio improves overall read performance.
Write Operations with Cache:
You are probably starting to wonder how much of an effect cache can possibly have with write performance. Well, whether you send a write I/O to cache or directly to the disk, your server has to get an acknowledgement that the block of data was successfully captured before more I/O can be processed. So, from your server’s perspective it takes less time to write directly to cache vs. disk. Small sequential writes to cache are ideal since they offer more optimization opportunities because they can be fused together for larger transfers to disks. Cached write operations are executed in the following ways:
- Write-Back Cache – As we discussed, the host requires an acknowledgement that its data has been successfully handed off. Write back cache allows data to be placed in cache and receive that acknowledgement. This saves a substantial amount of time compared to dealing with the mechanical delays of a disk. Not long after data is placed into cache will it be offloaded and committed to disk. One thing to realize is that data sitting in the cache and not yet sent to disk can be lost if cache failure occurs.
- Write–Through Cache – Data from the host is placed in cache, sent immediately to disk, and the host receives and acknowledgement. The risk of losing data is low since data is being sent to disk immediately. But, higher response times are likely due to the nature of a mechanical disk.
A situation where it may be ideal to bypass cache would be in large size write I/O. “Write aside size” is when the I/O request exceeds the pre-defined size. At this point writes will be sent directly to disk so the cache does not become overloaded.
Cache Fulfillment & Management:
Cache is usually implemented in either a global or dedicated fashion. Simply put, a global cache pool services both read and writes in any of the available memory spaces. Dedicated cache provides a pre-determine set of memory space for reads and writes. The majority of storage systems will use a global cache architecture. In most cases it is best to let the storage system work its magic and let it control on how it wants dynamically address cache.
Cache is not infinite, so housekeeping is crucial. There are several management algorithms that are used to keep everything tidy when data in cache starts to fill up or when data becomes stale.
- LRU (Least Recently Used) – LRU constantly monitors the data in cache to determine its lifecycle of stores information. If data has not been accessed for a while LRU will free up pages or will mark them to be overwritten.
- MRU (Most Recently Used) – MRU will recycle pages that are most recently used. The foundation of this algorithm states that recently used data has the potential for not needing to be accessed in the near future.
Cache flushing techniques are used to keep enough space free for new operations. Idle Flushing continuously monitors utilization and attempts to keep about 50% of space free at all times. High Watermark Flushing activates when utilization reaches close to 100%. Forced flushing occurs when things reach critical levels due to a burst in I/O processing. At about 100% capacity dirty pages are immediately flushed to disk.
Cache is stored in volatile memory so if any failure occurs the data stored in cache can become lost for good. Let’s go over some of the protection schemes.
- Cache Vaulting– The most common use of this technique uses a small battery attached to the cache to keep data “alive” in the event of a power failure. Some systems such as EMC go even further and use the battery power to actually write the data to a set of drives that are used for “vaulting” the data until power comes back is restored. Once power comes back on the data is written back to cache and then sent to its intended destination of drives. I have recently been seeing HP servers being equipped with flash backed write cache. FBWC is a flash based cache module that does not have the battery limitation of how long it can retain what is written to the module.
- Cache Mirroring – Pretty self-explanatory here but for protection cache data will be stored in two different memory locations. Compare this to RAID 1 functionality. If a hardware level failure of cache occurs data is still intact since it was mirrored to another location.
I have recently been seeing HP servers comign with flash backed write cache. FBWC is a flash based cache module that does not have the battery limitation of how long it can retain what is written to the module.
So, there you have it. Hopefully this article was informative and shed some light for people getting into storage. As I said previously, I attempted to make everything basic from all the materials I gathered my information from. My source of reference for this material comes mostly from EMC Education, along with 3Par and NetApp.