Cache and Cache Circuitry
The function of integrated cache (also often called a buffer) of a hard disk is to act as a buffer between a relatively fast device and a relatively slow one. For hard disks, the cache is used to hold the results of recent reads from the disk, and also to pre-fetch information that is likely to be requested in the near future, for example, the sector or sectors immediately after the one just requested.
Thus the purpose of this cache is not dissimilar to other caches used in the PC, even though it is not normally thought of as part of the regular PC cache hierarchy. You should always keep it in mind that when someone speaks generically about a disk cache, they are usually not referring to this small memory area inside the hard disk, but rather to a cache of system memory set aside to buffer accesses to the disk system.
The use of cache improves performance of any hard disk, by reducing the number of physical accesses to the disk on repeated reads and allowing data to stream from the disk uninterrupted when the bus is busy. Most modern hard disks have between 512 KB and 2 MB of internal cache memory even some high-performance SCSI drives have as much as 16 MB too.
The cache of a hard disk is important due to the sheer difference in the speeds of the hard disk and the hard disk interface. Finding a piece of data on the hard disk involves random positioning and incurs a penalty of milliseconds as the hard disk actuator is moved and the disk rotates around on the spindle. That is why hard disks have internal buffers.
The basic principle behind the operation of a simple cache is straightforward. Reading data from the hard disk is generally done in blocks of various sizes not just one 512-byte sector at a time. The cache is broken into segments or pieces each of which can contain one block of data.
When a request is made for data from the hard disk, the cache circuitry is first queried to see if the data is present in any of the segments of the cache. If it is present, it is supplied to the logic board without access to the hard disk platters being necessary. If the data is not in the cache, it is read from the hard disk, supplied to the controller and then placed into the cache in the event that it gets asked for again.
Since the cache is limited in size, there are only so many pieces of data that can be held before the segments must be recycled. Typically the oldest piece of data is replaced with the newest one. This is called circular, first-in, first-out (FIFO) or wrap-around caching.
In an effort to improve performance, most hard disk manufacturers today have implemented enhancements to their cache management circuitry, particularly on high-end SCSI drives:
Adaptive Segmentation: Conventional caches are chopped into a number of equal sized segments. Since requests can be made for data blocks of different sizes, this can lead to some of the storage of the cache in some segments being left over and hence wasted. Many newer drives dynamically resize the segments based on how much space is required for each access, to ensure greater utilization. It can also change the number of segments. This is more complex to handle than fixed-size segments, and it can result in waste itself if the space is not managed properly.
Pre-Fetch: The cache logic of a drive, based on analyzing access and usage patterns of the drive, attempts to load into part of the cache data that has not been requested yet but that it anticipates will be requested soon. Usually, this means loading additional data beyond that which was just read from the disk, since it is statistically more likely to be requested next. When done correctly, this will improve performance to some degree.
User Control: High-end drives have implemented a set of commands that allows the user detailed control of the drive cache's operation. This includes letting the user enable or disable caching, set the size of segments, turn on or off adaptive segmentation and pre-fetch etc.