正在加载图片...
The requirement that the cache memory be associative(content-addressable) complicates the design. Addressing data by content is inherently more complicated than by its address. All the tags must be compared concurrently, of course, because the whole point of the cache is to achieve low latency. The cache can be made simpler, however, by introducing a mapping of memory locations to cache cells. This mapping limits the number of possible cells in which a particular line may reside. The extreme case is known as direct mapping, in which ach memory location is mapped to a single location in the cache. Direct mapping makes many aspects of the design simpler, since there is no choice of where the line might reside, and no choice as to which line must be replaced. However, direct mapping can result in poor utilization of the cache when two memory locations are alternately accessed and must share a single cache cell. A hashing algorithm is used to determine the cache address from the memory address. The conventional mapping algorithm consists of a function of the form mod cache A (884) cache ine St where Acache is the address within the cache for main memory location Amemory, cache_size is the capacity of the cache in addressable units(usually bytes), and cache_ line of the cache line in addressable units Since the hashing function is simple bit selection, the tag memory need only contain the part of the address not implied by the hashing function. That is div size_of_cache (88.5) where Auag is stored in the tag memory and div is the integer divide operation. In testing for a match, the complete address of a line stored in the cache can be inferred from the tag and its storage location within the cache. A two-way set-associative cache maps each memory location into either of two locations in the cache and can be constructed essentially as two identical direct-mapped caches. However, both caches must be searched at each memory access and the appropriate data selected and multiplexed on a tag match(hit). On a miss, a choice must be made between the two possible cache lines as to which is to be replaced. A single LRU bit can be saved for each such pair of lines to remember which line has been accessed more recently. This bit must be toggled to the current state each time either of the cache lines is accessed. In the same way, an M-way associative cache maps each memory location into any of M memory locations in the cache and can be constructed from M identical direct-mapped caches. The problem of maintaining the LRU ordering of M cache lines quickly becomes hard, however, since there are M! possible orderings, and therefore it takes at least 「log2(M!)1 bits to store the ordering. In practice, this requirement limits true LRU replacement to 3-or 4-way set Figure 88.4 shows how a cache is organized into sets, blocks, and words. The cache shown is a 2-Kbyte, 4-way set-associative cache, with 16 sets. Each set consists of four blocks. The cache block size in this example is 32 bytes, so each block contains eight 4-byte words. Also depicted at the bottom of Fig. 88.4 is a 4-way interleaved main memory system(see the next section for details). Each successive word in the cache block maps into a different main memory bank. Because of the cache's mapping restrictions, each cache block obtained from main memory will be loaded into its corresponding set, but may appear anywhere within that set Write operations require special handling in the cache. If the main memory copy is updated with each write operation-a technique known as write-through or store-through-the writes may force operations to stall while the write operations are completing. This can happen after a series of write operations even if the processor is allowed to proceed before the write to the memory has completed. If the main memory copy is not updated e 2000 by CRC Press LLC© 2000 by CRC Press LLC The requirement that the cache memory be associative (content-addressable) complicates the design. Addressing data by content is inherently more complicated than by its address. All the tags must be compared concurrently, of course, because the whole point of the cache is to achieve low latency. The cache can be made simpler, however, by introducing a mapping of memory locations to cache cells. This mapping limits the number of possible cells in which a particular line may reside. The extreme case is known as direct mapping, in which each memory location is mapped to a single location in the cache. Direct mapping makes many aspects of the design simpler, since there is no choice of where the line might reside, and no choice as to which line must be replaced. However, direct mapping can result in poor utilization of the cache when two memory locations are alternately accessed and must share a single cache cell. A hashing algorithm is used to determine the cache address from the memory address. The conventional mapping algorithm consists of a function of the form (88.4) where Acache is the address within the cache for main memory location Amemory , cache size is the capacity of the cache in addressable units (usually bytes), and cache line size is the size of the cache line in addressable units. Since the hashing function is simple bit selection, the tag memory need only contain the part of the address not implied by the hashing function. That is, Atag = A memory div size of cache (88.5) where Atag is stored in the tag memory and div is the integer divide operation. In testing for a match, the complete address of a line stored in the cache can be inferred from the tag and its storage location within the cache. A two-way set-associative cache maps each memory location into either of two locations in the cache and can be constructed essentially as two identical direct-mapped caches. However, both caches must be searched at each memory access and the appropriate data selected and multiplexed on a tag match (hit). On a miss, a choice must be made between the two possible cache lines as to which is to be replaced. A single LRU bit can be saved for each such pair of lines to remember which line has been accessed more recently. This bit must be toggled to the current state each time either of the cache lines is accessed. In the same way, an M-way associative cache maps each memory location into any of M memory locations in the cache and can be constructed from M identical direct-mapped caches. The problem of maintaining the LRU ordering of M cache lines quickly becomes hard, however, since there are M! possible orderings, and therefore it takes at least È log2 (M !) ˘ (88.6) bits to store the ordering. In practice, this requirement limits true LRU replacement to 3- or 4-way set associativity. Figure 88.4 shows how a cache is organized into sets, blocks, and words. The cache shown is a 2-Kbyte, 4-way set-associative cache, with 16 sets. Each set consists of four blocks. The cache block size in this example is 32 bytes, so each block contains eight 4-byte words. Also depicted at the bottom of Fig. 88.4 is a 4-way interleaved main memory system (see the next section for details). Each successive word in the cache block maps into a different main memory bank. Because of the cache’s mapping restrictions, each cache block obtained from main memory will be loaded into its corresponding set, but may appear anywhere within that set. Write operations require special handling in the cache. If the main memory copy is updated with each write operation—a technique known as write-through or store-through—the writes may force operations to stall while the write operations are completing. This can happen after a series of write operations even if the processor is allowed to proceed before the write to the memory has completed. If the main memory copy is not updated A A cache size cache line size cache memory = mod _ _ _
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有