正在加载图片...
Caches that appear on the CPu chip are manufactured by the CPU vendor. Off-chip caches, however, are a ommodity part sold in large volume. An incomplete list of major cache manufacturers is Hitachi, IBM Micro, Micron, Motorola, NEC, Samsung, SGS-Thomson, Sony, and Toshiba. Although most personal computers and all major workstations now contain caches, very high-end machines(such as multi-million dollar supercom puters)do not usually have caches. These ultra-expensive computers can afford to implement their main memory in a comparatively fast semiconductor technology such as static RAM (SRAM), and can afford so many banks that cacheless bandwidth out of the main memory system is sufficient. Massively parallel processors (MPPs), however, are often constructed out of workstation-like nodes to reduce cost MPPs therefore contain cache hierarchies similar to those found in the workstations on which the nodes of the mpps are based Cache sizes have been steadily increasing on personal computers and workstations. Intel Pentium-based personal computers come with 8 Kbyte each of instruction and data caches. Two of the Pentium chip sets, nanufactured by Intel and OPTi, allow level-two caches ranging from 256 to 512 Kbyte and 64 Kbyte to 2 Mbyte, respectively. The newer Pentium Pro systems also have 8 Kbyte, first-level instruction and data caches, but they also have either a 256 Kbyte or a 512 Kbyte second-level cache on the same module as the processor chip. Higher-end workstations--such as DEC Alpha 21164-based systems--are configured with substantially more cache. The 21164 also has 8 Kbyte, first-level instruction and data caches. Its second-level cache is entirely on chip, and is 96 Kbyte. The third-level cache is off-chip, and can have a size ranging from 1 to 64 Mbyte. For all desktop machines, cache sizes are likely to continue to grow-although the rate of growth compared processor speed increases and main memory size increases is unclear. 88.4 Parallel and Interleaved Memories Main memories are comprised of a series of semiconductor memory chips. A number of these chips, like caches, rm a bank. Multiple memory banks can be connected together to form an interleaved (or parallel)memory stem. Since each bank can service a request, an interleaved memory system with K banks can service Reques simultaneously, increasing the peak bandwidth of the memory system to K times the bandwidth of a single bank. In most interleaved memory systems, the number of banks is a power of two, that is, K=2.An n-bit memory word address is broken into two parts: a k-bit bank number and an m-bit address of a word within a bank. Though the k bits used to select a bank number could be any k bits of the n-bit word address, typical interleaved memory systems use the low-order k address bits to select the bank number; the higher order m n-k bits of the word address are used to access a word in the selected bank. The reason for using the low order k bits will be discussed shortly. An interleaved memory system which uses the low-order k bits to select the bank is referred to as a low-order or a standard interleaved memor There are two ways of connecting multiple memory banks: simple interleaving and complex interleaving. Sometimes simple interleaving is also referred to as interleaving, and complex interleaving as banking Figure 88.5 shows the structure of a simple interleaved memory system. m address bits are simultaneously i pplied to every memory bank. All banks are also connected to the same read/write control line(not shown ig 88.5). For a read operation, the banks start the read operation and deposit the data in their latches. Data can then be read from the latches, one by one, by setting the switch appropriately. Meanwhile, the banks could accessed again, to carry out another read or write operation. For a write operation, the latches are loaded, one by one. When all the latches have been written, their contents can be written into the memory banks by supplying m bits of address(they will be written into the same word in each of the different banks). In a simple interleaved memory, all banks are cycled at the same time; each bank starts and completes its individual operations at the same time as every other bank; a new memory cycle can start(for all banks)once the previos cycle is complete. Timing details of the accesses can be found in The Architecture of Pipelined Computers, Kogge One use of a simple interleaved memory system is to back up a cache memory. To do so, the memory must be able to read blocks of contiguous words(a cache block)and supply them to the cache. If the low-order k bits of the address are used to select the bank number, then consecutive words of the block reside in different banks, and they can all be read in parallel, and supplied to the cache one by one. If some other address bits are used for bank selection, then multiple words from the block might fall in the same memory bank, requiring multiple accesses to the same bank to fetch the block e 2000 by CRC Press LLC© 2000 by CRC Press LLC Caches that appear on the CPU chip are manufactured by the CPU vendor. Off-chip caches, however, are a commodity part sold in large volume. An incomplete list of major cache manufacturers is Hitachi, IBM Micro, Micron, Motorola, NEC, Samsung, SGS-Thomson, Sony, and Toshiba. Although most personal computers and all major workstations now contain caches, very high-end machines (such as multi-million dollar supercom￾puters) do not usually have caches. These ultra-expensive computers can afford to implement their main memory in a comparatively fast semiconductor technology such as static RAM (SRAM), and can afford so many banks that cacheless bandwidth out of the main memory system is sufficient. Massively parallel processors (MPPs), however, are often constructed out of workstation-like nodes to reduce cost. MPPs therefore contain cache hierarchies similar to those found in the workstations on which the nodes of the MPPs are based. Cache sizes have been steadily increasing on personal computers and workstations. Intel Pentium-based personal computers come with 8 Kbyte each of instruction and data caches. Two of the Pentium chip sets, manufactured by Intel and OPTi, allow level-two caches ranging from 256 to 512 Kbyte and 64 Kbyte to 2 Mbyte, respectively. The newer Pentium Pro systems also have 8 Kbyte, first-level instruction and data caches, but they also have either a 256 Kbyte or a 512 Kbyte second-level cache on the same module as the processor chip. Higher-end workstations—such as DEC Alpha 21164-based systems—are configured with substantially more cache. The 21164 also has 8 Kbyte, first-level instruction and data caches. Its second-level cache is entirely on￾chip, and is 96 Kbyte. The third-level cache is off-chip, and can have a size ranging from 1 to 64 Mbyte. For all desktop machines, cache sizes are likely to continue to grow—although the rate of growth compared to processor speed increases and main memory size increases is unclear. 88.4 Parallel and Interleaved Memories Main memories are comprised of a series of semiconductor memory chips.A number of these chips, like caches, form a bank. Multiple memory banks can be connected together to form an interleaved (or parallel) memory system. Since each bank can service a request, an interleaved memory system with K banks can service K requests simultaneously, increasing the peak bandwidth of the memory system to K times the bandwidth of a single bank. In most interleaved memory systems, the number of banks is a power of two, that is, K = 2k . An n-bit memory word address is broken into two parts: a k-bit bank number and an m-bit address of a word within a bank. Though the k bits used to select a bank number could be any k bits of the n-bit word address, typical interleaved memory systems use the low-order k address bits to select the bank number; the higher order m = n – k bits of the word address are used to access a word in the selected bank. The reason for using the low￾order k bits will be discussed shortly. An interleaved memory system which uses the low-order k bits to select the bank is referred to as a low-order or a standard interleaved memory. There are two ways of connecting multiple memory banks: simple interleaving and complex interleaving. Sometimes simple interleaving is also referred to as interleaving, and complex interleaving as banking. Figure 88.5 shows the structure of a simple interleaved memory system. m address bits are simultaneously supplied to every memory bank. All banks are also connected to the same read/write control line (not shown in Fig. 88.5). For a read operation, the banks start the read operation and deposit the data in their latches. Data can then be read from the latches, one by one, by setting the switch appropriately. Meanwhile, the banks could be accessed again, to carry out another read or write operation. For a write operation, the latches are loaded, one by one. When all the latches have been written, their contents can be written into the memory banks by supplying m bits of address (they will be written into the same word in each of the different banks). In a simple interleaved memory, all banks are cycled at the same time; each bank starts and completes its individual operations at the same time as every other bank; a new memory cycle can start (for all banks) once the previous cycle is complete. Timing details of the accesses can be found in The Architecture of Pipelined Computers, [Kogge, 1981]. One use of a simple interleaved memory system is to back up a cache memory. To do so, the memory must be able to read blocks of contiguous words (a cache block) and supply them to the cache. If the low-order k bits of the address are used to select the bank number, then consecutive words of the block reside in different banks, and they can all be read in parallel, and supplied to the cache one by one. If some other address bits are used for bank selection, then multiple words from the block might fall in the same memory bank, requiring multiple accesses to the same bank to fetch the block
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有