Chapter 6 The Memory Hierarch y n Lu 11210240054@fudan.edu.cn
Chapter 6 The Memory Hierarchy Jin Lu 11210240054@fudan.edu.cn
Problem 6. 1(p. 460) In a DRAM array r: the number of rows c: the number of columns b: the number of bits needed to address the rows b the number of bits needed to address the columns Target: minimizing max(br, bc) Organization r max(br, bc 16*1 16*4 128*8 16 c448 512*4 1024*4
Problem 6.1 (p.460) In a DRAM array r : the number of rows c : the number of columns br : the number of bits needed to address the rows bc : the number of bits needed to address the columns Target : minimizing max(br , bc ) Organization r c br bc max(br, bc) 16*1 16*4 128*8 512*4 1024*4 4 4 16 32 16 32 32 5 5 5 5 4 5 8 4 3 4 4 2 2 2 4 2 2 2
Problem 6.2(p. 468) What is the capacity of a disk with 2 platters, 10,000 cylinders, an average of 400 sectors per track, and 512 bytes per sector 10000400*512*(2*2)=8.192(GB
Problem 6.2 (p.468) • What is the capacity of a disk with 2 platters, 10,000 cylinders, an average of 400 sectors per track, and 512 bytes per sector. 10000 * 400 * 512 * (2 * 2) = 8.192 (GB)
Problem 6.3(p. 471) Tava rotation =1/2*T, max rotation 1/2(60secs/15000RPM)*1000ms/sec Tayatransfer=(60 secs/15000 RPM)*1/500 sectors/track 1000 ms/sec Parameter Value Rotational rate 15.000RPM avg seek 8 ms Average #sectors/track 500 access avg seel ek +T avg rotation avg transter
Problem 6.3 (p.471) • Estimate the average time (in ms) to access a sector on the following disk: Parameter Value Rotational rate 15,000 RPM Tavg seek 8 ms Average #sectors/track 500 Tavg rotation = 1/2 * Tmax rotation =1/2*(60 secs/15000 RPM) * 1000 ms/sec Tavg transfer = (60 secs/15000 RPM) * 1/500 sectors/track * 1000 ms/sec Taccess = Tavg seek + Tavg rotation + Tavg transfer
Problem 6.4(p. 481) Permute the loops the following function so that it scans the three-dimensional array a with a stride-1 reference pattern int sumarray 3d(int aNINiNDo int i,j, k, sum =0; for(i=0; i<N; i++)//k forf=0;j<N;j++)∥ for (k=0; k<N; k++)/li sum +=akon:
Problem 6.4 (p.481) • Permute the loops the following function so that it scans the three-dimensional array a with a stride-1 reference pattern. int sumarray3d(int a[N][N][N]){ int i,j,k,sum = 0; for(i = 0; i < N; i++) //k for(j = 0; j < N; j++) //i for(k = 0; k < N; k++) //j sum += a[k][i][j]; }
Problem 6.5(p. 482) The three functions in Figure 6.20( next page)perform the same operation with varying degrees of spatial locality. Rank order the functions with respect to the spatial locality enjoyed by each Explain how you arrived at your ranking
Problem 6.5 (p.482) • The three functions in Figure 6.20(next page) perform the same operation with varying degrees of spatial locality. Rankorder the functions with respect to the spatial locality enjoyed by each. Explain how you arrived at your ranking
1 define N 1000 1 void clearl(point xp, int n) 3 typedef struct i int i, ji int vel[31 int acc[31 23456 for(⊥=0;⊥<n;i++){ 6 point (=0;j<3;j++) p[i].ve1[j]=0 point p[N] for(1=0;j<3;j++) p[i].acc[]=0; (a) An array of structs (b)The clearl function 1 void clear(point *p, int n) 1 void clear (point *p, int n) int±,j int i, j 4 for (i =0;i< n; i for (1=0 6 for(=0;j<3;j++) for (i =0:i< n: i++) p[].ve1[]=0 p[i].ve1[1=0 p[i] []=0 f。Y < 9 p[i].acc[]=0; a)The clear function (b) The clear function
Problem 6.6(p. 490) Fundamental parameters Parameter Description Number of sets E Number of lines per set B=2b Block size(bytes m=log2(M) Number of physical (main memory)address bits Derived quantities Parameter Description M=2 Maximum number of unique memory addresses Number of set index bits b=log2(B) Number of block offset bits =m-(8+b) Number of tag bits C=BXEX S Cache size(bytes)not including overhead such as the valid and tag bits
Problem 6.6 (p.490) • The table gives the parameters for a number of different caches. For each cache, determine the number of cache sets(S), tag bits(t), set index bits(s), and block offset bits(b). Cache m C B E S t s b 1. 32 1024 4 1 2. 32 1024 8 4 3. 32 1024 32 32
Problem 6.7(p. 496) In the previous dotprod example what fraction of the total references to x and y will be hits once we have padded array X? Cache info: float dotprod(float x[8], float y[8] B=16bytes float sum =0.0 int i s=2 for(=0;i<8;i++) E=1 sm+=刈*yl; return sum float x[12]
Problem 6.7 (p.496) • In the previous dotprod example, what fraction of the total references to x and y will be hits once we have padded array x? float dotprod(float x[8], float y[8]){ float sum = 0.0; int i; for(i = 0; i < 8; i++) sum += x[i] * y[i]; return sum; } Cache info: B=16bytes S=2 E=1 float x[12];?
Problem 6.8(p. 496) In general, if the high-order s bits of an address are used as the set index, contiguous chunks of memory blocks are mapped to the same cache set A. How many blocks are in each of these contiguous array chunks? B. What is the maximum number of array blocks that are stored in the cache at any point in time? cache info: (s,E,B,m)=(512,1,32,32) codes. int array [4096] for(i=0; i< 4096; i++)sum+=array[];
Problem 6.8 (p.496) • In general, if the high-order s bits of an address are used as the set index, contiguous chunks of memory blocks are mapped to the same cache set. • A. How many blocks are in each of these contiguous array chunks? • B. What is the maximum number of array blocks that are stored in the cache at any point in time? cache info: (S, E, B, m) = (512, 1, 32, 32) codes: int array[4096]; for(i = 0; i < 4096; i++) sum+=array[i]; 2 t 1