Friday, November 1, 2013

Ex. 3.4 Solution: Modern Processor Design by John Paul Shen and Mikko H. Lipasti : Solution Manual

Q.3.4: Consider a cache with 256 bytes. Word size is 4 bytes and block size is 16 bytes. Show the values in the cache and tag bits after each of the following memory access operations for the two cache organizations direct mapped and 2-way associative. Also indicate whether the access was a hit or a miss. Justify. The addresses are in hexadecimal representation. Use LRU (least recently used) replacement algorithm wherever needed.
1.Read 0010
2.Read 001C
3.Read 0018
4.Write 0010
5.Read 0484
6.Read 051C
7.Read 001C
8.Read 0210
9.Read 051C



Sol: 256 bytes cache, each block size is 16 bytes, word size is 4, 16 entries of blocks 


  • direct map, 16 sets. Offset is 4 bits, index is 4 bits, tag is 8 bits
    index         hit/miss   tag bits (Hex)      value

1    1                  miss       00                   (0010) 
2    1                  hit         00                   (0010)  
3    1                  hit         00                   (0010) 
4    1                  hit         00                   (0010) 
5    8                  miss       04                   (0480)
6    1                  miss       05                   (0510)   swapping 001 block out
7    1                  miss       00                   (0010)   swapping 051 block out
8    1                  miss       02                   (0210)   swapping 001 block out
9    1                  miss       05                   (0510)   swapping 021 block out






two-way associate map, 8 sets. Offset is 4 bits, index is 3 bits, tag is 9 bits

        index    hit/miss              tag bits (Binary)          value

1         1        miss                   0000 0000 1               (0010)
2         1        hit                      0000 0000 1               (0010)  
3         1        hit                      0000 0000 1               (0010) 
4         1        hit                      0000 0000 1               (0010) 
5         0       miss                    0000 0100 1               (0480)
6         1       miss                    0000 0101 1               (0510)
7         1       hit                       0000 0000 1               (0010)
8         1       miss                    0000 0010 1               (0210)    swapping 051 block out 
9         1       miss                    0000 0101 1               (0510)    swapping 001 block out 









Replacement algorithm :  Least Recently Used (LRU) 

     In a hardware maintained structure or computer program, it is very important and critical procedure to manage a cache of information stored on the computer. In computing, these procedures used to manage such caches are generally known as replacement algorithm or cache algorithm or replacement policies. In these procedure, the algorithm must choose which items to be discarded to make the new room for the new ones, when the cache is full.

      For a cache, there are two primary figures of merit, those are the latency and the hit rate. The hit ratio of a cache describes how often a searched item is actually found in cache. More efficient replacement algorithms keep track of  more usage information in order to improve the hit rate. Whereas the latency of cache describes how long after requesting a desired item in cache can return that item. Faster replacement strategies typically keep track of less usage information. Here, we can conclude that each replacement strategy is the trade off between hit rate and latency.

     In above example, we have used Least Recently Used (LRU) algorithm. This algorithm discard the least recently used item first. This algorithm requires to keep track of what was used when, which is expensive if one wants to make sure that the algorithm always discards the least recently used item. Generally, while implementing of this technique, it is also required to keep age bit for cache lines and track the least recently used cache line that based on the age bits. Hence, in this algorithm, every time the cache line is used and the age of all other cache lines are changed. 



Next Topic:
Q.3.13: Consider a processor with 32-bit virtual addresses, 4KB pages and 36-bit physical addresses. Assume memory is byte-addressable (i.e. the 32-bit VA specifies a byte in memory).
L1 instruction cache: 64 Kbytes, 128 byte blocks, 4-way set associative, indexed and tagged with virtual address.
L1 data cache: 32 Kbytes, 64 byte blocks, 2-way set associative, indexed and tagged with physical address, write-back.
4-way set associative TLB with 128 entries in all. Assume the TLB keeps a dirty bit, a reference bit, and 3 permission bits (read, write, execute) for each entry.
Specify the number of offset, index, and tag bits for each of these structures in the table below. Also, compute the total size in number of bit cells for each of the tag and data arrays.

Previous Topic:
Q.3.1: Given the following benchmark code, and assuming a virtually-addressed fully-associative cache with infinite capacity and 64 byte blocks, compute the overall miss rate (number of misses divided by number of references). Assume that all variables except array locations reside in registers, and that arrays A, B, and C are placed consecutively in memory. 
         double A[1024], B[1024], C[1024]; 
         for(int i=0;i<1000;i += 2) {
                                    A[i] = 35.0 * B[i] + C[i+1];
                                    }
Q.3.3: Given the example code in Problem 1, and assuming a virtually-addressed two-way set associative cache of capacity 8KB and 64 byte blocks, compute the overall miss rate (number of misses divided by number of references). Assume that all variables except array locations reside in registers, and that arrays A, B, and C are placed consecutively in memory.

No comments:

Post a Comment