Saturday, November 23, 2013

Ex 5.7 through 5.13 Solution : Modern Processor Design by John Paul Shen and Mikko H. Lipasti : Solution manual

Q.5.7 through Problem 5.13: Consider the following code segment within a loop body for problems 5:
        if (x is even) then                                      (branch b1) 
                     increment a                                  (b1 taken)
        if (x is a multiple of 10) then                      (branch b2) 
                     increment b                                  (b2 taken)

Assume that the following list of 9 values of x is to be processed by 9 iterations of
this loop. 
8, 9, 10, 11, 12, 20, 29, 30, 31 

Note: assume that predictor entries are updated by each dynamic branch before 
the next dynamic branch accesses the predictor (i.e., there is no update delay).

Q.5.7: Assume that an one-bit (history bit) state machine (see above) is used as the prediction algorithm for predicting the execution of the two branches in this loop. Indicate the predicted and actual branch directions of the b1 and b2 branch instructions for each iteration of this loop. Assume initial state of 0, i.e., NT, for the predictor. 

Sol:
                         8        9       10       11       12       20       29      30       31
b1 predicted: __N____T____ N____ T____  N____ T____ T____ N____ T__
b1 actual     :  __T____N____T____ N____   T____ T____ N____ T____ N__
b2 predicted: __N____N____N____ T____   N____ N____ T____ N____ T__
b2 actual     : __N____ N____T____ N____   N____ T____ N____ T____ N__


Q.5.8: What are the prediction accuracies for b1 and b2?  

Sol: 
b1: 1/9 = 11%
b2: 3/9 = 33%


Q.5.9: What is the overall prediction accuracy?   

Sol:                                           
Overall: 4/18 = 22%



Q.5.10: Assume a two-level branch prediction scheme is used. In addition to the one-bit predictor, a one bit global register (g) is used. Register g stores the direction of the last branch executed (which may not be the same branch as the branch currently being predicted) and is used to index into two separate one-bit branch history tables (BHTs) as shown below. Depending on the value of g, one of the two BHTs is selected and used to do the normal one-bit prediction. Again, fill in the predicted and actual branch directions of b1 and b2 for nine iterations of the loop. Assume the initial value of g = 0, i.e., NT. For each prediction, depending on the current value of g, only one of the two BHTs is accessed and updated. Hence, some of the entries below should be empty. 

Note: assume that predictor entries are updated by each dynamic branch before the next dynamic branch accesses the predictor (i.e. there is no update delay). 

Sol:

                          8        9       10        11      12      20       29       30       31

For g=0
b1 predicted:  __ N____ T____N___ ___ __  T____ T___ __ __     T ______
b1 actual      :  __ T____ N____T____ N____ T____ T____ N____ T____ N__
b2 predicted:  ____  __  N______ __  N____ __ ___ _ __   N____ __ __   N__
b2 actual      :  __ N____ N____T____ N____ N____ T____ N____ T____ N__

For g=1
b1 predicted:  ____  ____ ____ __      N______ _ ___ __    N___ ___ __  N__
b1 actual      :  __ T____N____ T____ N____ T____ T____ N____ T____ N__
b2 predicted:  __ N______ __  N______ __   T____ N____ __ __  T____ __
b2 actual      :  __ N____N____ T____ N____N____ T____  N____ T____ N__ 


Q.5.11: What are the prediction accuracies for b1 and b2? 

Sol:
b1 : 6/9 = 67%
 b2 : 6/9 = 67%

Q.5.12: What is the overall prediction accuracy?  

Sol:                                             
Overall: 67%


Q.5.13: What is the prediction accuracy of b2 when g = 0? Explain why. 


Sol: 100%. Whenever b1 is not taken (i.e. g=0), the number being checked is odd (not even). It follows that the number is also not evenly divisible by ten. Hence, in these cases, b2 is always not taken and the predictor is able to predict b2 with high accuracy in this global context.







Next Topic:
Q.5.14: Below is the control flow graph of a simple program. The CFG is annotated with three different execution trace paths. For each execution trace circle which branch predictor (bimodal, local, or Gselect) will best predict the branching behavior of the given trace. More than one predictor may perform equally well on a particular trace. However, you are to use each of the three predictors exactly once in choosing the best predictors for the three traces. Circle your choice for each of the three traces and add. (Assume each trace is executed many times and every node in the CFG is a conditional branch. The branch history register for the local, global, and Gselect predictors is limited to 4 bits.)  



Previous Topic:
Q.5.1: The displayed code that follows steps through the elements of two arrays (A[] and B[]) concurrently, and for each element, it puts the larger of the two values into the corresponding element of a third array (C[]). The three arrays are of length N.
The instruction set used for Problems 5.1 through 5.6 is as follows:


Identify the basic blocks of this benchmark code by listing the static instructions belonging to each basic block in the following table. Number the basic blocks based on the lexical ordering of the code.
Note: There may be more boxes than there are basic blocks.
Q.5.2: Draw the control flow graph for this benchmark.

No comments:

Post a Comment