ASSIGNMENT – 2
Q1: Given the following fragment:
Loop: L.D F0, 0(R1)
MUL.D F3, F0, F2
DIV.D F4, F0, F5
ADD.D F6, F3, F7
DADDUI R1, R1, #-8
BEQ R1, R2, Loop
Assume that the branch is taken 10 times (i.e., R1 does not equal R2 until 10 iterations are done).
Assume that MUL.D takes 6 execute cycles, DIV.D takes 10 cycles, ADD.D takes 2 cycles, DADDUI takes 1 cycle and the branch has a delay of 2 slots. Assume that there are no structural hazards, i.e., there are as many functional units as required to issue the instructions and start executing as soon as the operands are available. It is possible that multiple instructions may proceed to execution in the same cycle though more than one instruction cannot be issued in a single cycle.
Q1: Show the state of the processor when DIV.D has finished execution and is ready to write its result to the CDB for both dynamic scheduling and speculation. Primarily draw the state as shown in figures 3.10 and 3.13 (5E).
Q2: For this question, assume that the processor branch instruction computes the branch target after one execution cycle and the branch condition is evaluated after 3 execution cycles. So, the total branch delay is 4 cycles (1 ID + 3 EX). Assuming that there is no branch prediction and it is a simple pipeline processor with no dynamic scheduling or speculation, what is the total number of cycles needed to execute the above loop 3 times (Show the basic pipeline timeline for one iteration). Compute the same if there is branch prediction and the branch prediction is 100% accurate with speculation. What about if a branch target buffer is also being used with the speculative processor and has a 100% hit ratio? Repeat for the branch target buffer assuming that the first time in the loop the branch PC is not found in the BTB.