ASSIGNMENT – 4

Q1: Suppose we have a deeply pipelined processor, for which we implement a branch target buffer for the conditional branches only. Assume that the misprediction penalty is always four cycles and the buffer miss penalty is always three cycles. Assume a 90% hit rate, 90% accuracy of prediction and 15% branch frequency. How much faster is the processor with the branch target buffer versus a processor that has a fixed two-cycle branch penalty? Assume a base CPI of 1 without branch stalls.

Q2: For the following loop, assume that R5 is initialized to 5 before the beginning of the loop. R0 is a special register whose value is always 0. How many cycles does each loop take if dynamic scheduling is not used? What about with dynamic scheduling? Assume that all the operations complete in a single execute cycle except ADD which takes 4 cycles in execute. For the first two iterations of the loop, show the status of the instructions as in Figs. 2:20 and 2.21 in 4E or 3.19 and 3.20 in 5E using dynamic scheduling only and using speculation.

Loop: ADD R4, R2, R3

SW R4, 0(R1)

DADDIU R1, R1, #4

ADD R2, R3, R0

ADD R3, R4, R0

DADDIU R5, R5, #-1

BNEZ R5, Loop

Q3: Do the following exercises from Thread Level Parallelism Chapter : 4.1, 4.16 and 4.17 (4E).