Index

Lecture 07

3 Static Scheduling

IF->ID->EX->ME->WB
     |->FP1->FP2->FP3->WB
for (i = 999; i >= 0; i --) x[i] = x[i] + y
// &x[999] in R1
// &x[0] in R2
// y in F0
//                              stalls
Loop:   L.D     F2, (R1)          0
        ADD.D   F2, F2, F0        1
        S.D     F2, (R1)          1
        ADDI    R1, R1, #-8       0
        BGE     R1, R2, Loop      1
        NOP                       0
//.D -> double-precision f.p.
07-01

07-01

e.g. \(CPI_{loop1}=\frac{6instr\times 1CPI+3stall}{6instr}=1.5\)

4 Local Scheduling

Loop2:  L.D     F2, (R1)
        ADDI    R1, R1, -8
        ADD.D   F2, F2, F0,
        BGE     R1, R2, Loop2
        S.D     F2, 8(R1)
// 0 stalls
07-02

07-02

\(CPI_{loop2}=\frac{5\times 1+0}{5}=1\), \(S_{\frac{loop2}{loop1}}=\frac{t_{loop1}}{t_{loop2}}=\frac{6\times 1000\times 1.5}{5\times 1000\times 1}=1.8\)

5 Global Scheduling

Scheduling across multiple basic blocks

5.1 Loop Unrolling

07-03

07-03

\(CPI_{loop3}=\frac{8\times 1+0}{8}=1\), \(S_{\frac{loop3}{loop1}}=\frac{t_{loop1}}{t_{loop2}}=\frac{6\times 1000\times 1.5}{8\times 500\times 1}=2.25\)

Index