Latency: response time, execution time, elapsed time
Throughout: Amount work per unit time Ideally, for good Performance we want is Latency minimized ad Throughout maximized
CPU time
CPU time is CPU spends on executing a program. cpu time is comprised of three components $$ \text{CPU time} = \frac{\text{number of instructions}}{\text{program}} \times \frac{\text{cycles}}{\text{instructions}} \times \frac{\text{seconds}}{\text{cycle}} $$
instruction count = instructions/ program
CPI = average Cycles/ instruction
clock period(clock frequent = 1/ clock period)
Amdahl’s law
Suppose that enhancement (E) accelerates a fraction (F) of a task by a factor (S) and the remainder of the task is unaffected
$$ \text{over all speedup} = \frac{1}{1-F + \frac{F}{S}} $$
Assume that i and k correspond to registers $s3 and $s5 and the base of the array save is in $s6. What is the MIPS assembly code corresponding to this C segment?
Suppose register $s0 has the binary number $$ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111_{two}$$ and that register $s1 has the binary number $$ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0001_{two}$$ where are the values pf registers $t0 and $t1 after these two instructions?
1 2
slt $t0, $s0, $s1 sltu $t1, $s0, $s1
The key spot of the question is $s0 represents $-1_{ten}$ if it is a signed integer and $2^{32} - 1_{ten}$ if it is an unsigned integer. And the value in $s1 represents $1_{ten}$ in either case.
Jump opreation in MIPS
Instruction
Syntax
Address Source
Saves Return Address
Operation
Common Use
J
j target
26-bit immediate
No
PC = (PC+4)[31:28] || target || 00
Unconditional jumps, loops
JAL
jal target
26-bit immediate
Yes (in $ra)
$ra = PC + 4 PC = (PC+4)[31:28] || target || 00
Function calls
JR
jr $rs
Register content
No
PC = $rs
Function returns, indirect jumps
The jump register instruction jumps to the address stored in register $ra which is just what we want. Thus, the calling program, or caller, puts the parameter values in $a0–$a3 and uses jal X to jump to procedure X (sometimes named the callee). The callee then performs the calculations, places the results in $v0 and $v1, and returns control to the caller using jr $ra.
Using more registers
We may have such question, What if we have a procedure which require more than 4 arguments and 2 return value registers. This situation is an example in which we need to spill registers to memory.
Here we need to use a data structure called stack. A stack needs a pointer to the most recently allocated address in the stack to show where the next procedure should place the registers to be spilled or where old register values are found.By historical precedent, stacks “grow” from higher addresses to lower addresses.
Compiling a C Procedure That Doesn’t Call Another Procedure:
1 2 3 4 5 6
intleaf_example(int g, int h, int i, int j) { int f; f = (g + h) – (i + j); return f; }
What is the assembly language statement corresponding to this machine instruction? 00af8020hex
The first step in converting hexadecimal to binary is to find the op fields:
1 2
0 0 a f 8 0 2 0 0000 0000 1010 1111 0100 0000 0010 0000
op field is 000000 in op(31:26) rs is 00101 rt is 01111 rd is 10000 shamt is 00000 funct is 100000
By look up table we can know it is add $s0, $a1, $t7
I. What is the range of addresses for conditional branches in MIPS (K = 1024)?
Addresses between 0 and 64K # 1
Addresses between 0 and 256K # 1
Addresses up to about 32K before the branch to about 32K after
Addresses up to about 128K before the branch to about 128K after
Conditional branches use I-type format 16-bit immediate field for the branch offset The offset is signed The offset represents the number of words to branch not bytes Hence it will be $2^{16}$ words, $2^{15}$ words, $2^{17}$ bytes ahead and backward. which is 128KB ahead and backward.
Analyse instruction set to determine the datapath requirements
Select a set of hardware components for the datapath and establish clocking methodology
Assemble datapath to meet the requirements
For control
Analyse implementation of each instruction to determine control points/signals on the datapath
Assemble the control logic
step 1
For each instruction, its requirement can be identified as a set of transfer operation.
Register lever language(RTL) is used to describe the instruction execution.eg.
1
$3 <-- $2 + $1 is for ADDU $3 $2 $1
Datapath mush support each transfer operation.
All instruction start by fetching instruction.
THen followed by different operation All instruction should be followed with a
1
PC <- PC + 4
eg
1 2
BEQ If (R[rs] == R[rt]) then PC <- PC +4 + sign_ext(Imm16) ||00 else PC <- PC + 4
step 2
For combinational logical elements, we need adder, MUX, ALU obviously and
For storage we need some registers (which is 32-bit input and output, and Write enable input) and memory(which is have 32 bit data in and 32 bit date out)
For simple and robust, we use Edge triggered clocking, all storage elements are usually clocked by the same clock edge.
We have 3 time issues need to be considered. Clock skew, setup time, hold time.