百川赴海

Performance

Posted on 2025-06-10 Edited on 2025-06-16 In 巴别塔(babel-tower) , 千机图(computer-architecture)

Common Performance Metrics

Latency: response time, execution time, elapsed time
Throughout: Amount work per unit time
Ideally, for good Performance we want is Latency minimized ad Throughout maximized

CPU time

CPU time is CPU spends on executing a program.
cpu time is comprised of three components
$$
\text{CPU time} = \frac{\text{number of instructions}}{\text{program}} \times \frac{\text{cycles}}{\text{instructions}} \times \frac{\text{seconds}}{\text{cycle}}
$$

instruction count = instructions/ program
CPI = average Cycles/ instruction
clock period(clock frequent = 1/ clock period)

Amdahl’s law

Suppose that enhancement (E) accelerates a fraction (F) of a task by a factor (S) and the remainder of the task is unaffected

$$
\text{over all speedup} = \frac{1}{1-F + \frac{F}{S}}
$$

intro to java and oop

Posted on 2025-06-10 Edited on 2025-06-13 In 巴别塔(babel-tower) , 码术苑(software-design-architecture)

Objected oriented programming(OOP)

In procedural programming languages(like C), programming like to be action-oriented, whereas in JAVA - programming is Object-oriented.

In OOP, programmer like to defined their own self-defined types, called classes.

Each class will contain the data and the set of the method that manipulate the data.

A user defined type(eg.class) is called a object.

Inheritance

Inheritance relations form tree-like hierarchical structure.
(eg. postgraduate students is a subclass of students)

In a “Is a” relationship, an object of a subclass may also be treated as an object
of superclass.

In a “Has a” relationship, a class object has an object of another class to store its
state or do its work. It “has-a” reference to that other object.

Designing a Class

Always try to keep data private
Creating an object may require different actions such as initialization
Break up class with many responsibilities

An taste to MIPS ISA

Posted on 2025-06-08 In 巴别塔(babel-tower) , 千机图(computer-architecture)

Mips interpretion

Here is a traditional loop in C

1 2	while(save[i] == K) i+= 1;

Assume that i and k correspond to registers $s3 and $s5 and the base of the array save is in $s6. What is the MIPS assembly code corresponding to this C segment?

Loop: sll  $t1 $s3 2 
      add  $t1 $t1 $s6
      lw   $t0,0($t1)
      bne  $t0,$s5, Exit
      addi $s3,$s3, 1
      j    loop
Exit:

Signed Versus Unsigned Comparison

Suppose register $s0 has the binary number
$$ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111\ 1111_{two}$$
and that register $s1 has the binary number
$$ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0001_{two}$$
where are the values pf registers $t0 and $t1 after these two instructions?

1 2	slt $t0, $s0, $s1 sltu $t1, $s0, $s1

The key spot of the question is $s0 represents $-1_{ten}$ if it is a signed integer and $2^{32} - 1_{ten}$ if it is an unsigned integer. And the value in $s1 represents $1_{ten}$ in either case.

Jump opreation in MIPS

Instruction	Syntax	Address Source	Saves Return Address	Operation	Common Use
J	`j target`	26-bit immediate	No	PC = (PC+4)[31:28] \|\| target \|\| 00	Unconditional jumps, loops
JAL	`jal target`	26-bit immediate	Yes (in $ra)	$ra = PC + 4 PC = (PC+4)[31:28] \|\| target \|\| 00	Function calls
JR	`jr $rs`	Register content	No	PC = $rs	Function returns, indirect jumps

The jump register instruction jumps to the address stored in register $ra which is just what we want. Thus, the calling program, or caller, puts the parameter values in $a0–$a3 and uses jal X to jump to procedure X (sometimes named the callee). The callee then performs the calculations, places the results in $v0 and $v1, and returns control to the caller using jr $ra.

Using more registers

We may have such question, What if we have a procedure which require more than 4 arguments and 2 return value registers. This situation is an example in which we need to spill registers to memory.

Here we need to use a data structure called stack. A stack needs a pointer to the most recently allocated address in the stack to show where the next procedure should place the registers to be spilled or where old register values are found.By historical precedent, stacks “grow” from higher addresses to lower addresses.

Compiling a C Procedure That Doesn’t Call Another Procedure:

int leaf_example(int g, int h, int i, int j) 
{  
  int f;  
  f = (g + h) – (i + j); 
  return f; 
}

addi $sp, $sp, –12 
sw $t1,8($sp) 
sw $t0,4($sp)  
sw $s0,0($sp) 

add $t0,$a0,$a1 
add $t1,$a2,$a3
sub $s0,$t0,$t1 

add $v0,$s0,$zero

lw $s0, 0($sp)
lw $t0, 4($sp) 
lw $t1, 8($sp)
addi $sp,$sp,12
jr $ra

Decoding machine code

What is the assembly language statement corresponding to this machine instruction?
00af8020hex

The first step in converting hexadecimal to binary is to find the op fields:

1 2	0 0 a f 8 0 2 0 0000 0000 1010 1111 0100 0000 0010 0000

op field is 000000 in op(31:26) rs is 00101 rt is 01111 rd is 10000 shamt is 00000 funct is 100000

By look up table we can know it is
add $s0, $a1, $t7

I. What is the range of addresses for conditional branches in MIPS (K = 1024)?

Addresses between 0 and 64K # 1
Addresses between 0 and 256K # 1
Addresses up to about 32K before the branch to about 32K after
Addresses up to about 128K before the branch to about 128K after

Conditional branches use I-type format
16-bit immediate field for the branch offset
The offset is signed
The offset represents the number of words to branch not bytes
Hence it will be $2^{16}$ words, $2^{15}$ words, $2^{17}$ bytes ahead and backward.
which is 128KB ahead and backward.

Single Cycle Processor

Posted on 2025-06-08 In 巴别塔(babel-tower) , 千机图(computer-architecture)

typical steps of processor design

For datapath

Analyse instruction set to determine the datapath requirements
Select a set of hardware components for the datapath and establish clocking methodology
Assemble datapath to meet the requirements

For control

Analyse implementation of each instruction to determine control points/signals on the datapath
Assemble the control logic

step 1

For each instruction, its requirement can be identified as a set of transfer operation.

Register lever language(RTL) is used to describe the instruction execution.eg.

1	$3 <-- $2 + $1 is for ADDU $3 $2 $1

Datapath mush support each transfer operation.

All instruction start by fetching instruction.

THen followed by different operation
All instruction should be followed with a

1	PC <- PC + 4

eg

1 2	BEQ If (R[rs] == R[rt]) then PC <- PC +4 + sign_ext(Imm16) \|\| 00 else PC <- PC + 4

step 2

For combinational logical elements, we need adder, MUX, ALU obviously and

For storage we need some registers (which is 32-bit input and output, and Write enable input) and memory(which is have 32 bit data in and 32 bit date out)

For simple and robust, we use Edge triggered clocking, all storage elements are usually clocked by the same clock edge.

We have 3 time issues need to be considered. Clock skew, setup time, hold time.

2 requirement
$\text{cycle time} >= \text{clk to Q} + \text{longest delay path} + \text{setup} + |\text{clk skew}|$
$\text{clk to Q} + \text{shortest delay path} - \text{clock skew} > |\text{hold time}|$

step 3

Put the selected components together

step 4

instruction encoding defines how instruction and their argument are represented as binary values - machine instructions.