Clemson University -- CPSC 231 -- Fall 2009 closer view of load/store CPU -- only load/store instructions access memory data path set of general registers ALU executes operations (operands and results must be in registers) internal buses memory bus interface / BIU (bus interface unit) MAR MDR control PC IR +----------------------------------------+ | | | +------+ | | | PC |<-> ... | | +------+ | | | IR |<-> ... | | +------+ | | | | +------+ +-----+ | | .------->| R0 | ... -->| MAR |------> to and | | +------+ +-----+ | | | .-----| ... |<-> | from | | | +------+ +-----+ | | | | .-| R7 | ... <->| MDR |<-----> memory | | | | +------+ +-----+ | | | | | | | | _v_ _v_ | | | \ V / | | | \___/ ALU | | | | | | `----' | | | | ^^...^^ | | || || control signals to | | || |--- cause datapath actions | | || ---- are generated by logic | | |... in the control unit | +----------------------------------------+ .--------------------. | v | ---fetch--- MAR <- PC | READ | IR <- MDR | PC <- PC + incr | | | v | ---decode--- decode IR | | | +-----------+-----------+--------------+ | v v v v | ---execute--- load(B,R1) load(C,R2) add(R1,R2,R3) store(R3,A) | get address MAR <- B MAR <- C - - - - - MAR <- A | fetch operand READ READ - - - - - MDR <- R3 | execute R1 <- MDR R2 <- MDR R3 <- R1 + R2 - - - - - | store result - - - - - - - - - - - - - - - WRITE | | | | | `--------------------+-----------+-----------+--------------+ example assembly for load/store machine (section 1.8, pp. 33-34) load/store machine instruction set opcode operands operation name machine action ------ -------- -------------- -------------- halt ---- halt stop execution div a,b,c divide reg[c] = reg[a]/reg[b] mul a,b,c multiply reg[c] = reg[a]*reg[b] sub a,b,c subtract reg[c] = reg[a]-reg[b] add a,b,c add reg[c] = reg[a]+reg[b] load addr,r load reg[r] = memory[addr] store r,addr store memory[addr] = reg[r] ba addr branch always pc = addr blt0 r,addr branch on less than if reg[r]<0 then pc = addr ble0 r,addr branch on less than or equal if reg[r]<=0 then pc = addr beq0 r,addr branch on equal if reg[r]==0 then pc = addr bne0 r,addr branch on not equal if reg[r]/=0 then pc = addr bge0 r,addr branch on greater than or equal if reg[r]>=0 then pc = addr bgt0 r,addr branch on greater than if reg[r]>0 then pc = addr print addr print display contents of memory[addr] ---- comment(` first example load/store machine program ') comment(` ') comment(` data section for program -- word(label,value) ') word(a,0) word(b,3) word(c,48) word(d,9) comment(` code that implements the expression a = b + c - d; ') label(start) load(b,r1) load(c,r2) load(d,r3) add(r1,r2,r4) sub(r4,r3,r4) store(r4,a) halt comment(` start execution at label start ') end(start) ---- comment(` second example load/store machine program ') comment(` ') comment(` code that implements the loop ') comment(` sum = 0; ') comment(` for( i = 10; i > 0; i-- ){ ') comment(` sum = sum + i; ') comment(` } ') define(`one_r',r1) define(`i_r',r2) define(`sum_r',r3) label(start) load(zero_m,sum_r) load(one_m,one_r) load(ten_m,i_r) label(loop) add(sum_r,i_r,sum_r) sub(i_r,one_r,i_r) bgt0(i_r,loop) store(sum_r,sum_m) halt comment(` data section for program ') word(sum_m,0) word(zero_m,0) word(one_m,1) word(ten_m,10) comment(` start execution at label start ') end(start) ---- simulator and assembler for the load/store machine (ldst.c, assembler.m, and source.m bullets on course web page) 1) compile "ldst.c" 2) input file is "source.m" (you can edit it, if you like) 3) run "m4 assembler.m > executable" (which takes input from "source.m" and produces numeric output in "executable" - both passes of the assembler are in one m4 script) 4) run simulator "a.out < executable" (it echos the program in memory for verification, and then prints PC: opcode registers (result) for each instruction executed) ---- consider program on bottom p. 40 and top p. 41 symbolic code -----> addr: contents (in decimal) -----> simulated execution note variable-length insts. label(start) load(x_m, x_r) 0: 50 1: 41 2: 1 0: ld 41,r1 (result is 10) load(a2_m, a2_r) 3: 50 4: 38 5: 2 3: ld 38,r2 (result is 1) load(a1_m, a1_r) 6: 50 7: 39 8: 3 6: ld 39,r3 (result is 7) load(a0_m, a0_r) 9: 50 10: 40 11: 4 9: ld 40,r4 (result is 11) sub(x_r, a2_r, y_r) 12: 30 13: 1 14: 2 15: 0 12: sub r1,r2,r0 (result is 9) sub(x_r, a1_r, temp_r) 16: 30 17: 1 18: 3 19: 5 16: sub r1,r3,r5 (result is 3) mul(y_r, temp_r, y_r) 20: 20 21: 0 22: 5 23: 0 20: mul r0,r5,r0 (result is 27) sub(x_r, a0_r, temp_r) 24: 30 25: 1 26: 4 27: 5 24: sub r1,r4,r5 (result is -1) div(y_r, temp_r, y_r) 28: 10 29: 0 30: 5 31: 0 28: div r0,r5,r0 (result is -27) store(y_r, y_m) 32: 60 33: 0 34: 42 32: st r0,42 print(y_m) 35: 90 36: 42 35: prt 42 => -27 halt 37: 0 37: halt word(a2_m, 1) 38: 1 word(a1_m, 7) 39: 7 word(a0_m, 11) 40: 11 word(x_m, 10) 41: 10 word(y_m, 0) 42: 0 end(start) 43: 0 to do this example... ---- "source.m" input file (extended from bottom p. 40 and top p. 41) ---- define(`y_r', r0) comment(`variable assignments to registers') define(`x_r', r1) define(`a2_r', r2) define(`a1_r', r3) define(`a0_r', r4) define(`temp_r', r5) label(start) comment(`starting address') load(x_m, x_r) comment(`load variables into registers') load(a2_m, a2_r) load(a1_m, a1_r) load(a0_m, a0_r) sub(x_r, a2_r, y_r) comment(`y_r = x - a2') sub(x_r, a1_r, temp_r) comment(`temp_r = x - a1') mul(y_r, temp_r, y_r) comment(`y_r = (x - a2) * (x - a1)') sub(x_r, a0_r, temp_r) comment(`temp_r = x - a0') div(y_r, temp_r, y_r) comment(`y_r = (x - a2) * (x - a1) / (x - a0)') store(y_r, y_m) comment(`store y_r into memory') print(y_m) comment(`print result') halt word(a2_m, 1) comment(`the polynominal coefficients') word(a1_m, 7) word(a0_m, 11) word(x_m, 10) comment(`independent variable') word(y_m, 0) comment(`dependent variable') end(start) ---- "assembler.m" macro file (extended from bottom p. 41 and top p. 42) ---- -- top half is first pass -- takes source.m and runs location counter -- lower half is second pass -- takes defines that make up the symbol table along with source.m and then translates labels, opcodes, and registers divert(-1) define(loc,0) define(word, `define($1,eval(loc)) define(`loc', eval(loc + 1))') define(label,`define($1,eval(loc))') define(halt, `define(`loc', eval(loc + 1))') define(div, `define(`loc', eval(loc + 4))') define(mul, `define(`loc', eval(loc + 4))') define(sub, `define(`loc', eval(loc + 4))') define(add, `define(`loc', eval(loc + 4))') define(load, `define(`loc', eval(loc + 3))') define(store,`define(`loc', eval(loc + 3))') define(ba, `define(`loc', eval(loc + 2))') define(blt0, `define(`loc', eval(loc + 3))') define(ble0, `define(`loc', eval(loc + 3))') define(beq0, `define(`loc', eval(loc + 3))') define(bne0, `define(`loc', eval(loc + 3))') define(bge0, `define(`loc', eval(loc + 3))') define(bgt0, `define(`loc', eval(loc + 3))') define(print,`define(`loc', eval(loc + 2))') define(end,`dnl') define(comment,`') include(source.m) define(`loc',0) define(`word', ` $2') define(`label',`') define(`halt', ` 0') define(`div', ` 10 $1 $2 $3') define(`mul', ` 20 $1 $2 $3') define(`sub', ` 30 $1 $2 $3') define(`add', ` 40 $1 $2 $3') define(`load', ` 50 $1 $2') define(`store',` 60 $1 $2') define(`ba', ` 70 $1') define(`blt0', ` 71 $1 $2') define(`ble0', ` 72 $1 $2') define(`beq0', ` 73 $1 $2') define(`bne0', ` 74 $1 $2') define(`bge0', ` 75 $1 $2') define(`bgt0', ` 76 $1 $2') define(`print',` 90 $1') define(`end', ` $1') define(`comment',`') define(r0, 0) define(r1, 1) define(r2, 2) define(r3, 3) define(r4, 4) define(r5, 5) define(r6, 6) define(r7, 7) divert include(source.m) ---- result of first half of "assembler.m" / i.e., input to lower half ---- define(start,0) define(a2_m,38) define(a1_m,39) define(a0_m,40) define(x_m,41) define(y_m,42) -------- output of "m4 assembler.m" / input to simulator program -------- 50 41 1 50 38 2 50 39 3 50 40 4 30 1 2 0 30 1 3 5 20 0 5 0 30 1 4 5 10 0 5 0 60 0 42 90 42 0 1 7 11 10 0 0 --------------------- simulator program output ---------------------- --(simulator is /home/mark/231/chapter1_simulators/sim_ldst/ldst.c)-- simulation of load/store machine begins 0: ld 41,r1 (result is 10) 3: ld 38,r2 (result is 1) 6: ld 39,r3 (result is 7) 9: ld 40,r4 (result is 11) 12: sub r1,r2,r0 (result is 9) 16: sub r1,r3,r5 (result is 3) 20: mul r0,r5,r0 (result is 27) 24: sub r1,r4,r5 (result is -1) 28: div r0,r5,r0 (result is -27) 32: st r0,42 35: prt 42 => -27 37: halt simulation of load/store machine ends --------------------- commands to run example --------------------- gcc ldst.c -o ldst vi source.m m4 assembler.m > executable ldst < executable ---------------- operation of word macro in example ---------------- what m4 gets in assembler.m ... A: define(word, `define($1,eval(loc)) define(`loc', eval(loc + 1))') ... B: word(a2_m, 1) comment(`the polynominal coefficients') ... C: define(`word', ` $2') ... D: word(a2_m, 1) comment(`the polynominal coefficients') ... just after A: in m4 symbol table: loc == 0 word == define($1,eval(loc)) define(`loc', eval(loc + 1)) line of output: ... just before B: in m4 symbol table: loc == 38 word == define($1,eval(loc)) define(`loc', eval(loc + 1)) ... processing B: initial input line: word(a2_m, 1) after expansion: define(a2_m,eval(loc)) define(`loc', eval(loc + 1)) ... just after B: in m4 symbol table: a2_m == 38 loc == word == define($1,eval(loc)) define(`loc', eval(loc + 1)) ... just after C: in m4 symbol table: a2_m == 38 loc == word == $2 ... processing D: initial input line: word(a2_m, 1) after expansion: 1