231 Fall 2008 -- Final Exam (with answers) Name: _______________________ Matching. Indicate the letter of the best description. (2 pts. each) 1. _____b____ address a. name of register b. name of memory location 2. _____d____ opcode c. specifies addressing mode d. specifies operation to perform 3. _____e____ object code e. binary machine code with unresolved external references 4. _____g____ condition code f. binary machine code that is ready to execute 5. _____i____ synthetic instruction g. reflects outcome of last recorded ALU operation 6. _____k____ byte h. kernel mode instruction i. instruction provided in assembly 7. _____m____ word language only j. 4 bits 8. _____r____ (K) kilo- k. 8 bits l. 16 bits 9. _____s____ (M) mega- m. 32 bits n. 10**(-12) 10. ____t____ (G) giga- o. 10**(-9) p. 10**(-6) 11. ____q____ (m) milli- q. 10**(-3) r. 10**(+3) or 2**(+10), in context 12. ____p____ (Greek mu) micro- s. 10**(+6) or 2**(+20), in context t. 10**(+9) or 2**(+30), in context 13. ____o____ (n) nano- u. 10**(+12) or 2**(+40), in context v. copy address of actual parameter into 13. ____w____ call by value space allocated for formal parameter w. copy value of actual parameter into 14. ____x____ call by result space allocated for formal parameter x. copy final value of formal parameter 15. ____y____ call by value-result back into actual parameter y. copy value of actual parameter into 16. ____v____ call by reference space allocated for formal parameter and copy final value of formal parm. 17. ___ee____ macro processor back into actual parameter z. combines translation and execution 18. ___bb____ assembler aa. user-friendly debugger bb. assembly language is input and 19. ___cc____ static linker object module is output cc. object modules are input and 20. ___dd____ loader executable file is output dd. executable file is input and memory 21. ___ff____ programmed I/O image is output ee. source file with macros is input and 22. ___hh____ DMA I/O expanded source is output ff. single byte transfer using polling 23. ___jj____ blocking gg. single byte transfer using interrupt hh. single block transfer using interrupt 24. ___kk____ scatter-gather ii. one logical record per transfer jj. one buffer area (with multiple contiguous logical records) per transfer kk. multiple buffer areas (multiple non-contiguous logical records) per transfer Extra Credit Matching on floating-point. Indicate the letter of the best description. (1.5 pts. each) XC-1. ___c___ accuracy a. truncating bits that won't fit in fraction b. choosing nearest representable neighbor XC-2. ___d___ precision c. how correct the number is d. how many digits are used to represent a number XC-3. ___b___ rounding e. least significant bits cancel during floating- point arithmetic operation XC-4. ___f___ catastrophic f. most significant bits cancel during floating- cancellation point arithmetic operation g. geometrically increasing spacing between representable numbers as exponent increases 25. During execution, programs typically are divided into four major sections. Name these sections. (4 pts.) 1) text 2) data 3) stack 4) heap 26. A stack frame typically has five items. Name these items. (5 pts.) 1) parameters 2) return address 3) registers to save 4) old fp 5) local variables 27. Convert these numbers between signed decimal and 16-bit two's complement hexadecimal representation. (2 pts. each) signed decimal two's complement a. -23 __0xffed__ b. __-221__ 0xff23 c. 223 __0x00df__ d. ___547__ 0x0223 28. Show the little-endian and big-endian byte ordering of the following data values. Start the byte numbering of c[] and i at address 100. (6 pts.) char c[4] = {'a', 'b', 'c', '\0'}; int i = 0xabc; /* int size is 32-bit word */ byte address: 100 101 102 103 104 105 106 107 big-endian ordering: 'a' 'b' 'c' _0_ _00 _00 _0a _bc little-endian ordering: 'a' 'b' 'c' _0_ _bc _0a _00 _00 29. Consider the following C subroutine. void clear( int a[], int count ){ int *ptr = a; // like a.begin() for a C++ vector int *limit = a+count; // like a.end() for a C++ vector while( ptr != limit ) *ptr++ = 0; } The following generated assembly is edited from the "gcc -O2 -S" output. Fill in the blanks. (1.5 pts. each for 9 pts. total) clear: sll %o1, 2, __%o1__ add %o1, %o0, %o1 __cmp__ %o0, %o1 be .LL6 nop .LL8: __st___ %g0, [%o0] add %o0, ___4___, %o0 cmp __%o0__, %o1 bne .LL8 nop _.LL6:_ retl nop 30. Show the steps required in multiplication of two unsigned 4-bit binary numbers, 1010 times 1101. (1010 is the multiplicand and 1101 is the multiplier.) The details of mulscc are not required, but you should show the 3-register format of ACC, MQ, and MDR. (8 pts. + 1 pt. extra credit for placing the multiplicand and multiplier in the correct registers.) initially: C ACC MQ 0 0000 1101 MDR 1010 ---------------------------------------------- step 1: 0 0000 1101 + 1010 ^ add based on lsb=1 ------ 0 1010 1101 >> shift right 0 0101 0110 ---------------------------------------------- step 2: 0 0101 0110 + 0000 ^ no add based on lsb=0 ------ 0 0101 0110 >> shift right 0 0010 1011 ---------------------------------------------- step 3: 0 0010 1011 + 1010 ^ add based on lsb=1 ------ 0 1100 1011 >> shift right 0 0110 0101 ---------------------------------------------- step 4: 0 0110 0101 + 1010 ^ add based on lsb=1 ------ 1 0000 0101 >> shift right 0 1000 0010 ---------------------------------------------- check: 1010 = 10 x 1101 x 13 -------- ---- 10000010 130 10000010 = 1*(2^7) + 1*(2^1) = 128 + 2 = 130 31. Consider the following C program and its generated assembly. int a; main: save %sp,-112,%sp int main(void){ sethi %hi(a),%l0 a = 0; st %g0,[%l0+%lo(a)] a = sub( a ); ld [%l0+%lo(a)],%o0 } call sub int sub( int x ){ nop return( x + 1 ); st %o0,[%l0+%lo(a)] } ret restore sub: add %o0,1,%o0 retl nop Write the assembly code for the following C program. Give both main and sub assembly language code sequences. (8 pts.) int a; main: save %sp,-96,%sp // only 96 needed int main(void){ set a,%l0 // set is cleaner a = 0; st %g0,[%l0] a = sub( &a ); mov %l0,%o0 // send address } call sub int sub( int *x ){ nop return( *x + 1 ); st %o0,[%l0] // store result } ret restore sub: ld [%o0],%o0 // de-ref address add %o0,1,%o0 // return in %o0 retl nop 32. Name the four generic hardware actions that occur when the CPU accepts an interrupt signal. (4 pts.) 1) save PC and PSR 2) switch execution mode to kernel 3) disable or restrict further interrupts 4) load new PC from interrupt vector table XC-5. Explain how a one-pass-with-fixup assembler works. (up to 4 pts.) see detailed answer appended at bottom XC-6. Consider a 2 GHz processor with instructions that each need two clock cycles on average to execute. Consider also that an interrupt requires an overhead of 100 nanoseconds to transfer to an interrupt service routine, and the ISR has to execute 900 instructions on average to handle an interrupt. What is the maximum number of interrupts that can be handled in a second? (up to 4 pts.) 2 GHz = 2 * 10^9 cycles/sec => cycle time is 1/2 * 10^-9 sec = 0.5 nsec/cycle = 500 psec/cycle 2 cycles / inst. combining => 2 cycles/inst. * 0.5 nsec/cycle = 1 nsec/inst. interrupt handling time = overhead time + ISR execution time = 100 nsec + 900 insts. * 1 nsec/inst. = 1000 nsec = 1 microsec 1 microsec / interrupt => rate is 1 interrupt/microsec = 10^6 interrupts/sec XC-7. There were at least two phrases that I encouraged you think about during the semester as knee-jerk responses to certain questions. Identify the two phrases and briefly explain what each means. (up to 4 pts.) forward reference => when a translator encounters a use of a symbol before its definition. this leads to a two-pass structure (or at least a one-pass-with-fixup). time versus space => this is a typical tradeoff decision in programming or designing a computer system and appears in many, many forms. for example, consider the population count routine in your final programming assignment that can be optimized for time by storing a precomputed table of bit counts for bytes. as another example, consider caching in (1) processor design, in which additional fast storage holds copies of values from main memory; and, (2) web server design, in which main memory holds copies of web pages from disk. a cache is not needed for correct operation in either instance, but it speeds up access to the cached items. --------------- XC-5 answer did not have to be this detailed In one-pass-with-fixup, the symbol table needs a third field (in addition to the symbol name and address fields). The third field indicates whether the symbol is defined or undefined. When you encounter the use of a symbol, perform a lookup in the symbol table: a) if the symbol is defined, use the address value from the entry b) if the symbol is not yet present, add an entry for this symbol and mark it as undefined; use the address field in the entry as the head of a linked list of use locations of this undefined symbol, e.g., symbol type address +------+-------+--------+ | ... | +------+-------+--------+ | x | def | 10 | +------+-------+--------+ | ... | use-loc +------+-------+--------+ +-------+--------+ | y | undef | ---->| 12 | ----->... +------+-------+--------+ +-------+--------+ | ... | +------+-------+--------+ add the address of the current instruction to the first node. c) if the symbol is present but undefined, add another node to the linked list with the address of the current instruction in that node. When you encounter the definition of a symbol, perform a lookup in the symbol table: a) if the symbol is already defined, you have encountered a "multiple definition" error b) if the symbol is not yet present, add an entry for this symbol and mark it as defined; use the current value of the location counter as the address. c) if the symbol is present but undefined, "fix up" all the undefined uses by traversing the linked list and storing the current value of the location counter into the instruction words identified in the nodes of the list; then free the list and mark the symbol as defined; use the current value of the location counter as the address. symbol type address +------+-------+--------+ | ... | +------+-------+--------+ | x | def | 10 | +------+-------+--------+ | ... | +------+-------+--------+ | y | def | 20 | +------+-------+--------+ | ... | +------+-------+--------+ When you reach the end of the assembly language program, scan the symbol table for any remaining undefined entries. These are either errors or, absent a requirement to explicitly mark use of external names, symbols that must be passed to the linker to be resolved.