231 Fall 2009 - Program 1 Assignment (original assignment dated 9/2/09; updated 9/6/09 and 9/7/09) (this file can be found in /home/mark/public_html/231/f09.program1) Due date: Friday, September 11, by midnight Grading standard: correctness - 100% of grade; Submission: handin.231.1 1 - please submit all pass__.m files and an "assemble" shell script Tools needed: GNU m4 (use x86 Sun platforms) Concepts needed: macro processing This is to be an individual effort programming assignment. Your assignment is to implement an enhanced assembler for the chapter 1 simulated accumulator machine that will handle addition and subtraction of immediate values using the macros add_imm(i) // ACC <- ACC + i sub_imm(j) // ACC <- ACC - j You may use one of the three approaches discussed below or invent a fourth approach of your own. All approaches should be coded in m4. You should start with the assemble shell script "assemble" and the m4 files "pass1.m" and "pass2.m" in /home/mark/231/sim/. In any approach you take, you are allowed to create additional symbolic names for use with each unique constant. For example, the constant 5 can be assigned the symbols _imm_alloc_5 and _imm_5, where the first symbol is used to indicate that a memory word for the constant 5 has been allocated and the second symbol is used as the symbolic address of that memory word. Of course, the user should not use these symbols in his or her original program; this can be avoided if users are told not to use symbols that start with an underscore (that is, symbols starting with underscore are reserved to the assembler). Approach 1 You can define a two-pass "pre-assembler" that rewrites the source code. The first pass can collect the memory word allocations of immediate values into an "immediates.m" file, and then the second pass can rewrite the original source file into a source file that can be processed by the regular two passes. E.g., the assemble shell script becomes: m4 pass0a.m >immediates.m m4 pass0b.m >source_imm.m m4 pass1.m >symbols.m m4 pass2.m >executable where pass0a.m and pass0b.m perform the above actions and where pass1.m and pass2.m are each minimally changed so that they include the "source_imm.m" file instead of the original "source.m" file. As an example, in this approach, the source file: label(start) load(a) add(one) add_imm(3) add_imm(4) add_imm(3) store(a) halt label(data) word(a,0) word(one,1) end(start) produces an "immediates.m" file containing (with some blank lines omitted): word(_imm_3,3) word(_imm_4,4) and then the source file is rewritten by the pre-assembler's pass0b.m to create the "source_imm.m" file below: word(_imm_3,3) word(_imm_4,4) label(start) load(a) add(one) add(_imm_3) add(_imm_4) add(_imm_3) store(a) halt label(data) word(a,0) word(one,1) end(start) with the resulting "sybmols.m" file (with reformatting): define(_imm_3,0) define(_imm_4,1) define(start,2) define(data,15) define(a,15) define(one,16) and "executable" file (with some blank lines omitted): 3 4 50 15 40 16 40 0 40 1 40 0 60 15 00 0 1 2 Note that the constant 3 is used twice as an immediate value but is allocated a memory word only once. All uses of the immediate value of 3 are changed to refer instead to the single memory word that holds the value 3. Approach 2 A second approach to rewriting the source.m file is essentially the same as the first approach but uses m4's built-in divert and undivert functions within a single pass0.m script to rewrite the source file. (divert and undivert allow you to reorder the output so that the allocated memory words for the immediate values appear at the top of the output.) The assemble shell script is: m4 pass0.m >source_imm.m m4 pass1.m >symbols.m m4 pass2.m >executable where pass0.m rewrites the contents of the "source.m" file and pass1.m and pass2.m are each minimally changed so that they include the "source_imm.m" file instead of the original "source.m" file. The "source_imm.m", "symbols.m", and "executable" files in this approach are all the same as would be produced in the first approach. Approach 3 A third approach is similar to the the first two approaches, but it uses a single "pre-pass" to generate the memory word allocations for immediate values and modifies pass1.m and pass2.m to additionally process the add_imm and sub_imm opcodes. Thus, the assemble shell script becomes: m4 pass0.m >immediates.m m4 pass1.m >symbols.m m4 pass2.m >executable where pass0.m creates the same "immediates.m" file as in approach 1 above, and pass1.m and pass2.m are modified to handle add_imm and sub_imm. The "symbols.m" file produced by pass1.m in this approach is also the same as the one produced in approach 1 above. (There is no "source_imm.m" file needed by this approach.) Other Approaches If you invent another approach, please send me an email in which you describe your approach in a manner similar to the descriptions given above. Negative Constants Use m4's built-in pattern substitution function to handle the minus sign in negative constants. That is, a source file like this: label(start) load(a) add_imm(-3) add_imm(-3) store(a) halt label(data) word(a,0) end(start) should be translated in approach 2 to this "source_imm.m" file: word(_imm_neg3,-3) label(start) load(a) add(_imm_neg3) add(_imm_neg3) store(a) halt label(data) word(a,0) end(start) Other approaches are similar in the need for symbolic addresses such as "_imm_neg3" rather than "_imm_-3". Hand-In Please submit all assembly passes (pass0.m, pass1.m, etc.) and an assemble shell script named "assemble" that appropriately invokes the assembly passes. This script can be one of the ones listed above, based on the approach you choose. Hint: There is no need to make this assignment overly-complicated. One of the approaches described above above requires less than ten lines of m4 code.