In Phase III of your compiler front-end project you will
The lexicon, syntax and semantics of Phase I and Phase II are unchanged.
/* 90.cf - type checks paramater list - name mangling required */ int x; float num; int foo ( int num, float y ) { num = 7; /* OK */ num = y; /* line 10 type error */ y = 4.5 + 8 * 2; /* line 11 type error */ return 0; }If the binding is done correctly all references to num and y should be local. The symbol table dump will reflect this:
------------ ------ ------------ Name Type Line_Numbers ------------ ------ ------------- foo0 1 7 num0 2 5 num1 1 7 9 10 x0 1 4 y1 2 7 10 11The scope semantics that you will add to your parser for phase III checks for multiple declarations of the same identifier in the same scope and a reference to an identifier that has not been declared. The parser will spit out an error for these errors and continue parsing. The code below demonstrates these two types of errors:
int x = 5; float x = 5.5; /* identifier declared twice */ int foo ( ) { w = 5; /* reference to an identifier not declared */ return 0; }Since cflat does not support nested scope (e.g., nested blocks in C) one symbol table is sufficient for the two possible scopes which are global or local to a function. Nested scope generally requires multiple symbol tables. The scope checking requirements for phaseIII are covered in lab07. Your parser will keep track of the current scope as you parse by a global variable that is visible to the scanner. As the scanner adds the identifier to the symbol table the scanner appends the scope flag for that identifier. You cannot make a reference to an identifier that has not been declared:
/* file: foo.cf */ foo() { i = 5; // scope error }And you cannot declare an identifier twice in the same scope:
int i = 7; int i = 9; // scope error
i0 declared twice, line 5 w0 not declared, line 8 Type err:exp->exp op term,line 11 Type err: assign_stmt, line 15 ------------ ------ ------------ Name Type Line_Numbers ------------ ------ ------------- foo0 1 6 f0 2 3 11 15 i0 1 4 5 9 9 10 15 j1 1 6 14 16 w0 -1 8 x0 1 2 x1 2 6 13Note that the reference to x on line 13 is correctly bound to the formal parameter x and not to global x.
Since a compiler will generate intermediate code only after the source has passed both the syntax and semantic analysis phases you can assume you are working with error-free code during all remaining parts of phase III.
index value size type bind oth ver shndx name [68] 0x000208d0 0x00000004 OBJT GLOB D 0 .data intConstant
For 75% run your parser against 75.cf file. This file contains declarations and a function with a parameter list. The "code" that you need to generate for declarations is to insert into the symbol table a size and an address offset from 100 for each global variable. Parameters are not given an address since they are generally loaded into registers. For your globals assume each offset is computed starting at physical address 100 to make your table easier to read. The offset will be added to a memory pointer of the .data segment when the code is translated into machine code by the backend. Assume sizes: int (4 bytes) float (8 bytes) and char (1 byte). Align your data on a word (4-byte) boundary. Your symbol table hash record already has the fields you need (st_size and st_value) but you will need to add the functions to insert these values. After parsing this file
/* file: 75.cf generates code for declaration and parameter lists by storing the size and address for globals as an offset in the symbol table */ float f; int i; char c; int foo ( int arg1, float arg2 ) { return 0; }After parsing your symbol table dump should look something like this :
------------ ------ ---- ----- ------------ Name Type Size Value Line_Numbers ------------ ------ ---- ----- ------------ foo0 int 4 0 6 c0 char 1 112 5 f0 float 8 100 3 i0 int 4 108 4 arg11 int 4 0 6 arg21 float 8 0 6Notice that the parameters and the return value do not have addresses. Parameters and local variables will be loaded into registers or pushed onto the runtime stack.
float y = 5.5; char c; int ret = 0; int foo ( float f, int i) { i = i + 1; f = y * 7.0 + 3.5; return ret; }Generates this code:
y0 = 5.5 ret0 = 0 t1 = i1 + 1 i1 = t1 t2 = y0 * 7.0 t3 = t2 + 3.5 f1 = t3Temporaries such as t1, t2, and t3 are treated as objects in your program so you need to add each temporary to the symbol table. A temporary will have a type, size and offset just like data objects. To compute type you need to synthesize the type from the LHS and RHS of the operation. Sample symbol table dump:
------------ ------ ---- ----- ------------ Name Type Size Value Line_Numbers ------------ ------ ---- ----- ------------ foo0 int 4 0 9 c0 char 1 108 7 f1 float 8 0 9 12 ret0 int 4 112 8 13 t1 int 4 0 0 t2 float 8 0 0 t3 float 8 0 0 i1 int 4 0 9 11 11 y0 float 8 100 6 12