-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Compilers - analysis of source program linear analysis hierarchical analysis semantic analysis - compilers and syntax source program input lexical analyzer break into tokens syntax analyzer grammatical parsing semantic analyzer type checking, operator annotation intermediate code generation RTL code optimizer see below code generator machine code target program Hello, world! common modules symbol table manager error handler cousins preprocessors macro processing file inclusion "rational" preprocessors add flow control language extensions (Equel, cfront) assemblers two pass (but backpatching can merge this to one pass, see below) 1: find ids for storage and assign storage locations in symbol table 2: translate operations into bits, using storage information from 1st pass loaders and linkers phases front end depend on source language back end depend on target machine language several phases can be in one pass example: lexical analyzer, syntax analyzer, semantic analyzer, intermediate code generation fewer passes, fewer intermediate files (why is that logical...) backpatching can help keep down number of passes compiler construction tools lexical analyzer scanner generator lex/flex syntax analyzer parser generators yacc/bison syntax directed translation engines automatic code generators data flow engines - overview syntax defnition BNF = Backus-Naur Form parse trees ambiguity associativity +-*/ to the associate to the left exponentiation and assignment (=) to the right precdence of operators */ over +- syntax directed translation syntax directed definitions atributes with each grammar symbol semantic rules with each production annotated parse trees show attributes at nodes synthesized attributes value determined by attributes values of children traversal order depth-first traversal basic examples rules evaluted after all children translation scheme order depends on location of rules embedded in right side of productions parsing any CFG there is a parser that is O(n^3) for a string of n tokens however, in practice O(n) exist for most every language top down parsin easier for hand coding predictive parsing (special case of recursive decent parsing) always use e-productions as last resort left recursion can lead to infinite loop on things like "expr -> expr + term" rewrite A -> Aa|B to A->BR R->aR|e reminds me of CPS or tail recursive iteration now it is right recursive, which can be a problem for left associative operators instead rewrite A -> Aa|AB|y to A->yR R->aR|BR|e move translation scheme actions with productions optimizations for lame systems replace tail recursive calls with iteration manually inline functions bottom up down parsing (more later) larger class of grammars and translation schemes lexical analysis removal of whitespace and comments constants keywords and identifiers interface like parser does token lookahead, lexer does char lookahead parser gets multi-valued return, both type of token, plus attributes such as the id name, string, number, etc symbol table interface: insert(s,t) lookup(s) prepopulated with reserved keywords abstract stack machines pc pointer instructions arithmetic instructions + - * / and or stack manipulation push pop rvalue lvalue := copy l-values and r-values control flow label goto gofalse gotrue halt - lexical analysis tokens vs paterns vs lexemes id vs regexp vs pi, count, etc attributes for tokens as noted above, id point to symbol table entry, num to 2, etc lexical erros delete an extra char insert missing char replacr char transpose chars approaches lex c coding and runtime with ungetc assembler and explicit i/o management specification of tokens regexps fortran hollerith strings nHa1a2...aN aren't regexps... recognition of tokens finite state machines lex syntax declarations // define named regexps %% transition rules // conditional break down of named regexps // return token type constants // optionally set yylval value as above with additional ifn as above %% auxillary procedures // can be called in setting yylval, such as to manage symbol table lookahead operator / r1/r2, match regexp r1 if followed by r2 more fortran examples alg: take regexp create nfa using thompsons construction O(r) time O(r) space alg: take nfa create dfa using subset construction good for searching when x is long string or use two stack simulation of nfa (O(NxX) (Number of states x length of X) basically constructs subset construction at runtime good for searching when x is NOT long string or use lru cache to lazily compute dfa model alg: take regexp to dfa alg: minimize the number of states in dfa table compression methods - syntax analysis roles of the parser create parse tree error handling goals report clearly and accurately should recover quickly should not significantly slow down the processing of correct programs strategies panic mode: discard input until a designated set of syncronizing tokens is found phrase level: local correction error productions: good for common errors global recovery: find globally closest acceptable input: too costly to be more than theoretical context free grammars details omitted here, see automata theory notes above for vocab etc including terminals, productions, parse trees, derivations, ambiguity removing ambiguity from s - if e then s | if e then s else s | other to s - matched | unmatched matched - if e then matched else matched | other unmatched - if e then s | if e then matched else unmatched removing left recursion alg: to fully do this, you have to find not just immediately left recursive, but over multiple productions left factoring (aid to create predictive parser) alg: A -> aB | aC to A->aR R->B|C sayings a finite automata cannot keep count a grammer can keep count of two items but not three top down parsing (revisited) recursive decent can require expensive backtracking, but not generally predictive parsers nonrecursive predictive parsers maintain own stack instead of recursing alg: nonrecursive predictive parsers alg: construction of predictive parsing tables alg: FIRST alg: FOLLOW LL(1) = left ro right, leftmost derivation, one lookahead can't be left recursive can't be be ambiguous another restriction, if A -> alpha|beta 1.) for no terminal a do both alpha and beta derived strings beginning with a 2.) at most one of alpha and beta can derived the empty string 3.) if beta derives epsilon, then alpha does not derived any string beginning with a terminal in FOLLOW(A) if we construct the predictive parsing table, and there are duplicate elements in some cell, the grammer was left recursive or ambiguous error recover for predictive parsing can use FOLLOW (and FIRST) for syncronizing set bottom up parsing shift-reduce parsing types operator-precedence parsing LR parsing shift-reduce parsing (in general) handle = a handle of the right-sentential form gamma is a production A->beta and a position of gamma where the string beta may be found and replaced by A to produce the previous right-sentential form in a rightmost derivation of gamma. that is if S=> alphaAw => alphabetaw, then A-> beta in the position following alpha is a handle of alphabetaw. only one handle if grammar is not ambiguous conflicts shift/reduce reduce/reduce if these happen, the grammer is not LR(k) however, if we bias a shift/reduce conflict in favor of shift it solves the ambiguity of if/then/else in the way we'd like operator precdence parsing operator grammar this is NOT an operator grammar E->EAE (E) -E id A->+ - * / ^ but if we sub A into E E->E+E E-E E*E E/E E^E (E) -E id now we have an operator grammer parsers for this subclass can easily be constructed by hand need to defined precdence relations between operators can be determined by associativity and precedence rules alg: opreator-precedence parsing algorithm alg: constructing precedence functions handling errors: only two cases in the processing process can discover syntax errors LR parsers attractive because - LR parsers can be constructed to recognize virtually all programming language constructs for which CFGs can be written - The LR parsing method is the most general nonbacktracking shift-reduce parsing method known, yet it can be implemented as efficiently as other shift-reduce methods. - The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive parsers - An LR parser can detect an syntax error as soon as it is possible to do so on a left-ro-right scan of that input. cons too much work to do by hand but that is why we have yacc types SLR = simple LR easiest, but least powerful LR = canonical LR most powerful, most expensive LALR = lookahead LR intermediate in power and cost, can be implemented efficiently same algorithms, but different parsing tables LL vs LR: the difference LR(k) we must be able to recognize the occurence of the right side of a production, having seen all of what is derived from the right side with k input symbols of lookahead LL(k) we must be able to recognize the use of a production seeing only the first k symbols of what the right side derives therefore LR can describe more languages than LL huh? comparison SLR LALR same number of states, say 100 for pascal. canonical LR, 1000+ ambiguity using precedence and associativity to resolve parsing action conflicts error handling a canonical LR parser will never make even a single reduction before announcing an error SLR and LALR will never shift an erroneous input symbol on the stack yacc, a parser generator syntax %{ c declarations %} declarations // grammar tokens, like "%token DIGIT" // associativity like "%left '+' '-' // precedence for lower declarations like "%left '*' '/' // other... "%right UMINUS", see below %% productions // including code for semantic actions // empty alternative for epsilon // % prec UMINUS overrides precedence of right most terminal, good for E-E vs -E %% auxillary procedures, definition of lexer via yylex, which communicates via yylval yacc ambiguity y.output contains parsing action conflicts and how they were resolved default rules reduce/reduce conflict resolved by choosing the first conflicting production shift/reduce conflict resolved in favor of shift, making dangling else happy overrides %left, %right, %nonassoc for associativity precedence in the order they appear, lowest first yacc error productions yyerror dump message yyerrok reset to good state - syntax-directed translation syntax-directed definition each productin A -> alpha has a set of semantic rules of the form) b:=f(c1,c2,...,ck) where f is a function c1...ck are attributes of the grammar symbols of the production, or either b is a synthesized attribute of A b is an inherited attribute of one of the grammar symbols on the right side "b depends on c1...ck" an attribute grammar is a syntax-directed definition where the functions cannot have side effects. if the purpose is side efect, use procedure or program fragments (what kind of restriction is this) terminals are assumed to have synthesized attributes only values for attributes of terminals supplied by lexical analyzer start symbol assumed not to have any inherited attributes synthesized attributes if only use synthesized attributes, called an S-attributed definition always can be evaled bottom up in parse tree expression evaluation example inherited attributes use for for keep track of context relative to parent for example if an identifier is on the left or right of an assignment (l-value or r-value) it is always possible to rewrite a syntax-directed definition to use only synthesized attributes, inherited make it more natural dependency graph of attributes better be a DAG evaluation order 1.) parse tree methods: at copile time determine eval order from toplogical sort of dependency graph 2.) rule based methods: at compiler-construction time, eval order is encoded in productions 3.) oblivious methods: example, order is order of parsing 2/3 can be more efficient in time and space because they avoid the toplogical sort at compile time syntax trees (aka AST, abstract syntax trees) syntax directed translation can work with ASTs, not just parse trees you can use syntax-directed definitions to constuct the AST if you do the definition right for expressions, you can eliminate common subexpressions when creating AST bottom-up eval of s-attributed definitions whenever reduction is made, the values of the new synthesized attributes are computed from the attributes appearing on the stack fot the grammar symbols on the right side of the reducing production l-attributed definitions note that when translation takes place during parsing, it happens in bottom up order (oblivious style) this only works for the subclass called l-attributed (l implies the left to right depth first walk order) top down translation making l-attrbuted definitions work with top down predictive parsing %%% this % section is not on the reading list % bottom up eval of inherited attributes % for l-attrbuted definitions % works for all LL(1) and on many LR(1) % simulate inherited attributes or replace then with synthesized % recursive evaluators % for when eval can't happen concurrently with parsing % assigning space to attributes values at compile-time % assigning space to attributes values at compile-construction time % analysis of syntax-directed definitions - type checking (overlap with programming languages) examples of static checking type checks flow-of-control checks uniqueness checks name-related checks type systems lint has more detailed type system than C, similar to C++ type expressions basic types naming types type constructors arrays products (int x int) records pointers functions type variables static vs dynamic type checking "sound" type system reduces need for runtime checks array bounds checks are always a good example of a need for runtime checks equivalence of types (see programming languages) type conversions coercions overloading functions and operators polymorphic functions unification - run time environments (overlap with programming languages, system software) source language issues procedures activation trees control stacks scope of declarations binding of names static vs dynamic proc definition proc activation name decclaration name binding scope declaration lifetime of binding questions recursion? what happens to local names when control returns may a proc refer to nonlocal names how are parameters passed may procs be passed as parameters may procs be returned as results may storage be alloc'd dynamicly under program control must storage be dealloc'd explicitly %%% this % section is not on the reading list % storage organization % subdivition of run-time memory % code % static data % stack growing down % heap growing up % activation records % compile-time layout of local data % storage allocation strategies % static allocation % stack allocation % dangling references % heap allocation % access to nonlocal names % blocks % lexical scope without nested procedures % lexical scope with nested procedures % nesting depth % access links % procedure parameters % displays % dynamic scope % deep access - longer search, but no call overhead % shallow access - faster search, but need to save and restore values % parameter passing % call-by-value % call-by-reference (call by address, call by location) % copy-restore (copy-in copy-out, value-result % call-by-name (copy rule of Algol) macro expansion inlining % symbol tables % entries % characters in a name % storage allocation information % list data structure vs hashtables % representing scope information % dynamic storage allocation % issues % garbage % dangling references % explicit alloc of fixed-sized blocks % explicit alloc of variable-sized blocks % implicit dealloc % fortran storage allocation % common areas % equivalence algorithms - intermediate code generations advantages retargeting to new architectures machine independent code optimizers three address code assignment x := y op z x := op z (unary) x := y (copy) goto L (unconditional) if x relop y goto L (conditional) procedure call param x1 ... param xn call p,n return y (y optional) indexed assignments x:=y[i] x[i]:=y address and pointer x:=&y x:=*y *x:=y generating three address code with syntax directed definition implementations quadruples op arg1 arg2 result arg1 arg2 result are pointers to symbol table entries triples op arg1 arg2 avoid putting temporaries in symtable, result is refered to by triple index in sequence of triples indirect triples keep array of statements pointing to tripples comparison quads are easy to reorder, require symtable entries tripples are smaller, no symtable entries indirect tripples are reorderable and don't require symtable entries declarations declarations in a procedure tracking scope field names in records assignments names in symbol table reusing temp names addressing array elements value: baseaddress + (index - lowerbound) * widthelement compile time partial evaluation index * width + (baseaddress - lowerbound * widthelement) this can be evaled when declartion is seen c = (baseaddress - lowerbound * widthelement) row major vs column major (more compile time precalc) row by row vs column by column pascal vs fortran a[1,1] a[1,2] a[2,1] a[2,2] vs a[1,1] a[2,1] a[1,2] a[2,2] type conversion in assignments inttoreal realtoint accessing fields in records boolean expressions methods of translating numerical representation short circuit code flow of control statements mixed mode boolean expressions case statements backpatching context easist way to implement syntax-directed definitions first construct syntax tree for input then walk in depth first order computing translations from definitions problem is flow of control need to know labels leavel label unspecified and backpatch it needed for both boolean and flow of control since boolean with short circuit is just sugar for flow of control procedure calls - code generators issues form of input linear postfix quads stack machine ASTs or DAGs target programs absolute code relocatable code assembly memory management instruction selection register allocation choice of evaluation order best order is np-complete problem approaches to code gen correctness register allocation optimization tree directed code selection target machine instruction costs run-time storage management static allocaiton stack allocation run-time addresses for names basic blocks and flow graphs basic blocks transformation on basic blocks structure preserving transforamtion common subexpression elimination dead-code elimination renaming of temporary variables interchange of two independent adjacent statements algebraic transofrmations x=x+0 => noop x=x*1 => noop x=y^2 => x=y*y (strength reduction?) flow graphs graph of basic blocks representation of basic blocks loop means all the nodes in the graph that are strongly connected collection of notes has a unique entry next use information (more in liveness analysis later) backward pass marking "no next use" or location of next use. remember which are temps are block local in symtable can consider generate temps as not live on block exit simple code generator register and address descriptions simple algorithm for each three address statement in a basic block x := y op z 1.) invoke getreg to determine the location L where the result of the computation should be stored. L will usuablly be a register, but could be memory. 2.) consult the address descriptor for y to determine y', one the current locations of y'. pref the reisgter for y' if the value is both in memory and register. if the value of y is no already in L, generate the instrcution MOVE y', L to place a copy of y in L. 3.) generate the instruction OP z', L where z' is the current location of z. again prefer a register to memory for location of z'. update the address descriptor of x it indicate that x is in location L. if L is a register, update its descriptor to indicate that it contains the value of x, and remove x from all other register descriptors. 4.) if the current values of y and/or z have no next uses, are not live on exit from the block, and are in registers, alter the register descriptor to indicate that, after execution of x:= y op z, those registers no longer contain y and/or z respectively notes: unary is just special case x := y copy just need to update descriptors if in registers most copies are eleminted by block improving and copy-propagation anyway once we have processed all three-address statements, we store with MOV those that are live on exit getreg for x := y op z 1.) look for y in reg, return it if we can reuse it (dealing with flushing reg aliases for y) 2.) return empty reg if found 3.) if x has next use in the block or op is an operator that needs a register spill some register R to memory 4.) it x is not used in the block, select the memory location of x notes: could also look at future usage to determine what to spill etc could look at commutativity of op other types of statements indexed addresses conditional statements register allocation global register allocation usage counts register assignment for outer loops register allocation by graph coloring DAG representation of basic blocks peephole optimization examples redundant-instruction elimination redundant loads and stores flow-of-contol optimizations unreachable code algebraic simplifications x=x+0 => noop x=x*1 => noop reduction in strength x=y^2 => x=y*y (strength reduction) use of machine idioms i:=i+1 with INC instructions %%% this % section is not on the reading list % generating code from dags % rearranging the order % heuristic ordering for dags % optimal ordering % phase 1: labeling register usage counts bottom up % left leaf needs 1 % right leaf needs 0 % if l != r, parent node is max(l,r) % else if l != r, parent node is max(l,r)+1 % phase 2: walk top down, governed by label counts, evaluated tree with more register needs first % rstack of free registers % gencode % case 0, leftmost leaf, genrate load % case 1, subtree with right in location % eval left into R=top(rstack), % op name, R % case 2, right is harder than left % swap top of rstack from S,R,... to R,S,... % eval right into R=top(rstack) % remove R from rstack % eval left into S=top(rstack) (S is orig top of stack) % op R, S % places output in orig top of stack % swap rstack to restore to inital value % case 3, case 2 but with left hard that right, avoid register swap % case 4, occurs when both subtrees require r or more % registers, eval right into temporary, then % the left, then the root. % multiregister operations % expand labeling to understand register requirements for each operator % algebraic properies % common subexpression % with dags, we shared nodes % this makes labeling/gencode different, difficult % if fact, optimal code generation for dags on a one-register machine is NP-complete % even if we have unlimited registers its NP-complete % dynamic programming expands gencode to work for a broader class of machines with complex instruction sets % code-generator generators % tree rewriting by pattern matching - code optimization criteria correctness faster on average must be worth the effort optimization user can profile change algorithm transform loops intermediate code compiler can improve loops procedure calls address calculations target code compiler can use registers select instructions do peephole optimization stucture control-flow analysis data-flow analysis transformations function preserving transformations common subexpression elimination local to a basic block global across basic block copy propagation dead-code elimination loop optimization code motion induction variables and reduction of strength constant folding optimization of basic blocks more blather about algebraic stuff %%% this % section is not on the reading list % loops in flow graphs % dominators % d dom n - if every path from the initial block to n goes through d % every node dominates itself % entry of a loop dominates all nodes in the loop % natural loops % 1.) header - a loop must have a single entry point % 2.) path back to header - there must be at least one way to iterate the loop % a good way to find loops is graph is to search for edges whose heads dom their tails. that is, a->b and b dom a % given back edge n->, natural loop of the edge to be d plus the set of nodes that can readh n without going though d % node d is the header % inner loops % using natural loop definition % unless two loops have the same header, they are either disjoint or one is nested within the other % inner loop, the loop that contains no other loops % when two loops have the same header, but neither is nested, they are treated as single loop % preheader - a new basic block added before the header as target for code motion % reducible flow graphs % stuctured flow control always leads to reducible flow graphs % there are no jumps into the middle of a loop from outside % definition % 1.) the forward edges form an acyclic graph in which every node can be reached from the initial node % 2.) the back edges consit only of edges whose heads dominate their tails % global data-flow analysis % example % points and paths % reaching definitions % unambiguous % x := y % ambiguous % calling procedure with x by-reference % potentially aliasing if x is argument % assignment to pointer that could be x, *q=y % reaching definitions % d reaches a point p if there is a path from the point following d to p such that d is not killed along that path % gen[S] definitions generated by S % we kill a definition of a variable a if b/w two points along the path there is a definition of a % kill[S] definitions killed by S % conservatively assume all paths can be taken % true gen is subset of computed gen % true kill is superset of computed kill % safe because it being conservative this direction prevents certain optimizations % might be reversed for other optimizations % cases % gen[S] and kill[S] are synthesized attributes, from bottom up. % in[S] is inherited, out[S] is synthesized from in[S] % S = d: a:=b+c % gen[S] = {d} % kill[S] = Da-{d} (Da is the set of all defintions in the program for variable a) % out[S] = gen[s] union (in[s] - kill[s]) % S = S1; S2; % gen[S] = gen[S2] union (gen[S1] - kill[S2]) % kill[S] = kill[S2] union (kill[S1] - gen[S2]) % in[S1] = in[S] % in[S2] = out[S1] % out[S] = out[S2] % S = S1 or S2 % gen[S] = gen[S1] union gen[S2] % kill[S] = kill[S1] intersection kill[S2] % in[S1] = in[S] % in[S2] = in[S] % out[S] = out[S1] union out[S2] % S = loop around S1 % gen[S] = gen[S1] % kill[S] = kill[S1] % in[S1] = in[S] union gen[S1] (tricky to compute without rewritting) % out[S] = out[S1] % set representation % local reaching defintions % save space by only saving at basic block points, recompute with more time as needed % use definition chains % general flow control % dealing with break and continue % iterative solution of data-flow equations % alg: for reaching definitions % available expressions % ... realized here that this section is not on reading list!!! - grammars LL(1) LR(1) LALR (yacc/bison) shift-reduce conflicts recursive decent left recursion left factored not deterministic CFL ambiguous - common optimizations (some from above, just centralizing here) structure preserving transforamtion common subexpression elimination dead-code elimination renaming of temporary variables interchange of two independent adjacent statements algebraic transformations (beware, these often cause trouble in not preserving overflow/underflow behavior) x+0=0+x=x x-0=x x*1=1*x=x x/1=x strength reduction x=y^2 => x=y*y 2.0*x => x+x x/2 => x*0.5 peephole optimization examples redundant-instruction elimination redundant loads and stores flow-of-contol optimizations unreachable code algebraic simplifications x=x+0 => noop x=x*1 => noop reduction in strength x=y^2 => x=y*y (strength reduction) use of machine idioms i:=i+1 with INC instructions block improcing copy propagation common subexpression elimination constant folding reduction in strength loop unrolling li - forward/backward analysis