Sequential Programs
Back to UWaterloo
MIPS [aside]
 a reduced instruction set computer (RISC) instruction set architecture (ISA)
 MIPS implementations are primarily used in embedded systems
 tries to avoid the use of stacks (because of speed)
 MIPS reserves four registers to pass arguments
$a0
... 3
 passing arguments that cannot be contained in 4 registers use the stack
 however, nested functions still require the use of the stack
 this kind of architecture is advantageous for leaf procedures, where there are no nested function calls
Abstraction: labels & variables
Relocation and Linking
 def: an object file contains machine language code with additional metadata, which records how labels were used before translation to machine lang (which addresses need to be offset)
 def: relocation is the process of adjusting addresses in machine lang code so it can be loaded at different memory addresses
 conceptually, it reverse engineering what the assembly code would have been, then it reassembled at some offset, loaded at a different memory address
 def: linking is the process of combining object files into one (object file, program, library)
 to link, relocate conflicting object files and labels based on metadata
 Typical c compilation process:
.c
compiles to assembly .s
, assemble to object files .o
, then link object files to an executable program
Stack, $30
stores the stack pointer, pointing to the top of the stack
 to access variables, you need an offsest from the stack pointer
 a symbol table maps variables to offsets
a frame pointer at $29
is a copy of the stack pointer made at the beginning of a procedure that stays constant for the duration of the procedure call, this is used as the origin of offset for accessing variables in the current stack frame. This allows the actual stack pointer to change without ruining our symbol table

variables are abstractions of storage locations that hold a value
 the extent or lifetime of a variable instance is the time interval in which the variable can be accessed
 local vars have the extent of execution of procedure, global have extent of entire program, fields of objects have extent of time until object is freed (explicitly or by GC)
 variables are stored at offests relative to the frame pointer
 Symbol table holds the offset of each variable
 translation time/compile time is when you tkae high level program with variables and translate to machine code. Symbol table is used here to replace variables with their respective address
 at compile time, to access variable,
LW(3,8,29)
loads into R3, from an offset of 8 to the frame pointer
 on the stack, allocate chunk of consecutive memory locations rather than single bytes. Each chunk has 2 reserved words of memory: first is size, second is reserved by convention
 in assignments, allocate all memory in chunks
Evaluating Expressions
 Using the stack:
 for
(e1 op e2)
, build a binary expression tree, generate code to evaluate e1
and e2
 store intermediate responses on the stack

conceptually simple, but verbose
evaluate(e1, op, e2)
evaluate(e1)
push $3 onto stack
evaluate(e2)
pop from stack $4
$3 = $4 pop $3
 Using temp vars/virtual registers:
 Using temp vars with operations on registers

same as first, but adds abstraction with lends to more efficient underlying implementation
evaluate(e1 op e2) {
withTempVar {
block(
evaluate(e1)
write(t1,$3)
evaluate(e2)
read($4, t1)
$3 = $4 op $3
)
}
}
If statements:
if (e1 op e2) T else E
evaluate(e1)
t1 = $3
evaluate(e2)
$4 = t1
comparison of $4 and $3, branch to labelElse
T
jump to labelEnd
Define(labelElse)
E
Define(labelEnd)
 a variable is live at program point p if the value that it holds is called (even if just conditionally) read sometime after P
 the live range of a variable is the set of program points where it is live
 two variables can share the same register if their live ranges do not overlap
 the start of a live range is always just after a write to the var
 the end of a live range is always just before a read to the var

interference graph has each variable as a vertex and edges between variables that share a live range
 apply graph colouring
 finding a minimal colouring (fewest colours) is NPHard
 simple greedy algorithm: for each vertex, assign the smallest colour not used by its neighbours
 some special structured graphs allow for efficient algorithms
 Procedures: an abstraction that encapsulates a resuable sequence of code
 calling code transfers control to procedure (modifies PC), passing arguments for parameters
 proceduretransfer control back to caller
 caller and callee must agree on conventions:
 which registers to pass arguments and return values
 which registers procedures may modify or preserve
 register with return address (31 or Reg.savedPC)
 who allocates/frees memory
# CALLER
Stack.allocate(parameters)
LIS(Reg.targetPC)
Use(proc)
JALR(Reg.targetPC)
# PROCEDURE/CALLEE
Define(proc)
Reg.savedParamPointer = Reg.allocated
Stack.allocate(frame)
dynamicLink = Reg.framePointer # saved frame pointer of caller
# savedPC is on the Stack, this is needed for nested procedures
savedPC = Reg.savedPC # Reg 31
Reg.framePointer = Reg.allocated
paramPointer = Reg.savedParamPointer
# body
Reg.savedPC = savedPC
Reg.framePointer = dynamicLink
Stack.pop # pops frame chunk
Stack.pop # pops paramaters chunk
JR(Reg.savedPC)
On the stack:
 size
 Assignment11
 variables
 savedPC
 dynamicLink
 paramPointer
argChunk (pointed to by paramPointer):
 size
 A11
 arguments
Some Conventions:
 Callersave (what registers will be written to): 31 savedPC, 3 result
 Calleesave: 29 framePointer, 30 stackPointer
a5  don't mess up Reg.result
Static Link
Static lib
A static link is a pointer to the frame of the statically enclosed procedure of the currently executing procedure
 the nesting depth of a proc is the number of outer procs that it is nested inside of
 toplevel depth is 0
 to access variable (eliminateVarAccessesA6), n = depth(current proc)  depth(proc declaring variable)
 follow static link n times, then find variable in that frame
 to compute static link at call site:
 we want depth(static link) = depth(call target)  1
 let n = depth(current proc)  depth(static link)
 n = depth(current proc)  depth(call target) + 1
 if n is 0, pass parameter pointer for static link
 else follow static link n times and pass to callee as static link. n ≥ 0
First class functions
Heap
A data structure that manages memory so it can be allocated and freed
 A6  allocate and never free
 A11  allocate and free when extent ends
 every procedure with a closure nested inside will have a frame on the heap
Objects
Objects can be thought of in terms of closures
 objects have state and behaviour
 a collection of closures (indexed by names) that close over the same environment
def newCounter: (()=>int, (int)=>Unit) = {
var value = 0
def get() = value
def incrementBy(amount: Int) = {
value = value + amount
}
(get, incrementBy) // returns a pair of functions, that share an environment stored on the HEAP
}
A6 Tutorial
// Testing your closures locally
def increaseBy(increment: Int): (Int)=> Int = {
def procedure(x:Int) = { x + increment }
procedure
}
def main(a: Int, b: Int) = (increaseBy(a))(b)
// now reimplemented in Lacs
val increment = new Variable("increment")
val increaseby = new Procedure("increaseBy",
Seq(increment), Seq())
val x = new Variable("x")
// 4th arg, Option can be None or some other Procedure
val procedure = new Procedure("procedure", Seq(x), Seq(), Some(increaseBy))
// some helper function
def v(variable: Variable): Code = read(Reg.result, variable)
// now define the body of the procedure
// make sure you implement some way that accesses outer scoped variable (should be similar to a regular read)
procedure.code = binOp(v(x), plus, v(increment)) // increment is an outer variable
increaseBy.code = Closure(procedure)
val a = new Variable("a")
val b = new Variable("b")
val parameterVar = new Variable("parameterVar")
val main = new Procedure("main", Seq(a,b), Seq())
// you can't use paramChunk because you don't know until runtime which paramChunk to access outer vars
main.code = CallClosure(call(increaseBy, v(a)), b, Seq(parameterVar))
val machineCode = compilerA6(Seq(main, increaseBy, procedure))
val endState = A1.loadAndRun(machineCode.words, Word(encodeSigned(1)), Word(encodeSigned(2)))
 For compilerA6, why do we need two phases? Varaccesses handles outer variable. To do that, you need the frames of all the outer procedures. We don't compute the frame of a procedure until runtime.
 Phase one, translate all procedures and generate frames. Then EliminateVarAccesses phase 2.
 phaseOneResults is a map from procedure to frame
 eliminateCalls > closurecalls, static links?
 implementCall: Closure calls & Regular Calls. ImplementCall is the similarity between the two.
Finite Automata
In Theory of Computation, a Deterministic Finite Automata is a deterministic finite state machine that accepts and rejects finite strings of symbols, to produce a unique run of the automata for each input string. It recognizes exactly the set of regular languages, useful for lexical analysis and pattern matching
[ASIDE] A regular language is a formal language (set of string and symbols constrained by rules) that can be expressed with regular expressions.
 the collection of regular languages over an alphabet ∑ is defined recursively
 The empty language ø and the empty string language {ε} are regular languages
 forall a in ∑, the singleton language {a} is a regular language
 if A and B are regular languages, then A u B, A•B (concat), and A* (Kleene Star, one or more) are regular languages
 no other languages over ∑ are regular
 it is a language accepted by deterministic and nondeterministic finite automaton
Formally, a DFA is a 5tuple; <∑, Q, q0, A, ∂> where
 ∑ is a finite alphabet
 Q is a finite set of states
 q0 is an element of Q, is the starting state
 A is a subset of Q, is the accepting states; machine reports the input string is a member of the language it accepts
 ∂: Q x ∑ > Q is a transition function
DFA recognition algorithm:
 define ∂* : Q x ∑* > Q extended transition function, ∑* is a string of symbols from ∑
 ∂*(q, e) = q
∂* (q, head::tail) = ∂* ( ∂(q,head),tail), recursively applies alphabet symbols with function ∂() to move states
 A word is accepted by the DFA if ∂* (q0, w) is an element of A
 the language defined/specified by a DFA is the set of words accepted by the DFA
a language is regular if exists a DFA that specifies it
Nondetermininistic Finite Automata (NFA) has multiple choices of transitions between states
 a word is accepted by a NFA if any path leads to accepting state
 the machine takes all choices at once, so the NFA is a set of states at any given point, rather than a single state
Formally, a NFA is a 5tuple; <∑, Q, q0, A, ∂> same as DFA except ∂: Q x ∑ > 2^Q
is a transition function to set of states, where 2^Q is the power set
NFA recognition algorithm:
 define
∂* : Q x ∑* > 2^Q
extended transition function, ∑*
is a string of symbols from ∑
∂*(q, e) = {q}
∂*(q, head::tail) = union of all( q' in ∂(q,head) . ∂*(q', tail))
A word w
is accepted by an NFA if ∂*(q0, w) n A ≠ {}
, the set of possible transitions is contained in A
, the set of accepting states
NFAε, accepts ε (empty string) as a possible string input. It nicely represents unknown states. To eliminate all epsilontransitions from an NFA:
 where an epsilon transfer (
∂*(q, epsilon) = {q}
) is a given element of the alphabet that, for example, causes state A > B, then replace all transitions to A with a transition to B, and remove the epsilontransition from A to B
Scanning
Breaking a string into tokens using an NFA constrained by maximal munch property
 Recognition: is some word in language?
 Scanning: split a string into tokens s.t. each token is in language
 maximal munch is the principle that when creating some construct, as much of the available input should be consumed
 input: string w, language L
 output: sequence of words w1, w2 ... wn s.t.
 w1, w2...wn = w
 forall i . wi in L
 forall i, wi is the longest prefix of wi, wi+1 ... wn that is in L (maximal munch)
 when tokenizing, break the string into the largest tokens possible
Let L be language of tokens, L*
be the language of words that can be scanned
 Given an NFA for L, construct an NFA for
L*
by adding epsilontransitions from every accepting state to start state
 since NFA is nondeterministic, we constrain each transition to use the maximal munch to produce a unique transition
 :( might not find a solution when one exists because it chooses branches of maximum token size and over looks a possible path to a solution
 if
L = {aa, aaa}
, W = aaaa
, then maximal munch first takes the token aaa
from W, leaving W = a
. a
is not in L, so our method says no solution, even though it's possible to take the tokens aa
and aa
from W.
Method:
 run DFA for L until it gets stuck
 if in nonaccepting state, backtrack to lastvisited accepting state. if no accepting states visited, ERROR
 output a token corresponding to accepting state
 set DFA to start state and repeat from 1 (this is why we have to add epsilon transitions for
L*
eariler)
Regular Expressions
A regular expression expresses a regular language. So L is basically a function that maps regular expressions to the sets of strings that it represents.
ε L(ε) = {ε} # empty string
a in ∑ L(a) = {a} # literal character
ø L(ø) = {} # empty set
R1  R2 L(R1R2) = L(R1) u (L(R2) # R1 & R2 are regex
R1R2 L(R1R2) = {xy  x in L(R1), y in L(R2)} # concatenation of elements contained within each of the regular languages
R* L(R*) = { x1x2..xn  forall i . xi in L(R), n ≥ 0 } # Kleene's Star
Kleene's Theorem:
 For a language L, the following statements are equivalent to each other:
 exists a DFA specifying L
 exists an NFA without epsilontransitions specifying L
 exists an NFA specifying L (could have epsilon)
 exists a regex specifying L
 L is regular
ContextFree Grammars
Regular expressions can't express unlimited recursive structures. We want nesting in programming languages, so we need something more powerful than regular languages.
A contextfree grammar (CFG) is a set of recursive rewriting rules (productions) used to generate patterns of strings. Regular Expressions use iteration, Contextfree Grammars use recursion
 set of terminal symbols ∑, which are the characters of the alphabet that appear in the strings
 a set of nonterminal symbols N, placeholders for patterns of terminal symbols that can be generated by nonterminal symbols
 a set of productions P, rules for replacing nonterminal symbols on the left side of the production with other nonterminal or terminal symbols on the right
 a start symbol S, which is a special nonterminal symbol that appears in the initial string generated by the grammar
 A CFG may have a production for a nonterminal to the empty string (epsilon), effectively removing the nonterminal from the generated string
You can represent all regular languages with CFGs, but not all contextfree languages can be described by regular expressions/dfas/nfas
To generate a string of terminal symbols from a CFG:
 start with a string consisiting of the start symbol
 apply one of the productions with the start symbol
 repeat the process of selecting nonterminal symbols in the string and replacing them with the right side of corresponding productions until only terminal symbols remain
 αAβ directly derives (in one step) αγβ if A → γ ∈ P
a derives (in 0 or more steps) an if a1 => a2 => ... => an (a =>* an)
 Def: A formal grammar is a set of production rules for strings in a formal language
 Def: A regular grammar is a formal grammar that is rightregular or leftregular. For A, B, C in N, a in ∑:
 a right regular grammar where B > a, B > Ca, or B > epsilon
 a left regular grammar is where A > a, A > aB, or A > epsilon
 a regular grammar describes a language describable by a regular expression. DFAs, Regular Grammar and Regular Expressions are just three formalizations of the same concept.
context free languages are generated by CFGs, the set of all Regular languages are a proper subset of all Context Free Languages
canonical derivations: rightmost and leftmost, for precedence of parsing
 there exists a parse tree iff exists rightmost derivation iff exists leftmost derivation
 A CFG is ambiguous if it allows multiple parse trees for the same input (different paths of production rules for the same output)
Parsing
 Topdown parsing starts with S and finds derivation steps leading to result w. The hard problem is choosing the right rules
 Bottomup algorithm starts with w, finds derivation steps leading to s.
 Chomsky Normal Form: RHS of CFG can have an arbitrary number of symbols
 in CNF form, production rules can only have either two nonterminals on the RHS or one terminal
 formally, CFGs are allowed to have empty productions NP > epsilon, which can always be eliminated without changing the language via epsilon (by altering production rules)
CYK
CYK starts with S, starting nonterminal, as alpha. We want to expand nonterminals in alpha such that the input string is represented by the CFG. So, check all possible chains of production rules until it finds a valid parse tree. It follows four cases, each trying to reduce the string of terminal and nonterminal characters, alpha, to the base case where alpha is empty. Since alpha is our current state, we keep checking the leftmost character from alpha to see if it can reduce our input, with the end goal being an empty alpha and empty input (i.e. all of the input is represented by a parse tree of terminal/nonterminal characters). Complexity occurs when the leading character of alpha is a nonterminal, since you have to now have to split the input string in two, such that the nonterminal describes the first partition and the rest of alpha describes the second partition of the input string. So there are a quadratic number of substrings of input that could be checked, each with a linear time partition check. So O(n^3) time complexity. Dynamic programming is used to optimize the problem by eliminating repeated calculations with a memoization table.
 Bottom up parsing, with dynamic programming to save results in table
 given w, does s =>* w
 subproblem: given x, does alpha =>* x
, alpha is a sequence of terminals and nonterminals, x is a seq of characters
 w =yxz for some y,z. x is a substring of w
 A > gamma alpha in P. alpha is a suffix of RHS of a rule in R
 so we have to test for every suffix alpha and every substring x, in O(w^3) time, since for each w^2 substrings, check against O(w) partitions of each substring.
 the table contains single entries for each nonterminal, where each cell of the table has a substring
 cases for filling our table:
 alpha > epsilon (empty string) : true if x = epsilon
 alpha > aß : true if x = ay AND ß =>* y
 alpha > A : true if exists A > gamma in P s.t. gamma =>* x
 alpha > Aß : if exists y,z where x=y and A =>* y
and ß=>* z
Earley's Algorithm
 bottomup
 P(w^2) Space
 O(w^3) time for ambiguous
 O(w^2) time for unambiguous
Context Sensitive Analysis
Reject programs that satisfy a grammar but are still invalid
 in LACs, types prevent accidential misinterpretation
Lacs Types
 Int
 integers between 2^{31} to 2^{31}1 with arithmetic modulo 2^{32}
 (type, ...) => type
 functions that take arguments of the specified types, return a value of the specified type
A type is a collection of values. It is an interpretation of a sequence of bits
 a type is a compatable property of programs that guarantee some property of their execution
 ensures consistent use of operations
 identify how much memory is needed for a value
A type system is a set of rules htat compute the type of expression from the types of its subexpressions
 type system is sound if when it computes a type gamma for expression e, then e evaluates to value v in γ
Using premiseconclusion notation from proofs, type inference rules:
Γ  NUM: Int in the context Γ (captial gamma), the symbol table, NUM has type Int
Γ(ID) = γ Γ must be fed through all rules so that our symbol table is accessible when needed
____________
Γ  ID: γ
Let E in {expras, expra, expr, term, factor}
Γ  E1 : Int
Γ  E2 : Int
_____________
Γ  E1 op E2 : Int for op being arithmetic operation
Type tree has two passes: first pass get all "defdef"s and create new ProcedureScopes for them. Second pass, create a symbol table for each.
Memory Management
Heap is a data structure to manage memory that can be allocated/freed at any time
 operations:
 new/malloc/allocate: allocate a new block of memory whose size can be determined at runtime
 free/delete
The stack and heap grow toward each other, by our convention stack grows by subtraction, heap by addition.
We can structure our heap with blocks of memory, each block holding metadata about size and free/used. Then use a linked list of free blocks. A fragmented heap is split into small blocks.
Garbage Collector
Figures out the extent of objects for you, frees them implicitly.
 allocating a frame on the heap may call the garbage collector, which will call a function that could overwrite savedParamPtr
 after A6, the frame might be allocated on the heap. So always allocate on the stack first, then when you're done with Reg.savedPC, copy the chunk to the heap if needed
 when you allocate, you can only use machine language because a lot has been eliminated by transformations. Write it with A6 testsstyle
Cheney's Algorithm
A stop and copy algorithm, which splits our total heap size in half, and performs collection by copying live objects (ones with pointers to it) from one semispace (the "fromspace") to the other semispace (the "tospace")
 we refer to the semispaces as the "to" and "from" spaces rather than semispace 1 and 2 because their role swaps everytime we perform garbage collection
The algorithm works in 2 parts:
 check object references from the stack, and do one of the two following:
 if it hasn't been copied yet, make a copy in the tospace, put a forwarding pointer in place of the object in the fromspace and update the reference to point to the newly copied object
 if it's been copied, then update the reference using the forwarding pointer
 then check the objects in the tospace for references and perform the above checks
Lambda Calculus
For expressing computation based on function abstraction. It is turing complete and thus acts as a universal model of computation
 λ expressions and terms denote binding a variable to a function (think anonymous functions)
 square_sum(x,y) = x^2 + y^2 can be written in anonymous form as (x,y) > x^2 + y^2
 all functions in lambda calculus can be represented as single input functions
 (x,y) > x^2 + y^2 can be written as x > (y > x^2 + y^2)
 this method is currying, chaining function calls together
 id(x) = x is the identity function, useful for having a value represented as an anonymous function
lambda calculus consists of a language of lambda terms, defined by formal syntax and a set of transformation rules

lambda expressions can be valid or invalid, a valid expression is a term
 all syntactically valid lambda terms are inductively defined by 3 rules:
 a variable x is a lambda term

abstraction: if t is a lambda term, (λx.t) is a lambda term

application: if t and s are a lambda terms, (ts) is a lambda term
 nothing else is a lambda term
 an abstraction λx.t is a defn of an anonymous function that takes a single parameter x and substituting it into expression t
 abstraction binds free occurences of x in t
 application is like composition of terms; ts means t(s)
 Βreduction is performing expression substitution during application over an abstraction
 αconversion is renaming variables to avoid variable binding over an existing, same named variable
 has practical use for programming languages, where a variable is defined twice, once in a nested scope
Church Encoding
Means of representing data operators in lambda calculus (since in untyped lambda calculus, we only have function types)
 church numerals are a representation of the natural numbers, using lambda calculus
 church encoding maps primatives using higherorder functions
 any computable operator and operands can be represented under church encoding (though not actually practical in use)
 it demonstrates the completeness of untyped lambda calculus
Church numerals can be used to represent natural numbers by the number of times some function f is applied to x:
 0 = λf.λx.x
 1 = λf.λx.f x
 2 = λf.λx.f (f x)
 n = λf.λx f^{n} x
 addition = λm.λn.λf.λx.f^{m} (f^{n} (x))
 successor function succ(n) is Βequivalent to (plus 1)
Exam
 Closures, tail calls
 formal languages
 regular languages
 DFAs, NFAs, Regex
 Scanning
 CFGs
 parse trees
 derivations
 ambiguity
 CYK parsing
 Other parsing algorithms (simple comparisons, big O)
 name resolution
 typechecking
 heap
 explicit malloc/free
 fragmentation, compaction
 cheney's garbage collector
 lambda calculus