Anda di halaman 1dari 40

INTERMEDIATE CODE

GENERATOR

INTERPRETER & COMPILERS


code
Level of
programming

Error detection
Speed
Memory space

Machine code
Easy to write
Better error
messages

Slower
More memory space
C, C++ etc.

Intermediate code
Easy to write
Better error messages
Slower
More memory space
Java, LISP, UNIX Shell
etc.

Analysis synthesis model front


end analyses the source and
creates an intermediate
representation from which back
end generates the target code

Details of source code confined


to front end and details of target

STRUCTURE OF A COMPILER FRONT END


Back end

Front
end
Parser

Static
checker

Type checking
(compatibility of
operators and
operands)

Intermediate code
generator

Syntax checks

Intermedia
te code

Code
generator

INTERMEDIATE REPRESENTATIONS OF
COMPILER

Close to target
machine

Close to source
language

Source
program

High level
intermediate
representation

Syntax tree depict


natural hierarchical
structure of source
code , suited for static
checking

....
.

Low level
intermediate
representation

Suitable for machine


dependent tasks
register allocation,
instruction selection

Target
code

DIRECTED ACYCLIC GRAPH


Variants of syntax trees
Nodes represent constructs in the source program
Children represent components of construct
DAG for an expression identifies the common sub expressions
(repeating expressions) of the expression
Leaves atomic operands
Interior operators
While representing a common sub expression a node may have more

than 1 parent
In syntax tree it would be replicated as many time s it would appear in
the expression.

Gives important clue to compiler to generate an efficient


code to evaluate the expression .

Example: a + a*(b - c)+(b - c) * d

Syntax tree

DAG

SDD construct either ST or DAG

+
+
a

*
a

*
-

+
+
a

a
a

*
b
b

c
c b

*
d

E E1+T
E E1-T
ET
T(E)
T id
T num

THE VALUE NUMBER METHOD FOR


CONSTRUCTING DAGS

Node of DAG are stored as array of records.


Each row depicts a single node.
Represents operation code that determines the label of node.
Leaves have an additional field that holds lexical value.
Nodes have 2 fields that indicate the left and right child
(integer index of the record of the node within the arrayvalue number)

Using hash tables

=
+
i

10

id

num

10

CONSTRUCT DAG FOR THE FOLLOWING:


a+b+(a+b)
a+b+a+b
((x+y)-((x+y)*(x-y)+((x+y)*(x-y))

THREE ADDRESS CODE


At most one operator on RHS is permitted
x+y+z is rewritten as
t1=x+y
t2= t1+z

t1=b-c
t2=a*t1
t3=a+t2
t4=t1*d
t5=t3+t4

ADDRESSES AND INSTRUCTIONS


Three address
code
OOPs
Implemented
using Class
Addresses and
instructions are
subclasses

addres
s
instructi
on
General

Implemented
using records
with fields as
addresses

ADDRESSES
Name
Constant

: source program name


: constants and variables (type
conversions may be done)

Compiler generated temporary

: for optimization distant


names each
time a temporary s needed.
Combined(if possible) when
registers are allotted

SYMBOL LABELS
Used to alter flow of control
Represents the index of a 3 address instruction in a sequence
of instructions.

INSTRUCTIONS
Assignment instruction
address; op is
operations)

Assignment instruction

z=x op y (z , x, y are
arithmetic or logical

(z , x, y are
address; op is unary
operation: unary
plus, minus,
negation, shift, conversion
operation)

Copy instruction

x=y

z= op y

Unconditional jump goto L


if x goto L
Conditional jump
loops
Conditional jump
Procedural calls and return param x;
call p , n
y= call p, n //y is return value

Indexed copy instruction x=y[i]


Address and pointer assignment x=&y, x=*y

QUADRUPLES
Has four fields : op, arg1,arg2, result
z=x + y : + is op; x, y are arg1 and arg2 respectively and z is
the result

The exceptions are:

Unary operator has no arg2


Operators such as param has neither arg2 nor result
Conditional operator has neither result nor arg2

A=B*-C + B*-C
T1= minus C
T2=B*T1
T3=B*T1
T4=T2+T3
A=T4

OP

ARG1

ARG2

RESULT

minus

T1

T2

T1

T4

T2

T3

T4

T4

T1

TRIPLES
Has only 3 fields (no result field)
Result of operation is referred by its position.
DGA representation
Benefit of quadruples over tuples
Optimizing compiler
Move an instruction that computes a temporary t, instruction that uses it
need not be changed.

Indirect triples consists of a pointers to triples than listing.

SINGLE ASSIGNMENT
An intermediate representation that facilitates optimization.
All assignments in SSA are to variables with distinct names
hence name static single assignment.

Same variable may be assigned in 2 control flow statement


like if x>0 y1=1 else y3=0; b=a*y?

SSA uses a function to combine the two definitions of the


same variable here y.

If x>0 y1=1; else y2=0; y3= (y1,y2)


Y3 assumes the value of the argument that goes through the
control flow path

TYPES AND DECLARATIONS


Type checking : uses rules to reason the behavior of the
program at run time

Type expected by the operand Matches the type expected by


the operator

Translation application : determine the storage that is needed


at run time

TYPE EXPRESSIONS

TYPE EQUIVALENCE
If 2 type expressions are equal then return the type or error
Ambiguity arises if the expression is given a name and it is
then applied to subsequent sub expression

When

represented by graphs two types are equivalent if

they
Are of same basic type
They are formed by applying the same constructor to the same basic

types
One is a type name and the other denotes it

TYPE CHECKING
Assigns a type expression to each component of the source program
compiler determine that these expression confine to the collection of
logical rules type system of the source program

potential to catch errors


sound system eliminates the need of type checking , as it statistically
determines that errors cant occur when the target program runs.

Strongly typed system guarantees that the target programs it run


accepts will run without type errors.

Rules for type checking:

Synthesis
Built the type of
expression from the type
of its sub expressions.
Requires name to be
declared before use

Inference
Type of the language
construct from the way it
is used.

TYPE CONVERSIONS

Explicit conversions
Implicit conversions
Casts
Coercions
Usually widening is done
Usually widening is done

OVERLOADING OF FUNCTIONS AND


OPERATORS
Symbol has different meaning depending on context.
Is resolved when for each occurrence of a name a unique
meaning is determined.

TYPE INFERENCE AND POLYMORPHIC


FUNCTIONS
Polymorphic refers to any code fragment that can be
executed with arguments of different types.

Parametric polymorphism polymorphism characterized by


parameters or type variables.

CONTROL FLOW
Boolean Expression

Alter flow of control


Compute logical values

BOOLEAN EXPRESSIONS
Comprised of Boolean operators
|| && !

SHORT CIRCUIT CODE


In short circuit (jumping) code the Boolean operators translate jumps operators do
not appear in the code but value of the expression is represented by the position in
the sequence

E.g. If(x<100 ||x>200 && x!=y) x=0;


If x<100 goto L1
ifFalse x>200 goto L2
IfFalse x!=y goto L2
L1:x=0
L2:

FLOW-CONTROL STATEMENTS

CONTROL-FLOW TRANSLATION FOR


BOOLEAN EXPRESSIONS
Translated to three address codes that evaluates using
conditional and unconditional jumps to one of the two labels
true or false

Boolean expression may be used

alter the flow of control


Evaluated for its value (x=true)
Or even evaluated for assignment x=a>b

To handle both built a syntax tree and then compute


translations by :
Two pass construct syntax tree wall the tree in DF order computing

the transalations
One pass for statements and two passes for expression

BREAK CONTINUE AND GOTO


STATEMENTS.
Break
Continue
Goto

Anda mungkin juga menyukai