Anda di halaman 1dari 9

Type Checking:

A Compiler must check that the source program follows


both syntactic and semantic conventions of the source
language.

This kind of checking is called static checking to ensures


that certain kinds of programming errors will be detected
and reported.
Examples of static checks:

Type checks:
A compiler should report an error if an operator is applied to an incompatible operand;
Example: If we are adding an array variable and a function variable.

Flow-of control checks:


Statements that cause flow of control to leave a construct must have some place to
which to transfer the flow of control.
Example: A break statement in C causes control to leave the smallest enclosing
while, for, or a Switch statement; an error occurs if such an enclosing statement
does not exist.

Uniqueness checks:
Sometimes an object must be defined exactly once.
Example: In Pascal, an identifier must be declared uniquely,
Labels in case statement must be distinct,
Elements in a scalar type may not be repeated

Name – related checks:


Sometimes, the same name must appear two or more times.
Example: In Ada, a loop or block may have a name that appears at the
beginning and end of the
Construct. The compiler must check that the same name is used at both places.
Sometimes, there are checks implemented as we enter information about a
name into a symbol table, we can check that the name is declared uniquely.
Many Pascal compilers combine static checking and intermediate code
generation with parsing.

However, with more complex constructs, like those of Ada, it may be convenient
to have a separate type-checking pass between parsing and intermediate code
generation as shown below:

token syntax Type Syntax Intermediate Intermediate


Parser
steam tree checker tree Code gener.
Representation

A type checker verifies that the type of a construct matches that expected by its
context.

Example:
The built-in arithmetic operator MOD in Pascal requires integer operands, so
a type checker must verify that the operands of MOD have type integer.
Code Generation:
The final phase in a compiler model is the code generator.
Its input is an intermediate representation of the source program and it
produces as output an equivalent target program.
The output code must be correct and of high quality, meaning that it should
make effective use of the resources of the target machine. Moreover, the code
generator itself should run efficiently.

Issues in the design of a code generator:


While the details are dependent on the target language and the operating
system, issues such as memory management, instruction selection, register
allocation, and evaluation order are inherent in almost all code – generation
problems.
Input to the code generator:
The input to the code generator consists of the intermediate representation of the
source program, together with information in the symbol table that is used to
determine the run-time address of the data objects denoted by the names in the
intermediate representation.
There are several choices for the intermediate language as we mention in the
previous classes:
Three-address representation such as triples, quadruples;
Linear representation such as postfix notation; and
Graphical representations such as syntax trees and dag.
Before starting the code generation, we assumed that the front end has scanned,
parsed,and translated the source program into a reasonably detailed
intermediate representation, so
The values of names appearing in the intermediate language can be represented
by quantities that the target machine can directly manipulate(bits, integer, real,
etc)
Also, we assumed all the necessary type checking has been done
Therefore, the code generation phase can proceed on the assumption that its
input is free of errors.
Target programs:
The output of the code generator is the target program.
This output may be absolute machine language, relocatable machine language, or
assembly language.

Producing an absolute machine language program as output has the advantage


that it can be placed in a fixed location in memory and immediately executed.
Producing a relocatable machine language program as output allows subprogram
to be compiled separately. These relocatable objects can be linked together and
loaded for execution.

We gain a great deal of flexibility in being able to compile subroutine separately


and to call other previously compiled programs from an object module.
Producing an assembly-language program as output makes the process of code
generation somewhat easier.
Instruction Selection:
The uniformity and completeness of the instruction set are important factors;

If the target machine does not support each data type in a uniform manner, then
each exception to the general rule requires special handling.

Instruction speeds is also an important factor;

If we do not care about the efficiency of the target program, instruction selection
is striaghtforward.

For each type of three-address statement, we can design a code skeleton that
outlines the target code to be generated for that construct.
for example :
every three-address statement of the form x:= y + z can be translated into the
code sequence :
MOV y, R0 /* load y into register R0 */
ADD z, R0 /* add z to R0 */
MOV R0, x /* store R0 into x */
Unfortunatly, this kind of statement-by-statement code generation often produces
poor code;

for example:
a := b + c
d := a + e
these sequence would be translated into :
MOV b, R0
ADD c, R0
MOV R0, a
MOV a, R0
ADD e, R0
MOV R0, d
So, the third statement is redundant, and so the fourth one.
The quality of the generated code is determined by its speed and size.
The Target Machine:
Our target computer is a byte-addressable machine with four bytes to a word and
n general-purpose registers, R0, R1, . . ., Rn-1; it has two-address instructions :
Op Source, destination
Where op is an op-code, source and destination are data field:
MOV (move source to destination)
ADD (add source to destination)
SUB (subtract source from destination)

Anda mungkin juga menyukai