Compiler Design Storage Allocation Strategies

Compiler Design
Lecture-1
Arup Kr. Chattopadhyay, Department of IT, IEM, Kolkata 1

Compiler Design
Run-Time Environments
• The abstractions embodied in the source language definition are -
names, scopes, bindings, data types, operators, procedures,
parameters, and flow-of-control constructs.
• A compiler must accurately implement these abstractions and also

must cooperate with the operating system and other systems software
to support these abstractions on the target machine.
• To do so, the compiler creates and manages a run-time environment

in which it assumes its target programs are being executed.
• This environment deals with a variety of issues such as the layout

and allocation of storage locations for the objects named in the source
program, the mechanisms used by the target program to access
variables, the linkages between procedures, the mechanisms for
passing parameters, and the interfaces to the operating system,
input/output devices, and other programs.
2
Arup Kr. Chattopadhyay, Department of IT, IEM, Kolkata
Compiler Design
Source Language Issues
Language features that effect the organization of memory

•Does the source language allow recursion?
While handling the recursive calls there may be several instances of
recursive procedures that are active simultaneously.
•How the parameters are passed to the procedure?

Call by value
Call by address
Call by reference
•Does the procedure refer nonlocal names? How?
•Does the language support the memory allocation and deallocation

dynamically?
3
Compiler Design
Storage Organization
From the perspective of the compiler writer, the executing target

program runs in its own logical address space in which each program
value has a location.
The management and organization of this logical address space is

shared between the compiler, operating system, and target machine.
The operating system maps the logical addresses into physical

addresses.
4
Compiler Design
• From the perspective of the compiler
writer, the executing target program runs
in its own logical address space in which
each program value has a location.
• The management and organization of

this logical address space is shared
between the compiler, operating system,
and target machine.
• The operating system maps the logical

addresses into physical addresses.
We assume-
• The run-time storage comes in blocks of contiguous bytes, where a
byte is the smallest unit of addressable memory.
• An elementary data type, such as a character, integer, or float, can
be stored in an integral number of bytes.
• Storage for an aggregate type, such as an array or structure, must
be large enough to hold all its components.
5
Compiler Design
We assume-
The run-time storage comes in blocks of contiguous bytes, where a
byte is the smallest unit of addressable memory.
An elementary data type, such as a character, integer, or float, can be

stored in an integral number of bytes.
Storage for an aggregate type, such as an array or structure, must be

large enough to hold all its components.
The storage layout for data objects is strongly influenced by the

addressing constraints of the target machine.
•Aligning
•Padding
•Packing
6
Compiler Design
Code area
-Target code is fixed at compile time
-Compiler can place the executable target

code in a statically determined
- Low end of memory

Static area
- The size of some program data objects, such as global constants, and
data generated by the compiler, such as information to support garbage
collection, may be known at compile time.
- One reason for statically allocating as many data objects as possible is

that the addresses of these objects can be compiled into the target
code. In early versions of Fortran, all data objects could be allocated
statically.
To maximize the utilization of space at run time, the other two areas,
Stack and Heap, are at the opposite ends of the remainder of the
address space.
7
Compiler Design
Stack
-The stack is used to store data structures
called activation records that get generated
during procedure calls.
-The stack grows towards lower addresses.
Heap
- The heap grows towards higher.
(We shall assume that the stack grows towards higher addresses so that
we can use positive offsets for notational convenience in all our
examples.)
Many programming languages allow the programmer to allocate and

deallocate data under program control. For example, C has the functions
malloc and free that can be used to obtain and give back arbitrary
chunks of storage.
8
Compiler Design
Storage Allocation Strategies
 Code Area
 Static Data Area
 Stack Area
 Heap Area
Three different storage allocation strategies based on this division of

runtime storage-
1. Static allocation – allocation of all data object at compile time.
2. Stack allocation – stack is used to manage the runtime storage.
3. Heap allocation – heap is used to manage the dynamic memory

allocation.
9
Compiler Design
1. Static Allocation
 The size of data object is known at compile time. The names of these
objects are bound to storage at compile time only.
 The binding of name with amount of storage allocated do not chane

at runtime.
 Compiler can easily determine the amount of storage required by

data objects.
 Compiler can fill the addresses at which the target code can find the
data it operates on.
 FORTRAN uses the static allocation strategy.
10
Compiler Design
1. Static Allocation
Limitations of static allocations
 Can be done if the size of the data object known at compile time.
 The data structure can not be created dynamically – cannot manage

memory at runtime.
 Recursive procedures are not supported.
11
Compiler Design
2. Stack Allocation
The storage is organized as stack – called controlled stack.
On activation the activation records are pushed into the stack and on
completion of activation the corresponding record can be popped.
The locals are stored in the each activation record. Hence locals are
bound to corresponding activation record.
The data structure can be created dynamically for stack allocation.
12
Compiler Design
3. Heap Allocation
 If the values of non local variables must be retained even after the
activation record then such a retaining is not possible by stack
allocation. This limitation of stack allocation is because of its LIFO
nature. For retaining of such local variables heap allocation strategy is
used.
 The heap allocation allocates the continuous block of memory when

required and deallocated when no more needed. This deallocated
memory can be further reused by heap manager.
 The efficient heap management can be done by –

•Creating linked list for free blocks and when any memory is
deallocated that block of memory is appended to the linked list.
•Allocate the most suitable block of memory from the linked list i.e.
Use best fit technique for allocation of bock.
13
Compiler Design
Comparison between Static, Stack and Heap allocation

Static allocation Stack allocation Heap Allocation
Done for all data objects at Stack is used to manage the Heap is used to manage
compile time. runtime memory. dynamic memory allocation.
Data structure cannot be Data structures and data Data structures and data
created dynamically. objects can be created objects can be created
dynamically. dynamically.
Memory allocation: The names Memory allocation: Using LIFO Memory allocation: A
of data objects are bound to activation records and data contiguous block of memory
storage at compile time. objects are pushed into the from heap is allocated.
stack. The memory addressing
can be done using index and
registers.
Merits and limitations: Simple Merits and limitations: Merits and limitations: Efficient
to implement but supports Supports dynamic memory memory management is done
static allocation only. allocation but it is slower than using linked list.
Recursive procedures are not static allocation. The deallocation of space can
supported. Supports recursive procedures be reused.
but references to non local But since memory block is
variables after activation record allocated using best fit, holes
can not be retained. may get introduced in the
memory.
14
Compiler Design
Static Versus Dynamic Storage Allocation
The layout and allocation of data to memory locations in the run-time

environment are key issues in storage management.
• We say that a storage-allocation decision is static, if it can be made

by the compiler looking only at the text of the program, not at what the
program does when it executes.
• Conversely, a decision is dynamic if it can be decided only while the

program is running.
15
Compiler Design
Static Versus Dynamic Storage Allocation
Many compilers use some combination of the following two strategies

for dynamic storage allocation:
1. Stack storage. Names local to a procedure are allocated space on a
stack. The stack supports the normal call/return policy for procedures.
2. Heap storage. Data that may outlive the call to the procedure that
created it is usually allocated on a "heap" of reusable storage. The heap
is an area of virtual memory that allows objects or other data elements
to obtain storage when they are created and to return that storage
when they are invalidated.
To support heap management, "garbage collection" enables the run-

time system to detect useless data elements and reuse their storage,
even if the programmer does not return their space explicitly.
Automatic garbage collection is an essential feature of many modern
languages, despite it being a difficult operation to do efficiently.
16
Compiler Design
Stack Allocation of Space
Each time a procedure1 is called, space for its local variables is

pushed onto a stack, and when the procedure terminates, that space is
popped off the stack.
This arrangement not only allows space to be shared by procedure

calls whose durations do not overlap in time, but it allows us to compile
code for a procedure in such a way that the relative addresses of its
nonlocal variables are always the same, regardless of the sequence of
procedure calls.
17
Compiler Design
Activation Trees
Stack allocation would not be feasible if procedure calls, or activations
of procedures, did not nest in time.
If an activation of procedure p calls procedure q, then that activation of
q must end before the activation of p can end. There are three common
cases:
1. The activation of q terminates normally. Then in essentially any

language, control resumes just after the point of p at which the call to
q was made.
2. The activation of q, or some procedure q called, either directly or
indirectly, aborts; i.e., it becomes impossible for execution to continue.
In that case, p ends simultaneously with q.
3. The activation of q terminates because of an exception that q cannot
handle.
Procedure p may handle the exception, in which case the activation of q

has terminated while the activation of p continues, although not
necessarily from the point at which the call to q was made. If p cannot
handle the exception, then this activation of p terminates at the same
time as the activation of q, and presumably the exception will be
handled by some other open activation of a procedure.
18
Compiler Design
Example: Consider the program that reads nine integers into an array a
and sorts them using the recursive quicksort algorithm.
int a[11];
void readArrayO { /* Reads 9 integers into o[l], ...,o[9]. */
int i;
}
int partition(int m, int n) {
/* Picks a separator value v, and partitions a[m ..n] so that a[m ..p — 1] are less than v, a\p] = v,
and a[p + 1.. n] are equal to or greater than v. Returns p. */
}
void quicksort(int m, int n) {
int i;
if(n > m) {
i = partition( m , n );
quicksort(m, i - 1 );
quicksort( i + 1 , n );
}
}
mainQ {
readArrayO ;
a[0] = -9999;
a [10] = 9999;
quicksort( 1 , 9 ) ;
}
19
Compiler Design
Possible activations for the program
enter main()
enter readArray()
leave readArray()
enter quicksort( 1 , 9)
enter partition( 1 , 9)
leave partition( 1 , 9)
leave quicksort( 1 , 3)
leave main()
20
Compiler Design
Activation tree representing calls during an execution of

quicksort
21
Compiler Design
Downward-growing stack of activation records
22
Compiler Design
Activation Records
-Procedure calls and returns are usually managed by a run-time stack

called the control stack.
-Each live activation has an activation record (sometimes called a

frame) on the control stack, with the root of the activation tree at the
bottom, and the entire sequence of activation records on the stack
corresponding to the path in the activation tree to the activation where
control currently resides.
23
Compiler Design
Activation Records
-Temporary values, such as those arising from the

evaluation of expressions, in cases where those
temporaries cannot be held in registers.
- Local data belonging to the procedure whose

activation record this is.
- A saved machine status, with information about the state of the

machine just before the call to the procedure. This information typically
includes the return address and the contents of registers that were
used by the calling procedure and that must be restored when the
return occurs.
- An "access link" may be needed to locate data needed by the called

procedure but found elsewhere, e.g., in another activation record.
- A control link, pointing to the activation record of the caller.
24
Compiler Design
Activation Records
-Space for the return value of the called function, if

any. Again, not all called procedures return a value,
and if one does, we may prefer to place that value in a
register for efficiency.
- The actual parameters used by the calling procedure.

Commonly, these values are not placed in the
activation record but rather in registers, when
possible, for greater efficiency. However, we show a
space for them to be completely general.
25
Compiler Design
Example: By taking example of factorial program explain how activation record will
look like for every recursive call in case of factirial (3).
Solution:
int factorial (int n){

if(n == 1)
return 1;
else
return (n * factorial (n – 1) );
}
main(){
int f;
f = factorial(3);
}
26
Compiler Design
look like for every recursive call in case of factorial (3).
Step 1:
Act. Record for

main()
Act. Record for

factorial()
27
Compiler Design
Step 2:
Act. Record for
main()
Act. Record for

factorial (3)
Act. Record for

factorial (2)
28
Compiler Design
Step 3:
Act. Record for
main ()
Act. Record for

factorial (3)
Act. Record for

factorial (2)
Act. Record for

factorial (1)
29
Compiler Design
Parameter Passing
There are two types of parameters-
i) Formal Parameter
ii) Actual Parameter
Based on these parameters there are various parameter passing

methods, the most common methods are (all the examples in
FORTRAN) -
30
Compiler Design
1. Call by value:
• The actual parameters are evaluated and their r-value are passed to
called procedure.
• The operations on formal parameters do not changes the values of

actual parameters.
• Example: Language like C, C++ use actual parameter passing method.

In PASCAL the non-var parameter.
31
Compiler Design
2. Call by reference: This method is also called as call by address or

call by location.
• The L-value, the address of actual parameter is passed to the called

routines activation.
• The values of actual parameters can be changed.
• The actual parameter should have an L-value.
• Example: Reference parameters in C++, PASCAL’s var parameters.
32
Compiler Design
3. Copy restore: This method is a hybrid between call by value and

call by reference. This method is also known as copy-in-copy-out or
values result.
• The calling procedure calculates the value of actual parameter and it

then copied to activation record for the called procedure.
• During execution of called procedure, the actual parameters value is

not affected.
• If the actual parameter has L-value then at return the value of formal
parameter is copied to actual parameter.
•Example: In ADA, this kind of parameter passing is used.
33
Compiler Design
4. Call by name:
• Procedure is treated like macro. The procedure body is substituted for

call in caller with actual parameters substituted for formals.
• The actual parameters can be surrounded by parenthesis to preserve

their integrity.
• The local names of called procedure and names of calling procedure

are distinct.
•Example: In ALGOL uses call by name method.
34
Compiler Design
Symbol Tables
 A compiler uses a symbol table to keep track of scope and binding

information about names.
 The table is searched every time a name is encountered in source

code.
 A symbol-table mechanism must allow us to add new entries and find

existing entries efficiently.
We evaluate each scheme on basis of time required to add n entries and

make e enquires.
• A symbol-table mechanism must allow us to add new entries and find

existing entries efficiently.
35
Compiler Design
Symbol Tables
Symbol-Table Entries
 The items to be stored in symbol table are:

 Variable names
 Constants
 Procedure names
 Function names
 Literal constants and strings
 Compiler generated temporaries
 Labels in source language
 Compiler uses following types of information from symbol-table

 Datatype
 Name
 Declaring procedure
 offset storage
 if structure or record then pointer to the structure table
 for parameters, whether parameter passing is by value or reference ?
 Number and type of arguments passes to the function
 Base address
36
Compiler Design
How to store the names in symbol tables
 The lexeme consisting of character string forming the name and

attributes of the name.
 The lexeme is needed when a symbol-table entry is set up for first

time, and when we look up a lexeme found in input to determine
whether it is a name that has already appeared.
 There are two types of name representation
1. Fixed-length name
• A fixed space for each name is allocated in symbol table.
• If name is too small then there is wastage of space.
37
Compiler Design
How to store the names in symbol tables
2. Variable-length name
• Rather than allocating in each symbol-table entry the maximum
possible amount of space to hold a lexeme, we can utilize space
more efficiently if there is only one pointer space in a symbol-table
entry.
• In the record for name, we place a pointer to separate array of
characters (the string table) giving position of the first charecter of
the lexeme.
38
Compiler Design
Symbol Table Management
Requirement for symbol table management:
i) For quick insertion of identifier and related information
ii) For quick searching of identifier
39
Compiler Design
1. List data structure for symbol-

table
 Linear list is a simplest kind of
mechanism to implement symbol
table.
 An array is used to store names and
associated information.
 New names can be added in the
order they have arrive.
 The pointer ‘available’ is maintained at the end of all stored records.
 To retrieve the information about some name we start from
beginning of array and go on searching up to available pointer. If we
reach at pointer available without finding a name we get an error “use
of undeclared name”.
 While inserting a new name we should ensure that it should not be
already there. If it is there another error occurs i.e. “Multiple defined
Name”.
 The advantage of list organization is that it takes minimum amount
of space.
40
Compiler Design
2. Self organizing list

 Linear list is implemented using
linked list. A link field is added to
each record.
 We search the records in the
order pointed by the link of link
field.
 A pointer “First” is maintained to point to first record of the symbol
table. The reference to these names ca be Name 3, Name 1, Name 4,
Name 2.
 When the name is referenced or created it is moved to the front of
the list.
 The most frequently referred name will tend to be front of the list.
Hence access time to most frequently referred names will be least.
41
Compiler Design
3. Hash tables
 Hashing is an important technique used to search records of symbol
table. This method is superior to list organization.
 A hash table consisting of a fixed array of m pointers to table
entries.
 Table entries organized into m separate linked lists, called buckets.
Each record in symbol table appears on exactly one of these lists.
The dynamic storage allocation facilities of the implementation
language can be used to obtain space for the records, often at some
loss of efficiency.
A hash table
of size 211
42
Compiler Design
3. Hash tables (contd...)

• To determine whether there is an entry for string s in the symbol
table, we apply a hash function h to s, such that h(s) returns an
integer between 0 to m – 1.
• If s is in symbol table, then it is on the list numbered h(s). If s is not
yet in symbol table, it is entered by creating a record for s that is
linked at the front of list numbered h(s).
• The hash function should result in uniform distribution of names in
symbol table.
• The hash function should be such that there will be minimum number
of collision. Collision is such a situation where hash function results in
same location for storing the names.
• Various collision techniques are – open hashing, chaining, rehashing.
• The advantage of hashing is quick search and the disadvantage is
that hash is complicated to implement. Some extra space required.
Obtaining scope of variables is very difficult.
43

Compiler Design Storage Allocation Strategies

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Compiler Design Storage Allocation Strategies

Diunggah oleh

Hak Cipta:

Format Tersedia

Compiler Design

Arup Kr. Chattopadhyay, Department of IT, IEM, Kolkata 1

• A compiler must accurately implement these abstractions and also

• To do so, the compiler creates and manages a run-time environment

• This environment deals with a variety of issues such as the layout

Source Language Issues

Language features that effect the organization of memory

•How the parameters are passed to the procedure?

•Does the procedure refer nonlocal names? How?

•Does the language support the memory allocation and deallocation

From the perspective of the compiler writer, the executing target

The management and organization of this logical address space is

The operating system maps the logical addresses into physical

• The management and organization of

• The operating system maps the logical

An elementary data type, such as a character, integer, or float, can be

Storage for an aggregate type, such as an array or structure, must be

The storage layout for data objects is strongly influenced by the

-Compiler can place the executable target

- Low end of memory

- One reason for statically allocating as many data objects as possible is

-The stack grows towards lower addresses.

Many programming languages allow the programmer to allocate and

Storage Allocation Strategies

 Static Data Area

Three different storage allocation strategies based on this division of

1. Static allocation – allocation of all data object at compile time.

2. Stack allocation – stack is used to manage the runtime storage.

3. Heap allocation – heap is used to manage the dynamic memory

 The binding of name with amount of storage allocated do not chane

 Compiler can easily determine the amount of storage required by

 FORTRAN uses the static allocation strategy.

Limitations of static allocations

 The data structure can not be created dynamically – cannot manage

 Recursive procedures are not supported.

The storage is organized as stack – called controlled stack.

The data structure can be created dynamically for stack allocation.

 The heap allocation allocates the continuous block of memory when

 The efficient heap management can be done by –

Comparison between Static, Stack and Heap allocation

Static Versus Dynamic Storage Allocation

The layout and allocation of data to memory locations in the run-time

• We say that a storage-allocation decision is static, if it can be made

• Conversely, a decision is dynamic if it can be decided only while the

Static Versus Dynamic Storage Allocation

Many compilers use some combination of the following two strategies

To support heap management, "garbage collection" enables the run-

Stack Allocation of Space

Each time a procedure1 is called, space for its local variables is

This arrangement not only allows space to be shared by procedure

1. The activation of q terminates normally. Then in essentially any

Procedure p may handle the exception, in which case the activation of q

Possible activations for the program

Activation tree representing calls during an execution of

Downward-growing stack of activation records

-Procedure calls and returns are usually managed by a run-time stack

-Each live activation has an activation record (sometimes called a

-Temporary values, such as those arising from the

- Local data belonging to the procedure whose

- A saved machine status, with information about the state of the

- An "access link" may be needed to locate data needed by the called

- A control link, pointing to the activation record of the caller.

-Space for the return value of the called function, if