Anda di halaman 1dari 22

CS 2304 SYSTEM SOFTWARE G.

PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

UNIT II ASSEMBLERS

Basic assembler functions


A simple SIC assembler
Assembler algorithm and data structures
Machine dependent assembler features
Instruction formats and addressing modes
Program relocation
Machine independent assembler features
Literals
Symbol-defining statements
Expressions
Program blocks
One pass assemblers and Multi pass assemblers k /
Implementation examples
. t
MASM assembler.
b e
t u
Assemblers1:
s e
/ c
(i)Translation mnemonic operation codes to their machine language equivalents and
assigning machine address to symbolic labels used by the programmer.

architecture. : /
(ii)There are some features of an assembler language that have no direct relation to machine

t
1. Basic Assembler Functions2:
p
h t
START-Specific name and starting address for the program.
END-Indicate the end of the source program and specify the first executable instruction in
the program.
BYTE-Generate character or hexadecimal constant,occupying an many bytes as neede to
represent the constant.
WORD-Generate one-word integer constant.
RESB- Reserve the indicated number of bytes for a data area.
RESW-Reserve the indicated number of words for a data area.

2 MARKS

1.Define assembler.
2. What are the basic functions in assembler. Explain.

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

Example:
SIC assambler language program.
The program contains a main routine that reads records from an input device and copies
them to an output device.
This main routine calls subroutine RDREC to read a record into a buffer and subroutine
WRREC to write the record from the buffer to the output device.
Each subroutine must transfer the record one character at a time.
Because the only instructions available are RD and WD.
The buffer is necessary because the I/O rates for the two devices,such as a disk and a slow
printing terminal may be very different.
The end of each record is marked with a null character.If a record is longer than the length
of the buffer (4096 bytes),only the first 4096 bytes are copied.
The program does not deal with error recovery.
k /
. t
The end of the file to be copied is indicated by zero-length record.
When the end of file is detected,the program writes EOF on the output device and terminates
by executing RSUB instruction.
b e
will return control to the operating system.
t u
This program was called by the operating system using a JSUB instruction,Thus the RSUB

s e
PROGRAM
/ c
:/
EXPLANATION
SOURCE STATEMENT
LINE LOCCTR LABEL

t pOPCODE OPERAND

ht
5 COPY START 1000 COPY FILE FROM I/P TO O/P
10 FIRST STL RETADR SAVE RETURN ADDRESS
15 CLOOP JSUB RDREC READ I/P RECORD
20 LDA LENGTH TEST FOR EOF(LENGTH=0)
25 COMP ZERO
30 JEQ ENDFIL EXIT IF EOF FOUND
35 JSUB WRREC WRITE O/P RECORD
40 J CLOOP LOOP
45 ENDFIL LDA EOF INSERT END OF FILE MARKER
50 STA BUFFER
55 LDA THREE SET LENGTH=3
60 STA LENGTH

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

65 JSUB WRREC WRITE EOF


70 LDL RETADR GET RETURN ADDRESS
75 RSUB RETURN TO CALLER
80 EOF BYTE C'EOF'
85 THREE WORD 3
90 ZERO WORD 0
100 RETADR RESW 1
105 LENGTH RESW 1
110 BUFFER RESB 4096

SUB ROUTINE TO READ RECORD INTO BUFFER


PROGRAM
SOURCE STATEMENT
EXPLANATION
k /
LINE LOCCTR LABEL
OPCODE OPRAND
. t
125 RDREC LDX ZERO
b e
CLEAR LOOP COUNTER
130 LDA ZERO
t u
CLEAR A TO ZERO
135 RLOOP TD INPUT
s e TEST I/P DEVICE
140
145
JEQ
RD / c
RLOOP
INPUT
LOOP UNTIL READY
READ CHARACTER INTO

150 p
COMP
:/ ZERO
REGISTER A
TEST FOR END OF RECORD
t
ht
155 JEQ EXIT EXIT LOOP IF EOR
160 STCH BUFFER,X STORE CHARACTER IN BUFFER
165 TIX MAXLEN LOOP UNLESS MAX LENGTH HAS
BEEN REACHED
170 JLT RLOOP
175 EXIT STX LENGTH SAVE RECORD LENGTH
180 RSUB RETURN TO CALLER
185 INPUT BYTE X'F1' CODE FOR I/P DEVICE
190 MAXLEN WORD 4096

SUBROUTINE TO WRITE RECORD FROM BUFFER

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

PROGRAM
EXPLANATION
SOURCE STATEMENT
LINE LOCCTR LABEL
OPCODE OPERAND
200 WRREC LDX ZERO CLEAR LOOP COUNTER
210 WLOOP TD OUTPUT TEST OUTPUT DEVICE
215 JEQ WLOOP LOOP UNTIL READY
220 LDCH BUFFER,X GET CHARACTER FROM BUFFER
225 WD OUTPUT WRITE CHARACTER
230 TIX LENGTH LOOP UNTIL ALL CHARACTERS
HAVE BEEN WRITTEN
235 RSUB RETURN TO CALLER
240 OUTPUT BYTE X'05' /
CODE FOR O/P DEVICE
k
245 END FIRST
. t
2. A Simple SIC Assembler:
b e

t u
Convert mnemonic operation codes to their machine language equivalents.
(examlpe:translate STL to 14)
s e
Convert symbolic operands to their equivalent machine addresses.(example:translate
RETADR to 1033)
/ c
Build the machine instructions in the proper format.
: /
Convert the data constants specified in the source program into their internal machine

t p
representations(example:EOF to 454f46)
Write the object program and the assembler.

10
t
Consider the statement,
h1000 FIRST STL RETADR
To translate the program line by line,we will be unable to process this statement because we
do not know the address that will be assigned to RETADR.
Because of this,most of assemblers make two passes over the source program.
The first pass does little more than scan the source program for label definitions and assign
addresses.
The second pass performs most of the actual translation previously described.
In addition,to translating the instructions of the source program, the assembler must process
statements called assembler directives or pseudo-instructions.
These statements are not translated into machine instructions.Instead,they provide
instructions to the assembler itself.(example:BYTE,WORD)
In our example program
START-Specifies the starting memory address for the object program.
END-Specific end of the program.
Finally,the assembler must write generated object code onto some output device.

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

Object program format is divided into three types of records3,


Header
Text
End

The header record constains the program name,starting address and length.
Header record:
col:1 H
col:2-7 program name
col:8-13 Starting address of object program
col:14-19 Length of object program in bytes

The text records contain the translated instructions and data of the program,together with an
indication of the addresses where these are to be loaded.
k /
Text Record:
. t
col:1
col:2-7
T
b e
Starting address for object code in thid record
col:8-9
col:10-69 t u
Length of object code in this record in bytes
Object code,represented in hexadecimal

s e
c
3. Define record.Explain. ( 2 MARKS)
/
: /
The end record marks the end of the object program and specifies the address in the program

End record: t p
where execution is to begin.

col:1
col:2-7
E
h t
Address of first executable instruction in object program.

The scope of the assembler is, to generate object code. But assembler does not know the
address exactly.so that the assembler choose pass1 algorithm and pass 2 algorithm4.

Pass:1
1.Assign addresses to are statements in the program.
2.Save the values assigned to are labels for use in pass 2.
3.Perform some processing of assembler directives.
Pass:2
1.Assemble instructions.
2.Generate data values.
3.Perform processing of assembler directives not done during pass 1.
4.Write the object program and the assembly listing.

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

3. Assembler Algorithm and Datastructures:


Our simple assembler uses two major internal data stuctures5:
-The operation code tabel(OPTAB)
-The symbol table(SYMTAB)
OPTAB is used to look up mnemonic operation codes and translate them to their machine
language equivalents6.
SYMTAB is used to store values assigned to labels7.
LOCCTR-This is a variable that is used to help in the assignment of addresses8.
LOCCTR is intialized to the beginning address specified in the START statement.
After each source statement is processed, the length of the assembled instruction or data area
to be generated is added to LOCCTR.
Whenever we reach a label in the source program,the current values of LOCCTR gives the
addresss to be associated with that label.

k /
. t
b e
2 MARKS
t u
4.
5. s e
Why you go for pass 1 & pass 2 algorithm?. State the reason.
What are the data structures ised in assembler?
6.
7. / c
Define optab.
Define symtab.
8.
: /
Define LOCCTR.

t p
h t

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

3.1 PASS 1 ASSEMBLER ALGORITHM 9:

k /
. t
b e
t u
s e
/ c
: /
t p
h t

9. Explain in detail about pass1 assembler algorithm. (8 Marks)

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

3.2 PASS 2 ASSEMBLER ALGORITHM 10

k /
. t
b e
t u
s e
/ c
: /
t p
h t

10. Explain in detail about pass2 assembler algorithm.(8 MARKS)

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

4. Machine Dependent Assembler features:


Eg: SIC-XE assembler program.
Immediate and indirect addressing can be adopted in programs written in SIC/XE version.
*Immediate operands are denoted with the profix #
*Indirect addressing is indicated by adding the prefix @ to the operand.
Instructions that refeer to memory are assembleed normally using program counter relative
or base relative mode.
If the displacement required for pc relative and base relative addressing are too large then
the 4 byte extende format instruction is used.
The main difference between SIC and SIC/XE programs is the use of register to register
instruction.
4.1 Advantages of SIC/XE Program:
Execution speed is good since register to register instruction execution speed is faster than
register to memory instruction.
Immediate operand need not be fetched from anywhere as it is present as a part of
instruction.
k /
The large main memory of SIC/XE provides room to load and run several programs at the
same time.
4.2 Instruction formats and Addressing Modes: . t
b e
The START statement specifies the starting address of the location where the program is to
be loaded.
t u
s e
Eg:START 0 statement will allow a program to be loaded in the address 0.
SYMTAB would be preloaded with the register names (A,X...etc) and their values(0,1...ets)

relative addressing. / c
Register to memory instruction is assembled using either program counter relative or base

: /
The assembler must calculate the displacement, which must be added as a part of the object
instruction.
t p
The displacement is calculated so that the correct target address is ngot when content of

h t
program counter(pc) or base register(B) is added with the displacement.
Displacement must be between 0 and 4095(for base relative mode) or between -2048 and
-2047(for program counter relative mode).
If neither program counter rerlative nor base relative addressing can be used then the 4-byte
extended instruction format is used.
Examples for code generation:

10 0000 FIRST STL RETADR 17 2 0 2 D


machine last displacement value
equivalent four
+ bits
first two bits of
of register reg

12 0003 LDB #LENGTH 69 2 0 2 D


.
95 0030 RETADR RESW

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

PRODEDURE:

THERE ARE THREE OPRATIONS TO FIND OBJECT CODE.

OPERATION 1: FIND MACHINE LANGUAGE EQUIVALENT AND SUM WITH


FIRST TWO BITS OF REGISTER.
STEP 1: FIND MACHINE LANGUAGE EQUIVALENT
STEP 2: FIND FIRST TWO BITS OF PC OR BASE REGISTER.
STEP 3: CALCULATE DECIMAL VALUE OF TWO BITS.
E.G: STA - 14
FIRST TWO BITS OF PROGRAM COUNTER ( 11 0010) IS 11.
THE DECIMAL EQUIVALANT IS 1 1
1 X 20= 1
1 X 21= 2

t k/ 3

AS PER OUR PROCEDURE, MACHINE LANGUAGE SUM WITH


REGISTER DECIMAL EQUIVALANT. 14+3 = 17.
e .
u b
DECIMAL EQUIVALENT. e t
OPERATION 2: FIND LAST TWO BITS OF REGISTER. AND CALCULATE

c s
E.G: THE LAST FOUR DIGITS OF REGISTER IS 0010. THE
DECIMAL EQUIVALANT IS 2.
/ /
p :
OPERATION 3: FIND DISPLACEMENT VALUE
t
STEP 1: FIND THE OPERAND ADDRESS.

ht
STEP 2: FIND THE NEXT INSTRUCTION ADDRESS OF THE CURRENT
LINE.
STEP3: CONVERT THE STEP 1 HEXADECIMAL VALUE INTO DECIMAL.
STEP 4: CONVERT THE STEP 2 HEXADECIMAL VALUE INTO DECIMAL.
STEP 5: SUBTRACT STEP 4 ANSWER FROM STEP 1.
STEP6: CONVERT STEP 5 ANSWER INTO HEXADECIMAL.
STEP7: SUPPOSE STEP 5 ANSWER IS NEGATIVE VALUE MEANS, FIND
2'S COMPLEMENT VALUE

E.G: OPEARND IS RETADR. THE ADDRESS IS 30.


NEXT INS ADDRESS IS 3.
3 0 3
0
0 X 16 =0 3 X 160= 3
3 X 161=48

48

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

48 3 = 45.

16 45
2 - D

HEXADECIMAL 0F 45 IS 2 D.

Example 1:

15 0006 CLOOP +JSUB RDREC 4B101036


.
.
. k /
40
45
0017
0014 ENDFIL
J
LDA
CLOOP
EOF
------
03201D . t
b e
Program counter relative addressing is,
t u
1
n
1
i
0
x
0
b
1
0
p
e s e
c
Ist two bits are 1(As n and i=1)
/
: /
Hexadecimal equivalent of 11 is 3.
find the displacement values,
p
CLOOP location is 0006 and pc values is 1A.
t
6-26 = -20. h t
Decimal equivalent of 6 is 6,decimal equivalant if 1A is 26.

Hexadecimal equivalent of -20 is -14.


The hexadecimal values 14 is written as binary values.Because,the
hexadecimal value is have '-ve' signed.As per our
concept,calculate 2's complement for that value.

14 is written as,
0000 0001 0100 -->14 (based on displacement. Address foeld is 12 bits)
(0 1 4 )
The 2's complement is,
The 2's complement procedure is,
Take 1's complement then add 1 to the answer of 1's complement.

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

0000 0001 0100 -->1111 1110 1011 (1's complement)


+ 1

1111 1110 1100 (2's complement)

F=1111, E=1110,C = 1100


The object code is 3F2FEC.

Difference between pc relative and base relative addressing11:

1. When pc relative addressing is used the assembler will know the content of pc,only during
execution time.
2. But in base relative addressing ,the programmer must tell the assembler what the base
register will contain during the execution of the program and the assembler will calculate the
displacement.

4.3 Program Relocation:


k /
More than one program can share the memory and other resources of the machine.
. t
If we knew in advance,which program would execute concurrently,we could assign

overlap.But practically this may not be possible. b e


address,when the program were assembled so that they would fit together without

t u
So it is desirable to load a program into the memory whenever there is a space for it.
In such cases actual starting address of the program is not know until load time.
s e
If the program is loaded beginning at the location 1000,the variable THREE value will
located at address 102D.
/ c
If the program is loaded starting at some other addresss 2000,the address 102D will not

: /
contain the actual value of THREE.
So we have to make some changes in the address portion of the instruction in order to
t p
retrieve the correct value.

Eg:
h t
0006 CLOOP +JSUB RDREC 4B101036
.
.
1036 RDREC CLEAR X B410

11. Difference between program counter addressing and base relative addressing. (2 marks)

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

Case 1:
The statement RDREC is present at the memory location 1036,if the program loaded beginning
at address 0000.

0000
.
0006 4B101036 <--+JSUB RDREC
.
.
1036 B410 <--RDREC

Case 2:
5000
.
.
5006 4B106036 <--+JSUB RDREC
.
. k /
6036 B410 <--RDREC
The address of the instruction JSUB the address of label RDREC. . t
b e
The assembler does not know the actual location where the program will be loaded.However the

t u
assembler can identify for the loader those parts of the object program that need modification.
An object program that contain the information to perform this kind of modification is called a
relocatable program.
s e
Relocation Program Solving Steps:
/ c
: /
When the assembler generates the object code for the JSUB instruction,it will insert the
address of RDREC,relative to the start of the program.(This is the reason we intialized the

t p
location counter to 0 for the assembly)
The assembler will also produce a command for the loader,instructing it to add the
t
beginning address of the program to the address field in the JSUB instruction at load time.
h
Modification Record12:

col:1 M
col:2-7 Starting location of the address field to be modified relative to the beginning of
the program.
Col:8-9 Length of address field to be modified in half bytes.
(ie. 4 bits=1 half byte)
For all the instruction which uses extended format instruction,relocation must be performed,
so modification record must be added.
Other lines in the program do not require modification as they use pc relative or base
relative addressing.

12. Define Modification record.Explain.(2 marks).

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

5. Machine Independent Assembler Features:

This features that are commonly found in implementation of this type of software and that are
relatively machine independent.

5.1 Literals13:
Programmer is convenient to write the value of a constant operand as a part of the
instruction that uses it.
This avoids having defined the constantss where in the program and make up a label for it.
Such an operand is called as literal,because the value is stated literally in the instruction.
Literal is identified with the prefix '=',which is followed by a specification of the literal
value.
Eg:
45 001A ENDFIL LDA =C'EOF'032010
215 1062 WLOOP TD =X'05'E32011

Difference between Literal and Immediate Operand14:


k /
. t
An immediate addressing ,the operand value ios assembled as part of the machine
instruction.In literal the assembler must generate the value as a constant in any of the
memory location.
b e
u
Address of the constant is assigned as the target address.
t
5.1.1 Literal Pool15:
s e
Literals are stored in literal pool.This operation is carried out the end of the program.
LTORG-->Assembler directives
/ c
: /
It creates the literals pool immediately and store the literals until the previous LTORG.
Once a literal is stored in the literal pool then it is nnot repeated again.

t p
In some program the LTORG is placed in the middle of the program, this is because the
literals are placed in the pool at the end of the program.

h t
When there is a literal at the beginning of the program and the program has 300 lines means
then the starting address of the literal pool is at the end of the program.
The reference for the operand make the pc to go for to reach literal and this waste the
time.So it is possible to use as much LTORG statement in the program.
Most of the assembler does not allow duplication of literals in the literal pool.They allow the
same literal used more than one place in the program.
In literal pool only one copy of the specified date value is stored.
Before allocating space for a literal in the pool,it is verified that is there the same literal is
already in the pool by means of comparing the literals in the pool character with the new
literal.

13. Define literals.(2 marks)


14. Diferntiate literal and immediate operand.(2 marks)
15. Define literal pool.( 2marks)

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

For example,
Same literal is used more than once,and the literal has different values during the execution of
the program.Here according to the duplication of literals in the pool,the above mentioned literal is
appeared once in the pool and the execution may be a problem.

The solution is created basic data structure literal tabel[LITTAB16]


Literal tabel contains,
-Literal name
-The operand value and length
-Address assigned to the operand
During pass 1 the assembler searches the LITTAB for a literal name.If the literal is present
means no problem.If it is not the literal is added to the literal tabel.
During pass 2 the assembler searches the LITTAB for the literal address for object code
generation.

k /
5.2 Symbol-defining Statements : 17

. t
User defined symbols in assembler language program have appeared as labels on instruction
or data areas.
b e
The value of such a label is the address assigned to the statement on which it appears.

t u
Most assembler provides an assembler directive that allows the programmer to define
symbols and apecify their values.
s
The assembler directive generally used in EQU.
e
/ c
The general form of such statement is,symbol EQU value.
This statement defines the given symbol and assigns the value specified to it.
The value may be given as,
: /
-A constant.
t p
-As any expression involving contents.

h t
-Previously defined symbols.
One use of EQU is to establish symbolic names that can be used for improved readability in
place of numeric values.
Eg:
+LDT #4096
to load the values 4096 into register T.This values represents the maximum length record.We could
read with subroutine RDREC.
MAXLEN EQU 4096
And the calling statement like this
+LDT #MAXLEN
Now it is clear that MAX LEN is replaced with the values 4096 during execution.Assembler
encounters the EQU and stores it in the SYMTAB with its value 4096.
Another common use of EQU is in defining mnemonics names for registers.

16. Define LITTAB.(2 marks)


17. Define symbol defining statements.(2marks)

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

Eg:
A EQU 0 BASE EQU R1
X EQU 1 COUNT EQU R2
L EQU 2 INDEX EQU R3

These statements specify a 1 byte literal with the hexadecimal value 05.The notation used
for literal varies from assembler to assembler.
It is important to understand the difference between a literal and an immediate operand with
immediate addressing,the operand value is assembled as part of the machine instruction.
With literal the assembler generates the specified value as a constant at some other memory
location.
BASE *
LDB =*
Another assembler directive is called ORG.This is used to indirectly assign the values to
symbols.

k /
When value is a constant or an expression involving constants and previously defined

SYMBOL
symbol.
RESB 6
. t
VALUE
FLAGS
RESB
RESB
1
2 b e
ORG STAB +1100
t u
The first ORG resets the location counter to the value of STAB.The label on the following
s e
RESB statements defines SYMBOL to have the current value in LOCCTR.
c
(ie)the same address assigned to SYMTAB LOCCTR.
/
5.3 Expressions: : /
p
Our previous examples of assembler language statements have used single terms like
t
h t
label,literal,etc.,as instruction operands.
Most of the assemblers use expression wherever a single operand is permitted.
Such expression is evaluated by the assembler and the result is used as the normal operand.
Arithmetic expressions are allowed and it must follow the normal rules using the operators
+,-,* and /.
This statement is encountered during assembly of a program,the assembler refers its location
contain(LOCCTR)to the specified value we can define a symbol tabel with all following
structures.
SYMBOL VALUE FLAGS

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

In this tabel,SYMBOL field contain'6' byte user-defined symbols;VALUE is a one-word


representation of the value assigned to the symbol;FLAGS is a 2-byte field that specifies
symbol type and other information.
STAB RESB 1100
With EQU statements,
SYMBOL EQU STAB
VALUE EQU STAB+6
FLAGS EQU STAB+9
With help of assembler directive ORG,we can write those statemnts,
STAB RESB 1100
ORG STAB
Division is usually defined to produce an integer result.Individual terms in the expression
may be constant,user-defined symbols(or)special terms,common special term is the current
value of the location counter(designated by *).(ie)the value of the next unassigned memory
location.
BUFEND EQU *

k /
The above expression gives BUFEND a value that is the address of the next byte after the
buffer area.
. t
Some values in the object program are relative to the beginning of the program,while others
are absolute.
b e
Similarly,the values of terms and expressions are either relative or absolute.

t u
A constant is an absolute term.Labels on instructions and data areas,and references to the
location counter value,are relative terms.
s e
A symbol whose value is given by EQU may be either an absolute term or a relative term

Expressions are classified as18, / c


depending upon the expression used to define its value.

: /
*Absolute expression
t
*Relative expression p
h t
The expressions are depending upon the type of value they produce.
Expression that contains only absolute terms are come under absolute expression.
There are some conditions19 to use the relative terms in the expressions,
*Every relative term is paired with another relative term.
*Remaining unpaired term is assigned with a pasitive sign.
*Relative term is not allowed for multiplication and division operation.
Expressions that do not come under absolute or relative are flagged by the assembler an
errors.
Some timer relative terms are paired with opposite signs,in that case the result is an absolute
value.
MAXLEN EQU BUFEND-BUFFER

18. Define expressions. Whatr are types of expression.(2 marks).


19. What are conditions to use the relative terms in expressions.(2 marks)

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

5.4 Program Blocks:


Normally the source program is treated as a unit which contains subroutines,data areas,etc.,
The assembler evaluates the program and results in a single unit of object code.
Some features of assembler allow generalized machine instruction and data to appear in the
object program in a different order from the corresponding source statements.
These parts maintain their identity and are handled separately by the loader.
We use the program blocks to refer to segments of code that are arranged within a single
object program unit and control sections to refer to segments that are translate into
independent object program units.
Each program blocks may actually contain several seperate segments of the source program.
In this case three blocks20 are used.The first program block contains the executable
instructions of the program.(unnamed block).
The second block(C DATA)contains all data areas that are small in length.
The third (C BLKS) contains all data areas that consist of larger blocks of memory.

various blocks. k /
The assembler directive USE indicates which portions of the source program belongs to the

(default)block. . t
At the beginning of the program,statements are assumed to be part of unnamed

b e
If no USE statements use included ,the entire program belongs to this single block.

t u
The assembler will rearrange these segmants to gather together the pieces of each block.
These blocks will then be assigned addresses in the object program,with the blocks
e
appearing in the same order in which they were first begun in the source program.
s
/ c
The assembler accomplishes this logical rearrangement of code by maintaining,during pass
1 a seperate location counter for each program block.

: /
The location counter for a block is initialized to '0' when the block is first begun.
The current value if this location counter is saved when switching to another block.
p
And the saved value is restored when resuming a previous block.
t
h t
During pass 1 each label in the program is assigned an address that is relative to the start of
the block that contains it.
When labels are entered into the symbol tabel,the block name or number is stored along with
the assigned relative address.
At the end of pass 1 the latest value of the location counter for each block indicates the
length of that block.
The assembler can then assign to each block a starting address in the object program.
For code generation during pass 2,the assembler needs the address for each symbol relative
to the start of the object program.

20. what are blocks in program. How they classified. Explain.(2 marks)

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

Block Name Block Number Address Length

Default 0 0000 0066

C DATA 1 0066 00013

C BLKS 2 0071 1000

5.4.1 Control Section and Program linking:


A control section is a part of program that maintains its identify after assembly.
Each such control section can be loaded and relocated independently of the others.
Diffferent control sections are most often used for subroutines or other logical subdivisions
of a program.
The programmer can assemble,load and manipulate each of these control sections

k /
seperately.The resulting flexibility is a major benefit of using control sections.
When control section form logically related parts of a program,it is necessary to provide
some means for linking them together.
. t
another section. b e
Instructions in one control section might need to refer to instructions or data located in

process these references in the usual way. t u


Besause control sections are independently loaded and relocated ,the assembler is unable to

s e
The assembler has no idea where any control section will be located at execution time.Such

/ c
references between control external references.
In this case there are three control sections.One for the main program and for each

: /
subroutine. Program blocks traced through the assembly and loading process.
Control sections differ from program blocks in that they are handled seperately by the
assembler.
t p
h t
Symbols that are defined in control section may not be used directly by another section;they
must be identified as external references for loader to handle.
EXTDEF EXTERNAL DEFINITION
EXTREF EXTERNAL REFERENCE
The two new record types21 are DEFINE and REFER. A Define record gives information
about external symbol that are defined in this control section. A Refer record lists symbols
that are yield as external references by the control section.

DEFINE RECORD:
COL 1 :D
COL 2-7 :Name of the external symbol defined in this Control section.
COL 8-13 :Relative address of symbol.
COL 14-73 : Repeat information in col 2-13 for other external symbol.

21. Define DEFINE record and REFER record.explain.( 2 marks)

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

REFER RECORD:
COL 1 :R
COL 2-7 :Name of external symbol.
COL 8-13 :Name of the other external reference symbols.

MODIFICATION RECORD:
COL 1 :M
COL 2-7 :Starting address of the field to be modified.
COL 8-9 :Length of the field to be modified as half bytes.
COL 10 :Modification flag.
COL 11-16 :External symbol whose value is to be added or subtracted to the
indication field.

6. One pass assemblers and Multipass assemblers:

6.1 One-Pass Assemblers: k /


Scenario for one-pass assemblers . t
b e
Generate their object code in memory for immediate execution load-and-go
assembler.
t u
e
External storage for the intermediate file between two passes is slow or is
s
inconvenient to use.
/ c
: /
Main problem - Forward references
Data items
p
h
Solution
tt Labels on instructions

Require that all areas be defined before they are referenced.


It is possible, although inconvenient, to do so for data items.
Forward jump to instruction items cannot be easily eliminated.
Insert (label, address_to_be_modified) to SYMTAB
Usually, address_to_be_modified is stored in a linked-list
6.1.1Forward Reference in One-pass Assembler:
Omits the operand address if the symbol has not yet been defined.
Enters this undefined symbol into SYMTAB and indicates that it is undefined
Adds the address of this operand address to a list of forward references associated
with the SYMTAB entry.

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

When the definition for the symbol is encountered, scans the reference list and
inserts the address.
At the end of the program, reports the error if there are still SYMTAB entries
indicated undefined symbols.
6.2 Multi-Pass Assemblers:
For a two pass assembler, forward references in symbol definition are not allowed:
ALPHA EQU BETA
BETA EQU DELTA
DELTA RESW 1
Symbol definition must be completed in pass 1.
Prohibiting forward references in symbol definition is not a serious inconvenience.
k /
Forward references tend to create difficulty for a person reading the program.

. t
6.2.1 Implementation:
b e
t u
For a forward reference in symbol definition, we store in the SYMTAB:

The symbol names e


/ c
The defining expression

: /
t p
The number of undefined symbols in the defining expression

h t The undefined symbol (marked with a flag *) associated with a list of

symbols depend on this undefined symbol.

When a symbol is defined, we can recursively evaluate the symbol

expressions depending on the newly defined symbol.

7. IMPLEMENTATION EXAMPLE:

MASAM assembler
SPARC assembler

III CSE UNIT II


http://csetube.weebly.com/
CS 2304 SYSTEM SOFTWARE G.PRABHAKARAN AP/CSE
S.SELVARANI AP/CSE

MASAM assembler

MASAM assembler is written for Pentium and other x 86 systems.

Since x 86 system views memory as a collection of segments, MASAM

assembler language program is written as a collection of segments.

Each segment is defined as belonging to a particular class.

Commonly used classes are CODE, DATA, CONST and STACK.

During program execution, segments are addressed via the x 86 segment

registers.
k /
t
Code segment are addressed using register CS
.
e
Start segments are addressed using register SS
b
t u
Data segments are addressed using DS or GS.

s e
/ c
Jump instructions are assembled in two different ways

: /
t p Near jump

h t Far jump

III CSE UNIT II


http://csetube.weebly.com/