Unit 1: Introduction To System Programming

Unit 1
Introduction to System
Programming
Contents
• System Software
• Machine Structure
• Machine language
• Assembly Language
• Assemblers
– Design of two pass assembler
System Software
• System Software vs Application
– Most system software are machine dependency.
– System programs are intended to support the
operation and use of the computer itself, rather
than any particular application.
• Examples of system software
– Text editor, assembler, compiler, loader or linker,
debugger, macro processors, operating system,
database management systems
System Software
• Text editors
• Loaders
• Pre Processors
• Assemblers
• Macro processors
• Compilers
• Debuggers
System Software Concept
Users
Application Program
Utility Program
Debugger Macro Processor Text Editor
(Library)
Complier Assembler Load and Linker
OS
Memory Process Device Information
Management Management Management Management
Bare Machine (Computer Hardware)

System Software
• Text editors
– For editing plain text files
• Loaders
– Loads programs into memory and prepares them for execution
– Reads the contents of the executable file containing the program
instructions into memory
• Assemblers
– Converts Assembly Language code in to machine language
• Macro processors
– Carries out systematic set of replacements i.e. macro expansion
– Often included in other programs, like assemblers and compilers
System Software
• Compilers
– Translates a program written in HLL to executable machine language
– Prompts developer of errors
– object code is generated.
• Interpreter
– translates each instruction, executes it
– No object code is generated
• Debuggers
– test and debug a target program
• Linkers
– one or more objects generated by a compiler and combines them into
a single executable program
• Pre Processor
– processes our source code before compilation
Language Processors
Machine Structure
Machine Structure
• Memory address register (MAR): contains the address of the
memory location that is to read from or stored into.
• Memory buffer register (MBR): contain a copy of the content of

the memory location whose address is stored in MAR. The primary
interface between the memory and the CPU is through memory
buffer register.
• Memory controller: is a hardware device whose work is to transfer

the content of the MBR to the core memory location whose
address is stored in MAR.
• I/O channels: interprets special instructions for inputting and

outputting information from the memory.
Machine Structure
• Instruction interpreter: A group of electronic circuits performs the
intent of instruction of fetched from memory.
• Location counter: LC or program counter PC or instruction counter

IC, is a hardware memory device which denotes the location of the
current instruction being executed.
• Instruction Register: A copy of the content of the LC is stored in IR.
• Working Register: are the memory devices that serve as “scratch

pad” for the instruction interpreter.
• General register: are used by programmers as storage locations and

for special functions.
Machine Structure
• Opcode Register Address
• ADD 2,176
Machine Structure
• At first, the address from the IC is copied to the MAR.
• Then the instruction is fetched to the MBR.
• The instruction is then transferred to the IR.
• Then the operand of the instruction is checked and the
corresponding branch is taken, here ADD branch is chosen.
• Then the memory location of the second operand is placed
in the MAR.
• Then the content is placed in MBR.
• Now the first operand is placed in the WR.
• Finally the sum of WR and MBR is calculated and the stored
in WR.
• The content of WR is stored to the register that contained
first operand.
• And then IC is incremented to point to the next instruction.
Generations of Computer Languages:
Language Levels
High Level Language
Assembler Language
Machine Language
Micro - Firmware
programming
Hardware
High-level languages: Third Generation
• Combines algebraic expressions and English symbols

• The high-level languages are so called because each
statement in these languages generates multiple
statements at the machine-language level
• It requires much faster, more efficient compilers to
translate higher-level languages into machine codes
• Advantage: CPU-independent
• Disadvantage: computers do not understand HLL
directly.
Machine Language
• Machine language is the first generation
programming language.
• Machine language: directly understood by a
computer since it is a collection of binary
numbers (0 and 1)
• Disadvantages:
– It is not standardized
– Different CPU needs different machine
languages.
– Slow and labor-intensive process
Machine code
• Machine code:
– Set of commands directly executable via CPU

– Commands in numeric code
– Lowest semantic level
Machine code language
• Structure:
OpCode OpAddress
– Operation code
• Defining executable operation
– Operand address
• Specification of operands
– Constants/register addresses/storage addresses
Assembly languages: Second Generation
• One step above of machine language: the second
generation of programming languages
• More readable
• Computer operations are represented by mnemonic codes
rather than binary numbers
• Variables can be given names rather than binary memory
addresses
• Programmers could substitute language like acronyms and
words such as add, sub, and load in programming
statements
• A language translator is needed to convert the mnemonics
into machine language
• Disadvantage: CPU- dependent
Elements of the Assembly Language
Programming
• An Assembly language is a
– machine dependent
– low level Programming language specific to a certain computer
system.
Three features when compared with machine language are
1. Mnemonic Operation Codes

2. Symbolic operands
3. Data declarations
Elements of the Assembly Language
Programming
Mnemonic operation codes: eliminates the need to memorize
numeric operation codes.
Symbolic operands: Symbolic names can be associated with

data or instructions. Symbolic names can be used as operands
in assembly statements (need not know details of memory
bindings).
Data declarations: Data can be declared in a variety of

notations, including the decimal notation (avoids conversion
of constants into their internal representation).
Assembly language-structure
<Label> <Mnemomic> <Operand> Comments

• Label
– symbolic labeling of an assembler address
(command address at Machine level)
• Mnemomic
– Symbolic description of an operation
• Operands
– Contains of variables or addresse if necessary
• Comments
23
Statement format
An Assembly language statement has following
format:
[Label] <opcode> <operand spec>[,<operand spec>..]
If a label is specified in a statement, it is associated as a
symbolic name with the memory word generated for the
statement.
<operand spec> has the following syntax:
<symbolic name> [+<displacement>] [(<index register>)]
Eg. AREA, AREA+5, AREA(4), AREA+5(4)

Mnemonic Operation Codes
• Each statement has two operands, first operand is always a register
and second operand refers to a memory word using a symbolic
name and optional displacement.
Operation Codes
 MOVE instructions move a value between a memory word and
a register
 MOVER – First operand is target and second operand is source
 MOVEM – first operand is source, second is target
 All arithmetic is performed in a register (replaces the contents

of a register) and sets condition code.
 A Comparision instruction sets condition code analogous to

arithmetics, i.e. without affecting values of operands.
 condition code can be tested by a Branch on Condition (BC)

instruction and the format is:
BC <condition code spec> , <memory address>
Assembly Language Statements
• An assembly program contains three kinds of
statements:
Imperative Statements (IS)

Declaration Statements (DS)
Assembler Directives (AD)
Imperative Statements: They indicate an action to be

performed during the execution of an assembled program.
Each imperative statement is translated into one machine
instruction.
• Declaration Statements: syntax is as follows:
[Label] DS <constant>
[Label] DC '<value>'
 The DS (declare storage) statement reserves memory and associates names with
them.
 Ex:
A DS 1 ; reserves a memory area of 1 word, associating the name A to it
G DS 200 ; reserves a block of 200 words and the name G is associated with the
first word of the block (G+6 etc. to access the other words)
 The DC (declare constant) statement constructs memory words containing
constants.
 Ex:
ONE DC '1’ ; associates name one with a memory word containing value 1
Use of Constants
• The DC statement does not really implement constants
• it just initializes memory words to given values.
• The values are not protected by the assembler and can
be changed by moving a new value into the memory
word.
• In the above example, the value of ONE can be changed
by executing an instruction
MOVEM BREG, ONE
Use of Constants
• An Assembly Program can use constants just like HLL, in
two ways – as immediate operands, and as literals.
• 1) Immediate operands can be used in an assembly

statement only if the architecture of the target machine
includes the necessary features.
– Ex: ADDI AREG,5
– This is translated into an instruction from two operands – AREG and

the value '5' as an immediate operand
Use of Constants
• 2) A literal is an operand with the syntax = '<value>'.
• It differs from a constant because its location cannot be
specified in the assembly program.
• Its value does not change during the execution of the
program.
• It differs from an immediate operand because no
architectural provision is needed to support its use.
ADD AREG, =‘5’  ADD AREG, FIVE

FIVE DC ‘5’
Use of literals vs. Use of DC
Assembler Directive
• Assembler directives instruct the assembler to perform certain
actions during the assembly of a program.
• Some assembler directives are described in the following:

1) START <constant>
• This directive indicates that the first word of the target program
generated by the assembler should be placed in the memory
word having address <constant>.
2) END [<operand spec>]

• This directive indicates the end of the of the source program.
The optional <operand spec> indicates the address of the
instruction where the execution of the program should begin.
Advantages of Assembly Language
• The primary advantages of assembly language

programming over machine language programming
are due to the use of symbolic operand specifications.
(in comparison to machine language program)
• Assembly language programming holds an edge over

HLL programming in situations where it is desirable to
use architectural features of a computer.
(in comparison to high level language program)
Machine Instruction Format
• sign is not a part of the instruction

• Opcode: 2 digits, Register Operand: 1 digit, Memory
Operand: 3 digits
• Condition code specified in a BC statement is encoded
into the first operand using the codes 1- 6 for
specifications LT, LE, EQ, GT, GE and ANY respectively
• In a Machine Language Program, all addresses and constants
are shown in decimal.
Example: ALP and its equivalent Machine
Language Program
Example
START 100
MOVER BREG, N
MULT BREG, N
STOP
N DS 1
END
Example
START 100
MOVER BREG, N LC = 100 (1 byte)
MULT BREG, N LC = 101 (1 byte)
STOP LC = 102 (1 byte)
N DS 1 LC = 103
END
Advanced Assembler Directives
• ORIGIN
• EQU
• LTORG
*Assembler Directives doesn’t have Location

Counter*
ORIGIN
ORIGIN <address specification>
• Puts the address specification given by operand
or constant as location counter or symbolic name
+ displacement.
• The next statement gets the location counter as
<address specification> given in ORIGIN.
• ORIGIN 321
• ORIGIN SYMBOL+5
ORIGIN
LN Code LC
1 … 201
.
5 ADD CREG, =‘1’ 205
6 ORIGIN 210
7 MULT DREG, =‘1’ 210
8 … 211
15 END
EQU
• <Symbol> EQU <address specification>
• address specification is given by operand or

constant or symbolic name + displacement
• Associates <symbol> with address given in
<address specification>.
• Doesn’t change Location Counter.
LTORG
• Allows programmers to specify where the
literals are to be placed
• If LTORG AD is used:
– At the location counter where LTORG is called
• If LTORG AD is not used:
– After the END AD
LTORG
…
6.ADD CREG, =‘3’ 216
7.ADD BREG,=‘4’ 217
8.MOVEM AREG, M 218
….
15.LTORG
16.MULT BREG, N 227
17.ADD AREG,=‘6’ 228
….
END
LTORG
…
6.ADD CREG, =‘3’ 216
7.ADD BREG,=‘4’ 217
8.MOVEM AREG, M 218
….
15.LTORG
=‘3’ 225
=‘4’ 226
16.MULT BREG, N 227
17.ADD AREG,=‘6’ 228
….
END
=‘6’ 236
Fundamentals of Language Processing
• Language processing = analysis of source program
+ synthesis of target program
• Analysis of source program is specification of the
source program:
– Lexical rules: formation of valid lexical units(tokens) in
the source language (Operators, identifiers and
constants)
– Syntax rules : formation of valid statements in the
source language
– Semantic rules: associate meaning with valid
statements of the language
Fundamentals of Language Processing
• per = (obtained_marks * 100) / total_marks;
Analysis of source program
• Lexical rules:
– Operators: = , *, /
– Identifiers: per, obtained_marks, total_marks
– Constants: 100
• Syntax rules: (tree generation)
– On LHS per
– On RHS (obtained_marks * 100) / total_marks
• Semantic rules: Assignment of RHS to LHS
using = (type checking)
Fundamentals of LP
• Synthesis of target program is construction of
target language statements
– Memory allocation : generation of data structures
in the target program
– Code generation for target machine
Assembler
• Translating source code written in assembly

language to object code.
A simple Assembly Scheme
Design Specification of an assembler
There are four steps involved to design specification of an
assembler:
• Identify information necessary to perform a task.
• Design a suitable data structure to record info.
• Determine processing necessary to obtain and maintain
the info.
• Determine processing necessary to perform the task
A simple Assembly Scheme
• There are two phases in specifying an assembler:
1. Analysis Phase
2. Synthesis Phase(the fundamental information
requirements will arise in this phase due to code
generation)
Analysis Phase
• Primary function of the Analysis phase is to build the
symbol table.
– It must determine the addresses with which the symbolic names
used in a program are associated
– It is possible to determine some addresses directly like the
address of first instruction in the program (i.e.START)
– Other addresses must be inferred
– To determine the addresses of the symbolic names we need to
fix the addresses of all program elements preceding it through
Memory Allocation.
• To implement memory allocation a data structure called
location counter is introduced.
Analysis Phase – Implementing memory
allocation
• LC(location counter) :
– is always made to contain the address of the next memory word in
the target program.
– It is initialized to the constant specified at the START statement.
• When a LABEL is encountered,
– it enters the LABEL and the contents of LC in a new entry of the
symbol table.
LABEL – e.g. N, AGAIN, SUM etc
– It then finds the number of memory words required by the
assembly statement and updates the LC contents
• To update the contents of the LC, analysis phase needs to know
lengths of the different instructions
– This information is available in the Mnemonics table and is extended
with a field called length
• We refer the processing involved in maintaining the LC as LC
Processing
Example
START 100
MOVER BREG, N LC = 100 (1 byte)
MULT BREG, N LC = 101 (1 byte)
STOP LC = 102 (1 byte)
N DS 5 LC = 103
Symbol Address
N 103
• Since there the instructions take different
amount of memory, it is also stored in the
mnemonic table in the “length” field
Mnemonic Opcode Length

MOVER 04 1
MULT 03 1
Synthesis Phase: Example
Consider the following statement:
MOVER BREG, ONE
Synthesis Phase: Example
The following info is needed to synthesize machine instruction
for this stmt:
1. Address of the memory word with which name ONE is
associated [depends on the source program, hence made
available by the Analysis phase].
2. Machine operation code corresponding to MOVER [depends

on the assembly language(target code), hence synthesis
phase can determine this information for itself]
Based on above discussion, guess the two data structures

required during the synthesis phase?
Data structures in synthesis phase
Symbol Table
• built by the analysis phase
• The two primary fields are name and address of the symbol used to
specify a value.
Mnemonics Table
• already present
• The fields are mnemonic and opcode, along with length.
Synthesis phase uses these tables to obtain
– The machine address with which a name is associated.
– The machine op code corresponding to a mnemonic.
• The tables have to be searched with the

– Symbol name and the mnemonic as keys
Data structures of an assembler
During analysis and
Synthesis phases Mnemonic Opcode length
ADD 01 1
SUB 02 1
Mnemonic Table
Source Synthesis Target

Analysis
---------------------------------> Phase Program
Program Phase
Symbol Address
 Data Access
AGAIN 104
-- > Control Access
N 113
Symbol Table
Data structures
• Mnemonics table is a fixed table which is
merely accessed by the analysis and synthesis
phases
• Symbol table is constructed during analysis
and used during synthesis
Tasks Performed : Analysis Phase
• Isolate the labels, mnemonic, opcode and operand fields
of a statement.
• If a label is present, enter (symbol, <LC>) into the symbol

table.
• Check validity of the mnemonic opcode using mnemonics

table.
• Update value of LC.

Tasks Performed : Synthesis Phase
• Obtain machine opcode corresponding to the mnemonic from
the mnemonic table.
• obtain address of the memory operand from symbol table.
• Synthesize a machine instruction or machine form of a

constant, depending on the instruction.
Assembler’s functions
• Convert mnemonic operation codes to their
machine language equivalents
• Convert symbolic operands to their equivalent
machine addresses
• Build the machine instructions in the proper
format
• Convert the data constants to internal
machine representations
• Write the object program and the assembly
listing
Assembler: Design
• The design of assembler can be of:

– Scanning (tokenizing)
– Parsing (validating the instructions)
– Creating the symbol table
– Resolving the forward references
– Converting into the machine language
Assembler Design
• Pass of a language processor – one complete scan of the
source program
• Assembler Design can be done in:
– Single pass
– Two pass
• Single Pass Assembler:
– Does everything in single pass
– Cannot resolve the forward referencing
• Two pass assembler:
– Does the work in two pass
– Resolves the forward references
Difficulties: Forward Reference
• Forward reference: reference to a label that is
defined later in the program.
START 100
MOVER BREG, N
MULT BREG, N
STOP
N DS 5
* N is referred at line 2 and declared at the end of the code

Backpatching
• The problem of forward references is handled
using a process called backpatching
– Initially, the operand field of an instruction containing
a forward reference is left blank
– Ex: MOVER BREG, ONE can be only partially
synthesized since ONE is a forward reference
– The instruction opcode and address of BREG will be
assembled to reside in location 101
– To insert the second operand’s address later, an entry
is added as Table of Incomplete Instructions (TII)
– The entry TII is a pair (<instruction address>,
<symbol>) which is (101, ONE)
Backpatching
– When END statement is processed, the symbol table would
contain the addresses of all symbols defined in the source
program
– So TII would contain information of all forward references
– Now each entry in TII is processed to complete the
instruction
– Ex: the entry (101, ONE) would be processed by obtaining
the address of ONE from symbol table and inserting it in
the operand field of the instruction with assembled
address 101.
– Alternatively, when definition of some symbol ‘X’ is
encountered, all forward references to ‘X’ can be
processed
Assembler: Design
• Symbol Table:
– This is created during pass 1
– All the labels of the instructions are symbols
– Table has entry for symbol name, address value.
• Forward reference:
– Symbols that are defined in the later part of the
program are called forward referencing.
– There will not be any address value for such
symbols in the symbol table in pass 1.
Assembler: Design
• Assembler directives are pseudo instructions.

– They provide instructions to the assemblers itself.
– They are not translated into machine operation
codes.
Assembler: Design
• First pass:
– Scan the code by separating the symbol, mnemonic
op code and operand fields
– Build the symbol table
– Perform LC processing
– Construct intermediate representation
• Second Pass:
– Solves forward references
– Converts the code to the machine code
Data Structures in Pass I
• OPTAB
• SYMTAB
• LOCCTR
• LITTAB
• POOLTAB
• OPTAB – a table of mnemonic op codes
– Contains mnemonic op code, class and mnemonic info
– Class field indicates whether the op code corresponds
to
• an imperative statement (IS),
• a declaration statement (DL) or
• an assembler Directive (AD)
– For IS, mnemonic info field contains the pair ( machine
opcode, instruction length)
– Else, it contains the id of the routine to handle the
declaration or a directive statement
– Routine processes the operand field of the statement to
determine the amount of memory required and
updates LC and the SYMTAB entry of the symbol defined
• Characteristic
– Static table. The content will never change
• Implementation:
– Contents will never change, we can optimize its
search speed. Use Array or hash table.
• In pass 1, OPTAB is used to look up and
validate mnemonics in the source program
• In pass 2, OPTAB is used to translate
mnemonics to machine instructions
SYMTAB (Symbol Table)
• Content
– Include the label name and address for each label in the source
program.
– Include type and length information (e.g., int64)
– With flag to indicate errors (e.g., a symbol defined in two places)
• Characteristic
– Dynamic table (i.e., symbols may be inserted, deleted, or
searched in the table)
• Implementation
– Hash table can be used to speed up search
– Because variable names may be very similar (e.g., LOOP1,
LOOP2), the selected hash function must perform well with such
non-random keys
• LOCCTR - Location Counter
• LITTAB – A table of literals used in the
program
– Contains literal and address
– Literals are allocated addresses starting with the
current value in LC and LC is incremented,
appropriately if LTORG AD is found otherwise at
the END
• POOLTAB – Pool Table
– Maintains Pool of Literals
Two Pass Assembler
Source READ (Label, opcode, operand)
program
Intermediate
Pass 1 Code Pass 2 Object
codes
OPTAB
SYMTAB
Mnemonic and
Label and address
opcode mappings
mappings created
are referenced
LITTAB
Variables / Constants
address mappings
created
Example
START 100
MOVER BREG, N
MULT BREG, N
ADD CREG, =‘3’
MOVEM AREG, M
ADD AREG,=‘6’
STOP
N DS 1
M DC ‘4’
END
OPTAB SYMTAB
Mnemonic Class Mnemonic Info Symbol Address length
START AD R#11 N 106 1
MOVER IS (04,1) M 107 1
MULT IS (03,1)
ADD IS (01,1)
MOVEM IS (05,1)
STOP IS (00,1) LITTAB

DS DL R#7 Literal Address
DC DL R#8
=‘3’ 108
END AD R#10
=‘6’ 109
Example
START 100
MOVER BREG, N
MULT BREG, N
ADD CREG, =‘3’
ADD BREG,=‘4’
MOVEM AREG, M
LTORG
MULT BREG, N
ADD AREG,=‘6’
STOP
N DS 5
M DC ‘4’
END
SYMTAB
Symbol Address length
OPTAB N 110 1
Mnemonic Class Mnemonic Info M 115 1
START AD R#11
LITTAB
MOVER IS (04,1)
Literal Address
MULT IS (03,1)
ADD IS (01,1) =‘3’ 105
MOVEM IS (05,1) =‘4’ 106
LTORG AD R#12
=‘6’ 116
STOP IS (00,1)
DS DL R#7 Literal No
DC DL R#8
#1
END AD R#10
POOLTAB #3
Intermediate Code (IC) Format
• IC must contain following three fields:
– Address
– Representation of Mnemonic Opcode
– Representation of Operands
Address Mnemonic Operands

Opcode
• Address is the location counter

Representation of Mnemonic Opcode
Declaration Statements
(DL)
DC 01
DS 02
Assembler Directive (AD)

START 01
END 02
ORIGIN 03
(Statement Class, Code)
EQU 04
LTORG 05
Representation of Operands
• Operand 1 (Register or Conditional Codes)
Registers Conditional Codes
AREG 01 LT 01
BREG 02 LE 02
CREG 03 EQ 03
DREG 04 GT 04
GE 05
ANY 06
• (Number)
Representation of Operands
• Operand 2 (S,C,L)
• (Operand Class, Code)
• S = Symbol Code = Entry in Symbol Table
• C = Constant Code = Constant Value
• L = Literal Code = Entry in Literal Table
START 200
READ A
LOOP MOVER AREG,A
.
.
SUB AREG,=‘9’
BC GT,LOOP
STOP
A DS 2
LTORG
.
• Variant I
(AD,01) (C,200)
(IS,09) (S,01)
(IS,04) (1) (S,01)
.
.
(IS,02) (1) (L,01)
(IS,07) (4) (S,02)
(IS,00)
(DL,02) (C,1)
(AD,05)
Variant I
• Variant II
(AD,01) (C,200)
(IS,09) A
(IS,04) AREG,A
.
.
(IS,02) AREG (L,01)
(IS,07) GT, LOOP
(IS,00)
(DL,02) (C,1)
(AD,05)
Variant II
• Variant I
– Complete processing of Operands is needed
– Reduces the work of pass II
• Variant II
– Operand processing has to be done in Pass II
– IC is less compact
– Transfers some coding load from Pass I to Pass II
Assembler Pass I
Step 1: Initialization
loc_cntr= 0 (default value)
pooltab_ptr= 1
POOLTAB[1] = 1
Littab_ptr=1
Step 2: While next statement is not an END
statement
a)If label is present then
this_label= symbol in label field
Enter ( this_label, loc_cntr) in SYMTAB
b)If an LTORG statement then
i) Process literals
make entry in literal table
ii) pootab_ptr= pootab_ptr++
iii) POOLTAB[pooltab_ptr] = littab_ptr
c) If START or ORIGIN statement then
loc_cntr= value specified in operand field
d) If an EQU statement then
this_addr= value of <addrspace>
correct the symtab_entry for
this_label to (this_label, this_addr)
e) If a declaration statement then
i) code = code of declaration statement
ii) size = size of memory area required by
DC/DS
iii) loc_cntr= loc_cntr+ size
iv) Generate IC ‘(DL, code)…’
f) If an imperative statement
i) code = machine opcode from OPTAB
ii) loc_cntr= loc_cntr+ instruction length from
OPTAB
iii) If operand is literal then
this_literal= literal in operand field
LITTAB[littab_ptr] = this_literal
littab_ptr= littab_ptr+ 1
Generate IC ‘ (IS,code)(L,this_literal)’
Else (i.e. Operand is a symbol)

this_entry = SYMTAB entry no of operand
Generate IC ‘ (IS,code)(S,this_entry)’
Step 3: (Processing of END statement)
i) Perform step 2(b)
ii) Generate IC ‘(AD,02)’
iii) Goto Pass II
Machine Instruction Format
• sign is not a part of the instruction

• Opcode: 2 digits, Register Operand: 1 digit, Memory
Operand: 3 digits
• In a Machine Language Program, all addresses and constants
are shown in decimal.
Assembler Pass II
Assembler Pass II
Step1 : Initialization
i. code_area_address= address of code area
ii. pooltab_ptr= 1
iii. loc_cntr= 0
Step 2:
1) While next statement is not END statement
a) clear machine_code_buffer
b) If an LTORG statement then
i.Process literals
ii.size = size of memory area required for literal
iii.pootab_ptr= pootab_ptr++
c) If START or ORIGIN statement then
i.loc_cntr= value specified in operand field
ii.size = 0
d) If a declaration statement then

i.If a DC statement then Assemble the constant in
machine_code_buffer
ii.Size = size of memory area required by DC/DS
e) If an imperative statement
i.Get operand address from SYMTAB or LITTAB
ii.Assemble instruction in machine_code_buffer
iii.Size = size of instruction
f) If size is not equal to 0
i.move contents of machine_code_buffer
to the address code_area_address+ loc_cntr
ii.loc_cntr= loc_cntr+ size
Step 3: (Processing of END statement)

i.Perform step 2(b) and 2(f)
ii.Write code area into output file
Example: ALP and its equivalent Machine
Language Program
Error Reporting in Assembler
• Invalid Opcode
• Duplicate Declaration of Symbols
• Undefined Symbols
Error Reporting in Assembler
Donovan's Instruction Set
Pseudo Opcode Machine Opcode
START RR (Register Register)
END RX (Register Index)
DC SI (Storage Immediate)
DS RS (Register Storage)
USING SS (Storage Storage)
LTORG
DROP
EQU
Types of Register Memory Requirement
RR 2 Bytes
RX 4 Bytes
RS 4 Bytes
SI 4 Bytes
SS 6 Bytes
• RR type instruction: Ending with R
– SR,BR,BALR,BCR,etc.
• SI type instruction: Ending with I
– LI,AI,CLI,etc.
• RX type instruction: Ending with other than R/I
– A,R,ST,S,etc.
Label Opcode Operand
DATA DC F’5’
Data Size
H (Half Word) 2 Byte
F (Full Word) 4 Byte
D (Double Word) 8 Byte
C (Character) 1 Byte
IBM 360/370 Instruction Set
Design Procedure of ALP
– Assign the value of Location Counter (LC)

– Prepare symbol table
– Prepare base table
– Prepare literal table
– Generate M/C code
• Location Counter
– To assign values to location counter (LC), write machine
code instruction and then increment (LC) corresponding
to these type of instructions. (e.g. RR, RX,…)
– If DC, DS is encounter alignment should be done

according to these constant and then increment
according to constant
E.g.
Suppose DC F ‘4’ 16
DS F ‘5’ 20
Alignment
DC / DS F multiple of 4
DC / DS H multiple of 2
DC / DS D multiple of 8
LTORG
If LTORG is encountered then 1st assign the value to of
literal.
See for (=F) literal in program while making literal table,
we have all instruction with F in front
e. g. = F ‘10’ 32
-> 1st literal should be always placed at double word
location i.e. multiple of 8.
Literal LC
= F ‘25’ 36
= F ‘ 4’ 40
LTORG
Literal Table
literal value R
= F ‘10’ 32 R
= F ‘25’ 36 R
Machine Code
– RR val1, val2 RR val1, val2
– RX val2, val2 RX val1, offset (X, B)

• X- index
• B- base(Reg. no in base table)
– SI val1, val2 SI val2, offset (B)

• B- reg. value
Symbol Table
Symbol value R/A
SIMPLE 0 R
Relative- all symbol and these LC values and write R
Absolute- absolute value are these where you will

encounter EQU value in front of symbol
e.g. R1 EQU 1
so R1 is symbol write values of EQU not LC value in
table and write A in table
• Base Table
Reg. No. Content
15 0
for base table see USING instruction in prog.
e.g. USING * ,15
So reg. no. 15 and content these is * see the value
How to solve
OFFSET - value of (symbol/literal) – (B) /B- content
value of base table
BNE - Always changes into BC 7
BR 14 – Always changes into BCR 15, 14
DC - Corresponding Constant
DS- Blank

Unit 1: Introduction To System Programming

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Unit 1: Introduction To System Programming

Diunggah oleh

Hak Cipta:

Format Tersedia

Unit 1

Complier Assembler Load and Linker

Bare Machine (Computer Hardware)

• Memory buffer register (MBR): contain a copy of the content of

• Memory controller: is a hardware device whose work is to transfer

• I/O channels: interprets special instructions for inputting and

• Location counter: LC or program counter PC or instruction counter

• Instruction Register: A copy of the content of the LC is stored in IR.

• Working Register: are the memory devices that serve as “scratch

• General register: are used by programmers as storage locations and

• Combines algebraic expressions and English symbols

– Set of commands directly executable via CPU

Three features when compared with machine language are

1. Mnemonic Operation Codes

Symbolic operands: Symbolic names can be associated with

Data declarations: Data can be declared in a variety of

<Label> <Mnemomic> <Operand> Comments

<operand spec> has the following syntax:

<symbolic name> [+<displacement>] [(<index register>)]

Eg. AREA, AREA+5, AREA(4), AREA+5(4)

 All arithmetic is performed in a register (replaces the contents

 A Comparision instruction sets condition code analogous to

 condition code can be tested by a Branch on Condition (BC)

Imperative Statements (IS)

Imperative Statements: They indicate an action to be

• 1) Immediate operands can be used in an assembly

– This is translated into an instruction from two operands – AREG and

ADD AREG, =‘5’  ADD AREG, FIVE

• Some assembler directives are described in the following:

2) END [<operand spec>]

• The primary advantages of assembly language

• Assembly language programming holds an edge over

• sign is not a part of the instruction

*Assembler Directives doesn’t have Location

• address specification is given by operand or

• Translating source code written in assembly

Mnemonic Opcode Length

2. Machine operation code corresponding to MOVER [depends

Based on above discussion, guess the two data structures

• The tables have to be searched with the

Source Synthesis Target

• If a label is present, enter (symbol, <LC>) into the symbol

• Check validity of the mnemonic opcode using mnemonics

• Update value of LC.

• obtain address of the memory operand from symbol table.

• Synthesize a machine instruction or machine form of a

• The design of assembler can be of:

* N is referred at line 2 and declared at the end of the code

• Assembler directives are pseudo instructions.

Mnemonic Class Mnemonic Info Symbol Address length

START AD R#11 N 106 1

MOVER IS (04,1) M 107 1

STOP IS (00,1) LITTAB

Symbol Address length

Mnemonic Class Mnemonic Info M 115 1

Address Mnemonic Operands

• Address is the location counter

Assembler Directive (AD)

Registers Conditional Codes

Else (i.e. Operand is a symbol)

• sign is not a part of the instruction

d) If a declaration statement then

Step 3: (Processing of END statement)