Directives tells the assembler how to generate machine code and allocate storage. Ex:
count db 50
The assembler produces an object file from the assembly language source The object file contains machine language code with some external and relocatable addresses that will be resolved by the linker. There values are undetermined at that stage. The linker extract object modules (compiled procedures) from a library and links them with the object file to produce the executable file. The addresses in the executable file are all resolved but they are still logical addresses.
5
count db 50
main: mov eax, 5 xor eax, ebx jump main It specifies the Entry Point of a block of instructions The Entry Point here is mov eax, 5
Names (Cont.)
The first character must be a letter or any one of @, _, $, ? subsequent characters can include digits A programmer chosen name must be different from an assembler reserved word avoid using @ as the first character since many keywords start with it When called from bcc32, the TASM32 assembler is case sensitive for user-defined words but case insensitive for the assembler reserved words
Integer Constants
Integer constants are made of numerical digits with, possibly, a sign and a suffix. Ex: -23 (a negative integer, base 10 is default) 1011b (a binary number) 1011 or 1011d (a decimal number) 513o or 531q (an octal number) 0A7Ch (an hexadecimal number) A7Ch (this is the name of a variable, an hexadecimal number must start with a decimal digit)
10
11
14
15
Will allocate 6 bytes that will be filled with 0 (i.e. the specified initial values are ignored).
16
Constants
We can use the equal-sign (=) directive or the EQU directive to give a name to a constant. Ex: one = 1 ;this is a constant two equ 2; also a constant The EQU and = directives are equivalent The assembler does not allocate storage to a constant (in contrast with data allocation directives) It merely substitutes, at assembly time, the value of the constant at each occurrence of the assigned name
17
Constants (cont.)
In place of a constant, we can use a constant expression involving the standard operators used in HLLs: +, -, *, / Ex: the following constant expression is evaluated at assembly time and given a name at assembly time: A = (-3 * 8) + 2 A constant can be defined in terms of another constant: B = (A+2)/2
18
Exercise 1
Suppose that the following data segment starts at address 0 .data A DW 1,2 B DW 6ABCh Z EQU 232 C DB 'ABCD' A) Find the address of variable A. B) Find the address of variable B. C) Find the address of variable C. D) Find the address of character C.
19
mov destination,source
This changes the content of destination (but not the content of source) All two-operands instructions are in the form OpCode Dst, Src Both operands must be of the same size. An operand can be either direct or indirect Direct operands (this chapter) are either:
Immediate (a constant): noted Imm Register: noted Reg Memory variable (with displacement), noted Mem
Indirect operands are used for indirect addressing (later chapter)
20
21
22
23
25
XCHG destination,source
Only mem and reg operands are permitted (and must be of the same size) Both operands cannot be mem (direct mem-tomem exchange is forbidden). To exchange the content of word1 and word2, we have to do:
Exercise 2
Given the following data segment .data A dw 1234h,-1 B dd 55h,66778899h Indicate if the following instruction is legal. If it is, indicate the value, in hexadecimal, of the destination operand immediately after the instruction is executed (please verify your answers with a debugger) MOV eax,A MOV bx,A+1 MOV bx,A+2 MOV dx,A+4 MOV cx,B+1 MOV edx,B+2
28
ADD destination,source
The SUB instruction subtracts the source from the destination and stores the result in the destination (source remains unchanged)
SUB destination,source
Both operands must be of the same size and they cannot be both mem operands Recall that to perform A - B the CPU in fact performs A + NEG(B)
29
ZF (zero flag) = 1 iff the result is zero SF (sign flag) = 1 iff the msb of the result is one OF (overflow flag) = 1 iff there is a signed overflow CF (carry flag) = 1 iff there is an unsigned overflow
Signed overflow: when the operation generates an out-ofrange (erroneous) signed value Unsigned overflow: when the operation generates an out-ofrange (erroneous) unsigned value
30
More on Overflows
A unsigned overflow occurs if and only if (IFF) the unsigned value of the result does not fit into the destination operand This occurs IFF the unsigned interpretation of the result is erroneous It is signaled by CF=1 A signed overflow occurs IFF the signed value of the result does not fit into the destination operand This occurs IFF the signed interpretation of the result is erroneous It is signaled by OF=1
31
al, 0FFh al,1 ; AL=00h, OF=0, CF=1 al,7Fh al, 1 ; AL=80h, OF=1, CF=0 al,80h al,80h ; AL=00h, OF=1, CF=1
Hence: we can have either type of overflow or both of them at the same time
32
Overflow Example
mov ax,4000h add ax,ax ;AX = 8000h Unsigned Interpretation: The sum of the 2 magnitudes 4000h + 4000h gives 8000h. This is the result in AX (the unsigned value of the result is correct). CF=0 Signed Interpretation: we add two positive numbers: 4000h + 4000h and have obtained a negative number! the signed value of the result in AX is erroneous. Hence OF=1
33
Overflow Example
mov ax,8000h sub ax,0FFFFh ;AX = 8001h
Unsigned Interpretation: from the magnitude 8000h we subtract the larger magnitude FFFFh the unsigned value of the result is erroneous. Hence CF=1 Signed Interpretation: We subtract -1 from the negative number 8000h and obtained the correct signed result 8001h. Hence OF=0
34
Overflow Example
mov ah,40h sub ah,80h ;AH = C0h
Unsigned Interpretation: we subtract from 40h the larger number 80h the unsigned value of the result is wrong. Hence CF=1 Signed Interpretation: we subtract from 40h (64) a negative number 80h (-128) to obtain a negative number the signed value of the result is wrong. Hence OF=1
35
Exercise 3
For each of these instructions, give the content (in hexadecimal) of the destination operand and the CF and OF flags immediately after the execution of the instruction (verify your answers with a debugger). ADD AX,BX when AX contains 8000h and BX contains FFFFh. SUB AL,BL when AL contains 00h and BL contains 80h. ADD AH,BH when AH contains 2Fh and BH contains 52h. SUB AX,BX when AX contains 0001h and BX contains FFFFh.
36
37
Character Output
The putch macro prints on the screen the character of the operands ASCII code. Usage: putch source Where source must be a 32-bit operand i.e. either imm, reg32, or mem32 (a double word variable) .data aword dw 41h adword dd 61h .code putch aword ;error: 16-bit operand putch adword ;a is written on screen putch b ;b is written on screen mov eax,c putch eax ;c is written on screen putch ax ;error: 16-bit operand
40
41
String Output
To print a string, use the following macro: putstr source Where source must be mem operand (i.e. the name of a variable). It cannot be a reg or imm operand. This macro calls printf(%s, ) of the C library. Hence: The number 10 = 0Ah will move the cursor to the beginning of the next line (the newline character in C) The string must be a null terminating string. The last character must have ASCII code = 0h. Ex: .data msg db hello,0ah,world,0h .code putstr msg ;prints hello on one line ;and world on the next line
42
Integer Output
To print the signed value of an integer, use: putint source Where source must be a 32-bit operand i.e. either imm, reg32, or mem32 (a double word variable) . Ex: .data aword dw 243 adword dd -266 .code putint aword ;error: 16-bit operand putint adword ;-266 is written on screen putint -1 ; -1 is written on screen mov eax,0FFFFFFFFh putint eax ;-1 is written on screen putint ax ;error: 16-bit operand
43
Character Input
To read one or more character on the keyboard, we will use the getch macro. Usage: getch This macro calls getchar() from the C library. So it uses a memory buffer that we will call the input buffer. Upon execution of getch, the input buffer is first examined. If the input buffer is empty, then getch waits for the user to enter an input line (a sequence of char ended by <CR>). Each character that the user enters (at the keyboard) is copied into the input buffer When the user enters the <CR>: the screen cursor move to the next line, the value 0Ah is stored in the input buffer and the control is passed to the instruction following getch The ASCII code of the first character entered on the keyboard will be stored in AL. The remaining bits of EAX are filled with zeros. Ex: mov eax,-1 getch ; eax=41h if the user first hits A
44
45
If the input buffer is not empty when getch is executed, then EAX will get loaded with the ASCII code of the next character in the input buffer and the pointer to the next char will increase by one. The input buffer is empty only when the pointer to the next char points beyond the last character (i.e: 0Ah) The user is prompted only when the input buffer is empty
46
47