Anda di halaman 1dari 42

x86, Assembler

TASM, MASM, NASM

Available assembler
MASM

Microsoft : Macro Assembler


Borland : Turbo Assembler Library General Public License (LGPL) [Free] : Netwide Assembler

TASM

NASM

etc, Flat Assembler, SpAssembler

MASM: Microsoft Macro Assembler


MASM contains a macro language with

looping, arithmetic, text string processing, and so on, and MASM supports the instruction sets of the 386, 486, and Pentium processors, providing you with greater direct control over the hardware. You also can avoid extra time and memory overhead when using MASM.
http://msdn.microsoft.com/library/en-us/vcmasm/

html/vcoriMicrosoftAssemblerMacroLanguage.asp

TASM: Turbo Assembler


TASM, Inpise's Borland Turbo Assembler,

supports an alternative to MASM emulation. This is known as Ideal mode and provides several advantages over MASM.
The key (questionable) disadvantage, of

course, is that MASM style assemblers cannot assemble Ideal mode programs.

NASM: Netwide Assembler


NASM is designed for portability and

modularity. It supports a range of object file formats including Linux, Microsoft 16-bit OBJ and Win32. Its syntax is designed to be simple and easy to understand, similar to Intel's but less complex. It supports Pentium, P6, MMX, 3DNow! and SSE opcodes, and has macro capability. It includes a disassemble as well. NASM is Library General Public License (LGPL) [Free] http://nasm.sourceforge.net

FASM: Flat Assembler


Currently it supports all 8086-80486/Pentium

instructions with MMX, SSE, SSE2, SSE3 and 3DNow! extensions, can produce output in binary, MZ, PE, COFF or ELF format. It includes the powerful but easy to use macroinstruction support and does multiple passes to optimize the instruction codes for size. The flat assembler is self-compilable and the full source code is included. http://flatassembler.net/

About developing assembly language


CPUs language (instructions)

X86 instruction set


Directives

About Complier

MASM TASM NASM

TASM

Important files
Compiler

TASM TASM32

16 bits 32 bits

real mode protected mode

Linker TLINK

Pseudo instructions
Segment, ends : To define a segment. Assume: To specify which segment defined

by Sengment, ends should use which segment-register


Data Allocate

Segment Declaration
Usage

Segment_name Segment_name

segment
ends

Ex.

Cseg Cseg

segment ends

Label declaration
Usage

Label name follow with colon :

Ex. Start: mov bx, offset start jmp Start

Data allocate
Define value

DB DW DD DQ DT

Define Byte Define Word Define Doubleword Define Quadword Define Ten Bytes

Usage Var_name

Dx

data

Ex. Data allocation


dseg segment
Msg MulH MulF db hello world$ dw 0, 1, 2, 3 dd 1234h

dseg ends

Data duplication
Usage

type count dup (value)

Ex. data1 data2 data3 data4

db db db db

10 dup (0) 2 dup (3 dup (0)) 3 dup (1, 2, 3 dup (4)) 4 dup (?)

Structure
Struc PosType Row dw ? Col dw ? Ends PosType

Union PosValType Pos PosType ? Val dd ? Ends PosValType


Point PosValType ?

Structure
mov [Point.Pos.Row], bx ;
; OK: Move BX to Row component of Point

mov [Point.Pos.Row], bl ; ; Error: mismatched operands

Data reference
offset directive, To retrieve an offset of a data mov bx, offset msg1 ;dx=offset/addr To retrieve / put a data mov dx, msg1 mov [msg1], dx mov [bx+2], dx

;dx = [msg1] ;[msg1] = dx ;[bx+2] = dx

Memory contents
ByteVal db ? ;"ByteVal" is name of byte variable mov ax, bx ;OK: Move value of BX to AX mov ax, [bx] ;OK: Move word at address BX to AX. Size of ;destination is used to generate proper object code mov ax,[word bx] ;OK: Same as above with unnecessary size qualifier mov ax,[word ptr bx] ;OK: Same as above with unnecessary size qualifier ;and redundant pointer prefix mov al, [bx] ;OK: Move byte at address BX to AL. Size of ;destination is used to generate proper object code mov [bx], al ; OK: Move AL to location BX

Memory contents
mov ByteVal, al ;Warning: "ByteVal" needs brackets mov [ByteVal], al ;OK: Move AL to memory location named "ByteVal" mov [ByteVal], ax ;Error: unmatched operands mov al, [bx+2] ;OK: Move byte from memory location BX+2 to AL mov al, bx[2] ; Error: indexes must occur with "+" as above mov bx, Offset ByteVal ;OK: Offset statement does not use brackets mov bx, Offset [ByteVal] ; Error: offset cannot be taken of the contents of memory

Memory contents
lea bx, [ByteVal] ;OK: Load effective address of "ByteVal" lea bx, ByteVal ;Error: brackets required mov ax, 01234h ;OK: Move constant word to AX mov [bx], 012h ;Warning: size qualifier needed to determine ;whether to populate byte or word mov [byte bx], 012h ;OK: constant 012h is moved to byte at address BX mov [word bx], 012h ;OK: constant 012h is moved to word at address BX

Echo entered string


cseg segment assume cs:cseg, ds:cseg org 100h start: jmp load Buf db 11, 12 dup (' ') _ent db 10,13,$ ;lf,cr mov mov mov add mov mov mov int int cseg al,[buf+1] ah,00h bx,offset buf+2 bx,ax byte ptr [bx],'$' ah,09h dx,offset buf+2 21h 20h ends end start

load: mov ah,0ah mov dx,offset buf int 21h


mov mov mov int ah,09h dx,load dx,offset _ent 21h

Compiling a program
Syntax:

TASM [options] source [,object] [,listing] [,xref]


/z Display source line with error message /zi,/zd,/zn Debug info: zi=full, zd=line numbers only, zn=none

Ex TASM zi hello.asm

Creating an executable file


TLINK objfiles, exefile, mapfile, libfiles, deffile,

resfiles

/v Full symbolic debug information /t Create COM file (same as /Tdc) /Txx Specify output file type
Tdx DOS image (default) x can be e=EXE or c=COM Twx Windows image x can be e=EXE or d=DLL

Ex Tlink /v /t hello;

NASM

NASM vs. MASM & TASM


NASM is case sensitive.

NASM Requires Square Brackets For

Memory References

No need offset, either equ or address

mov ax, data


mov ax, [data]

; mov ax, offset data


;

Use square bracket to retrieve content

Everything is treated as a label instead of var or equ or else

NASM vs. MASM & TASM


Does not support hybrid syntaxes, such as mov ax, table [bx] -> mov ax, [table + ax] Likewise

mov ax, es:[di]

-> mov ax, [es:di]

NASM Doesn't Store Variable Types


NASM, by design, chooses not to remember

the types of variables you declare. Whereas MASM will remember, on seeing `var dw 0', that you declared `var' as a word-size variable, and will then be able to fill in the ambiguity in the size of the instruction
mov var,2, NASM will deliberately remember

nothing about the symbol var except where it begins, and so you must explicitly code mov word [var],2.

NASM Doesn't Store Variable Types


For this reason, NASM doesn't support the

`LODS', `MOVS', `STOS', `SCAS', `CMPS', `INS', or `OUTS' instructions, but only supports the forms such as `LODSB', `MOVSW', and `SCASD', which explicitly specify the size of the components of the strings being manipulated.

NASM Doesn't `ASSUME'


As part of NASM's drive for simplicity, it also

does not support the ASSUME directive.


NASM will not keep track of what values you

choose to put in your segment registers, and will never _automatically_ generate a segment override prefix.

NASM Doesn't Support Memory Models


NASM also does not have any directives to support

different 16-bit memory models. The programmer has to keep track of which functions are supposed to be called with a far call and which with a near call, and is responsible for putting the correct form of RET instruction (`RETN' or `RETF'; NASM accepts `RET' itself as an alternate form for `RETN'); in addition, the programmer is responsible for coding CALL FAR instructions where necessary when calling _external_ functions, and must also keep track of which external variable definitions are far and which are near.

Layout of a NASM Source Line


Like most assemblers, each NASM source

line contains (unless it is a macro, a preprocessor directive or an assembler directive: some combination of the four fields
label: instruction operands ; comment

Declaring Initialized Data


DB, DW, DD, DQ and DT are used, much as

in MASM, to declare initialized data in the output file. They can be invoked in a wide range of ways:

db db db db dw dw dw dw dd dd dq dt

0x55 ; 0x55,0x56,0x57 ; 'a',0x55 ; 'hello',13,10,'$'; 0x1234 ; 'a' ; 'ab' ; 'abc' ; 0x12345678 ; 1.234567e20 ; 1.234567e20 ; 1.234567e20 ;

just the byte 0x55 three bytes in succession character constants are OK so are string constants 0x34 0x12 0x61 0x00 (it's just a number) 0x61 0x62 (character constant) 0x61 0x62 0x63 0x00 (string) 0x78 0x56 0x34 0x12 floating-point constant double-precision float extended-precision float

Declaring Uninitialized Data


RESB, RESW, RESD, RESQ and REST are

designed to be used in the BSS section of a module: they declare uninitialized storage space. Each takes a single operand, which is the number of bytes, words, doublewords or whatever to reserve. NASM does not support the MASM/TASM syntax of reserving uninitialized space by writing `DW ?' or similar things.

Defining Constants
EQU defines a symbol to a given constant

value: when EQU is used, the source line must contain a label. The action of EQU is to define the given label name to the value of its (only) operand. This definition is absolute, and cannot change later. So, for example,

message msglen

db equ

'hello, world' $-message

Repeating Instructions or Data


The TIMES prefix causes the instruction to be

assembled multiple times. This is partly present as NASM's equivalent of the DUP syntax supported by MASM-compatible assemblers, in that you can code

zerobuf: times 64 db 0 times 100 movsb ; trivial unrolled loops

Effective Addresses
An effective address is any operand to an

instruction which references memory. Effective addresses, in NASM, have a very simple syntax: they consist of an expression evaluating to the desired address, enclosed in square brackets. For example:

wordvar dw 123 mov ax,[wordvar] mov ax,[wordvar+1] mov ax,[es:wordvar+bx]

Numeric Constants
A numeric constant is simply a number.

NASM allows you to specify numbers in a variety of number bases, in a variety of ways: you can suffix

H, Q or O, and B for hex, octal and binary, or prefix 0x or $ for hex in the style of C and Pascal

Note, a hex number prefixed with a $ sign must have a digit after the $ rather than a letter.

Ex. Numeric Constants


mov

mov
mov mov mov mov mov

ax,100 ax,0a2h ax,$0a2

; decimal ; hex ; hex again ; the 0 is required ax,0xa2 ; hex yet again ax,777q ; octal ax,777o ; octal again ax,10010011b ; binary

Echo entered string


org 0x100 start:jmp load buf: db 11 resb 12 ;reserve 12 bytes _ent: db 10, 13, '$ load: mov ah,0ah mov dx,buf int 21h mov mov mov add mov al,[buf+1] ah,0x00 bx,buf+2 bx,ax byte [bx],'$'

mov mov int int

ah,09h dx,buf+2 21h 20h

mov ah,$09 mov dx,_ent int 21h

How to NASM
nasm -f bin program.asm -o program.com nasm -f bin driver.asm -odriver.sys

Q&A

Thats it for now.