Anda di halaman 1dari 22

Introduction

I was fascinated by computers right from the moment I got an opportunity to lay my
hands on one. I learned BASIC programming in school when I was in grade eight. Those
were the days of the 80286 with BASICA on MS-DOS. I went on to complete my degree
in Medicine. After completion my interest in computers returned as I got the opportunity
of developing a small database system for the hospital I was working. I went on to learn
C++, Visual Basic, and ultimately Assembler. I always had a fantasy of developing my
own operating system and started reading more and more about that. I ultimately set
down to develop a small operating system, which will help me understand the PC better. I
then realised the difficulties I underwent to get the needed help and so decided to write
the OS as a tutorial as well. I am not sure if the OS will ultimately be completed. But I
am sure the reader will be able to complete his/her own version as he/she works with it.
Developing the system (which is still under process) was great fun and I am sure the
reader will enjoy developing his/her own system.

Pre-requisites
A basic/working knowledge of Assembler with an unrelenting desire to explore and know
more and to develop something new will be needed to start the project. My own
knowledge of the PC is derived from books ('Under the IBM PC' by Peter Norton is an
excellent text and I would suggest readers to get a copy of it and read it cover to cover) as
a result of which the project will be very much limited in expertise. A copy of the
excellent and free NASM (Netwide assembler) is needed for developing the system. You
can download your free copy at NASM. All development is done on the Windows
operating system. It is also possible to do the same under Unix/Linux system. Differences
will be hinted whenever needed.

Design considerations
JOSH, as I named it to signify a sense of excitement/satisfaction, will be a real-mode
operating system to make learning faster and easier (and my own understanding of the
protected mode has only just started). Like MS-DOS it will be interrupt driven. JOSH
will be a single tasking operating system.

Operating system basics


This discussion of the OS basics should not be considered as an expert discussion and for
more help you are advised to refer books on OS development ('Operating Systems -
design and implementation' by Andrew S Tanenbaum is an excellent book). A general
understanding of how applications execute, the existence of registers, their functions, the
existence of the stack etc. are assumed to be present and this tutorial is not the place to
look for such information. Now let us dive right in to develop JOSH.
The Boot process
The bootup process is what happens when we power up the PC. When we power up the
PC, all registers are blanked and the microprocessor is set to a reset state. Then the
address 0xFFFF is loaded into the code segment and the instruction present at that
location is executed. Taking this fact into consideration the basic software called BIOS
(Basic Input Output System) is present at that location. So the BIOS will execute as a
result. The BIOS will run a necessary check of all memory for errors, connected devices -
like serial ports etc, and after completion of these system checks will search for the
Operating System, load it and execute it. By making changes in the BIOS setup (we can
enter into the BIOS settings by hitting the relevant key during the bootup process) we can
make the BIOS look for the OS from the floppy disk, hard disk, CD-ROM etc in any
order we want. We will have to make the BIOS look for the floppy first for the OS when
we start developing JOSH.

The BIOS will not load the complete operating system. It will just load a fragment of
code present in the first sector (also called boot sector) of the floppy disk (we will keep
all discussions to the floppy disk as JOSH will not touch the Hard-disk now). This
fragment of code will be 512 bytes long (if we use a DOS formatted floppy) and the last
two bytes of the fragment should be 0xAA55 (also called the boot signature). If the boot
signature is absent the floppy disk is not considered bootable and the BIOS will search
the next disk for a valid boot signature and the OS. This tiny fragment of code that has to
be present in the boot sector is called the boot-strap loader. The BIOS will load the boot-
strap loader into memory starting at 0x7C00 (remember that all segment addresses start at
a paragraph boundary and hence the lower '0' of 0x7C00 is asumed to be present and the
address will be translated as 0x07C0 when we move it to a segment register) and execute
the boot-strap loader. It is now up to the boot-strap loader to load the operating system
and make the necessary settings for the correct functioning of the operating system. This
is the boot process and the reader is advised to understand this process well before
continuing.

Considerations before we write our boot-strap loader


At the end of the BIOS boot process and just before loading the boot-strap loader the
system memory map will look as follows:

1. Memory block 0x0000 to 0x0040 (1-KB) will have a list of addresses called the
Interrupt Service Routine Vectors (more about ISR vectors later). We should not touch
this area unless we know what we are doing.

2. Memory block 0x0040 to 0x0100 (not actually upto 0x0100 but I have picked up this
limit for safety reasons and to use part of it at a later time) is called the BIOS data area
and it is in this area where the BIOS saves data about the memory available, devices
connected etc. We should not touch this area unless we know what we are doing.
3. Memory block 0x0100 to 0x07C0 is free and is available for the OS and applications.

4. Memory block 0x07C0 to 0x07E0 is where the boot-strap loader is loaded by the
BIOS. We should not touch this area from the boot-strap loader, but we can overwrite
this area once the OS is loaded and is under control.

5. Memory block 0x7CE0 to 0xA000 is free and is available for the OS and applications.

6. Memory block 0xA000 to 0xC000 is set aside for the video sub-system. We should not
touch this area unless we want to write/draw to the screen directly bypassing the BIOS
routines for the same. There are reasons for that to happen and the process will be
discussed when needed.

7. Memory block 0xC000 to 0xF000 is the location of the ROM BIOS, and we cannot
write to that area under normal conditions.

8. Memory block 0xF000 to 0xFFFF is the base ROM system ROM (the top of which is
where the first instruction is present on a re-boot).

This is the condition of the memory layout during loading of the boot-strap loader.
Kindly make a drawing of the memory map to understand it better. We will choose to
load JOSH at memory location 0x0100, just above the BIOS data area.

Assembling our first boot-strap loader


In computing, re-inventing what is already available is considered a waste of resource.
Hence it is strongly advised that if resources are freely available, and are good we should
not try and re-do them again. JOSH is developed as an educational utility and hence you
are advised to use existing solutions so that you can concentrate on understanding the
system.

We can write a boot-strap loader to load the OS from raw sectors from the floppy disk
into RAM and execute, but this will need us to keep an eye on the OS size so that the
correct number of sectors need to be loaded. This also has another disadvantage that we
will need perfect floppy disks without bad sectors (as coding to look for bad sectors and
avoiding them is too much of a task at this stage). To solve all these problems we can
write a boot-strap loader which can understand the FAT12 structure and load the FAT file
into memory. We will have to concentrate on the development of the OS itself that we
will use readily available code to load JOSH into memory. The 'ultimate boot loader'
written by Matthew Vea (thanks Matthew for this wonderful bootloader. The original
code can be reached here) is one such code snippet that I ran across on a web search.
Download boot-code. Most of the code is concerned with reading the FAT table and
getting the file to memory. We will not go into the details now (we will write our own
custon boot-strap loader at a later date when we are happy with our OS).
Description of the Boot-strap Loader source
Lines 31-41: The BIOS Boot loader doesn't bother to setup your DS, ES, ES, FS, SS
segments. We have to initiate them to 0x07C0 (remember 0x7C00 is 0x07C0, with the
lower '0' understood as all segment addresses start on a paragraph boundary). When we
change the stack segment register, we have to stop all interrupts as we won't know the
current value of the SS and any interrupts happening during the change can result in loss
of data and even in crashing of the system.

Lines 96-98: The two addresses (0x0100, and 0x0000) form the CS:IP pair where JOSH
will be loaded. We can change this to any value of our choice (except ROM area of
course) and when we execute JOSH, it will be this values to which the CS & IP should be
set.

Lines 132-133: The two addresses pushed here into the stack should be the address where
the OS was loaded into memory (see the above point). The order should be CS and IP
values. The execution of JOSH is done using the RETF instruction. The RETF instruction
will jump to a block by popping the IP and CS values from the stack in that order. By
pushing 0x0100 and 0x0000 into the stack just before the RETF instruction we are
making the RETF instruction to jump to 0100:0000. This is one useful trick we should
understand while we load applications and execute them.

Line 235: The constant 'ImageName' has the name of the OS file. The filename is in DOS
8.3 format with spaces if the first part is shorter than 8 characters. Note that all characters
are in upper case. The 'kernel' is the part of the operating system that resides in memory
and provides all the system functions like loading an application etc. Usually a separate
application called the 'shell' will be running, which will receive user commands and
interpret & act on them. To start with JOSH will have a single file which will be both the
kernel and the shell. At a later time we will separate the 'shell' from the 'kernel'. As of
now the name of the OS image will be 'KERNEL.BIN' and is entered as 'KERNEL BIN'
in the ImageName constant. change it if you choose to have a different name (and make
sure you name the OS executable to that name).

Line 241: This directive tells the NASM assembler to pad up the final executable to 510
bytes long with '0'. we should remember that the boot-strap loader should be exactly 512
bytes long. We are padding only upto 510 bytes length so that we can add the boot
signature as the 511 & 512 bytes.

Line 242: This line stores 0xAA55 as the last two bytes to make a valid boot signature.

Assembling our boot-strap loader


To assemble our boot-strap loader, we have to setup NASM successfully. Download the
NASMW.EXE file to a convenient directory (I use "C:\NASM") as we have to work with
the command prompt pretty often. NASM doesn't need special setups and if you plan to
operate from a different directory from the NASM home directory, you should add the
NASM home directory to the PATH of the system. You can do this by adding
";C:\NASM" or the actual location (watch out for the ';' before the path if more than one
path is already set) of the NASM executables to the end of the "SET PATH=******" line
of the Autoexec.bat file in the root directory of the C: drive. You can get more help about
setting the PATH from the web.

Copy the 'boot.asm' file to the NASM directory (or other convenient directory if you have
setup the correct PATH variable). Open the command prompt and change to the
directory. Type the following command:

NASMW BOOT.ASM -o BOOT.BIN -f bin

'BOOT.ASM' is the name of the assembler source file. The -o flag instructs the assembler
to create the output file as 'boot.bin', you can change the file name here if you want to
have the executable in a different name. If you don't supply the -o flag and the name of
the executable, the executable is created in the name of the source file without an
extension. The -f flag instructs the assembler to generate code for a target processor. The
NASM is a vary capable assembler and supports code generation for a variety of
processors and code types. The "-f bin" directive instructs the assembler to produce raw
binary output. This is what we want the assembler to do while assembling our boot-strap
loader.

NASM gives error messages with line numbers so you should be able to get to the line
that caused the error and rectify it.

Now that the boot-strap loader is ready we have to load it onto the boot-sector of a floppy
disk to make it useful. You have all the tools needed with your Windows system. Fire up
your console and change to the directory where "boot.bin" resides. Now start Debug.exe
by typing 'debug' with the boot image name as a parameter. Debug is a console based
low-level tool to view memory locations, assemble small snippets etc. You may have to
supply the full path of the WINDOWS\COMMAND directory if it is not included in your
system PATH. You will get the Debug prompt of '-' as follows:

C:\NASM>Debug boot.bin

Now insert a clean DOS formatted floppy labelled into the first floppy drive and issue the
following command at the debug prompt

-W 100 0 0 1

Wait till the light goes off the floppy drive. Now issue 'Q' at the debug prompt to exit
debug as
-Q

The whole process will look like this:

C:\NASM>Debug boot.bin

-W 100 0 0 1

-Q

C:\>

Unix/Linux users can copy the boot image to the boot sector of the floppy disk using the
following command

dd if=boot.bin bs=512 count=1 of=/dev/fd0

Thats it! Your boot-strap loader is ready for action. This is the JOSH Boot floppy. What
remains now is to develop a small kernel/shell to be loaded and executed. That is the next
part.

Considerations before we write our kernel/shell


The kernel is the core of the OS performing most of the house-keeping work and
providing the applications with a good set of functionality. The kernel is generally never
unloaded from memory. The kernel (or its functionality) should be available always and
applications should have a way of interacting with the kernel in a seamless manner. There
are many ways to implement this and in real mode interrupts provide a seamless and fast
way to provide the functionality. JOSH will provide all the functionality to applications
through interrupts. We will discuss about interrupts when it is needed.

Writing and assembling a rudimentary kernel


We will start out to write a rudimentary kernel, which will display a welcome message,
wait for a keypress, and will reboot. We will use BIOS interrupts extensively for display
until we develop our own routines to display text/graphics using the VGA RAM.

NASM uses three segments, '.text' for code, '.data' for initialized data (such as constants
that never change), and '.bss' for un-initialized data (such as variables that can be used
later). Our code starts from the first line (unlike in DOS executables where the first 0x100
bytes will have a header for the program loader and the actual code starts after that) and
to tell this to the assembler you have to use 'org 0x0000', which will set the IP to the first
word of the executable. The 'bits 16' directive tells the assembler to assemble 16 bit real
mode code. The skeleton kernel without any code will look as follows:
;*********************start of the kernel code***********
[org 0x0000]
[bits 16]

[SEGMENT .text]

[SEGMENT .data]

[SEGMENT .bss]

;*********************end of the kernel code*************

We first have to write a function to display a string so that we can call the function as
needed. Let's call the function "_disp_str". The function should be under the '.text'
segment.

The function is as follows:

[SEGMENT .text]
_disp_str:
lodsb ; load next character
or al, al ; test for NUL character
jz .DONE
mov ah, 0x0E ; BIOS teletype
mov bh, 0x00 ; display page 0
mov bl, 0x07 ; text attribute
int 0x10 ; invoke BIOS
jmp _disp_str
.DONE:
ret

'lodsb' loads a character from the DS:SI memory location into the AL register. Check if it
is the null character (we will terminate all strings with the null character 0x00). If yes
jump to the end of the function and return. Else, display the character in AL using the
BIOS teletype display service. We use the teletype service so that it will automatically
respond to 'carriage return', 'bell', 'backspace' etc. With every displayed character the loop
loads the next character to the AL register until it is a null character. Before the function
is called, the location of the string to be displayed has to be correctly initialized to the
DS:SI register pair.

Next initialize the string with the message (note the trailing 0x00 null character at the end
of the string) in the '.data' segment as follows:

[SEGMENT .data]
strWelcomeMsg db "Welcome to JOSH!", 0x00

As already discussed with the boot-strap loader code we have to initialize the segment
registers before any other code. The initialization code is as follows:
[SEGMENT .text]
mov ax, 0x0100 ;location where kernel is loaded
mov ds, ax
mov es, ax

cli
mov ss, ax ;stack segment
mov sp, 0xFFFF ;stack pointer at 64k limit
sti

_disp_str:
...

After initializing the data, stack and extra segments to the code segment, the stack pointer
is set at 0xFFFF at 64K boundary. This gives room for a big stack segment and should be
more than enough for the time being.

Now we should make a call to the display function to display the welcome message. To
do that load the SI register with the address of the welcome string and call the function.
NASM gives the address of variables (called tokens) when referenced without any braces
around them (this is different from what MASM users will expect). The code is as
follows:

...
sti

mov si, strWelcomeMsg ;load message


call _disp_str
...

Next terminate the kernel after waiting for a keypress as follows:

...
call _disp_str

mov ah, 0x00


int 0x16 ; await keypress using BIOS
int 0x19 ; reboot
...

Thats the code of the rudimentary kernel, which should now be able to print a welcome
message to the screen, wait for a key press and reboot. The complete code for the
rudimentary kernel can be downloaded here.

Assemble the kernel with the following command:

nasmw kernel.asm -o kernel.bin -f bin


This should produce a small 'kernel.bin' file in the default directory. Copy this file to the
JOSH boot floppy you just created. Insert the floppy into the PC and reboot the system.
You should see JOSH being loaded, print a welcome message, wait for a key press and
reboot. Remove the floppy to reboot normally.

If you have come this far you can give a pat on your back that you have made quite some
progress into developing your own OS. The journey has just started and it is going to be
even more fun. Move on to the next chapter.

Considerations before we write our first interrupt


service
An interrupt is an operation, which stops execution of the user program so that the system
can pay attention to the event causing the operation. For eg. if you are playing some
music with an application like media player. You know that the application is busy
converting the file into sound. Now you want to stop playing and you hit the '.' key. What
happens? The player application stops for a moment (a very brief moment that does not
produce an audible pause in the music), analyses the key pressed, senses that you wanted
to stop the music, and finally stops the music from playing. If you had pressed any key
this would have happened, except the music stopping in the end. Interrupts are so
important that PCs would not function normally if interrupts are not working.

How do Interrupts work?


Interrupts have numbers, and there can be upto 256 different interrupts. When an
interrupt occurs (like a keypress or a mouse click), the application running is stopped and
the contents of the CS/IP/flags are pushed into the stack, and the routine that has to
handle the interrupting event is executed. After execution of the routine, using an IRET
call, execution returns to the application. The locations of all the interrupt handling
routines are maintained at the beginning (0000:0000) of memory, and it is called the
Interrupt Service Routine table. The location of the interrupt handling routine is identified
by multiplying the interrupt number by four. The address 0000:(interrupt no. * 4) has the
IP (two bytes) of the routine and the next 2 bytes have the CS of the routine. This address
is moved to the CS:IP and the routine is executed. The end of the interrupt routine will
exit with a IRET call, which will pop the Flags/IP/CS that was pushed before the
interrupt, and continue execution of the application that was running.

Our first interrupt service!


We have to make a few decisions before we design our first interrupt service. Some of
them are as below:

1. What will the interrupt number be?


2. How many sub-services will we support?

3. How will we know a sub-service - what register to use?

We will make a decision that we will setup interrupt 0x21 (DOS uses the same number
for a lot of services). We will use the 'AL' register to differentiate between sub-services.
We can support upto 256 sub-services, we needn't worry about the upper boundary now!

Firstly we will convert the function displaying 'zero' terminated strings (_disp_str) as an
interrupt service and use it to display the welcome message. So service 0x01 of the
interrupt 0x21 will display a string upto the 'zero' terminator, and the string should be
pointed by the 'SI' register. Let us start with a skeleton for the interrupt routine, which
will look as follows:

_int0x21:
iret

This does nothing but return from the interrupt as soon as it enters. Now we have to
identify what service was actually wanted by checking the contents of the 'AL' register.
Let us do it as follows:

_int0x21:
_int0x21_ser0x01: ;service 0x01
cmp al, 0x01 ;see if service 0x01 wanted
jne _int0x21_end ;goto next check (now it is end)

_int0x21_ser0x01_start:
lodsb ; load next character
or al, al ; test for NUL character
jz _int0x21_ser0x01_end
mov ah, 0x0E ; BIOS teletype
mov bh, 0x00 ; display page 0
mov bl, 0x07 ; text attribute
int 0x10 ; invoke BIOS
jmp _int0x21_ser0x01_start
_int0x21_ser0x01_end:
jmp _int0x21_end

_int0x21_end:
iret

Now let us examine the code in detail. Each sub-service will start with an identifier in the
form of _int0x21_ser0x01, the first part being the interrupt number and the second part
being the service number. At the beginning of each part a comparison is done of the 'AL'
register against the sub-service number. If the numbers do not match (it means it is a
different sub-service that was requested), execution should continue to the next sub-
service (or to the end of the interrupt service if it is the last service). As of now since
service 0x01 will be the only one, if the service doesn't match, the routine should end.
Otherwise the code is the same (except for some label changes) that was used by the
_disp_str routine. Now, the interrupt routine above should completely replace the
_disp_str routine from the original kernel code.

Now that we have written the interrupt handler, we have to write the location of the
handler to the Interrupt Table for it to work. The routine looks like this and should be
inserted after the initiation block of the kernel code that sets up the stack:

...
push dx
push es
xor ax, ax
mov es, ax
cli
mov word [es:0x21*4], _int0x21 ; setup our OS services
mov [es:0x21*4+2], cs
sti
pop es
pop dx
...

Let us examine the code in detail. We save the contents of the 'DX' and the 'ES' registers.
Then we blank the 'AX' register and move its contents to the 'ES' register (it is just a fast
way to do and anything else will also work). Then we disable interrupts as we are
meddling with interrupts themselves. The first 'MOV' command will move the location of
start of the interrupt routine to the es:0x21*4 location (remember that a label points to the
next instruction in NASM and please read how NASM calculates memory addresses).
The second 'MOV' command moves the CS value to the Interrupt table (remember that
the current CS value is the correct value for the interrupt). Next we enable interrupts and
pop the saved values in reverse order. That is what it takes to write a small interrupt
handler and setup the interrupt table entry!

One more thing has to be done. The call that displays the welcome message has to be
changed to the following:

mov si, strWelcomeMsg ;load message


mov al, 0x01 ;request sub-service 0x01
int 0x21

The complete kernel code can be downloaded here.

Finally you have to assemble the code and copy the resultant KERNEL.BIN file to the
bootable floppy you already created. Go ahead and try, you will see JOSH booting as it
did before, only that it is a new animal with its own interrupt handler.

Shell/Command Interpreter
The shell or the command interpreter (we will hereafter refer to this as the shell) is the
routine, which communicates with the user in a console based operating system like
JOSH. The shell displays a prompt, gets user input, analyses it and performs the
necessary action. Usually there are commands that the shell understands itself and
provides the user with the necessary result - these commands are called internal
commands. Sometimes the shell doesn't understand the command, and usually in this
situation it tries to find a file with the command name and loads it for execution, if it is
executable.

The shell is the interface between the operating system and the user and hence it is very
important. Intuitiveness and usefulness of the shell determines if the OS is liked or hated
by the user. The design of a shell takes into consideration many factors:

1. Should the shell be part of the kernel or should it be another application that the kernel
loads automatically? - This is an important consideration as shell being part of the kernel
will usually retain that space in memory throughout, resulting in reduced memory
availability for user applications. The other option is better as the shell can be overwritten
by the user application and re-loaded when the user application terminates. We will use
the former approach as it is easy to code and we are aiming to provide only minimal
functionality with the shell, which wouldn't take much space.

2. Length of the commands the shell can receive - We will set this limit as 255 characters.
The shell will receive only 255 characters.

3. How many operands will a typical command have? - The operands can modify the
command and sometimes will be parameters to the command. We will decide that the
JOSH shell will accept a command with 4 operands and anything more will be discarded.
The command and operands will be on a single string of characters and will be separated
by spaces. The shell can identify leading/trailing/multiple-intermittent spaces and will
remove them while deriving the command and the operands.

4. What internal commands will the shell support? - We will not define this at this
moment. We will add internal commands as they are needed.

Having said so much let us delve into the design of the JOSH shell. Move on to the next
chapter.

Shell implementation
The implementation of the shell will be done in small steps that do meaningful work. As
an aside, I prefer to develop small applications to test routines so that development can be
done fast without the need to reboot the system often (I assume that most users will have
access to one PC to do all the development/testing). I develop small .com binaries that I
test immediately and once I am happy with the working of the routine, I move it to the
JOSH source to finally test it.
First we will write small routines that will do some screen oriented chores like displaying
space, moving to the beginning of next line etc. Let us start with a small function to
display a space and advance the cursor. This can be done everytime we need in the
required routine itself but it is time consuming. It is easy to write small functions and call
them when needed. The function looks like below:

_display_space:
mov ah, 0x0E ; BIOS teletype
mov al, 0x20 ; space character
mov bh, 0x00 ; display page 0
mov bl, 0x07 ; text attribute
int 0x10 ; invoke BIOS
ret

The code is rather self explanatory and you should have no difficulty understanding it.
These functions should go to the end of the code section after the interrupt handler (it is
only a matter of taste, you can place it anywhere in the code section). The next function
will move the cursor to the beginning of the next line. The code is:

_display_endl:
mov ah, 0x0E ; BIOS teletype acts on newline!
mov al, 0x0D
mov bh, 0x00
mov bl, 0x07
int 0x10
mov ah, 0x0E ; BIOS teletype acts on linefeed!
mov al, 0x0A
mov bh, 0x00
mov bl, 0x07
int 0x10
ret

The code is similar to the previous function but displays two characters in succession, the
'newline' and the 'linefeed' characters. The newline character will move the cursor to the
beginning of the same line and the linefeed will move the cursor to the next line. The
BIOS has many interrupt sub-services to display characters, this one will display the
character and also act on the character by moving the cursor accordingly (for eg. if you
try to display the backspace character, the cursor actually will move one space back - but
only upto the beginning of the line and also will not erase existing displayed characters as
it moves!).

The next function we develop will display the shell prompt! To do this we have to decide
how our shell prompt is going to look. The JOSH shell prompt is goint to look like
'JOSH>>' and if you don't like it you can always change it to your liking. Before
displaying the prompt, we have to create a string variable to hold the prompt, and
changing the contents of this variable will change the prompt. Let us do it like this:

[SEGMENT .data]
strWelcomeMsg db "Welcome to JOSH!", 0x00
strPrompt db "JOSH>>", 0x00
...

This adds a variable called 'strPrompt' made of a string of bytes having 'JOSH>>' in it as
well as a terminal 0x00. Now we will add a function to display the string.

_display_prompt:
mov si, strPrompt ;load message
mov al, 0x01 ;request sub-service 0x01
int 0x21
ret

The code again is self explanatory and you should have no difficulty understanding it.
Now we have to create a buffer to hold the command that the user will type in. As we
have already decided, a user command can be 255 chars long, so our buffer should be 256
chars long to accommodate a terminal 0x00. We can declare un-initiated data in NASM
in the .bss section at the end of the code. We also need the ability to change the length of
the commands later, so we will not hard-code the length in the concerned routine but
rather will declare a variable called 'cmdMaxLen' and initiate it to the maximun length
desired. We also need a counter to keep track of the number of characters entered in the
command. This will be called 'cmdChrCnt'. The declaration section looks like this:

[SEGMENT .data]
strWelcomeMsg db "Welcome to JOSH!", 0x00
strPrompt db "JOSH>>", 0x00
cmdMaxLen db 255 ;maximum length of commands

[SEGMENT .bss]
strUserCmd resb 256 ;buffer for user commands
cmdChrCnt resb 1 ;count of characters

Now that we have the necessary helper functions and the data variables in place, we can
delve into the actual function that will accept a user input, which is called
'_get_command'. This function will display the cursor and wait for the keyboard input.
When a key is struck, it analyses the key pressed. If the key is an extended key (F1,
HOME etc.) it will do nothing and will continue to wait. If the key pressed is ENTER, it
will terminate the buffer with 0x00 and return. If the key pressed is a backspace, it will
move the buffer pointer one character behind if not already at the beginning and will also
move one space back on the screen. If the key pressed is a character key, it will be added
to the buffer if the buffer is not 255 characters long. It is a pretty long listing and try to
understand it fully. The code can be re-written using the more efficient MOVSB
assembler directive, which I leave it to you. The listing is as follows:

...
_int0x21_end:
iret
_get_command:
;initiate count
mov BYTE [cmdChrCnt], 0x00
mov di, strUserCmd

_get_cmd_start:
mov ah, 0x10 ;get character
int 0x16

cmp al, 0x00 ;check if extended key


je _extended_key
cmp al, 0xE0 ;check if new extended key
je _extended_key

cmp al, 0x08 ;check if backspace pressed


je _backspace_key

cmp al, 0x0D ;check if Enter pressed


je _enter_key

mov bh, [cmdMaxLen] ;check if maxlen reached


mov bl, [cmdChrCnt]
cmp bh, bl
je _get_cmd_start

;add char to buffer, display it and start again


mov [di], al ;add char to buffer
inc di ;increment buffer pointer
inc BYTE [cmdChrCnt] ;inc count

mov ah, 0x0E ;display character


mov bl, 0x07
int 0x10
jmp _get_cmd_start

_extended_key: ;extended key - do nothing now


jmp _get_cmd_start

_backspace_key:
mov bh, 0x00 ;check if count = 0
mov bl, [cmdChrCnt]
cmp bh, bl
je _get_cmd_start ;yes, do nothing

dec BYTE [cmdChrCnt] ;dec count


dec di

;check if beginning of line


mov ah, 0x03 ;read cursor position
mov bh, 0x00
int 0x10

cmp dl, 0x00


jne _move_back
dec dh
mov dl, 79
mov ah, 0x02
int 0x10

mov ah, 0x09 ; display without moving cursor


mov al, ' '
mov bh, 0x00
mov bl, 0x07
mov cx, 1 ; times to display
int 0x10
jmp _get_cmd_start

_move_back:
mov ah, 0x0E ; BIOS teletype acts on backspace!
mov bh, 0x00
mov bl, 0x07
int 0x10
mov ah, 0x09 ; display without moving cursor
mov al, ' '
mov bh, 0x00
mov bl, 0x07
mov cx, 1 ; times to display
int 0x10
jmp _get_cmd_start

_enter_key:
mov BYTE [di], 0x00
ret

_display_space:
...

The routine above will get user input into a string and terminate it with 0x00 but we
cannot use this string directly as a command. As we have already discussed, a command
can have many parts with directives/operands added after the command. Each of them
will be separated by one or many spaces (we will not be so rigid and will let users key in
leading/trailing spaces as well as multiple spaces between directives). We will have to
split the user input into the various components before using it. We will declare five new
buffers, one for each component of the command. The complete .bss declaration section
is as follows:

[SEGMENT .bss]
strUserCmd resb 256 ;buffer for user commands
cmdChrCnt resb 1 ;count of characters
strCmd0 resb 256 ;buffers for the command components
strCmd1 resb 256
strCmd2 resb 256
strCmd3 resb 256
strCmd4 resb 256

Now we will look at the function that will split the string into five sub-strings (more parts
will be ignored!). To get a string, first we will analyse the string from the beginning and
move over spaces, then we will copy characters to the sub-string until we meet a space or
the terminator char 0x00. We will do this cycle five times to get all the command
components. If only one or two components are present, the other components will be
terminated at the first character with 0x00. This function can be added after the previous
function and the listing for this is:

_split_cmd:
;adjust si/di
mov si, strUserCmd
;mov di, strCmd0

;move blanks
_split_mb0_start:
cmp BYTE [si], 0x20
je _split_mb0_nb
jmp _split_mb0_end

_split_mb0_nb:
inc si
jmp _split_mb0_start

_split_mb0_end:
mov di, strCmd0

_split_1_start: ;get first string


cmp BYTE [si], 0x20
je _split_1_end
cmp BYTE [si], 0x00
je _split_1_end
mov al, [si]
mov [di], al
inc si
inc di
jmp _split_1_start

_split_1_end:
mov BYTE [di], 0x00

;move blanks
_split_mb1_start:
cmp BYTE [si], 0x20
je _split_mb1_nb
jmp _split_mb1_end

_split_mb1_nb:
inc si
jmp _split_mb1_start

_split_mb1_end:
mov di, strCmd1

_split_2_start: ;get second string


cmp BYTE [si], 0x20
je _split_2_end
cmp BYTE [si], 0x00
je _split_2_end
mov al, [si]
mov [di], al
inc si
inc di
jmp _split_2_start

_split_2_end:
mov BYTE [di], 0x00

;move blanks
_split_mb2_start:
cmp BYTE [si], 0x20
je _split_mb2_nb
jmp _split_mb2_end

_split_mb2_nb:
inc si
jmp _split_mb2_start

_split_mb2_end:
mov di, strCmd2

_split_3_start: ;get third string


cmp BYTE [si], 0x20
je _split_3_end
cmp BYTE [si], 0x00
je _split_3_end
mov al, [si]
mov [di], al
inc si
inc di
jmp _split_3_start

_split_3_end:
mov BYTE [di], 0x00

;move blanks
_split_mb3_start:
cmp BYTE [si], 0x20
je _split_mb3_nb
jmp _split_mb3_end

_split_mb3_nb:
inc si
jmp _split_mb3_start

_split_mb3_end:
mov di, strCmd3

_split_4_start: ;get fourth string


cmp BYTE [si], 0x20
je _split_4_end
cmp BYTE [si], 0x00
je _split_4_end
mov al, [si]
mov [di], al
inc si
inc di
jmp _split_4_start

_split_4_end:
mov BYTE [di], 0x00

;move blanks
_split_mb4_start:
cmp BYTE [si], 0x20
je _split_mb4_nb
jmp _split_mb4_end

_split_mb4_nb:
inc si
jmp _split_mb4_start

_split_mb4_end:
mov di, strCmd4

_split_5_start: ;get last string


cmp BYTE [si], 0x20
je _split_5_end
cmp BYTE [si], 0x00
je _split_5_end
mov al, [si]
mov [di], al
inc si
inc di
jmp _split_5_start

_split_5_end:
mov BYTE [di], 0x00

ret

Now we have all the components of a shell ready. A component to get user inputs, and
the one to split it into the needed sub-strings. We have to write a wrapper to display the
prompt, get a command, split it, analyse the first component and act on it. If the
command is an internal command, the result is displayed and the cycle continues till the
user exits from the shell - using the 'exit' command. Another thing I forgot to add
previously is that we will follow the UNIX convention of case-sensitive commands and
filenames (it makes life a lot simpler to code! but maybe a bit tough on the user).

We also have to define the internal commands available and set up a couple of strings to
hold the OS name, version numbers etc. The complete listing of the declarations are:

[SEGMENT .data]
strWelcomeMsg db "Welcome to JOSH Ver 0.03", 0x00
strPrompt db "JOSH>>", 0x00
cmdMaxLen db 255 ;maximum length of commands

strOsName db "JOSH", 0x00 ;OS details


strMajorVer db "0", 0x00
strMinorVer db ".03", 0x00

cmdVer db "ver", 0x00 ; internal commands


cmdExit db "exit", 0x00

txtVersion db "version", 0x00 ;messages and other strings


msgUnknownCmd db "Unknown command or bad file name!", 0x00

The shell routine listing is:

_shell:
_shell_begin:
;move to next line
call _display_endl

;display prompt
call _display_prompt

;get user command


call _get_command

;split command into components


call _split_cmd

;check command & perform action

; empty command
_cmd_none:
mov si, strCmd0
cmp BYTE [si], 0x00
jne _cmd_ver ;next command
jmp _cmd_done

; display version
_cmd_ver:
mov si, strCmd0
mov di, cmdVer
mov cx, 4
repe cmpsb
jne _cmd_exit ;next command

call _display_endl
mov si, strOsName ;display version
mov al, 0x01
int 0x21
call _display_space
mov si, txtVersion ;display version
mov al, 0x01
int 0x21
call _display_space

mov si, strMajorVer


mov al, 0x01
int 0x21
mov si, strMinorVer
mov al, 0x01
int 0x21
jmp _cmd_done

; exit shell
_cmd_exit:
mov si, strCmd0
mov di, cmdExit
mov cx, 5
repe cmpsb
jne _cmd_unknown ;next command

je _shell_end ;exit from shell

_cmd_unknown:
call _display_endl
mov si, msgUnknownCmd ;unknown command
mov al, 0x01
int 0x21

_cmd_done:

;call _display_endl
jmp _shell_begin

_shell_end:
ret

The shell currently understands two commands. The 'ver' command will display the OS
name and the version, and the 'exit' command will exit from the shell.

Finally we have to call the shell from the main routine after displaying the welcome
message. We can also safely remove the two lines of code that wait for a key press to
reboot. This will finish the shell integration for now. The call looks like:

...
mov si, strWelcomeMsg ; load message
mov al, 0x01 ; request sub-service 0x01
int 0x21

call _shell ; call the shell

int 0x19 ; reboot


...

That comes to the end of a lengthy session. The complete kernel can be downloaded here.
Go ahead and compile the new kernel. Copy it to your JOSH boot-disk and boot JOSH.
Play around with the shell. Try inputting all sorts of keys and see how the shell responds.
As of now it should display the version in response to the 'ver' command, and reboot for
the 'exit' command. Check if it responds to leading/trailing spaces.
That comes to the end of another part of our journey. Now we have understood how an
OS boots. We have successfully installed an interrupt service, and have integrated a
rudimentary shell to play with. The next step would be to add as many interrupt sub-
services as needed to display things, and to add most of the internal shell commands that
do not need access to the filesystem. These chores will be done in the next chapter.

Anda mungkin juga menyukai