Anda di halaman 1dari 13

Computer And System Engineering Department

Presents

FS Assignment 2

ENCODE

Blagwa
Intro

FS Assignment 2

ENCODE

By :

TA’S Notes :

2
Index

• Intro 2
• Index 3
• Problem Statement 4
• Overall System 6
• System Requirements 6
• UML 6
• How to Use 7
• Take care 7
• System under the microscope 8
• File Reader 8
• File Writer 9
• InBuf 10
• OutBuf 11
• File Statistics 12
• Huffman Encoding Tree Builder xx
• Huffman Encode xx
• Huffman Decode xx
• Some ideas along the way 13
• Bad 13
• Good .. BUT 13

3
Problem
Statement

You are required to write a program that can compress and


decompress a file using Huffman encoding.
Input to this program is:
1. A file name.
2. Whether to compress or decompress the file.
Output of the program is:
1. The compressed/decompressed file.
2. The code used to code the bytes of the source file.
3. The compression ratio.
4. The execution time.

Part (A) – Compression


You must collect statistics from more than one text,
binary files and find the Huffman codes for each byte. This is a
fixed code that is to be used in your program. There will be two
fixed codes one for text files and one for binary files.

To compress a file, you have to collect statistics for the file and
find whether it is more economical to code it using the fixed
code OR to use another Huffman code according to the new
statistics and store the code in the compressed file. Your output
is the compressed file, compression ratio and the code used to
compress the file. To display the code, it must be in the
following format:
Byte Original Code New Code
65 01000001 10001
66 01000010 101101
67 01000011 01000
4
Problem
Statement

Part (B) - Decompression


You must read the file header and determine whether
the file is encoded using the fixed code or using another code.
The file must then be decompressed (returned to its original
format).

5
Overall
System

Requirements

• Get a file name.


• See weather you are going to encode or decode.
• If encode
• Do statistics .
• Check for the most economical solution.
• And do it.
• If decode
• See which way you went through encoding .
• Move it backward the real file data.

UML Class Digram

6
Overall
System

How To Use

Take Care
• That is a beta version with many bugs
• The program wasn’t tested sufficiently
• Big file take a lot of time

7
System Under
Microscope

File Reader

Class FileReader is an input character like buffer


parameters :
inputFileStream => input File Stream -> to get data
from the file
characterBuffer => unsigned character buffer
index => integer (pointer) points to the next data to
be retrieved
limit=> integer (pointer) points to the end of data
Behaviors :
FileReader Constructor =>
parameter -> character pointer file name
Actions -> open up input file stream and read some
data and set index to start
~FileReader destructor =>
Actions -> close input file stream
readByte =>
Actions -> if index points to the end then
get new data if can'y get data
throw exception
return next valid data

8
System Under
Microscope

File Writer

Class FileWriter is an output character like buffer


parameters :
outputFileStream => file stream -> to write to
the file
CharacterBuffer => character array -> to store
characters
index => integer (pointer) -> to points to the next
empty slot to write in
Behaviors :
FileWriter Constructor =>
parameter -> Character pointer output file name
Actions -> open up file stream to the file and set
index to start
~FileWriter destructor => flush and closes fileWriter
writeChar =>
parameter -> unsigned character a
Actions -> if index points to the end write down and
flush the stored data in CharacterBuffer
write down a in the next empty place
close =>
Actions -> write and flush the data stored in
CharacterBuffer then closes the
outputFileStream file stream

9
System Under
Microscope

InBuf

Class InBuf (An input bit like buffer) to accumilate data inside it
and retrieve from it bit by bit
parameters :
inputCharacterBuffer -> File Reader (character like
buffer) to read from the input file
bitsDataStorage -> unsigned character (data storage
for the bits) to accumulate the bits and
retrieve from bit by bit
Index -> integer (pointer) points to where is the next
bit to be retrieved from
Behaviors :
InBuf Constructor 1 =>
parameters -> Character pointer input file name
Actions ->Creates new File Reader and sets the
pointer index to the start
InBuf Constructor 2 =>
parameters -> File Reader in Pointer
Actions -> set inputCharacterBuffer to in and sets
the pointer index to the start
~InBuf Destructor =>
Actions -> deletes inputCharacterBuffer
readBit =>
Action -> if index points to the end => get new data f
rom file reader and reset the index pointer
Retrieve next bit
return -> boolean to indicate weather it read 1 or 0
10
System Under
Microscope

OutBuf

Class OutBuf is a bit like buffer (A buffer for bits)


parameters :
outputCharacterBuffer -> File Writer (character like buffer) to write to
the output file
bitsDataStorage -> unsigned character (data storage for the bits) to
accumulate the bits till a char is completed
index -> integer (pointer) points to where is the next empty slot
Behaviors :
OutBuf Constructor 1 =>
parameters -> Character pointer output file name
Actions ->Creates new File Writer and sets the pointer index to the s
tart
OutBuf Constructor 2 =>
parameters -> File Writer out Pointer
Actions -> set outputCharacterBuffer to out and sets the pointer I
ndex to the start
~OutBuf Destructor =>
Actions ->Closes and deletes outputCharacterBuffer
writeBit =>
parameters -> boolean bit to indicate weather to write 1 or 0
Action -> if index points to the end => flush the results and reset the
Index pointer
write the bit in the very next empty slot
writeSome =>
parameters -> [Character / 64Long] z (Data Storage for the bits
wanted to be written)
Short some (number of bits wanted to be written)
Action -> Loop from the start of valid data till the end
see weather the bit is 1 or 0 and write it down
flush =>
Action -> Write down the data inside bitsDataStorage
getFileWriter =>
Action ->return a pointer to outputCharacterBuffer

11
System Under
Microscope

File Statistics

FileStatistics
It is a class to count up the occurrences(probabilities)
for each character in the input file
Constructor
Allocate the memory for the 2 maps
Distructor
Deallocate the memory of the 2 maps
Some getters
print
print the statistics of each byte in the input file
printing format :
number => weight
Do the statistics
BY reading the file and incrementing the occurrences
of the data

12
Ideas along
the Way

BAD

• ROOT ERROR storage


• It Depends on replacing the value of integer by his root
and the difference between integer and the root square
GOOD

• Existence dependent code


• It Depends on that only the exist bytes be considered on
the new code
• Minimum Variance Huffman tree
• It Depends on forcing the tree to be more like balanced
• Calonical Huffman
• It Depends on making a relation between elements
and codes , and sort them according to that relation .

13

Anda mungkin juga menyukai