Anda di halaman 1dari 7

1

Assignment # 6

Assignment # 6
Huffman coding

Saad Iftikhar 039


Munhal Imran
Muhammad Hassan Zia 029
Moeez Aslam
Hamza Hashmi
junaid Afzal Swatti
Ali Tausif 061
Armaghan Ahmed
Zohair Fakhar
Mohsin Altaf

Instructor:
Sir Qasim Umer Khan
2
Assignment # 6

Huffman Coding implementation in Matlab


classdef huffman % data structure
values that the node has in it

Abstract In this assignment we have implemented


the Huffman entropy encoding algorithm for data
properties
compression. The results obtained after extensive leftNode = []
testing with different sets showed acceptable results rightNode = []
and confirmed the notion that more similar the data probability
set the better is compression achieved by Huffman code = [];
compression algorithm. symbol
huffy % will store the huufman
code just for check
I. INTRODUCTION end
end
I n computer science and information theory, Huffman
coding is an entropy encoding algorithm used for lossless
%%%%%%%%%%%--------------%%%%%%%%%%%%%%

Code for probability finding of data:


data compression. The term refers to the use of a variable-
%- calculating frequency of elements--%
length code table for encoding a source symbol (such as a %%%%--- Saad Iftikhar-------%%%%%%%%%%%
character in a file) where the variable-length code table has %%%---- 17 december 2013----%%%%%%%%%%
been derived in a particular way based on the estimated
probability of occurrence for each possible value of the source %% calculate how many same numbers occur
function
symbol. Huffman coding uses a specific method for choosing [data_unique,data_freq]=frequency(data);
the representation for each symbol, resulting in a prefix clc;
code (sometimes called "prefix-free codes", that is, the bit % data=[22 33 55 66 11 22 33 44 66];
data_unique=unique(data); % this
string representing some particular symbol is never a prefix of
function creates asorted ascending order
the bit string representing any other symbol) that expresses the array
most common source symbols using shorter strings of bits % with only unique elements no two
than are used for less common source symbols elements are repereated

II. ASSIGNMENT for i=1:length(data_unique)


data_unique1(i)= sum(data ==
In this assignment we were required to implement data_unique(i));
the Huffman algorithm in matlab. % this array has the corresponding
frequency of the data
% in the unique array
end
III. PERFORMANCE
data_unique=data_unique;
data_freq=data_unique1;

following are the matlab codes: data_freq=data_freq/sum(data_freq);


end
CODE: %%%%%%%%%%%------------%%%%%%%%%%%%%%
Class of Huffman code:
Conversion of data from binary in
%-----huffman coding-------------% decimal:
%%%----- version 1--------------%%% %--- calculating frequency of elements
%%- data structure (classes)---%%%% binary version-------------%
%%------18-12-2013------------%%% %%%%%%--- 20-21 december 2013---%%%%%%

%% %% convert the data to decimal


function [convData]=dataConv(data,M);
3
Assignment # 6
information=[0 0 1 0 1 0 0 0 0 1];
%%% always comes a whole no not a % information=randint(1,3000);
fraction convdata=dataConv(information,M); % this
% M=8; function her will convert our
k=log2(M); % from binary format to decimal as rest
convData=[]; of our program is written for
% decimal
remainder=mod(length(data),k); [data,prob]=frequency(convdata);
% this function here will check if the
data is exactly divisable by k or else %% Empty Array of Object Huffman
will append 0 bits ; array = huffman.empty(length(prob),0);
if(remainder~=0) array_final =
append=k-remainder; huffman.empty(length(prob),0);
else
append=0;
end %% Assign Initial saving all the
probabilities of the numbers in the
data=[zeros(1,append) data]; % probability property of the
for dataLength=1:k:length(data) class/structure alsp
for i=1:length(data)
array(i).probability = prob(i);
string=num2str(data(dataLength:dataLength array(i).symbol = data(i);
+k-1)); array_final(i).probability =
decimal=bin2dec(string); prob(i);
convData=[convData decimal]; array_final(i).symbol = data(i);
end end
%%%%%%%%%%%------------%%%%%%%%%%%%%% % here creating a temperary aaray to do
the sorting the algo we are using
% is the bubleSort algo for ascending
order
Main code of Huffman: temparray = array;
%%%%% main code of huffman coding algo %
using classes-------%%%%%
%%%%%-- Saad Iftikhar 18-12-2013-%%%%%
%%%---creating binary tree-----%%%%%% %% Creating the Binary Tree for k =
%%%---huffman code using classes---%%%% 1:size(temparray,2)-1 % size(a,2) gives
%% function size of the columns
codedData=sourceHuffman(information,M); % binary tree is where a node/ parent has
clc; two children and lower
% probability one is on left and higher
one is on right
clear symbol; % here to create a binary tree we have to
clear codeHuff; traverse for the no of nodes -1
clear codeBits; % here we take size of the colums as size
clear arrr; is always given as 2 dim vector
%% initializing for k = 1:size(temparray,2)-1
global symbol; % global variable will % % First Sort the temp array usse
have the the symbols buble sort
global codeHuff; % global variable will %
have the the symbols huffman codes for i=1:size(temparray,2)
global codeBits;% global variable will for j = 1:size(temparray,2)-1
have the the symbols related length of % buble Sort algorithm
huffman codes if (temparray(j).probability
decodeData=[]; > temparray(j+1).probability)
symbol=[]; tempnode = temparray(j);
codeHuff=[]; % this is the swaping operation
codeBits=[]; temparray(j) =
temparray(j+1);
M=4; temparray(j+1) =
tempnode;
4
Assignment # 6
end f=traverse(rootNode,le_code); % here we
end will traverse the tree and generate
end huffman tree

% % here is the loop for detecting


% % now we have to Create a new node replacing the data with its huffman code
%
newnode = huffman; % a node of the for codeLength=1:length(convdata)
class of huffman for inner=1:length(symbol)

% Add the probailities here we are if(convdata(codeLength)==symbol(inner))


creating the tree lowest two level=sum(codeBits(1:inner-1))+1;
% probability nodes are added into final_data=[final_data
one single node codeHuff(level:level+codeBits(inner)-1)];
newnode.probability = else
temparray(1).probability + end
temparray(2).probability; end
% new node has the sum of previous two
probabilities end
% % now assign the left lowest probabily % codedData=final_data
one as 0 and higher probabilty oen %%%%%%%%%%%----------------%%%%%%%%%%%%%%
% as 1
temparray(1).code = [0];
temparray(2).code = [1]; Traversal code for code generation:
% %%%%%----------- function for traversal
% % Attach Children Nodes to the new of the binary tree-------%%%%%
node the parent node created %%%%%---- 20-12-2013------------%%%%%
newnode.leftNode = temparray(1); %%%--algorithm for traversing the tree
newnode.rightNode = temparray(2); whole to get the code-----%%%%%%%
%
% % remove the previous two nodes and
replace by parent nodes just like
% in C++ we would remove the pointer function f = traverse(tempNode,codec)
and of children nodes and replace
% by pointer of father node global symbol; % these are the global
% variables to store our array code and
temparray = data as in recursive functions they are
temparray(3:size(temparray,2)); % fist continously over written
two nodes are gone global codeHuff;
% global codeBits;
% now appending the new parent node
%
temparray = [newnode temparray]; if ~isempty(tempNode) % if we have the
% next root or not
end % end the looping and hence binary codec = [codec tempNode.code]; % append
tree created with the previous node
%%
rootNode = temparray(1); % the root if ~isempty(tempNode.symbol)
node is always the first node % disp(tempNode.symbol);
le_code = []; % that will be the final tempNode.huffy=[codec];
code huffman % disp(codec);
% symbol=[symbol tempNode.symbol];
% % Looping though the tree codeHuff=[codeHuff codec];
% % See recursive function loop.m codeBits=[codeBits length(codec)];
%
final_data=[]; % variable definition
later used end
check=huffman;
traverse(tempNode.leftNode,codec);
traverse(tempNode.rightNode,codec);
5
Assignment # 6
end end
f=codec;
end
%%%%%%%%%-------------%%%%%%%%%%%%%% end
if (i>length(data)) % if
Code for decoding of Huffman: previous was the last value of data that
node symbol was decoded
%---Huffman decoding algorithm---------% realdata=[realdata
%%%%%------- Saad Iftikhar----%%%%%%%%%% centerNode.symbol];
%-- 21-12-2013------------------------% flag=1;
k=i+1;
break
%% function end
decodedData=decodeHuffman(data,M,rootNode if(data(i)==1) % data is 1 so
) right child
if
%lets traverse data and create the ~isempty(centerNode.rightNode)
decoded string using the structures % decoded=[decoded
tempNode.rightNode.code]; % to check if
we
% are traversing correct
i=i+1;

function [realdata,olright] = centerNode=centerNode.rightNode;


dHuffman(tempNode,data,M) else
%% traversing the tree and when we reach realdata=[realdata
a leaf we assign the leaf nodes value to centerNode.symbol];
the data vector flag=1;
decoded=[]; % definig variables k=i;
realdata=[]; break;
i=1; end
k=1;
centerNode=tempNode; % variable of class end
huffamn if (i>length(data)) % if data is
while(k<length(data)+1) % loop defining over this is the leaf
flag=0; % flag to terminated loop realdata=[realdata
centerNode=tempNode; % temperary centerNode.symbol];
node flag=1;
k=i+1;
break
while (flag==0&i<=length(data)) % end
loop condition end
end
%% here we convert the data from decimal
if (data(i)==0) % if data is 0 back to binary format
check left child binaryReal=dec2bin(realdata,log2(M));
if binaryReal1=binaryReal';
~isempty(centerNode.leftNode) binaryRealFinal=binaryReal1(:);
% decoded=[decoded binaryRealFinal=binaryRealFinal';
tempNode.leftNode.code]; %this was only for loop=1:length(binaryRealFinal)
% checking that we are olright(loop)=str2double(binaryRealFinal(
traversing correct loop));
i=i+1; end
centerNode=centerNode.leftNode;
else
realdata=[realdata centerNode.symbol]; end
flag=1; %%%%%%%%%%%------%%%%%%%%%%%%%%
k=i;
break;
6
Assignment # 6
Results:

Information is the original data its size is 21 bits long. final_data is the Huffman
compressed data its size is greatly reduced to 10 as M=8 ,k=3 in this case quite
similar data
7
Assignment # 6
Now in this window it is shown that final data Huffman encoded data when sent to the decoding functions return the original
data and the sum(ol==information) returns 21 which means all 21 bits of original data and decoded data are a match.

IV. CONCLUSION
Now the implementation of the Huffman lossless entropy
encoding compression algorithm has confirmed the notion that
when data has many similar elements in it this compression
reduces the length of a code and hence increases the entropy
no of useful information sent per bits.