Anda di halaman 1dari 18

BY

Jeevan Jyoti Rana


REGD.NO-0801307165
BRANCH-IT
CONTENTS
DATA COMPRESSION
REDUNDANCY
COMPRESSION TECHNIQUE
DATA COMPRESSION ALGORITHM
RUN LENGTH ENCODING
HUFFMAN ENCODING
CONCLUSION
DATA COMPRESSION
Virtually all forms of data - text, numerical, image, video contain
redundant elements
Data can be compressed by eliminating the redundant elements.
A code is substituted for the eliminated redundant element, where the code
is shorter than eliminated element.
When compressed data is retrieved from storage or received over a
communications link, it is expanded back to its original form, based on the
code.
Compression is used:
to save storage space
to reduce communications transmission requirements
Digital realm: using lesser number of bits to represent
information
Data + Compression = information – redundancy
REDUNDANCY
Most types of computer files are fairly redundant -- they
have the same information listed over and over again. File-
compression programs simply get rid of the redundancy.
“Ask not what your country can do for you -- ask what
you can do for your country.”
Ignoring the difference between capital and lower-case
letters, roughly half of the phrase is redundant. Nine words --
ask, not, what, your, country, can, do, for, you -- give us
almost everything we need for the entire quote
Compression Techniques
Lossless Compression
 Maintains data integrity
 Recovered data is identical to original
 Exploits redundancy in data
Lossy Compression
 Doesn’t maintain data integrity
 Some information is lost for ever
 Gives more compression than lossless
 Discards “insignificant” data components
Example
A good example of Lossy vs. Lossless
compression is the following string—
888883333333
The string is in uncompressed form.Using
Lossless compression technique we could
save space by writing it as 8[5]3[7].
By using Lossy compression technique we
can compress as 83.Here we can't get the
original data back.
DATA COMPRESSION ALGORITHMS

LOSSLESS COMPRESSION
Run Length Encoding
Huffman Coding
RUN-LENGTH ENCODING
Data files frequently contain the same
character repeated many times in a row.

Example of run-length encoding. Each run of zeros


is replaced by two characters in the compressed file:
a zero to indicate that compression is occurring,
followed by the number of zeros in the run.
Run Length Encoding (Contd.)

WWWBWWWWWBWWWBWWWWBWWWWWBWWW
BWWWWWBWWBWWWWWWBBBWWWWWWWBWB
WWWWWWWBWWBBWWWWWBWWWWBWWWWB
WWWWB
WWWBWWWWWBWWWBWWWWB….
3WB5WB3WB4WB….
3151314 possible optimization, but…
#3151314….. Optimization requires escape character
Run Length Encoding
CTAAAAAGGGTCGTTTTTTGCCCGGGGGCCTCCCCCCC

CTAAAAAGGGTCGTTTTTTGCCCGGGGGCCTCCCCCCC

CTAAAAAGGGTCGTTTTTTGCCCGGGGGCCTCCCCCCC

CT5A3GTCG6TG3C5GCCT7C } Run length encoded: 31


symbols
HUFFMAN ENCODING
This method is named after D.A. Huffman, who
developed the procedure in the 1950s.
More than 96% of this file consists of only 31
characters out of 127
HUFFMAN CODING ALGORITHM
Line up the source symbols according to their
probabilities,from highest to lowest.
Combine the two least probable symbol into composite
symbol whose probability is equal to the sum of their
probabilities.
Repeat step2 until there is a single symbol with probability
of 1.
Now trace the route from the root of the tree to leaf,writing
a 0 for left child and 1 for right child.
HUFFMAN ENCODING EXAMPLE
Character frequencies
A: 20% (.20)
B: 9% (.09)
C: 15% (.15)
D: 11% (.11)
E: 40% (.40) E
.40
BF
.14
D
.11
A
.20
C
.15
0 1
F: 5% (.05) B F
.09 .05
HUFFMAN ENCODING EXAMPLE
(CONTD.)
Codes ABCDEF
1.0
A: 010 0 1
B: 0000 ABCDF E
C: 011 .6 .4
0 1
D: 001
E: 1 BFD AC
.25 .35
F: 0001 0 1 0 1
BF D A C
.14 .11 .20 .15
0 1
B F
.09 .05
CONCLUSION
The aim of data compression is to reduce redundancy
and thus increase effective data density.
Algorithms are categorised on the basis of – algorithm
complexity and amount of compression.
Data compression leads to speedy transmission that
depends upon the number of bits sent,the time
required for the encoder to generate the coded
message and the time required for the decoder to
recover the original ensemble.
Data compression is also widely used in backup
utilities,spreadsheet applications and database
management systems.
QUERIES ? ?
THANK YOU

Anda mungkin juga menyukai