Anda di halaman 1dari 8

We're now ready to describe each of one

of the blocks that built an image


compression algorithm and in particular,
a JPEG.
And as we say, we are going to start with
a symbol encoder.
After we have basically done everything,
we have done the transform and the
quantization that we are going to explain
in detail, we want to further compress.
So, quantization will provide us on
compression that we want to further
compress.
But we need to save whatever we have
produced here.
We need to save with no error.
But we're still going to see how we can
compress.
So, why can we achieve that?
The basic idea is very simple.
And, of course, you can not teach image
processing without showing that first, at
least once,
this famous image that is called Lena.
That it, it used to be used a lot in
image processing.
But now, because of the abundance of
images, many other images are used.
But I want to show it at least once
during these, these nine weeks.
And we might see the same image again a
few more times.
So, the basic idea is that some pixels,
some gray values in the image appear more
than others.
And let's see that.
Here on the right, we have what's called
a histogram.
The histogram basically counts how many
times a particular gray value appears in
the image.
For example, the pixel value, the pixel
with gray values, let's say, around 47,
appears this amount of times.
So, we go around the image, we count how
many times I see the number 47 and this
is the amount of times.
Then I can go and see, for example, 200,
and I see this, this amount of time,
about 700 times appears in the image.
So, we see that there are some pixels
that appear a lot, there are some pixels,
let's say, zero or one don't appear at
all.
And the basic idea that we are going to
exploit is that we're going to try to
give to pixels, to pixels that appear a
lot, very short course, and to pixel
values that don't appear so much, we're
going to try to give very long cause or

longer course.
Let's see how we do that.
Let's see an example of the use of this.
So here, we have an image, and this is
actually the image that we saw in the
previous video, the geometric image.
This is a particular image that has,
there's only four different gray values
in that image, and they appear, and so
the pixels, pixels with gray value 87
appear in 1/4 on the image, so 25%,, 25%
or probabilty of 0.25.
And we also have 128 appearing almost
half of the time in the image, and we
also have 186 and 255 appears very, very
few times, everybody else just doesn't
appear in the image at all.
So, the standard way to represent this
image is we have an array and for every
pixel, we have a binary code with eight
bits.
So, we are going to basically represent
87 with its binary representation.
We are going to represent 128 with its,
its own binary representation and so on.
So, at the end, we are basically spending
eight bits per pixel if we don't
compress.
But let's assume that somehow I managed
to basically construct a code that is
shown here,
in a way that 87 is represented by the
binary code 01,
128 is represented by the binary code
one,
186 by the binary code 000 and 255 by the
binary code 001.
That means that, for example, if I want
to, if I want to store or send the pixels
87, 87,
255,
128.
Instead of writing all these then, this
again, for 255, this and for 128, this,
which will take me basically, 4 times 8,
32 bits and I'm going to write it in the
following way.
I'm going to say, okay,
87 is 01.
87 again, 01.
255,
001.
128, 1.
So, basically it took me two for seven,
eight bits instead of 32 to represent the
same stream the same form of gray values.
Now, the decoder has this table, so the
decoder will see 01, we'll go into that
pixel position and we'll put an 87.
We'll see 01, we'll put again an 87.

We'll see 001, we say, oh, I need to put


a 255, and so on.
And so, we have achieved compression.
Let's see exactly how much compression we
can achieve.
So basically, we do a very simple
computation.
We say, okay, with probability zero
point,
sorry,
with probability 0.25,
I'm going to have length of 01,
a length of, of two.
That's my, my code, 01.
With probability 0.47, I'm going to have
length of 1 plus 0.47 times one.
With probability 0.25, so for one quarter
of the image, I'm going to have length
three.
And with very small probability, I'm also
going to have length three.
And this is equal to 1 60,
1.81.
Okay.
So, instead eight bits per pixel, I now
have 1.81 on average on my whole image.
I basically, I have save lots of pixels.
Okay, I have gone from eight to basically
1.81.
And that's a, that's a compression of
over four times, and that's what we have
done.
Basically, what we did is we have
exploited the fact that some of the pixel
values appear more often than others, and
then I have to give shorter codes.
I can give shorter codes.
So, how we do that, we do that using
something which is called Huffman coding.
And I'm going to explain next how to do
it.
So, lets just look once again in this
particular example, the best way to
explain Huffman coding is just to use an
example, and I'm going to that next.
So, let's assume that we have five
different symbols, six different symbols,
sorry.
And I order them, the first thing they
have to do is order them with, according
to the probability.
The highest probability first and the
lowest probability at the end.
And if some probabilities are equal, it
doesn't matter which one we put first.
So, for example, this symbol a2 might be
the pixel value 100.
The symbol a6 might be the pixel value
77, and so on.
So, I'm going to add,

basically I'm going to list the prob, the


symbols here and the probabilities here.
And from the bottom, I start to add.
I take the two lowest.
And basically, I add their probabilities.
So, I go from 0.04 plus 0.06 and that's
probability one, 0.1.
And then, I do the list again.
So, none of these has changed, the only
one that changed is the last one.
See what's happening here.
What's happening here is with probability
0.4, I can get a2, with probability 0.3,
I can get a6, with probability 0.1, I can
get a1,
with probability 0.1, I can get symbol
a4, and with probability 0.1, I can get
either a3 or a5.
I am going to have to do something to
distinguish which one is it, is it a3 or
a5? And that's going to come in the next
slide how we distinguish them.
Now, we do it,
we do this again.
We take the two lowest and then we add
them.
And now, because it became 0.2, remember,
we have to reorder all the time.
So, let me just describe that again.
With 0.4, we can get a2.
With 0.3, we can get a6.
Now, with 0.2, we can get either one of
these two,
which is a4, a3 or a5.
I don't know which one but I know with
0.2 probability is one of them.
And then, we have 0.1 again.
We do that once more, once more,
actually, two more times.
We have the two lowest.
We get this and then we, once again, add
it to lowest, and we get this.
And once again, we, every time we add it
to lowest and we move into the next
column, we'll reorder everything.
When we end up with only two, we stop.
And we're going to see why we stop.
Now, it's time to give the code.
So, basically, what I have done here, I
basically redraw the graph that we have
in the previous slide.
And so, this is the first column, this is
the second column, and the third column,
the fourth column, and the fifth column.
And now, I'm going to go backwards,
assigning codes.
So, basically to 0.6, I assigned zero.
And to 0.4, I assigned one.
Before we go backwards, let me tell you
what's happening.

When the decoder is going to see a zero,


it's going to know that it's from the
group that led to this 0.6.
When you see sub one, it's going to know
that it's from the group that leads to,
lead to this 0.4.
Then we're going to have to keep adding
bits to say which one in that group.
And that's why we keep going backwards.
So, we basically propagate back the zero.
And we actually, the 0.6 came from
0.30.3, plus 0.3.
So, we add a zero and a one.
So now we, of course, propagated 0.4 here
and basically assigned it to this that
came to this column, without adding any
numbers.
So, it keeps propagating without adding
any bits.
So now, all the symbols that lead to this
group and we're going to see, it was
actually only one symbol,
we have this code and all the symbols
that propagated to this probability.
We have symbols that start, we have codes
that start from 01.
So now, we keep propagating.
We propagate the one to the one, the 00
to the 00.
01, we have to split because this 0.3
probability came from the addition of
these two.
And then, we have 010, 011.
And then we keep going.
The, we propagate the one to the one, the
zero to the zero, this 010, which has a
code 011 propagates here.
And this actually has to split.
Every time we split because when we were
computing it, we were adding.
So, when we're going back, we were
splitting.
Every time we are splitting, we have to
add zero and one.
And so, we keep going, we keep going, and
we get this result that says that the
symbol 01 is going to have a code, one.
The symbol 06 is going to have a code 00
and so on.
So, this is the code,
it's a variable length code.
Not every symbol is represented by a code
of the same length.
It actually has a different length and
the length depends on the probability.
The higher the probability, the shorter
the code.
And that's very easy to see because the
low probabilities were added, added,
added, added and then, when we're going

back we have to split, split, split,


split.
And remember, every time we split, we
have to add a symbol, and we have to add
a bit for that symbol.
So, basically the code becomes longer and
longer.
And we can actually go and compute, once
again, the average length of these.
So, to repeat the process, we basically
order the symbols according to the
probability,
that's what we did here.
We start adding low probabilities.
So, we add these two, we got this,
we reorder and we keep going.
It's a recursive process so if you know
how to go from one column to the next,
you know how to go all the way.
And then, we come back.
There is a couple of things that are very
important to observe here.
The code that you get in this way is
what's called prefix-free.
And that's very, very important.
Look at this column.
When you see a one, when the decoder sees
a one, it knows that it has to go here,
to a2.
When is this a zero, it basically says,
okay, I have to wait for the next symbol.
But there is no code here, which is a
full prefix of any other code.
And that's very important, otherwise, the
decoder won't be able to reconstruct.
Let's give an example of something which
is not a prefix code.
Let's say that the symbol a1 is
represented by the code one, the symbol
a2 is represented by the code zero, and
the symbol a3 is represented by the code
01,
okay?
So, when the decoder sees 01,
basically there are two possibilities.
It's either a3,
let's say, pixel value 100 or is a2
followed by a1.
Let's say, pixel zero and pixel 255.
So, there is a lot of ambiguity in the
reconstruction.
We don't know which one it is.
And this is because this code is not
what's called prefix-free.
This guy is a subset of this guy.
That doesn't happen here.
Let's just clear these annotations.
None of these codes,
so for example, the 00, you don't find it
anywhere here.

So, when the decoder sees a one,


knows what to do.
It puts the pixel value of A2.
When it sees a zero.
Is it, oh, I'll have to wait, because
there are many codes that starts with
zero.
If this is a zero, another zero and I'll
say, oh, I'm done, I basically have a6.
If, let's say, it's, so, the decoder
receives zero and then one, says, oh, I
have to wait, because there are numerous
codes that start with 01.
But then, it receives an, a one.
that says, oh, I'm done.
I don't have to keep waiting.
Because there's nothing.
There's no ambiguity here.
And that's very important, a Huffman code
by construction.
Look how we basically, split, and split,
and split.
And by construction, this is a
prefix-free code.
Now, how do we know that this is an
optimal code?
How do we, how do we know that with this
set of probabilities,
the process, the procedure I just
explained achieved the lowest possible
the, the lowest possible length, the
shortest possible length of a code on
average.
So, that's a theorem we have to prove,
we're not going to do that in this class,
but as a theorem that, that shows the
Huffman coding basically achieves the
shortest possible length.
Now, you might wonder what length can
Huffman coding achieve or any basically
encoder of this tile.
So, the basic idea is if I give you a set
of probabilities,
how much can I compress?
How much, I go, and before I construct a
Huffman code, I want to know, is it worth
the effort of constructing the code?
Will I compress a lot?
And the way, the way to do that, is
basically you compute what's called the
entropy.
And the entropy,
which is written by H,
H is entropy,
[SOUND] is the sum over all the symbols
with a minus here, I'm going to explain
in a second, of the probability, of every
symbol log base 2 of the probability of
every symbol.
So, it's basically sum, 0.4 times log

base two of 0.4 plus 0.3 times log base


two of 0.3 and so on.
So, this is actually, at the end, a
positive number because all probabilities
are below one and therefore, the log of
the probabilities are negative numbers
and then with the negative, we get the
positive.
And Shannon's First theorem basically
says, that we can achieve basically,
asymptotically, we can achieve this code
length on average.
So, this is basically what's achievable
with these types of codes, like Huffman
code.
We basically can compute this number and
say, oh, that's going to be my average
code length.
My expected average code length.
And the, it's actually not so hard to
understand where this formula comes from.
We used a similar computation with the
previous example.
Here, the probability is basically the
probability of the symbol appearing.
And this is actually the length of the
code that is going to be used for that
symbol.
So basically, what this is saying is that
the expected length of that code is the
log of the probability.
And why it's only the expected is because
you cannot have a code which is 3.2 long.
You have to send entire bits.
Either you send three bits or you send
four bits.
And that's why this is on average the
expectation,
what you expect to be the code length.
But this basically gives you an idea of
what encoders at Huffman coding can
achieve.
So, now that we know how to encode the
bits that we are producing, we are ready
to go into the next blocks of image
compression.

Anda mungkin juga menyukai