Anda di halaman 1dari 25

Fractal Image Compression through

Iterated Function Systems


A tutorial for the IFS Application Framework (IFSAF):
a software for experiencing with Fractal Image Compression
Last revision 0.0 September 1995
WARNING
WORKS IN PROGRESS
THIS DOCUMENT IS A PRE-RELEASE, SO IT IS INCOMPLETE AND, VERY LIKELY, IT CONTAINS A
LOT OF ERRORS AND IMPERFECTIONS (NOT TO MENTION THE LACK OF FORMATTING AND
RATIONAL ORGANIZATION).
THIS IS AN UNDERGRADUATE STUDENTS WORK: BE WARNED OF INACCURACIES.
Index - workmap
About IFSAF
Authors
Disclaims
Features
Installation
Requirements
File formats
A coding/decoding example
+ How to read the STS file - only the c-decl by now
Approximation of shade blocks with polynomials
The algorithm for gray-scale images
- Calculating and quantizing alpha and beta - just as Prof. Fisher does.
- True color image encoding
- Using IFSAF
- The classification of blocks
- The tool for classifying these blocks, gradients etc., the two classifications compare tool
+ Simulated annealing
- Orthonormalization for beta of 2nd deg
- Different metrics
- Recursive encoding
- The INI file
- Error evaluation over blocks
+ About the source code
The color encoding function
- The porting on Linux/UNIX: surely command-line, perhaps X.
["-" : Not yet written; "+" : To be completed.]
About IFSAF
The IFSAF program was written by G.Pulcini, V.Verrando, R.Rossi and C.Meloni as final half-year work for
the 1995 course in microelectronics at "La Sapienza" University, Rome.
The goal of this work was to explore some algorithms for fractal image compression through Iterated
Functions Systems (IFS).
Our results are: compression ratio up to 25:1 in 6 minutes, with good image quality.
IFSAF Theory and Applications 2
The compression algorithm is based upon a paper of A.E.Jaquin
1
and on some extensions outlined there
by, between the others, Prof. Yuval Fisher
2
, whom we thank for addressing us to his paper SIGGRAPH '92
course notes
3
.
Even more info you can found on Prof. Fisher's book: Fractal Image Compression: Theory and Application
to Digital Images ( Springer Verlag, New York, 1995 (URL :)).
Authors
Giovambattista Pulcini
e-mail: mc0878@mclink.it
www: http://www.webcom.com/~verrando/pulcini
phone: +39-775-726038
Valerio Verrando
e-mail: v.verrando@mclink.it
www: http://www.webcom.com/~verrando
phone: +39-6-3243486
Riccardo Rossi
phone: +39-6-5811847
Carlo Meloni
phone: ?
For any question, feel free to contact us by e-mail: we'll answer all the mail we get, just let us take some
time to answer, up to 2-3 weeks, in some cases, because of our student duties at the University.
Disclaims
We may have used some portions of copyrighted source code (certainly portions of Prof. Fisher and some
libraries of Borland C++ 3.0 compiler). We wrote this program at the only educational purpose, so any other
use may be prohibited, including the reproduction/distribution of this software not for free.
We think that this can be only distributed FREELY for the only EDUCATIONAL purpose to experiment
fractal image encoding.
That's why, also, we didn't make the complete source code available.
We are not lawyer anyway, so this may be not so correct.
WE ARE NOT RESPONSIBLE FOR ANY KIND OF DAMAGES THIS SOFTWARE MAY CAUSE TO YOU. IF YOU DON'T AGREE THIS,
SIMPLY DON'T USE IT AND DELETE YOUR COPY OF THIS SOFTWARE.
Just to put us in safe. :-)
Features
True color image compression by the means of a new approach. We do not make 3 separate
compressions.
During the encoding, it shows graphically each step of the computation, while building the "collage".
Gray level encoding with several different selectable algorithms.
Different strategies for searching in the domain pool
Exhaustive.
Reduced by Local Sub Search.
Reduced around neighbourhood of the range block.
Reduced by Isometry prediction.

1
- Fractal Image Coding: a review. Proceedings of IEEE, vol. 81, n. 10, October 1993, 1451-1465
2
- http://www.ucsd.edu/y/Fractals/ e-mail: yfisher@ucsd.edu
3
- ftp://ftp.ucsd.edu/y/Fractals/fractal_paper.ps.Z
IFSAF Theory and Applications 3
We also considered the Prof. Fisher's block classification but, we didn't implemented yet.
Simulated Annealing.
Selectable dimensions of range blocks. (domain blocks are fixed to be 4 times the area of range blocks)
Selectable number of bits for encoding massic transform parameters.
Polynomials of the 2nd degree for skipping the search on flatten (shade) blocks.
Polynomials of the 2nd degree used as the shift term in the massic transform, with an orthonormalization
scheme for computing the coefficients (does not work well, yet).
Recursive encoding with quadtree partition, limited to 2 levels, by now.
Several selectable metrics to evaluate differences between blocks. Both for encoding or only for
reporting.
Log of each processing step, you can find a lot of useful information. Warning, these files may get very
long.
In the early stage of this project we implemented some methods to classify blocks, as in Jaquin article, to
reduce the search space. While we dropped that way of encoding during the implementation of the
procedure, these tools are still available:
Block classification based on block histogram:
dynamic range evaluation (histogram);
evaluation of 2 max peaks (twin peaks);
projection onto a 9 dimensional vector subspace with a Frei and Chen's basis.
The software makes available a tool for interactive view of these values.
Also, you can compute 2 different classifications of blocks on the same image and then you can show
each class for each block of the image or the differences between the two.
In the study of block orientations, for predicting the isometry, we developed an interactive tool that shows
(for gray-scale images) some gradients for each block you click on.
Notices:
+ Several features can not be mixed together, even if the program allows to do that. This is because we
made the program for flexibility: it was intended to run with the program on one hand and the compiler on
the other hand. :-)
So don't blame us if the program hangs or behaves strange.
+ The program makes almost no check on what you are doing, eg. neither if you give it a color image when
it needs a gray-scale one.
+ Even if it may look somewhat user-friendly, due to the windows interface, it's joking you! :-)
It has nothing of user friendly, you are on your own for assuring consistency of what you are doing. That's
way you should have a deep knowledge of how fractal image compression works.
+ There is, anyway, an easier way to get some results from it: you can select short menu and follow the
procedure outlined in the section: An encoding/decoding example.
+ It only supports the windows DIB RGB encoded format, both in true-color (24bpp) and in gray-
scale(8bpp). See the section Requirements for more info.
It is well known by us that the program has a lot of bugs and inconsistencies.
Some options in dialogs box are not available, even if they are not dimmed, and some other can be only
changed editing the .INI file.
See the section: Using IFSAF for more info on this subject.
IFSAF Theory and Applications 4
Requirements
A 386/387 CPU (at least 486-66 for speed) with 8Mb of RAM, SVGA card capable of 15 or more bpp video
mode (this is only for best viewing the results, since IFSAF doesn't handle dithering on 8bpp displays,
anyway you may view them with another program as LView or PSP).
Your machine should run Windows 3.10 or higher in Enhanced mode with a swap file of at least 15Mb and
some available space on the HD: say 2-3 Mb at least.
No other program should be running while IFSAF is running. You should keep free as much physical RAM
as you can to prevents your Windows to go into trashing. Some swap during decoding is OK.
These are not absolute minimal requirements, but those for a reasonable working configuration for encoding
images 256x256 or 512x512, depending on options settings.
IFSAF is not a well behaved windows program: during encoding it monopolizes the CPU and
doesn't release it until the encoding is complete, preventing other programs to run.
There is no way to stop the encoding or decoding when started, besides you can kill the program pressing
Ctrl-Alt-Del once and then Return.
The input image must be a .BMP windows file in true-color RGB format, not RLE compressed, which
is not supported. The dimensions of the image must be power of two or at least divisible by the size of range
blocks: they must contain an integer number of range blocks.
With parameters at extreme values, you may get some sqrt:DOMAIN errors or program may hang.
Sometimes it's better to close the status window bebore you start the encoding or the decoding, since
program may hang, in some cases.
Installation
You have to put the IFSAF program and all its relating files (unzipping them with pkunzip 2.04g) under the
top-level directory \IFS in any drive, otherwise IFSAF won't be able to find some files it needs.
The images should be put in the same directory too. All output will be generated under this directory.
File formats
Input file:
The only supported file format is the windows DIB RGB encoded format (.BMP) both for true color
images at 24bpp and for gray-scale ones at 8bpp. Be warned that any other format or even subformat such
as the RLE encoded .BMP are not supported.
For color images you have to use only 24bpp ones, 8bpp color mapped images will be seen as gray-scale
ones.
For gray-level images, we take the pixel color index as the level of gray, not the value index in the
palette. This should not cause problems to you, because most of programs, when converting to a gray-level
image, do the same, assigning an "identity" map to the palette:
index 0 is RGB 0,0,0, index 1 is RGB 1,1,1 and so on.
Output files:
IFSxxx.BMP : is the collage or decoded bitmap, as you save them.
IFSxxx.IFS : is the file containing the transformations, it is in binary form.
IFSxxx.STS : some accurate statistics in binary format that will be shown by the status window, there are
also other statistical info that aren't currently shown. You can extract those with a simple C
program: see the section How to read the STS file.
IFSxxx.RPT : this is a text file that is a log-report of the encoding, several useful information can be
obtained in the header and footer of the file. The body of the file contains lines like this:
R(0,0)<-D(96,224): 0 0.152941 3.000000 [2.473020] E:2.603206 1%
[0f] 1844.000000 1314.000000 6071.000000 8016.000000 :: 17245.000000
IFSAF Theory and Applications 5
R(0)<-D(0,136,1): 2 1.000000 -23.000000 [4.730222] E:4.730222 2%
R(1)<-D(152,112,3): 0 0.866667 1.000000 [8.749443] E:8.755411 3%
R(2)<-D(16,104,3): 0 0.976471 -7.000000 [5.178644] E:5.203635 2%
R(3)<-D(48,96,3): 4 0.866667 65.000000 [11.118842] E:11.123037 4%
R is the range block and (x,y) are its block-coordinates, (actually the 0,0 block is the one on the lower-left
corner), D is the domain block and (x,y) are pixel coordinates of its lower left corner (relating to the lower left
corner of the whole image). The three number following are: the isometry (in range 0..7), the value for the
alpha parameter and the value for the beta parameter (both after quantization and reconversion to real
numbers). Inside the [] there is the rms error between the mapped range block and the original block, before
the quantization of alpha and beta parameters. While the number next to E: is the rms error after the
quantization of parameters, followed by the same rms error expressed as a percentage of the number of
gray levels and rounded to integer values.
The second line, present only in recursive two level encoding, shows the in [] the bitmap of which subblock
will be encoded, followed of the values representing the amount of ms error on each sub block, the last
number is the total.
Depending on how many block will be encoded there may be from 0 to 4 more lines, representing some
information as the first one, with the except that R(x) now indicates which subblock of R(x,y) the values are
relating to.
For color images the error is a composition of single channel errors.
An encoding/decoding example
To install IFSAF see the section installing IFSAF.
Now you need a true color image, we suggest the image "lena" 256x256 since it is used as a reference
picture. You can find lena_ori.ras in the package YUV.ARJ in ...... or get the true-color 512x512 from .....,
then with an image editing software such as the shareware Paint Shop Pro (see the programs of support
section for more information) you need to convert the image to gray-scale, resize it to 256x256, and then
save it as lena.bmp in the Windows DIB RGB encoded format, not the RLE encoded.
Optionally you could prepare another 256x256 truecolor image, that will be used for the decoding.
Now follow these steps:
1. start IFSAF from the \IFS directory - we suppose in the following that you have short menus, if you don't
simply click on Option/Short Menu.
2. File/Open and select the lena.bmp, press OK.
3. Sometimes you should better to close the status window, since because of some bug the program may
hang on completion of encoding, when updating the Status window.
4. Click on the window showing the image.
5. Select Image/Encode from menu.
6. Look at the pop-up windows that will be displayed, while the encoding take place. The program defaults
for a mix of high speed and good compression ratio, so the image quality at this stage may be very poor.
7. Consider that the encoding is not recursive on color images since we have had some trouble with the
procedure, we disabled this feature.
8. On completion, a dialog box will show the name of the set of output files, (.RPT, .IFS, .STS), after
clicking OK, the collage will show.
You can optionally save the collage with File/Save. Sorry, but you can not specify the name, a progressive
number will be assigned in automatic. You can rename it, of course.
Now you can decompress the image, from the .IFS file, starting from an arbitrary other file (with same
dimensions, anyway).
9. Open such a file selecting File/Open or simply click on the original lena image window if you don't have
it.
10. Select Image/Decode From File... and select the .IFS file with the name showed in the dialog box at the
end of the encoding. (most likely it will be ifs001.ifs).
11. Look at the decoding pop-up window, it shows 10 images converging to the original image - well, an
approximation of it -.
The final image represents a good approximation of the fractal attractor for the iterated functions systems
calculated for the image.
12. Now you can optionally save the decompressed image selecting File/Save
13. May be better to quit and restart the program before encoding some other image.
IFSAF Theory and Applications 6
Good luck and have fun. :-)
IFS - Iterated Function Systems
Here we will briefly describe how fractal image compression through IFS is done.
A picture is considered as a point in a complete metric space, where the metrics is the Hausdorff metric.
So, we find a set of transformations that, when iterated and applied to an arbitrary starting point (an
arbitrary picture), converges toward a point that depends only on the set of transformations, not on the
starting point. This point is called the attractor.
The question is: given an arbitrary image, it is possible to find a set of transformations whose attractor is
the given image itself ? The collage theorem answers yes, stating that: as far better we can cover (as in a
collage) the original image with scaled, rotated and translated copies of the same image (or of its parts), as
the attractor will be closer to the original image.
We build the collage image with the following algorithm:
let we have a gray-scale image, we subdivide it in blocks (the range blocks) of - say - 8x8 pixels, then for
each of them, we search in the whole image, a block (the domain block) of - say - 16x16 pixels, which
reduced (to the 8x8 size), transformed by one of 8 isometries and scaled in luminance is the best match for
the original 8x8 range block.
So, the original image is encoded in a set of transformations, (scaling, isometry application, luminance
scaling and shifting). There are not pixel values in the encoded image, there are only the values of
parameters involved in the transformation applied to each domain block to map the range block.
The decoding process consists in the reconstruction of the attractor, well, an approximation of the true
attractor after a finite number of iterations (typically 6-8 should suffice).
The range block size defaults to 8, but it can be changed both from Option/Block Settings or with the
RangeBlockSize entry in the INI file.
Even if there is an entry DomainBlockSize in the INI and a combobox in Option/Block Settings for changing
the Domain block size, it is actually hard-coded in the program to be 2 times the range block size.
More in detail, the transformations applied are:
scaling of the domain block to match the size of the range block (typical a reduction from 16x16 pixel to
8x8 averaging the pixels in 2x2 blocks.
one of the 8 isometries that led a square in a square, we use the one that minimizes the rms error
between range and transformed domain block.
These two are often said "geometric transforms"
IFSAF Theory and Applications 7
non homogeneous scaling of the luminance. We apply to each pixel of the partially transformed domain
block the following:
r[i,j]=alpha*d[i,j]+beta
where r[i,j] indicated the resulting pixel of the partially transformed domain block whose pixels are d[i,j].
The alpha parameters controls the contrast of the block and the beta controls the luminance.
This is what is called massic transform.
translation of the domain block to cover the range block.
The search of the domain block.
The simplest and slowest way of finding a matching block for the selected range block is to test each
domain block. This is the exhaustive search.
We can imagine that we move a 16x16 square window all around the original image, apply the
transformations and calculate the rms error, for each position of the window. When we have explored all the
image, we simply select the position (which identify a domain block) that had the least rms error.
Since the domain blocks are contracted to half of their original size, we can move the window by a step of 2
pixels, with almost no change for the domain block and halving the search space. From here: as larger the
step as smaller the search space and faster encoding. Too large steps, however led to too small search
spaces leading to poor image quality. That is the parameter Domain Step.
So one may think: I can move with large steps, evaluating the rms error, when I get the minimum, I search
in the neighborhood with a smaller step.
This is what the Local Sub Search (LSS) does. The parameters Overstep simply tells the program how
many Domain Step to skip in the search of the first minimum, after that the search is done with step Domain
Step in an square of size Overstep * Domain Step pixels,
as shown in the picture.
Another way, probably better, is to search with finer step each time the rms error falls below a threshold (not
implemented in IFSAF).
In order to reduce the time in the search, we can try another approach: we could search for matching
domain blocks only in a square neighborhood of the original range. This is what we call Local Search. The
diameter in of this region is the Diameter parameter in the INI file.
We can combine LSS with local search, of course.
A completely different way to reduce the search space is to use the Simulated Annealing Algorithm. See the
special section on this topic.
For each domain block we should test all values of alpha, all values of beta, all isometries.This leads to a
search space which is simply too big.
We can simply avoid the search for alpha and beta values, since there is a simple relation between the rms
error and the values of the 2 parameters alpha, and beta. So expressing the rms error as a function of alpha
and beta, we can simply calculate the values of alpha and beta for which the rms reach its minimum.
This calculation consists in the inversion of a 2x2 matrix.
IFSAF Theory and Applications 8
This matrix may be singular ? We don't know: only in few cases the program reported this event, but could
be caused by round-off errors. In any case, never this prevented the program to find another block for which
this matrix was not singular.
For stripping out the isometries from the search space, the things are not so simple. We should be able to
decide what isometry will cause the least rms, without trying it. The trial is exactly what we want to get rid of.
For the human eye this may not be a difficult task, but for computer the things stand a little different. It is a
typical problem of pattern matching, at the time, as far we know, there is no definitive solution to this
problem.
You may try to use fuzzy logic, neural networks, and several other interesting methods to do this, we
encourage the use of these methods. By now we implemented a trivial classification scheme which led to
good results.
This is what we call Iso(metry) Prediction.
We subdivided each block in 4 sub-blocks, we calculate the mean luminance of each block and sort these in
increasing order. So, each block, even range or domain ones, has a list and a sorted list of the luminance of
the four sub-blocks. Comparing the two lists (the sorted one and the other one), for each block, we get the
number of the permutation that transforms the unsorted list into the sorted one. This is called the
lexicographic order of the permutation. It can be computed as
perm# = inv(x1) 3! + inv(x2) 2! + inv(x3) 1! + inv(x4) 0!
where inc(xi) is the number of inversions relating to the index in the i-th position.
In the example above we have: 2*3!+1*2!+1*1!+0*0! = 15.
Each block (range and domain) now has its own lexicographic order, so we can build a table of 24x24
isometries which tell us what isometry to use for "best mapping" a range block with a given permutation
number onto a domain block with another permutation number.
So, what is wrong with that ?
Well, the "best mapping" is not assured to be the least rms error, that's why:
we say that a block best matches another with isometry i, if with that isometry the two most bright subblocks
are led one over another, and possibly the other subblock in order of luminance are led one over another.
So, this does not assure the rms to be minimum, since we are dealing with means we may undertake some
internal orientation of the blocks. The perfect overlapping of the blocks is not always possible, there are
cases in which we can only impose the matching of the only brightest block and, in this case we will have
several isometries that leads to this, so what is the one to select? We get the first. Anyway this is an open
problem.
A question remains: we should first apply the isometry and then calculating the alpha and beta. Is this
correct or may be that with an alpha and beta that don't minimize the rms and another isometry the rms
error could be less than the previous one. We don't know.
Another method both to reduce the search space and to find directly the isometry is the classification of
blocks by the means of some of their statistical moments. This is what Dr. Fisher does in his enc.c. (we
didn't implement this classification scheme).
Another classification is based on the "typology" of the blocks, as suggested by Jaquin. We can classify
each block as a shade block, a middle-range block and an edge-block. So as dr. Jaquin says we could
approximate the shade blocks with polynomials, no search for them is needed, we could map the middle-
edge block (think of them as texture blocks) without checking the isometry (we assume them as somewhat
orientation-invariant) and finally the full search is done for edge blocks.
As we said, there are tools in IFSAF to help to find a classification algorithm for those, but the classification
doesn't not work during encoding. We, besides, implemented the polynomial approximation of shade blocks.
When Polynomials are active, each range block is approximated with a 2nd degree polynomial in x and y.
If the approximation rms error is below a selectable threshold (ErrorThreshold parameter), we simply
encode the range block as the coefficients of the polynomial, otherwise we encode the block traditionally.
That way the classification is somewhat implicit.
IFSAF Theory and Applications 9
While using polynomials, assuming that all range blocks classified as shade will be approximated with
polynomials, we may get rid of shade blocks in the domain pool, that is what the option Strip-off shades
should do (it actually doesn't).
The quantization of the polynomials coefficients is a big deal. We should use as few bits as we can, in order
to keep high the compression ratio, but if we assign too few bits to them the approximation with real number
become looser and the projection theorem (which we use to calculate the coefficients of the polynomial)
may loose it validity in a discrete space.
On the other hand, with few bits the polynomials will approximate well fewer blocks, increasing the
compression time. Another question arises: how to allocate n bits on 6 coefficients so that the error of
quantization will be generally low? We proportionally allocated bits with the degree of the monomial
associated with the coefficient, imposing an equal error for each monomial.
Approximation of shade blocks with 2nd degree polynomials
We want to approximate the function f(i,j), defined over a square domain NxN, with a polynomial p(i,j) =
x
1
+x
2
i+x
3
j+x
4
i
2
+x
5
ij+x
6
j
2
with i,j {0,1,...,N}
We impose that, in each pixel (i,j), p(i,j) = f(i,j) and so we have m=N
2
equations with n=6 unknown
quantities. The system, in general, has no solution "too many equations for few unknown quantities".
So we will use approximated methods for solving the system.
Let be:
| | b f h N h N h m
h
= ( / , mod ), { ,..., } 0
| |
| |
| |
a
a i
a j
a i
a ij
a j
h
h i h N
h
j h N
h
i h N
h
i h N
j h N
h
j h N
1
2
3
4
2
5
6
2
1 =
=
=
=
=
=
=
=
=
=
=
=
/
mod
/
/
mod
mod
we have:
a x a x a x b
a x a x a x b
a x a x a x b
m m m m
11 1 12 2 16 6 1
21 1 22 2 26 6 2
1 1 2 2 6 6
...
...
.......
...
=
=
=
|

|
|
|
|
|
with an error:
e b a x a x a x
e b a x a x a x
e b a x a x a x
m m m m m
1 1 11 1 12 2 16 6
2 2 21 1 22 2 26 6
1 1 2 2 6 6
= ~ + + +
= ~ + + +
= ~ + + +
|

|
|
|
|
|
( ... )
( ... )
.......
( ... )
We solve the system with vectorial methods:
in the space F=J
m
we define the inner product:
< >= >
=
_ x y , , w x y w
i i i
i
m
i
1
0
A b e
k
k
mk m m
a
a
b
b
e
e
k n =
|
|
|
|
|
|
|
|
|
|
=
|
|
|
|
|
|
|
|
|
|
=
|
|
|
|
|
|
|
|
|
|

1 1 1
1 ... ... ... { ,..., }
that is:
x x x
x x x
n n
n n
1 1 2 2
1 1 2 2
A A A b
e b A A A
+ + + =
= ~ + + +
...
( ... )
with a global error equal to:
IFSAF Theory and Applications 10
E d x x x x x x
E w e
n n n n
i i
i
m
= + + + = ~ + + +
= =< >=
=
_
2
1 1 2 2 1 1 2 2
2
1
( , ... ) ( ... )
,
b A A A b A A A
e e e
The best approximation for a solution of the system is given by the orthogonal projection of the vector b
onto the space U with a basis A
1
, ..., A
n
.
Let
x b A A A * P( , ) * * ... * = = + + + U x x x
n n 1 1 2 2
be the best approximation of b in U. This can be found solving the system:
< > + < > + + < > =< >
< > + < > + + < > =< >
< > + < > + + < > =< >
|

|
|
|
|
|
A A A A A A b A
A A A A A A b A
A A A A A A b A
1 1 1 1 2 2 1 1
2 1 1 2 2 2 2 2
1 1 2 2
, , ... , ,
, , ... , ,
........
, , ... , ,
x x x
x x x
x x x
n n
n n
n n n n n n
[ ]
G A A
G
= =< >
=
g g
ij ij i j
, ,
det 0
where G is a Graham matrix and the A
k
are independent. Then we let:
B
b A
b A
X =
< >
< >
|
|
|
|
|
|
|
|
|
|
=
|
|
|
|
|
|
|
|
|
|
,
...
,
...
1 1
n n
x
x
with:
GX B X G B = = =
~1
Notice that the inversion of G can be done once. The computational cost of the approximation of a block
NxN with a polynomial of the 2nd degree in x,y is the cost of the computation of the product of a matrix nxn
eg. 6x6 by a vector 6x1 plus the cost of the computation of n=6 inner products between vectors with m=NxN
components.
We could consider to orthonormalize the basis A
k
.
We could consider different masks for the weights w
i
in order to try to reduce the border effects between
blocks.
Quantization of coefficients
The coefficients have to be quantized in order to reduce the number of bits used to encode the block.
As in the decoding we evaluate the polynomial:
p(i,j) = x
1
+x
2
i+x
3
j+x
4
i
2
+x
5
ij+x
6
j
2
con i,j {0,1,...,N}
we impose an equal percentile error on each monomial, accepting, in the worst case, a global error 6 times
greater.
It is interesting to evaluate how much the value is different from the optimal value due to the quantization of
coefficients. There may be another set of discrete coefficients values that lead to a smaller error between
6e..e.
The Simulated Annealing
It is a stochastic method of optimization of the parameters of a system, based upon the process of
annealing, that is the crystallization of a liquid from the fusion of a solid: we gradually decrease the
temperature of the system so that the crystal structure modify slightly going through states of thermal
equilibrium.
IFSAF Theory and Applications 11
For us the states at low level of energy are the solutions that are closer to the optimal one, while the
temperature is our control-parameter.
The main advantage of the simulated annealing is the ability of reaching the global minimum with low
probability to remain caught in a local minimum.
We have a system with a certain configuration x, with energy E(x). From this state we generate a new state
y, with energy E(y), moving one of the particles only a bit far from the position it has in x. If E(y)<E(x) we
accept the new state, else we accept it with a probability equal to
e^(E(y)-E(x))/KT
where K is the Boltzmann's constant and T is the temperature of the system.
The format of the STS files
Here is the C declaration of the struct that is written with fwrite on the .STS file; be warned that not all
encoding algorithms implemented fill up all the fields in this structure, eg. the color encoding is one that
does not fill up them.
enum TIsometries { iso_IDENTITY, iso_FLIPV, iso_FLIPH, iso_1stDIAG, iso_2ndDIAG,
iso_ROT90, iso_ROT180, iso_ROT270, iso_LAST=iso_ROT270 };
struct TEncStats // Encoding Image...
{
DWORD dwIFSBlk, dwPolyBlk, dwSubBlk, dwCountBlk,
dwAlphagt1, dwAppAlphagt1,
dwDet0, dwAppDet0,
dwAlphalt0, dwAppAlphalt0,
dwAlphaltm1, dwAppAlphaltm1,
dwAlphaCount, dwAppAlpha,
dwSearch,
dwLSSsuccess, dwLSSsameiso;
DWORD isofrequency[iso_LAST+1];
DWORD packed, original;
DWORD ElapsedTime;
double ErrorThreshold;
double GlobalError, GlobalErrorwoLSS;
char szOutFileName[80];
char szInFileName[80];
// Kernel settings
WORD B,W,H,did,djd,nd,md,nr,mr,overstep;
DWORD RIsize, DIsize, Dsize;
WORD Metric, ClassScheme, Approx;
double EdgeThreshold, MidrangeThreshold, ShadeThreshold;
double w[3];
double alphamin,alphamax,betamin,betamax;
WORD alphabits,betabits,polybits;
WORD ENC_USEPOLY,ENC_QUANTIZEALPHA,ENC_BOUNDALPHA,ENC_QUANTIZEPOLY,
ENC_POLYONLY, ENC_USEISOMETRIES, ENC_STRIPSHADES, ENC_OVERSTEP,
ENC_OVERSTEPUSEISOMETRIES, ENC_USEORIENTATIONSCHEME;
WORD wErrRange[15];
WORD wAlpha[100];
WORD wBeta[100];
WORD wDeltax[100];
WORD wDeltay[100];
WORD wDist[100];
BYTE EncodeAlgorithm;
WORD SimAnnSize;
// Recursion support chunk
WORD wSubBlock[4];
WORD MaxLevel; // Max level of recursive subdivision
int diameter;
double ErrorThresholdR;
};
The Color Encoding Function: Kernel::EncodeImageCR1
All function for the IFS encoding or some image processing tools are members of the Kernel class. So most
settings are simply declared as data members of that class.
// file: crencode.cpp
// color Recursive encoding, 2 levels
IFSAF Theory and Applications 12
// Written by G.Pulcini
// Last revised 1995/4/17 Rev 0.30
#include "rcencdec.h"
static BYTE GetOrientation(WORD SSum[4])
{
// ordering luminance sub-blocks 0123 index in a[] induce increasing order
// a[0] brightest sub-block, ..., a[3] darkest sub-block (or its derivative)
int i,k,sw,s=0,a[4]={0,1,2,3},FattTable[]={1,1,2,6,24};
for(k=0; k<3; k++)
{
for(i=k+1, sw=0; i<4; i++)
if (SSum[a[k]]<SSum[a[i]]) Swap(a[k],a[i]), sw++;
s+=sw*FattTable[4-k-1];
}
return s; // gets lexicographic order of perm.
}
void RGBLevelScale(RTImage img, int n, int d, int L)
{
switch (L)
{
case 0:
{
for(WORD i=0; i<img.wHeight; i++)
for (WORD j=0; j<3*img.wWidth; j++)
img.Pixel(i,j) = img.Pixel(i,j)*n/d;
} break;
case 1:
{
for(WORD i=0; i<img.wHeight; i++)
for (WORD j=0; j<3*img.wWidth; j++)
img.Pixel(i,j) = 255-((255-img.Pixel(i,j))*n/d);
} break;
}
}
void BlockSubError(HPBYTE lpY, WORD Wbits, WORD ir0, WORD jr0, PTBLOCK B, WORD N, double E[4])
{
E[0]=E[1]=E[2]=E[3]=0;
WORD I, I2, N2=N/2;
DWORD Id;
for(WORD i=0; I=i*N, I2=i/N2*2, Id=(DWORD(ir0+i)<<Wbits)+jr0, i<N; i++)
for(WORD j=0; j<N; j++)
E[I2+j/N2] += sqr(double(lpY[Id+j])-double(B[I+j]));
}
void CombineYIQBlock2RGB(PTBLOCK BlockY, PTBLOCK BlockI, PTBLOCK BlockQ, PTBLOCK BlockRGB, WORD N)
{
WORD I1,I2;
for(WORD i=0; I2=i*N*3, I1=i*N, i<N; i++)
for(WORD j=0; j<N; j++)
{
int Y,I,Q, R,G,B;
WORD P = I2+(j<<1)+j, P1 = I1+j;
YIQtoRGB(BlockY[P1], BlockI[P1], BlockQ[P1], R, G, B);
BlockRGB[P] = Bound(R, 0, 255);
BlockRGB[P+1] = Bound(G, 0, 255);
BlockRGB[P+2] = Bound(B, 0, 255);
}
}
void SetBlock2RGB(PTBLOCK Block, PTBLOCK BlockRGB, WORD N)
{
WORD I1,I2;
for(WORD i=0; I2=i*N*3, I1=i*N, i<N; i++)
for(WORD j=0; j<N; j++)
{
int Y;
WORD P = I2+(j<<1)+j;
Y = Block[I1+j];
BlockRGB[P] = Y;
BlockRGB[P+1] = Y;
BlockRGB[P+2] = Y;
}
}
static void CalculateSums(HPBYTE lpcY, WORD wbits, WORD id, WORD jd, WORD B, HPDPI DPI)
// if id,jd are in range 0x0..ndxmd u should pass id*did,jd*djd to this proc
{
DWORD I;
IFSAF Theory and Applications 13
WORD B2 = B/2, I2, B4 = B/4, I4;
WORD Sum = 0;
DWORD Sum2= 0;
WORD SSum[4]={0,0,0,0};
DWORD SSum2[4]={0,0,0,0};
WORD SSSum[4][4]={0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
for (WORD i=0; I=DWORD(id+i)<<wbits, I2=(i/B2)<<1, I4=((i%B2)/B4)<<1, i<B; i++)
for (WORD j=0; j<B; j++)
{
DWORD J = jd+j;
WORD SI= I2+j/B2, // sub index
SSI=I4+(j%B2)/B4; // sub-sub index
WORD xx, x = lpcY[I+J];
Sum += x;
Sum2+= (xx=x*x);
SSum [SI] += x;
SSum2[SI] += xx;
SSSum[SI][SSI] += x;
}
if (DPI)
{
DPI->Orientation = GetOrientation(SSum);
DPI->Sum = Sum;
DPI->Sum2= Sum2;
for(int k=0; k<4; k++)
{
DPI->SSum[k] = (SSum2[k]<<12)|SSum[k];
DPI->SOrientation[k] = GetOrientation(SSSum[k]);
}
}
}
#pragma argsused
void Kernel::DoPrecalculation(BYTE huge * lpDibRGB, HPBYTE &lpY, HPBYTE &lpI, HPBYTE &lpQ,
HPBYTE &lpcY,HPBYTE &lpcI,HPBYTE &lpcQ,
HPDPI &DPI)
{}
static void Calculate(HPBYTE lpY, WORD Wbits, WORD i0, WORD j0, WORD N, WORD &Sum, DWORD &Sum2, WORD
SSum[4], DWORD SSum2[4], WORD SSSum[4][4])
{
Sum = 0;
Sum2= 0;
for(WORD k=0; k<4; k++)
{ SSum[k]=0;
SSum2[k]=0;
for(WORD k1=0; k1<4; k1++) SSSum[k][k1]=0;
}
DWORD I;
WORD I2,I4, B2=N>>1, B4=N>>2;
for(WORD i=0; I=DWORD(i0+i)<<Wbits, I2=(i/B2)<<1, I4=((i%B2)/B4)<<1, i<N; i++)
for(WORD j=0; j<N; j++)
{
WORD
SI = I2+j/B2,
SSI= I4+(j%B2)/B4,
xx, x = lpY[I+j0+j];
Sum += x;
Sum2+= (xx=x*x);
SSum[SI] += x;
SSum2[SI]+= xx;
SSSum[SI][SSI] += x;
}
}
static void BlockIsoCrossSum( HPBYTE lpY, WORD Wbits, HPBYTE lpcY, WORD wbits, WORD N, int iso,
WORD ir0, WORD jr0, WORD id0, WORD jd0, DWORD &SumRD)
{
register WORD i,j;
DWORD Ir,Id;
SumRD=0;
switch(iso)
{
case iso_IDENTITY:
for(i=0; Ir=(DWORD(ir0+i)<<Wbits)+jr0,
Id=(DWORD(id0+i)<<wbits)+jd0, i<N; i++)
for(j=0; j<N; j++)
SumRD += WORD(lpY[Ir+j])*lpcY[Id+j];
break;
case iso_FLIPV:
for(i=0; Ir=(DWORD(ir0+i)<<Wbits)+jr0,
IFSAF Theory and Applications 14
Id=(DWORD(id0+i)<<wbits)+jd0, i<N; i++)
for(j=0; j<N; j++)
SumRD += WORD(lpY[Ir+j])*lpcY[Id+N-1-j];
break;
case iso_FLIPH:
for(i=0; Ir=(DWORD(ir0+i)<<Wbits)+jr0,
Id=(DWORD(id0+(N-1-i))<<wbits)+jd0, i<N; i++)
for(j=0; j<N; j++)
SumRD += WORD(lpY[Ir+j])*lpcY[Id+j];
break;
case iso_1stDIAG:
for(i=0; Ir=(DWORD(ir0+i)<<Wbits)+jr0, i<N; i++)
for(j=0; Id=(DWORD(id0+j)<<wbits)+jd0, j<N; j++)
SumRD += WORD(lpY[Ir+j])*lpcY[Id+i];
break;
case iso_2ndDIAG:
for(i=0; Ir=(DWORD(ir0+i)<<Wbits)+jr0, i<N; i++)
for(j=0; j<N; j++)
SumRD += WORD(lpY[Ir+j])*lpcY[(DWORD(id0+(N-1-j))<<wbits)+jd0+(N-1-i)];
break;
case iso_ROT90:
for(i=0; Ir=(DWORD(ir0+i)<<Wbits)+jr0, i<N; i++)
for(j=0; j<N; j++)
SumRD += WORD(lpY[Ir+j])*lpcY[(DWORD(id0+j)<<wbits)+jd0+(N-1-i)];
break;
case iso_ROT180:
for(i=0; Ir=(DWORD(ir0+i)<<Wbits)+jr0,
Id=(DWORD(id0+(N-1-i))<<wbits)+jd0, i<N; i++)
for(j=0; j<N; j++)
SumRD += WORD(lpY[Ir+j])*lpcY[Id+(N-1-j)];
break;
case iso_ROT270:
for(i=0; Ir=(DWORD(ir0+i)<<Wbits)+jr0, i<N; i++)
for(j=0; j<N; j++)
SumRD += WORD(lpY[Ir+j])*lpcY[(DWORD(id0+(N-1-j))<<wbits)+jd0+i];
break;
}
}
void GetBlock( HPBYTE lpcY, WORD wbits, WORD N, int iso,
double alpha, double beta,
WORD id0, WORD jd0, BYTE * Block)
{
register WORD i,j,I;
DWORD Id;
switch(iso)
{
case iso_IDENTITY:
for(i=0; I=i*N, Id=(DWORD(id0+i)<<wbits)+jd0, i<N; i++)
for(j=0; j<N; j++)
Block[I+j] = Bound(int(0.5+lpcY[Id+j]*alpha+beta), 0, 255);
break;
case iso_FLIPV:
for(i=0; I=i*N, Id=(DWORD(id0+i)<<wbits)+jd0, i<N; i++)
for(j=0; j<N; j++)
Block[I+j] = Bound(int(0.5+lpcY[Id+N-1-j]*alpha+beta), 0, 255);
break;
case iso_FLIPH:
for(i=0; I=i*N, Id=(DWORD(id0+(N-1-i))<<wbits)+jd0, i<N; i++)
for(j=0; j<N; j++)
Block[I+j] = Bound(int(0.5+lpcY[Id+j]*alpha+beta), 0, 255);
break;
case iso_1stDIAG:
for(i=0; I=i*N, i<N; i++)
for(j=0; Id=(DWORD(id0+j)<<wbits)+jd0, j<N; j++)
Block[I+j] = Bound(int(0.5+lpcY[Id+i]*alpha+beta), 0, 255);
break;
case iso_2ndDIAG:
for(i=0; I=i*N, i<N; i++)
for(j=0; j<N; j++)
Block[I+j] = Bound(int(0.5+lpcY[(DWORD(id0+(N-1-j))<<wbits)+jd0+(N-1-i)]*alpha+beta), 0,
255);
break;
case iso_ROT90:
for(i=0; I=i*N, i<N; i++)
for(j=0; j<N; j++)
Block[I+j] = Bound(int(0.5+lpcY[(DWORD(id0+j)<<wbits)+jd0+(N-1-i)]*alpha+beta), 0, 255);
break;
case iso_ROT180:
for(i=0; I=i*N, Id=(DWORD(id0+(N-1-i))<<wbits)+jd0, i<N; i++)
for(j=0; j<N; j++)
Block[I+j] = Bound(int(0.5+lpcY[Id+(N-1-j)]*alpha+beta), 0, 255);
IFSAF Theory and Applications 15
break;
case iso_ROT270:
for(i=0; I=i*N, i<N; i++)
for(j=0; j<N; j++)
Block[I+j] = Bound(int(0.5+lpcY[(DWORD(id0+(N-1-j))<<wbits)+jd0+i]*alpha+beta), 0, 255);
break;
}
}
double quantizemin[QCLASSES],quantizemax[QCLASSES],quantizestep[QCLASSES],quantizeinvstep[QCLASSES];
WORD quantizebits[QCLASSES];
static void QuantizationSetup()
{
// qunatization settings & inits
quantizemin[ALPHAY] = -1.0; quantizemax[ALPHAY] = 1.0;
quantizemin[ALPHAI] = -1.0; quantizemax[ALPHAI] = 1.0;
quantizemin[ALPHAQ] = -1.0; quantizemax[ALPHAQ] = 1.0;
quantizemin[BETAY] = -255.0; quantizemax[BETAY] = 255.0;
quantizemin[BETAI] = -355.0; quantizemax[BETAI] = 355.0;
quantizemin[BETAQ] = -355.0; quantizemax[BETAQ] = 355.0;
quantizebits[ALPHAY] = 8;
quantizebits[ALPHAI] = 8;
quantizebits[ALPHAQ] = 8;
quantizebits[BETAY ] = 8;
quantizebits[BETAI ] = 8;
quantizebits[BETAQ ] = 8;
for(int k = 0; k<QCLASSES; k++)
{
quantizestep[k] = 1.0/(quantizemax[k]-quantizemin[k])*((1ul<<quantizebits[k])-1);
quantizeinvstep[k] = 1.0/quantizestep[k];
}
}
static void MinimizeCRms( TEncStats &S, int ka, int kb,
WORD N, double SumR, double SumR2, double SumD, double SumD2, double
SumRD,
DWORD &ialpha, DWORD &ibeta, double &alpha, double &beta, double&E )
{
double Sum,det,oneoverSum;
Sum = N*N;
oneoverSum = 1/Sum;
det = Sum*SumD2-SumD*SumD;
if (det==0)
{
alpha = 0; S.dwDet0++;
}
else
{
alpha = (Sum*SumRD - SumR*SumD)/det;
if (alpha>quantizemax[ka]) alpha=quantizemax[ka];
if (alpha<quantizemin[ka]) alpha=quantizemin[ka];
}
ialpha = Quantize(alpha, ka);
alpha = Dequantize(ialpha, ka);
beta = (SumR-alpha*SumD)*oneoverSum;
if (beta>quantizemax[kb]) beta=quantizemax[kb];
if (beta<quantizemin[kb]) beta=quantizemin[kb];
ibeta = Quantize(beta, kb);
beta = Dequantize(ibeta, kb);
E=(alpha*(alpha*SumD2+2*beta*SumD-2*SumRD)+beta*(beta*Sum-2*SumR)+SumR2)*oneoverSum;
#ifdef DETAILED_RPT
if (E<0) {fprintf(rpt, "E<0\n"); fflush(rpt); }
#endif
}
#define PROFILING
#ifdef PROFILING
#define PROFILE(x) \
Time1 = GetTimeTicks()-Time1; \
fprintf(rpt, x ": %lu\n", Time1); fflush(rpt); \
Time1 = GetTimeTicks();
#elif
IFSAF Theory and Applications 16
#define PROFILE(x) ;
#endif
#define ShowReset() {ShowEnd(); ShowBegin();}
void Kernel::EncodeImageCR1(TEncStats &S, LPSTR szDefName, const RTImage img1, RTImage img2, TRpoolR
huge * RPI, TDpoolR huge * DPI1, PTBLOCK DPool)
{
DWORD Time = GetTimeTicks();
#ifdef PROFILING
DWORD Time1= GetTimeTicks();
#endif
char szFileName[80];
strcpy(szFileName, szDefName);
strcat(szFileName, ".RPT");
rpt = fopen(szFileName,"w");
if (Weight) { delete Weight; Weight=NULL; };
CalcScheme();
ShowBegin();
ShowMessage("Converting image...");
ShowRatio(0.0);
HPBYTE lpDibRGB = img1.lpDib; // We should here test type of the image...
HPBYTE lpY, lpI, lpQ, // YIQ color components
lpcY, lpcI, lpcQ; // YIQ contracted color components
HPDPI DPI; // color Domain Pool Info
HPBYTE lpRGBbits = GetDibBitsAddr(lpDibRGB);
if ( (1u<<bitsof(djd)) != djd ||
(1u<<bitsof(did)) != did ||
(1u<<bitsof(B)) != B ||
B < 2*djd ||
B < 2*did
) // do something of appropriate...
{ MsgBox("Error", "Unvalid parameters #01"); return; }
// uppercase are relative to image
// lowercase are relative to contracted image
// radius = 128; // my stub *****************
int dRadius = radius; // from kernel
int diameter= 2*dRadius+1;
WORD bB = bitsof(B),
bdid = bitsof(did),
bdjd = bitsof(djd);
WORD W = GetDibWidth(lpDibRGB),
H = GetDibHeight(lpDibRGB),
Wbits = bitsof(W),
Hbits = bitsof(H),
W2 = WORD(1) << Wbits,
H2 = WORD(1) << Hbits,
mr = (W>>bB) + ((W%B)?1:0), // W/B + ((W%B)?1:0),
nr = (H>>bB) + ((H%B)?1:0), // H/B + ((H%B)?1:0),
W1 = mr<<bB, // mr*B,
H1 = nr<<bB, // nr*B,
w = W1>>1,
h = H1>>1,
wbits = bitsof(w),
hbits = bitsof(h),
w2 = 1u << wbits,
h2 = 1u << hbits,
md = (w-B)>>bdjd, // (w-B)/djd,
nd = (h-B)>>bdid, // (h-B)/did,
jdbits = Min(bitsof(diameter), bitsof(md)),
idbits = Min(bitsof(diameter), bitsof(nd));
QuantizationSetup();
DWORD dwImageSize = DWORD(H1)<<Wbits,
dwcImageSize= DWORD(h)<<wbits,
dwLS = (W*3/4+(((W*3)%4)?1:0))*4; // DIB line-size
PROFILE("Init")
// Allocating memory, we should check if ok
lpY = (LPBYTE) GlobalAllocPtr(GMEM_MOVEABLE|GMEM_ZEROINIT, dwImageSize);
lpI = (LPBYTE) GlobalAllocPtr(GMEM_MOVEABLE|GMEM_ZEROINIT, dwImageSize);
lpQ = (LPBYTE) GlobalAllocPtr(GMEM_MOVEABLE|GMEM_ZEROINIT, dwImageSize);
IFSAF Theory and Applications 17
lpcY= (LPBYTE) GlobalAllocPtr(GMEM_MOVEABLE|GMEM_ZEROINIT, dwcImageSize);
lpcI= (LPBYTE) GlobalAllocPtr(GMEM_MOVEABLE|GMEM_ZEROINIT, dwcImageSize);
lpcQ= (LPBYTE) GlobalAllocPtr(GMEM_MOVEABLE|GMEM_ZEROINIT, dwcImageSize);
if (!lpY || !lpI || !lpQ || !lpcY || !lpcI || !lpcQ) // do something
{ MsgBox("Error", "Can't allocate memory #02"); return; }
PROFILE("Memory allocation")
// separating color components
ShowReset();
ShowMessage("1 of 4 - Splitting image into YIQ color components...");
DWORD dwSI, dwDI, dwSP, dwDP;
for(WORD i=0; dwSI=dwLS*i, dwDI=(DWORD(i)<<Wbits), i<H; i++)
{
ShowRatio(float(i)/H);
for(WORD j=0; j<W; j++)
{
int R,G,B;
int Y,I,Q;
dwSP = dwSI+(j<<1)+j;
dwDP = dwDI|DWORD(j);
RGBtoYIQ(lpRGBbits[dwSP], lpRGBbits[dwSP+1], lpRGBbits[dwSP+2], Y, I, Q);
lpY[dwDP]=Y;
lpI[dwDP]=I;
lpQ[dwDP]=Q;
}
}
ShowRatio(1.0);
PROFILE("Color separation")
// pre-contracting color components
ShowReset();
ShowMessage("2 of 4 - Pre-contracting YIQ color components...");
DWORD dwSI1,P0,P1,P2,P3,J;
for(i=0; dwSI=DWORD(i<<1)<<Wbits, dwSI1=DWORD((i<<1)+1)<<Wbits, dwDI=DWORD(i)<<wbits, i<h; i++)
{
ShowRatio(float(i)/h);
for(WORD j=0; j<w; j++)
{
J = j<<1;
P0 = dwSI|J;
P1 = dwSI|(J+1);
P2 = dwSI1|J;
P3 = dwSI1|(J+1);
dwDP = dwDI|DWORD(j);
lpcY[dwDP] = 0.5+(int(lpY[P0])+int(lpY[P1])+int(lpY[P2])+int(lpY[P3]))*0.25;
lpcI[dwDP] = 0.5+(int(lpI[P0])+int(lpI[P1])+int(lpI[P2])+int(lpI[P3]))*0.25;
lpcQ[dwDP] = 0.5+(int(lpQ[P0])+int(lpQ[P1])+int(lpQ[P2])+int(lpQ[P3]))*0.25;
}
}
ShowRatio(1.0);
PROFILE("Contraction")
// Now we should allocate enaugh memory to keep Domain pool
DWORD dwDPsize = DWORD(nd)*md*sizeof(cDPoolInfo);
DPI = (HPDPI) GlobalAllocPtr(GMEM_MOVEABLE, dwDPsize);
if (!DPI) // do something of appropriate...
{ MsgBox("Error","Can't allocate memory #03"); return; }
PROFILE("Allocating D-pool info")
// Calculating Domain pool
ShowReset();
ShowMessage("3 of 4 - Pre-calculating mean and square mean of Y...");
DWORD I;
for (WORD id=0; I=DWORD(id)*md, id<nd; id++)
{
ShowRatio(float(id)/nd);
for (WORD jd=0; jd<md; jd++)
CalculateSums(lpcY, wbits, id<<bdid, jd<<bdjd, B, &DPI[I+jd]);
}
ShowRatio(1.0);
PROFILE("Calculation of D-pool info")
// Effective encoding
ShowReset();
IFSAF Theory and Applications 18
ShowMessage("4 of 4 - Encoding...");
// we should end show here, because UpdateImage call by itself Showxxxx
img2.Copy(img1);
RGBLevelScale(img2, 1,2,1);
InitImage(&img2);
UpdateImage(0);
PROFILE("Image setup and level scaling")
strcpy(szFileName, szDefName);
strcat(szFileName, ".IFS");
FILE * cod = fopen(szFileName,"wb");
double GlobalError = 0, GlobalYError=0, GlobalIError=0, GlobalQError=0;
WORD N = B, N2=N/2,
bN=bB, bN2=bN-1;
BYTE * BlockY= new BYTE[N*N];
BYTE * BlockI= new BYTE[N*N];
BYTE * BlockQ= new BYTE[N*N];
BYTE * BlockRGB= new BYTE[N*N*3];
DWORD ID = 0x0AFFF08ul; // stepping 08 : Color IFS !!
DWORD PackedHeaderSize;
// WriteColorHeader
{
Pack(0, 0, cod); // inits
Pack(32, ID, cod);
Pack(16, W, cod);
Pack(16, H, cod);
Pack( 8, B, cod);
Pack( 8, did, cod);
Pack( 8, djd, cod);
Pack(16, dRadius, cod);
for(int k=0; k<QCLASSES; k++)
{
Pack(quantizemin[k], cod);
Pack(quantizemax[k], cod);
Pack(8, quantizebits[k], cod);
}
PackedHeaderSize = Pack(-2, 0, cod);
}
InitEncStats(S, img1.szFileName, szDefName, W, H, nr, mr, nd, md, idbits, jdbits,
ID, GetRPoolInfoSize(img1), GetDPoolInfoSize(img1), GetDPoolSize(img1));
WriteReportHeader(rpt, S, idbits, jdbits, ID);
PROFILE("Inits")
double ErrorT1 = sqr(ErrorThresholdR)*B*B/4.0;
for (WORD ir=0; ir<nr; ir++)
{
ShowRatio(float(ir)/nr);
for (WORD jr=0; jr<mr; jr++)
{
S.dwCountBlk++;
// This should be an else of an if which select or not poly approx.
{
// Full exaustive search in D-pool for best D-block
S.dwIFSBlk++;
TIsometries isoBest = iso_IDENTITY;
WORD k, idBest, jdBest, y,x;
double alpha, beta;
DWORD ialpha, ibeta, ialphaBest, ibetaBest;
double E,EY,EI,EQ, EBest = 10000000.0; // just a big number
WORD SumR, SSumR[4], SSSumR[4][4];
DWORD SumR2, SSumR2[4];
y = ir<<bB; // ix*N, N=B
x = jr<<bB;
Calculate(lpY, Wbits, y, x, N, SumR, SumR2, SSumR, SSumR2, SSSumR);
BYTE OrientationR = GetOrientation(SSumR);
idBest = y >> (bdid+1); // y/(2*did);
jdBest = x >> (bdjd+1); // x/(2*djd);
WORD idStart = Max(int(idBest)-dRadius, 0),
idEnd = Min(int(idBest)+dRadius+1, (int)nd),
jdStart = Max(int(jdBest)-dRadius, 0),
jdEnd = Min(int(jdBest)+dRadius+1, (int)md);
if (idStart==0) idEnd = Min(diameter, (int)nd);
if (idEnd==nd) idStart=Max(0, int(nd)-diameter);
IFSAF Theory and Applications 19
if (jdStart==0) jdEnd = Min(diameter, (int)md);
if (jdEnd==md) jdStart=Max(0, int(md)-diameter);
fprintf(rpt, "i::%d %d [%d] j::%d %d [%d]", idStart, idEnd, idEnd-idStart, jdStart,
jdEnd, jdEnd-jdStart);
fflush(rpt);
DWORD I;
for (WORD id=idStart; I=DWORD(id)*md, id<idEnd; id++) // Search in D-pool
for (WORD jd=jdStart; jd<jdEnd; jd++)
{
DWORD IJ = I+jd;
WORD SumD = DPI[IJ].Sum;
DWORD SumD2= DPI[IJ].Sum2;
DWORD SumRD;
TIsometries iso;
S.dwSearch++;
iso = TIsometries(
IsometryClass[OrientationR][DPI[IJ].Orientation]);
BlockIsoCrossSum(lpY, Wbits, lpcY, wbits, N, iso, y,x, id<<bdid,jd<<bdjd, SumRD);
MinimizeCRms( S, ALPHAY, BETAY, N, SumR, SumR2, SumD, SumD2, SumRD,
ialpha, ibeta, alpha, beta, E);
if (E < EBest)
{
idBest = id, jdBest = jd;
EBest = E;
isoBest = iso;
ialphaBest = ialpha; ibetaBest = ibeta;
}
} // nested for all id,jd
PROFILE("Full D-loop")
WORD id0 = idBest<<bdid, //idBest*did,
jd0 = jdBest<<bdjd;
EY = fabs(EBest);
alpha = Dequantize(ialphaBest, ALPHAY);
beta = Dequantize(ibetaBest, BETAY);
GetBlock(lpcY, wbits, N, isoBest, alpha, beta, id0,jd0, BlockY);
// Now we are ready to evaluate the possibility of going down with recursion
double SE[4];
WORD sbb=0; //sub-block-bitmap
BlockSubError(lpY, Wbits, y, x, BlockY, N, SE);
int h = 0;
for(k=0; k<4; k++)
if (SE[k]>ErrorT1) { sbb |= (1<<k); h++; }
if (h>2) { sbb = 0x0f; h=3;}
S.wSubBlock[h]++;
fprintf(rpt, "\t[%02x] %lf %lf %lf %lf ::
%lf\n",sbb,SE[0],SE[1],SE[2],SE[3],SE[0]+SE[1]+SE[2]+SE[3]);
fflush(rpt);
double alphaY=alpha, betaY=beta, alphaI,alphaQ,betaI,betaQ;
Pack(4, sbb, cod);
if (sbb!=0x0F || TRUE)
{
Pack(idbits, idBest-idStart, cod);
Pack(jdbits, jdBest-jdStart, cod);
Pack(3, isoBest, cod);
Pack(quantizebits[ALPHAY], ialphaBest, cod);
Pack(quantizebits[BETAY] , ibetaBest, cod);
cDPoolInfo dpi;
DWORD dwSumRD;
double SumD,SumD2;
BlockIsoCrossSum(lpI, Wbits, lpcI, wbits, N, isoBest, y,x, id0,jd0, dwSumRD);
Calculate(lpI, Wbits, y, x, N, SumR, SumR2, SSumR, SSumR2, SSSumR);
CalculateSums(lpcI, wbits, id0, jd0, B, &dpi);
SumD = dpi.Sum;
SumD2= dpi.Sum2;
MinimizeCRms(S, ALPHAI, BETAI, N, SumR, SumR2, SumD, SumD2, dwSumRD,
ialpha, ibeta, alpha, beta, EI);
GetBlock(lpcI, wbits, N, isoBest, alpha, beta, id0, jd0, BlockI);
Pack(quantizebits[ALPHAI], ialpha, cod);
Pack(quantizebits[BETAI], ibeta, cod);
alphaI=alpha; betaI=beta;
IFSAF Theory and Applications 20
BlockIsoCrossSum(lpQ, Wbits, lpcQ, wbits, N, isoBest, y,x, id0,jd0, dwSumRD);
Calculate(lpQ, Wbits, y, x, N, SumR, SumR2, SSumR, SSumR2, SSSumR);
CalculateSums(lpcQ, wbits, id0, jd0, B, &dpi);
SumD = dpi.Sum;
SumD2= dpi.Sum2;
MinimizeCRms(S, ALPHAQ, BETAQ, N, SumR, SumR2, SumD, SumD2, dwSumRD,
ialpha, ibeta, alpha, beta, EQ);
GetBlock(lpcQ, wbits, N, isoBest, alpha, beta, id0, jd0, BlockQ);
Pack(quantizebits[ALPHAQ], ialpha, cod);
Pack(quantizebits[BETAQ], ibeta, cod);
alphaQ=alpha; betaQ=beta;
}
/*
for(k=0; k<4; k++)
if (sbb & (1<<k))
SubBlockEncode(S, cod, k, BlockY, ir, jr, idBest, jdBest, &(RPI[IJR]), DPI, DPool,
img2, SE );
*/
E = fabs(EY+EI+EQ)/3;
UpdateEncStats(S, isoBest, 0, 0, E, ir, jr, idBest*2, jdBest*2 );
fprintf(rpt, "R(%d,%d)<-D(%d,%d): %d [%lf %lf %lf] E:%lf %0.0lf%%\n",
ir, jr, id0*2, jd0*2, int(isoBest),
sqrt(EY),sqrt(EI),sqrt(EQ), sqrt(E),sqrt(E)*100/255.0);
if (sqrt(E)>10) fprintf(rpt, "Y:%lf %lf I:%lf %lf Q:%lf %lf\n",alphaY, betaY, alphaI,
betaI, alphaQ, betaQ);
fflush(rpt);
CombineYIQBlock2RGB(BlockY,BlockI,BlockQ, BlockRGB, N);
//SetBlock2RGB(BlockY, BlockRGB, N);
img2.SetBlock(y, x, N, BlockRGB);
//GlobalError += (SE[0]+SE[1]+SE[2]+SE[3])/(B*B);
GlobalYError+= EY;
GlobalIError+= EI;
GlobalQError+= EQ;
GlobalError += E;
UpdateImage((ir*mr+jr)/(nr*mr-1.0), ir, jr,
id0*2, jd0*2, int(isoBest), alpha, beta,
sqrt(E), sqrt(GlobalError/S.dwCountBlk));
}
}
}
delete BlockY;
delete BlockI;
delete BlockQ;
delete BlockRGB;
GlobalFreePtr(lpY);
GlobalFreePtr(lpI);
GlobalFreePtr(lpQ);
GlobalFreePtr(lpcY);
GlobalFreePtr(lpcI);
GlobalFreePtr(lpcQ);
Timing = GetTimeTicks()-Time;
Pack(-1, 0, cod); // we are @ end.
fclose(cod);
S.packed = Pack(-2, 0, cod) - PackedHeaderSize;
S.original=DWORD(img1.wWidth)*img1.wHeight*3;
FixupEncStats(S, GlobalError, 0);
WriteReportFooter(S, GlobalError, 0);
fclose(rpt);
EndImage();
Beep();
strcpy(szFileName, szDefName);
strcat(szFileName, ".STS");
FILE * r = fopen(szFileName,"wb");
fwrite(&S, sizeof(TEncStats), 1, r);
fclose(r);
}
static void GetSubBlock(int k, PTBLOCK A, WORD M, PTBLOCK B, WORD N)
/*
0 1 values of k to get sub-block from A and put it on B
2 3
IFSAF Theory and Applications 21
*/
{
WORD I,IB,i,j;
switch(k)
{
case 0:
for(i=0; I=i*N, IB=i*M, i<N; i++)
for(j=0; j<N; j++)
B[I+j]=A[IB+j];
break;
case 1:
for(i=0; I=i*N, IB=i*M, i<N; i++)
for(j=0; j<N; j++)
B[I+j]=A[IB+j+N];
break;
case 2:
for(i=0; I=i*N, IB=(i+N)*M, i<N; i++)
for(j=0; j<N; j++)
B[I+j]=A[IB+j];
break;
case 3:
for(i=0; I=i*N, IB=(i+N)*M, i<N; i++)
for(j=0; j<N; j++)
B[I+j]=A[IB+j+N];
break;
}
}
void Kernel::SubBlockEncodeC( TEncStats &S, FILE * cod, int K, PTBLOCK RBlock,
WORD ir, WORD jr, WORD idRBest, WORD jdRBest,
TRpoolR far * RPI, TDpoolR huge * DPI, PTBLOCK DPool,
RTImage img, double SE[4])
{
double E;
WORD N = B/2, N2=N/2;
BYTE * Block = new BYTE[N*N];
BYTE * DBlock= new BYTE[N*N];
BYTE * DBlock1=new BYTE[N*N];
BYTE * appBlock=new BYTE[N*N]; // transformed best D-Block
WORD nd=S.nd,md=S.md;
WORD nr=S.nr,mr=S.mr;
WORD W = S.W,
H = S.H;
//optimal number of bits for Domain-(id,jd) coords
WORD jdbits = ceil(log(md)/log(2.0)),
idbits = ceil(log(nd)/log(2.0));
//S.dwCountBlk++;
//Now extract subblock from r-block
GetSubBlock(K, RBlock, B, Block, N);
{
// Full exaustive search in D-pool for best D-block
S.dwIFSBlk++;
TIsometries isoBest = iso_IDENTITY, iso_LAST1 = iso_LAST;
WORD k, idBest, jdBest, kBest;
double alpha, beta, alphaBest, betaBest;
DWORD ialpha, ibeta, ialphaBest, ibetaBest;
double SumR = 0, SumR2 = 0,
E0, E, E0Best, EBest = 10000000.0; // just a big number
SumR = RPI->SubBlock[K].Sum;
SumR2= RPI->SubBlock[K].Sum2;
WORD idStart = Max(int(int(idRBest)-diameter+1), 0),
idEnd = Min(idRBest+diameter-1, nd-1),
jdStart = Max(int(int(jdRBest)-diameter+1), 0),
jdEnd = Min(jdRBest+diameter-1, md-1);
if (k=idEnd-idStart < 2*diameter-2)
{
if (idStart==0) idEnd = Min(idEnd+2*diameter-2-k, nd-1);
if (idEnd==nd-1) idStart=Max(int(int(idStart)-2*diameter+2+k), 0);
}
if (k=jdEnd-jdStart < 2*diameter-2)
{
if (jdStart==0) jdEnd = Min(jdEnd+2*diameter-2-k, md-1);
if (jdEnd==md-1) jdStart=Max(int(int(jdStart)-2*diameter+2+k), 0);
}
if (ENC_USEISOMETRIES && !ENC_USEORIENTATIONSCHEME) iso_LAST1 = iso_LAST;
else iso_LAST1 = iso_IDENTITY;
IFSAF Theory and Applications 22
//No local search on sub-blocks by now
//if (!ENC_OVERSTEP) overstep=1;
//if (ENC_OVERSTEP && overstep==0) overstep=1;
WORD I, BB=B*B, NN=N*N, md1=md/overstep, nd1=nd/overstep;
//for (WORD id=0; I=id*md*overstep,id<nd1; id++) // Search in D-pool
//for (WORD jd=0; jd<md1; jd++)
//fprintf(rpt, "%d : %d..%d %d..%d\n", diameter, idStart,idEnd, jdStart, jdEnd);
//fflush(rpt);
for (WORD id=idStart; I=id*md,id<idEnd; id++) // Search in D-pool
for (WORD jd=jdStart; jd<jdEnd; jd++)
for (WORD Sk=0; Sk<4; Sk++)
{
//WORD IJ = I+jd*overstep;
WORD IJ = I+jd;
DWORD IJNN = ((DWORD)IJ)*BB;
double SumD = DPI[IJ].SubBlock[Sk].Sum,
SumD2= DPI[IJ].SubBlock[Sk].Sum2;
for (TIsometries iso=iso_IDENTITY; iso<=iso_LAST1; iso++)
{
S.dwSearch++;
if (ENC_USEORIENTATIONSCHEME && ENC_USEISOMETRIES)
{
iso = TIsometries(
IsometryClass[RPI->SubBlock[K].Orientation]
[DPI[IJ].SubBlock[Sk].Orientation] );
}
GetSubBlock(Sk, &DPool[IJNN], B, DBlock1, N);
BlockTransform(DBlock1, DBlock, N, iso);
MinimizeRms(S, Block, DBlock, N, SumR, SumR2, SumD, SumD2,
alpha, beta, ialpha, ibeta, E0, E);
S.dwAlphaCount++;
if (E < EBest)
{
idBest = id, jdBest = jd; kBest = Sk;
EBest = E; E0Best = E0;
isoBest = iso;
alphaBest = alpha; betaBest = beta;
ialphaBest = ialpha; ibetaBest = ibeta;
}
} // for each isometry
} // nested for all id,jd
/*
// Here we have the best matching D-block with xxxxBest paremeters
// Restore idBest, jdBest to finer step
idBest = idBest*overstep;
jdBest = jdBest*overstep;
GlobalErrorwoLSS += fabs(EBest);
TIsometries isoOldBest = isoBest;
BOOL LSSsuccess = FALSE;
if (ENC_OVERSTEP)
{ // now we search D-pool with finer step around the best block found
// search range is -overstep+1, ..., overstep-1
TIsometries iso_FIRST1 = iso_IDENTITY;
if (ENC_OVERSTEPUSEISOMETRIES) iso_LAST1 = iso_LAST;
else iso_LAST1 = isoBest, iso_FIRST1 = isoBest;
WORD idStart = MAX(int(idBest)-overstep+1, 0),
idEnd = MIN(idBest+overstep-1, md-1),
jdStart = MAX(int(jdBest)-overstep+1, 0),
jdEnd = MIN(jdBest+overstep-1, nd-1);
for (WORD id=idStart; I=id*md,id<=idEnd; id++) // Search in D-pool
for (WORD jd=jdStart; jd<=jdEnd; jd++)
{
if (id == idBest || jd == jdBest) continue; // skip retesting best block
WORD IJ = I+jd;
DWORD IJNN = ((DWORD)IJ)*NN;
double SumD=0, SumD2=0;
for(k=0; k<4; k++)
{ SumD += DPI[IJ].SubBlock[k].Sum;
SumD2+= DPI[IJ].SubBlock[k].Sum2;
}
for (TIsometries iso=iso_FIRST1; iso<=iso_LAST1; iso++)
{
S.dwSearch++;
BlockTransform(&DPool[IJNN], DBlock, N, iso);
MinimizeRms( rpt, Block, DBlock, N, SumR, SumR2, SumD, SumD2,
IFSAF Theory and Applications 23
alpha, beta, ialpha, ibeta, E0, E,
S.dwDet0, S.dwAlphagt1, S.dwAlphalt0, S.dwAlphaltm1);
S.dwAlphaCount++;
if (E < EBest)
{
idBest = id, jdBest = jd;
EBest = E; E0Best = E0;
isoBest = iso;
alphaBest = alpha; betaBest = beta;
ialphaBest = ialpha; ibetaBest = ibeta;
LSSsuccess=TRUE;
}
} // for each isometry
} // for each neightbor
}
if (LSSsuccess) S.dwLSSsuccess++;
if (isoOldBest==isoBest) S.dwLSSsameiso++;
*/
SE[K] = E = fabs(EBest);
E0Best = fabs(E0Best);
fprintf(rpt, "\tR(%d)<-D(%d,%d,%d): %d %lf %lf [%lf] E:%lf %0.0lf%%\n",
K,DPI[idBest*md+jdBest].id*did,DPI[idBest*md+jdBest].jd*djd,kBest,
int(isoBest), alphaBest, betaBest, sqrt(E0Best), sqrt(E),sqrt(E)*100/255.0);
fflush(rpt);
Pack(1, 1, cod);
Pack(idbits, DPI[idBest*md+jdBest].id, cod);
Pack(jdbits, DPI[idBest*md+jdBest].jd, cod);
Pack(2, kBest, cod);
Pack(3, isoBest, cod);
if (ENC_QUANTIZEALPHA)
{
Pack(alphabits, ialphaBest, cod);
Pack(betabits, ibetaBest, cod);
}
else
{
Pack(alphaBest,cod);
Pack(betaBest, cod);
}
//UpdateEncStats(S, isoBest, alphaBest, betaBest, E, ir, jr,
// DPI[idBest*md+jdBest].id, DPI[idBest*md+jdBest].jd );
GetSubBlock(kBest, &DPool[DWORD(idBest*md+jdBest)*B*B], B, DBlock1, N);
BlockTransform(DBlock1, DBlock, N, isoBest);
BlockLumScale(DBlock, appBlock, N, alphaBest, betaBest);
switch (K)
{
case 0: img.SetBlock(ir*B, jr*B, N, appBlock); break;
case 1: img.SetBlock(ir*B, jr*B+N, N, appBlock); break;
case 2: img.SetBlock(ir*B+N, jr*B, N, appBlock); break;
case 3: img.SetBlock(ir*B+N, jr*B+N, N, appBlock); break;
}
}
delete Block;
delete DBlock;
delete DBlock1;
delete appBlock;
}
About the source code of IFSAF
The source (more than 1.5M for up to 130.000 compiled lines, including libraries) of this application is a very
good example of what not to do with the C++. It is almost unreadable even for us who wrote it! So, we are
reviewing and "cleaning" it, while adding more comments. We'll not distribute the actual source code
because it is too much windows-msdos dependant, so it requires a too big effort to understand how it works.
Some parts of the new source are available, and will be included in this document. We gave priority to the
most interesting sections of the source - the IFS encoding -, so it is here!
You may also get the Linux command line version that implements only the color encoding; this should be
soon, if not yet, available at our ftp site.
IFSAF Theory and Applications 24
How the True-Color Encoding Works
Background
To understand the following, you should already know how fractal image compression works on gray-scale
images, specifically the fractal image compression through IFS with exahustive search in the domain-pool.
The knowledge of Fisher's paper SIGGRAPH92 course notes, and Jaquin's Fractal Image Compression - a
review on IEEE transaction - should suffice.
Abstract
We use YIQ color model to exploit human eye different sensitivity to the Y,I,Q channels. So we assign more
bits to Y and less bits to I,Q. But this all is trivial.
For each range block, we search for the best-matching domain block on the Y channel and we use the
corresponding
4
blocks in I,Q channels as well-matching
5
blocks for the I,Q components. So we have only 1
geometric transformation (the isometry and the address of the selected domain block) for all the three
components YIQ, while we recompute the optimal massic transformation for each of the YIQ channels.
This approach can both save compression time and increase the compression ratio; compared to the
obvious three separed encoding of each YIQ channel.
How we can encode true color images
In the first stage we have to convert the image from the RGB model to another color model that is more
suitable for exploiting the differences in how the human eye perceives the colors. These models can be one
of YIQ, YUV, HLS, HVS. We selected the YIQ model.
Since the variance of the chrominance data (I,Q) tend to be less than the corrisponding one of the Y
channel, we can save space reducing the dynamic range of I and Q channels.
Once we have selected a color model there are three ways of doing the encoding:
1) To treat disjointly the 3 channels, doing 3 different encodings with different parameters: we could
decrease bits for alpha and beta, reducing the 2 images and then restoring the original size in the
decompresion, etc.
2) To treat the color image as a whole, doing all transformations directly on the color image. Obviously,
each single transformation will be splitted in 3 parts, but all decisions are made upon global values - eg. a
global error - and not on separated values for each channel.
3) A mixing of these two methods. That's to say to encode jointly and asymmetrically the 3 channels.
How we do encode true color images
Let we have a true-color image in RGB format (24bpp) which we split into Y,I,Q color planes. Now we could
apply three times the gray-scale algorithm (such as the prof. Fisher's enc.c) to the Y, I and Q channels
(obviously we can use different encoding parameters for each channel, privileging the Y at the expense of I
and Q). In so doing we'll get an algorithm that is almost 3 times slower than the gray-scale encoding. That's
why we have developed a different approach.

4
- For corresponding blocks we mean those block that have the same coordinates in the image. Or, in other words, those three blocks (one in
each of the Y,I,Q color planes) that are the color components of a block in the true-color image.
5
- Obviously they are not the best matching blocks, but here is the matter: we allow sub-optimal block mapping on the I and Q channels.
IFSAF Theory and Applications 25
In the picture the Y,I,Q color components of the Lena image.
Examinating several different images and their related YIQ color planes, we noticed that something as the
"shape" of the gray-scale image on each channel is preserved - even if not exactly -.
This pictures show the Y,I,Q components for the Lena image filtered
in order to make more evident the shape that is almost exacly preserved.
We noticed that if in a range block of the component Y there is some sort of a detail, we can distinguish its
shape in the I and Q components too, even if more or less attenuated in gray-level luminance of that
channel. What is interesting in this all, is that we don't need to change the parameters of the geometric
transform (offset of best block and isometry) across the three channels, while recalculating the massic
transform parameters for each channel.
- Other ways to compress 24 bit images.
- Results: a comparison between the sub-optimal yiq and the separated optimum.

Anda mungkin juga menyukai