Abstract—Image Compression is widely used for obvious images on the computer screen, which depicts the real-world
purpose: to minimize the amount of memory taken inside the environment.
storage without losing any substantial visual information
contained inside the image. There are various techniques of On the other end, there is Pattern Recognition or Image
image compression available, such as Huffman Encoding and Recognition. Sometimes we need to obtain information from
Run Length Encoding (RLE). Here we discuss a novel technique one or several images. The information might be easy to be
to reduce spatial redundancy which is familiar to RLE by storing obtained manually using human capability, but nowadays there
the image into several intervals of pixel values. This technique are several use-cases which requires the machine to have these
could also be combined with Huffman Encoding, removing capabilities automatically. Several examples of these use-cases
coding redundancies, to achieve better performance. are face recognition for attendance management system,
transportation plate recognition in CCTV, and so on. A
Keywords—image, compression, run length encoding, dynamic machine having this recognition capability is really useful,
programming, segment tree because a machine could process large amount of data in less
amount of time, so there are many tasks that could be
I. INTRODUCTION automated. The field of pattern recognition in image aims to
Image is a representation of visual information or picture in get machines to perform these recognition tasks, and the output
a two-dimensional space [1]. This piece of information could might be some descriptions of the image, such as what objects
hold spatial information which is not available at other data, are inside the image, how many people are there in the image,
such as texts. To represent spatial information in a two- who are inside the image, how do the text inside the image
dimensional space, an image is basically created from a two- reads, and many other tasks. Basically, nowadays these tasks
dimensional array (or matrix), in which each element of the are done by the machine using learning algorithm.
matrix at a position (x,y) holds a value corresponding to the In the middle of these process, there is Image Processing.
brightness of the picture at the position mapped to the This image aims to improve the quality of the image for easier
coordinate (x,y). Each of these positions is called a pixel. Since next usage, by humans or by machines [1]. Basically, operators
an image is two-dimensional, to embed our environment into in Image Processing maps an image into other image by some
an image, there is some spatial mapping required to put the functions, in which the output image is expected to not have
view we see into the image. In other word, each pixel issues or problems the initial image has in the first place. For
corresponds to a position in the real word. example, some problems in Image Processing are Image
The study about image is very broad. At one end, there is a Enhancement (enhancing image brightness in case the captured
study on how to capture image from the real word and convert image is too dark), Image Deblurring (removing blur in the
them into stream of bytes to be processed later by software or initial image), Image Restoration (restoring image from
manually by human hand. This is the part which requires some problems such as blurs and noises caused by for example aging
low-level programming and understanding of physics (such as picture).
light sensor, etc.). Then, after the digital device obtains the Another problem which occurs in Image Processing is
picture in terms of bytes of memory, there is a need to visualize Image Compressing. We need several information to represent
the captured information into the screen for other people to be an image (such as the brightness value of each pixel, as have
able to view them. This is the field of Computer Graphics. It’s been discussed previously). To have a better quality, an image
used in digital photography, film and television, video games, normally should have a higher number of pixels, so that the
and on electronic devices. This field aims to display images image would be more continuous mathematically and smoother
(graphics) effectively to users in the screen, using some visually. This means, the image needs to store more values, and
primitives such as lines, circles, etc. User could then view more memory is required to store the image inside the storage
F. Segment Tree
Fig. 6. Example 3 of Huffman Encoding Segment tree is a tree-based data structure that stores
information about array intervals as a tree [2]. It is used to
Then in the next step, the node with symbol 5 and the node obtain aggregate information over an interval of an array (such
from the previous step have the smallest frequencies (0.06 and as sum, product, greatest common divisor, or other associative
0.05 respectively), so they are joined to create a new parent operations) in O(log n) time complexity without the need to
node with frequency 0.11. This process is continued until only loop over the intervals and doing the operations one by one
a connected component remain. (which requires O(n) time complexity on average. Segment
tree also supports an update operation (update value of
D. Run Length Encoding elements of array), whether single update or range update
Run Length Encoding (RLE) is used to remove spatial (multiples consecutive elements at once) in O(log n).
redundancy. Basically, for each row of the image, it Segment tree basically stores several intervals information
compresses intervals with the same pixel values, only storing of the array. Let the length of the array be n. The root of the
the values and the length of the interval. In the extreme cases, segment tree stores the information of the interval with length
this could compress a row with all same values of pixels into n, i.e. the entire array. Then, each child of the root stores the
only two values, the width of the image, and the pixel of value. information of the left half and the right half of the array,
For example, in a row with values [3, 3, 3, 4, 4, 4, 4, 4, 3, 3, 3, respectively, in which each of them has length of n/2. This
4], RLE compresses the array into the following: [(3, 3), (4, 5), process is continued recursively, and each leaf of the segment
(3, 4)] because there are three 3’s, followed by five 4’s, and tree stores information of individual element of the array. Since
then four 3’s. This is a lossless compression technique since we in each level the size of the interval is halved, therefore the
don’t remove any information, and we can reconstruct the number of the levels in a segment tree is k, where 2 k ~ n, for
initial array from the compressed array without any loss of which k is approximately O(log n). And since in the i-th level
information, by simple iteration. there is 2i node, therefore the number of nodes is
approximately 20 + 21 + … + 2k = 2k+1 – 1 ~ 2n, which is still
E. Dynamic Programming linear in n.
Dynamic Programming (DP) is a problem-solving
paradigm by decomposing the solution into several stages, so
that the entire solution (final stage) could be derived from the
previous stage, which are connected to each other [3]. DP
could be used to solve optimization problems, such as
minimization or maximization. In DP paradigm, we consider
every next decision state possible, obtain their optimal values
by some transition to the previous state, and store these optimal
values in a table for future purposes by the next state. In
contrast, in the greedy approach we only consider one next
state, which optimizes the current state (but might not
necessarily leads us to the global optima).
There are two different types of DP: the top-down approach
and the bottom-up approach. In the top-down approach, we
start from the final state (the entire solution we care about), and
optimize this state result from the previous states, in which the
values are obtained recursively. On the other hand, in the
bottom-up approach, we fill the DP table from the base case
(initial states) up to the final state iteratively. Several classical
Fig. 7. Illustration of Segment Tree (cp-algorithms.com)
algorithmic problems which could be solved using dynamic
programming are shortest path problem, knapsack, longest
REFERENCES
[1] https://informatika.stei.itb.ac.id/~rinaldi.munir/Citra/2022-2023/01-
Pengantar-Pengolahan-Citra-Bag1-2022.pdf, accessed on 18 December
2023