Parte Terza A

Multimedia Features
Sistemi Multimediali - DIS 2011
Vectorswhat are they???

Image with 1 pixel <5>
0 5

Image with 2 pixels <5,7>
0 5

Image with 3 pixels <5,7,3>
0 5
3
Distance between two images???

Given A<a1,a2,a3> and B<b1,b2,b3>, how
different are they?
B
Euclidean distance
(A,B)
B
Which image is more similar to A?
C (A,C)
A
(A,B)
B
Which image is more similar to A?
C (A,C)
A
(A,B)
B
Closer to A
Similar to A
Find 2 most similar images to A
A
Find 2 most similar images to A
Nearest-neighbor search
Find images at most different from A
A
Find images at most different from A
range search
Are there other similarity measures?
A
Lets try angles...
F 3 2 5
A 5 5 5
E 2 2 2
A
E
F
Lets try angles...
Similar F 3 2 5
composition
A 5 5 5
E 2 2 2
A
E
F
Lets try angles...
Similar F 3 2 5
composition
A 5 5 5
E 2 2 2
A
E
F If we use
angles as a
similarity
measure,
then A is
more similar
cos(AE) > cos(AF) Maria Luisa Sapino
to E -than
Basi di dati
F
Multimediali
Angle-based measures
Given
x x1 , x2 ,..., xn y y1 , y2 ,..., yn
Dot product n
x. y xi y
i
i 1
Cosine similarity

cosx , y
x. y
x y
What is a good measure then??
Application dependent...
...but, distances in a metric space help indexing!

Metric model: axioms
Any function d expressing a distance must satisfy

the following axioms:
self-minimality: d(s,s) = 0
minimality d(s1,s2)>= d(s1,s1)
simmetry d(s1, s2) = d (s2, s1)
triangular inequality d(s1,s2) + d (s2,s3) >= d (s1,s3)
Example: Euclidean distance

Metric distances (Minkowski metrics)
L1-metric: d = (dX+dY)
Y
dY
dX
Also called Manhattan Distance

L2-metric: d = (dX2+dY2)1/2
Y
dY
dX
Also called Euclidean

ManhattanDistance
Distance
L3-metric; d = (dX3+dY3)1/3
.....
.....
L(infinity): d = max{X,Y}
metric model
Well suited for certain kinds of similarity evaluation,

such as color based comparisons
Consistent with widely used approaces from

computer vision and pattern recognition
communities
results suggest that the L1 metric may better capture
human notions of image similarity.
Makes it relatively easy to index data, modeled as
vectors of properties, in terms of classical multi-
dimensional indexing techniques.
Feature
a property of interest that can help us index an
object
For a student record
student_ID
can be a feature
What are the features for an image?
Image features
There are many possible features
Color histogram
Texture
Edges
Shapes
Objects
Object or scene semantics
Feature selection: which one to use for indexing?
Image analysis
Image analysis gives the features
Image analysis is usually an expensive

operation
..so...
In the design of a MIS, you want to minimize the

number of image analysis operations to
perform on the fly
pre-processing/ pre-analysis
indexing/clustering
semantic optimization
If
(op1 op2) = (op2 op1) and
Cost (op2 op1) < Cost(op1 op2)
then do op2 op1
Problem
If each pixel is treated as a different feature.
..then the feature vector size is equal to the
number of pixels for example . 628
1024
Dimensionality curse: high dimensions make

indices unusable (10-15 dimensions max!!!)
Problem
Feature vector size: 628 1024

Dimensionality curse: high dimensions make
indices unusable (10-15 dimensions max!!!)
Solution: Reduce # dimensions of the
vector
use distance-preserving transforms
Ex: fourier transform, DCT, wavelet transform
628 1024 4
DCT
Feature Selection
The initial step of a MIS design involves feature
selection or dimensionality reduction
data are transformed and projected in such a way that
the selected features are the important ones!
Important ..
Application semantics
Perception impact
Discrimination Power
Object Description Power
Query Description Power and Workload
Maria Luisa Sapino - Basi di dati
30
Multimediali
Transforms
A
Transforms
A
Transforms
Distances and angles are

preserved
Transforms

preserved
Some dimensions are more

important (differentiating)
than the other
Transforms

preserved
Some dimensions are more

important (differentiating)
than the other
Eliminate unimportant
dimensions
Transform + Projection (Compression or

Feature selection)
Projection
What happens to distances???
(A,B) (A,B)
(A,B) <> (A,B)



2
1
1 2
False hit
(1> 1) Miss
(2< 2)
Misses are not desirable!
Can not be eliminated with postprocessing

2
1
1 2
False hit
(1> 1) Miss
(2< 2)
Good feature..
A good feature is significant and enables us to
differentiate objects from others as much as
possible
A good feature corresponds to users perception as

much as possible
Relevance feedback!!!!
What does significant mean

Information theoric sense:
An event is more significant if it carries more
information
What does significant mean

Information theoric sense:
An event is more significant if it carries more
information
An event that has high occurrence rate carries less
information
Solar eclipse is more interesting then sunset
High frequency ----- less information

Low frequency ----- high information
Entropy
Total information content (uncertainty)
Entropy (example)
more uncertain
P(a) = 0.5, P(b) = 0.5 H=1
P(a) = 1.0, P(b) = 0.0 H=0

less uncertain
Entropy (example)
more uncertain
P(a) = 0.5, P(b) = 0.5 H=1
more information
P(a) = 1.0, P(b) = 0.0 H=0
less uncertain
less information
Which feature is better?

Along F1 data are distributed with a
higher variance and seems that
F1 has a greater discriminatory
F2 power. Is it always true?
F1
F3
F2
F1
F3
F2 Better separation!
F1
F3
Less frequent!
F1
F3
Principal Component Analysis

Karhunen-Loeve (KL) transform
Optimally decorrelates the input data
Given a data set described in a vector space, PCA
identifies a seto of alternative base for the space
along wich the spread is maximized
52 A. Picariello
PCA and Feature Selection

Let F1, F2, .. Fn be n features.
Let us consider the covariance among each couple of
feature
measures how much a pair of features vary from
their means w.r.t each other
Let us consider the matrix S(i,j) = COV (i, j)
PRINCIPLE: if a couple of features (i, j) is statistically
independent from each other COV(i,j)=0 and
COV(i.i)= 2 S is a diagonal matrix
A. Picariello
Note that ..
COV(X,Y) = E[(X-E[X]) (Y-E[Y])]
1
X ,Y i (x i x )(y i y )
n
VAR(X)= COV(X,X)
54 A. Picariello
PCA Goals
.. To identify a set of alternative dimensions for the given data
space such that the covariance matrix of the data along this
new set of dimensions is diagonal
through eigen-decomposition into eigenvalues and eigenvectors
r
(S i I) r 0
S PCP 1
1 0 ... 0

0 ... 0
C
2
0 0 ... 0

0 0 ... n
r r r
P r1 r2 ... rn
55 A. Picariello
5
2nd Principal
Component, y2 1st Principal
Component, y1
4
2
4.0 4.5 5.0 5.5 6.0
PCA Eigenvalues
5
1 2
2
4.0 4.5 5.0 5.5 6.0
PCA Algorithm
PCA algorithm:
1. X Create a data matrix, with one row vector xn
per data point
2. X subtract mean x from each row vector xn in X
3. covariance matrix of X
Find eigenvectors and eigenvalues of
PCs the M eigenvectors with largest eigenvalues
2d Data
10
-2
2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5
Principal Components
5 1st principal vector
4
3
Gives best axis to
project 2
Minimum RMS error 1
Principal vectors are 0

orthogonal
-1
-2
-3 2nd principal vec

-4
-5
-5 -4 -3 -2 -1 0 1 2 3 4 5
How many components?

Check the distribution of eigen-values
Take enough many eigen-vectors to cover 80-90% of the
variance
Single Value Decomposition

A[n x m] = U[n x r] L[r x r] (V[m x r])T
A: n x m matrix
U: n x r matrix
L: r x r diagonal matrix (r: rank of the matrix)
V: m x r matrix
SVD - Properties
THEOREM [Press+92]: always possible to decompose matrix
A into A = U L VT , where
U, L, V: unique (*)
U, V: column orthonormal (ie., columns are unit vectors,
orthogonal to each other)
UTU = I; VTV = I (I: identity matrix)
L: singular value are positive, and sorted in decreasing order
Principal component
F1
F3
Principal component analysis
F2
Principal component
is a combination of
features!
F1
F3
Principal component analysis
The eigenvector of the

covariance matrix
with the largest
F2 eigenvalue
Principal component
is a combination of
features!
F1
F3
Compactness of a database
comp( D) similarity (oi , o j )

i j

i j
more
compact

i j
A compact database is not desirable!!!

Feature quality
A feature is
good if we remove it, the overall compactness increases
bad if we remove it, the overall compactness decreases
good bad

Parte Terza A

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Parte Terza A

Diunggah oleh

Hak Cipta:

Format Tersedia

Multimedia Features

Sistemi Multimediali - DIS 2011

Vectorswhat are they???

Vectorswhat are they???

Vectorswhat are they???

Distance between two images???

Which image is more similar to A?

Which image is more similar to A?

Find 2 most similar images to A

Find 2 most similar images to A

Find images at most different from A

Find images at most different from A

Are there other similarity measures?

Lets try angles...

What is a good measure then??

...but, distances in a metric space help indexing!

Metric model: axioms

Any function d expressing a distance must satisfy

Example: Euclidean distance

Metric distances (Minkowski metrics)

Also called Manhattan Distance

Metric distances (Minkowski metrics)

Also called Euclidean

Metric distances (Minkowski metrics)

Well suited for certain kinds of similarity evaluation,

Consistent with widely used approaces from

Image analysis gives the features

Image analysis is usually an expensive

In the design of a MIS, you want to minimize the

Dimensionality curse: high dimensions make

Feature vector size: 628 1024

Distances and angles are

Distances and angles are

Some dimensions are more

Distances and angles are

Some dimensions are more

Transform + Projection (Compression or

What happens to distances???

(A,B) <> (A,B)

What happens to distances???

What happens to distances???

What happens to distances???

A good feature corresponds to users perception as

What does significant mean

What does significant mean

High frequency ----- less information

P(a) = 1.0, P(b) = 0.0 H=0

Which feature is better?

Which feature is better?

Which feature is better?

Which feature is better?

Principal Component Analysis

PCA and Feature Selection

Minimum RMS error 1

Principal vectors are 0

-3 2nd principal vec

How many components?

Single Value Decomposition

Principal component analysis

Principal component analysis

The eigenvector of the

comp( D) similarity (oi , o j )

comp( D) similarity (oi , o j )

comp( D) similarity (oi , o j )