Approaches
to
Change Detection
Song Liu, JSPS DC2,
Sugiyama Lab, Tokyo Institute of Technology
2014/3/10
Newton Laws
Predicting the movements of new
planets
http://certifiedknowledge.org/blog/are-search-queries-becoming-even-more-unique-statistics-from-google
http://www.nasa.gov/content/goddard/haiyan-northwestern-pacific-ocean/#.Uq3u6vQW2So
(p.5)
Supervised
Predict Future: (|)
Unsupervised
Learn Patterns:
Classification, Regression
Unsupervised
Learn changes of patterns
/ or ,
Change detection
Vapniks Principle
Support Vector
Machines!
(p.10)
In Graphical Model
Changes in brain activation (Clry et al., 2006)
10
Chapter 2,
Distributional Change Detection
1. Background & Existing Methods
2. Problem Formulation
3. Divergence based Change-point Score
4. Experiments
12
Changes in Time-series
Health monitoring
e.g. Kawahara et al., 2012
13
200
400
600
800
1000
1200
1400
1600
1800
2000
Metrics: True positive rate, false positive rate, and detection delay
14
model
data
Existing Methods
3000
2500
2000
1500
1000
500
0
-4
-3
-2
-1
2500
2000
1500
Model-free Methods:
1000
500
No model assumption.
0
-4
-2
15
10
Chapter 2,
Distributional Change Detection
1. Background & Existing Methods
2. Problem Formulation
3. Proposed Method
4. Experiments
16
Sliding window
Imaginary Bar
Time
Time point
17
Chapter 2,
Distributional Change Detection
1. Background & Existing Methods
2. Problem Formulation
3. Proposed Method
4. Experiments
18
19
Density Estimation
A naive way is a two step approach:
()
()
()
()
()
()
p(x)
0.5
0
-5
(, )
=1
, =
exp(
2 2
2
0
x
20
Knowing
1.5
Vapniks Principle!
0.5
0
-1.5
-1
-0.5
0.5
1.5
21
Kullback-Leibler Divergence as
(p.22)
Change-point Score
We may have different choices for change-point score:
Kullback-Leibler Divergence (Kullback and Leibler, 1951)
22
Pearson Divergence as
Change-point Score (pp.24)
The log term in Kullback-leibler divergence can go
infinity if
()
()
= 0.
23
20
15
+ 1
10
5
0
-8
-6
-4
-2
More robust!
24
Divergences as Change-point
Scores (Summary) (pp.22-28)
Robust
Change
Detection
25
Chapter 2,
Distributional Change Detection
1. Background & Existing Methods
2. Problem Formulation
3. Proposed Method
4. Experiments
26
Experiments (Synthetic)
original signal
score
ground truth
Mean Shift
Variance Scaling
Covariance Switching
27
Experiments (Sensor)
Stay
Walk
Jog
time
time
28
29
Conclusion
A robust and model-free method for distributional
change detection in time-series data was proposed.
The proposed method obtained the leading
performance in various artificial and real-world
experiments.
30
Chapter 3,
Structural Change Detection
1. Background
2. Separate Estimation Methods
3. Proposed Methods
4. Experiments
31
Patterns of Interactions
Genes regulate each other
via gene network.
Brain EEG signals may be
synchronized in a certain
pattern.
(Wikipedia)
32
,,
33
Chapter 3,
Structural Change Detection
1. Background
2. Existing Methods
3. Proposed Methods
4. Experiments
34
= ( 1 , 2 , , () )
is the inverse covariance matrix
We can visualize the above MN using an undirected graph.
1
4
3
6
35
1
4
3
6
1
4
3
636
37
Gaussian
Gaussian
38
= (
() = (1 (
, , () )
), 2 (
), , (
))
39
40
: 2
Gaussian: , =
Nonparanormal: , = ()
Polynomial: , = [ , , 1 , , , 1]
41
!
However, for a generalized Markov Network, there is
no closed-form for ().
Importance Sampling (IS):
42
Chapter 3,
Structural Change Detection
1. Background
2. Existing Methods
3. Proposed Methods
4. Experiments
43
44
To ensure:
= () ,
45
() = ()
46
Sparse Regularization
L2 regularizers Group lasso regularizer
Elastic Net
47
p. 57
Computational Time
800
700
600
Primal
500
400
300
200
Dual
100
0
40
50
60
Dimension
70
80
48
Chapter 3,
Structural Change Detection
1. Background
2. Existing Methods
3. Proposed Methods
4. Experiments
50
Regularization path
51
Regularization path
52
Regularization path
P-R curve
53
Twitter Dataset
source: Wikipedia
54
3 weeks
Q PPP
Time
55
KLIEP
From 7.26-9.14
Flasso
Small # of samples, severe over-fitting of Flasso!
56
Conclusion
We proposed a direct structural change detection
method for Markov Networks.
The proposed method can handle non-Gaussian
data, and can be efficiently solved using primal or
dual objective.
The proposed method showed the leading
performance on both artificial and real-world
datasets
57