A Statistical Approach for Object Motion Estimation with MPEG Motion Vectors
Xiaodong Yu', Ping Xue' and Qi Tian'
Nanyang Technological Universiv, School of Electrical and Electronic Engineering, Singapore
Institute far Infocomm Research, Agency for Science, Technology and Research, Singapore
'{exdyu, epxue)@ntu.edu.sg, tian@i2r.a-star.edu.sg
was sensitive to the existence of small motion vector
clusters and resulted in accurate identification of small
objects. Babu and Ramakrishnan [6] accumulated and
interpolated motion vectors over a few frames to enrich
the motion information. Nevertheless, these approaches
are inefficient if the object is too small. For example,
wherever two or more objects smaller than a macroblock
conbibute to distinct motions within a macroblock, the
encoded motion vector cannot represent the motion
correctly [4] hence motion segmentation is infeasible.
Furthermore, if an object is in similar size of one or two
macroblocks, only one or two motion vectors cannot
provide sufficient information to distinguish object
motions from noisy vectors thus it is still difticult to
segment this object from the background. These problem
motivate us to seek another way to estimate object motion
with motion vectors. We argue that it is possible to
extract some useful motion information in macro level
even when the objects are too small to estimate individual
object's motion, providing these objects follow some
kinds of common motion pattern.
In this paper we proposed a statistical model to
estimate the mean object motion with MF'EG motion
vectors under the stationary assumption. Two normal
distribution terms are used to model the randomness of
the object motion and the noises embedded in motion
vector field respectively. Applying the statistical analysis
within a time window, we alleviate the granularity of
motion vector field on the cost of instant motion
information.
The rest of this paper is organized as follows. In
Section 2, we formulate our research question and
proposed a statistical model. Then we present the test bed
for the proposed model in Section 3. In Section 4 we
present experimental results of the model and the
influential factors presented in Section 2 with the test bed.
Finally, a conclusion and the discussions of future work
are given in Section 5.
Abstract
In this paper we propose a statistical approach to
estimate the object motion with A4PEG motion vectors. A
model with two normal distribution terms is applied to
represent the simplified object motion. One term models
the nobes embedded in the mofion vectorfield produced
in the encoding stage and the other term models ihe
randomness of the frue object motion. Experiments with
vehicle motion estimation from MPEG ha@c video are
used to evaluate the proposed algorithm. The influence of
rime window, frame size and referencej+ame distance are
investigated. The vehicle speeds can be estimared with a
high accuracy up to 85% 92%.
1. Introduction
Object motion estimation is a classic problem in the
computer vision field. In recent years with the popularity
of MPEG videos, much research effoorts have been
attached to estimate object motion with MPEG motion
vectors. Although MPEG motion vector is originally
designed to minimize the motion prediction error in
coding, it also embeds rich motion information among
frames [I]. Since motion vectors are readily available in
MPEG streams, we need neither fully decode the
compressed video stream nor calculate the optical flow
thus great computations can be saved.
Motion-vector-based object motion estimation is
composed of two components: motion segmentation and
object tracking. It is assumed that objects are rigid or their
parts are rigidly connected to one another and objects
have continuous motion [I]. Thus an object can be
segmented from background by clustering motion vectors
according to their similarities in directions or amplitudes
[2,3,10]. In the next step, motion parameters are derived
from the motion vectors associated to this object for
tracking. Such algorithms are analogues of those in
optical flow field and they all rely on the success of
moving object segmentation. However, the granularity of
motion vector field limits the performance of motion
vector based object segmentation. To solve this problem,
scholars have raised several approaches. Eng and Ma [5]
used unbiased fuzzy clustering to replace the well-know
fuzzy c-means clustering. They found that this algorithm
0-7803-8603-5/04/$20.00
02004 IEEE
2. Theoretical analysis
In this paper, we assume that the object motions are
homogeneous both in spatial and temporal domain and we
call it the stationary assumption. This assumption requires
that object motions are similar to one another in terms of
519
<
5 = (X,'-p)
and S N R is given by
- N(0, * 1,
0:+U:
(4)
(5)
Now let us characteristically analyze the influential
factors in <and SNR.
The estimation error is controlled by three factors:
the sample size N, the variance of the true object motion
dxand the variance of noise d,.
The sample size N, i.e.,
the total number of motion vectors used for calculation of
the sample mean, is positively correlated to the time
window T and the object density d. The variance of the
true object motion dxis in direct proportion to the time
window T while in inverse proportion to the density of
objects d. The variance of noise dnis composed of the
variance of the motion estimation error and the variance
of the error caused by resolution limitation. The motion
estimation error results from the block-matching
algorithm in MPEG video. For a specific application, its
variance can be reasonably assumed as a constant. The
error caused by resolution limitation comes from the half
pixel accuracy of MF'EG motion vector. It implies that the
motion estimation error has a lower bound there is
always a minimum random error introduced by this half
pixel accuracy. Its variance is also a constant. As a result,
the variance ofnoise dmis a constant in equation (2).
SNR is controlled by three factors too: the variance of
the true object motion dr,
the variance of-noise deand
the sample mean of the true object motion x.dxand d,
have been discussed as above. To improve the estimation
accuracy of p, a larger
is preferred. The amplitude of
the true object motion X i s influenced by the speed of
objects, the frame size F, and the distance between
current frame and its reference frame Df For a given
application, we cannot control the speed of objects. But
we can improve the signal-noise-ratio by increasing the
amplitude of motion vectors. By observation, the
amplitude of a motion vector is roughly in direct
<
X , N(P,~::),
ni N(O,U;),
X z ' - N ( P . +~U : ) ,
(2)
where p and dxare mean and variance of the true
object motion, and dnvariance of noise. We approximate
,,
The mean of the true object motion is a parameter of
interest to the users in applications because it represents
the dominant motion characteristics. It is desired to
improve its estimation accuracy. This can he achieved by
either reducing the variance of the estimation error or
<
520
$4,
833
&
a
gp.6
,
.
.
a
.
,._"
"-I...-
.
I
,
,.*<-.*a*
*1/,
(b)
fits.
4. Experimental results
We test the proposed model and the impact of the
influential factors with the test bed described in Section 3
and the test videos are two MF'EG videos collected from
hvo Skycams respectively. Each of them is 5 minutes
long and includes 6 lanes, representing various traffic
conditions at certain place. They are digitalized by a
MPEG card in MPEG-1 format at resolution 352x288,
frame rate 25Fps, reference frame distance Df=3 and
constant bitrate 1150khps. The variable of object motion
is the speed of vehicle in this case study. The mean speed
of each lane is calculated and compared with ground buth
independently. Ground truth is obtained manually at 2
seconds interval.
First of all, we test the normal approximation of object
motion. Figure 2.a show the distributions of the estimated
speed within a lane for a 30-second test sequence. It is
bell-shape and symmetric about their mean. The normal
521
Reference:
[I] Nevenka Dimitrova and Forouzan Clshani, Motion
Recovely for Video Content Classification, ACM
Transactions on Information Systems, Vo1.13, No.4,
October,1995,pp408-439
[2] F. Bartolini, V. Cappelhi, and C. Giani, Motion
Estimation and Tracking for Urban Traffic Monitoring,
Proceeding of IEEE Internal Conference on Image
Processing, 1996, pages 87-90
[3] Heitou Zen, Tameharu Hasegawa, Shinji Ozawa, Moving
Object Detection from MPEG Coded Picture, Proceeding
of IEEE International Confrrence on Image Processing,
vol. IV, pp.25-29, Oct. 1999
[4] Kyongil Yoon, Daniel DeMenthon, David Doermann,
Event Detection from MPEC Video in the Compressed
Domain, Internalional Conference on Pattern Recognition,
p. 1819-1825, Volume 1, Barcelona, Spain, September 03 08,2000.
[5] Haw-Lung Eng, Kai-Kuang Ma, Motion Trajectory
Extraction Based on Macroblock Motion Vectors for Video
Indexing, International Conzrence on Image Processing,
pp:284-288, 1999
[6] Babu, R.V., Ramakrishnan, K.R., Compressed domain
motion segmentation for video object extraction,Acoustics,
Speech, and Signal Processing, 2002 IEEE Inlernational
Conference on, Volume: 4,2002, Page(s): 3788 -3791
,
I
I
[7] Christophe Garcia, Georgios Tziritas, Optimal Projection of
w.
w
a>
2-0 Displacements for 3-D Translational Motion
Estimation, Image omi Vision Computing, Vol 20, pp:793(a)
(b)
804,2002
Figure 4. The mean motion vectors (a) and the mean accuracies
of speed estimation @) far test videos in different fnme size and
[8] Xiaodong Yu, Lingyu Dum, Qi Tian, Highway Traffic
Information Extraction from Skycam MPEG Video,
reference hame distance. T=60s.
Proceedings of IEEE 5th Intelligent Tramponation System
Conference, Page(s): 37- 42, Sep. 3-6, 2002
5. Conclusion and future work
[9] C A . Gonzales, H. Yeo and C.J.Kuo, Requirements for
Motion Estimation Search Range in MPEG-2 Coded Video,
In this paper, an algorithm that estimates object
IBM Joumal of Research Development, Vol. 43, No.4, July
motion from MPEG compressed video with statistical
1999.
[IO] Jim Wang and Ze-Nian Li, Kernel-based Multiple Cue
model was presented. This algorithm complements the
Algorithm for Object Segmentation, IS&T/SPIE, Symp. On
existing clustering based approaches in small object
Electronic Image and Video Communications and
scenarios where the latter are inefficient. Theoretical
Processing, 2000
analysis and experimental evaluation were conducted to
investigate the influential factors of the proposed model.
.
7
.
PI,
011
522