Abstract
We give a new proof of a theorem of Shub & Smale [9] on the
expectation of the number of roots of a system of m random polynomial equations in m real variables, having a special isotropic Gaussian
distribution. Further, we present a certain number of extensions, including the behaviour as m + of the variance of the number of
roots, when the system of equations is also stationary.
Introduction
Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle
Igua 4225. 11400 Montevideo. Uruguay.
a j tj ,
Xi (t) :=
(1)
j di
Rice formulae
In this section we give a brief account without proofs of Rice formulae, contained in the statements of the following two theorems (Azas and Wschebor,
[1]).
Theorem 1 Let V be a compact subset of Rm , Z : V Rm be a random
field and u Rm be a fixed point.
Assume that:
1) Z is Gaussian,
2) x
Z(x) is a.s. of class C 1 ,
3) for each x V , Z(x) has a non degenerate distribution and denote by
pZ(x) its density.
4) P{x V , Z(x) = u, det Z (x) = 0} = 0. Here, V is the interior of
V and Z denotes the derivative of the field Z(.).
5) m (V ) = 0, where V is the boundary of V and m is the Lebesgue
measure on Rm (we will also use dx instead of m (dx)). Then, denoting
NuZ (V ) := {x V : Z(x) = u}, one has
E NuZ (V ) =
(3)
E
Vk
j=1
n
i=1
Ui , then Z(x)
/ Bm (u, ).
Z(x)u <} dx
=
n
Ui
Hence,
NuZ (V ) = n = lim
0
1
m (Bm (u, ))
(5)
dx
V
1
m (Bm (u, ))
=
V
Bm (u,)
Main results
a j tj ,
Xi (t) =
i = 1, ..., m
j di
(i)
V ar aj
di
j1 .....jm
di !
j1 !...jm !(di
Then,
E NX =
h=m
h=1 jh )!
(7)
independent for different is but can be now correlated from one j to another
for the same value of i. It is easy to check that this implies that for each
i = 1, ...m, the covariance r Xi (s, t) is a function of the triple ( s, t , s 2 , t 2 )
( . is Euclidean norm in Rm ). It can also be proved (Spivak [10]) that this
function is in fact a polynomial with real coefficients, say Q(i)
r Xi (s, t) = Q(i) ( s, t , s 2 , t 2 ),
(8)
(9)
(10)
di
(i)
c k uk .
Q (u, v, w) =
k=0
In this case, it is known that the necessary and sufficient condition for it
to be a covariance is that ck 0 k = 0, 1, ..., di . [Shub & Smale corresponds
to the choice ck = dki ]. Here is a simple proof of this fact using the method
of Box & Hunter [4]. The covariance of the random field
a j tj ,
X(t) =
j d
t Rm
having the form (1), where the random variables aj are centered and in L2
is given by
E (X(s)X(t)) =
j,j sj tj
(11)
j d, j d
ck
j,j s t =
k=0
j d, j d
j =k
k!
(s1 t1 )j1 ... (sm tm )jm =
j!
j ! j j
st
j!
j d
qi (x) :=
(i)
(i)
(i)
Qu
Q(i)
(i)
(i)
(i)
(i) 2
where the functions in the right-hand sides are always computed at the triplet
(x, x, x).
Put:
ri (x)
.
hi (x) := 1 + x
qi (x)
Then for all Borel sets V with boundary having zero Lebesgue measure, we
have
m
X
E N (V ) = (2)
m/2
qi ( t 2 )
Lm1
V
Here
1/2
Eh ( t 2 )dt.
(13)
i=1
hi (x)i2 )1/2
Eh (x) := E (
i=1
Ln :=
Kj
j=1
((m + 1)/2)
2
(m/2)
1 m+1 m + 1
Lm = 2 2 (
).
2
2
We define the integral
+
Jm :=
0
m1
d =
(1 + 2 )(m+1)/2
/2
1
Km
that will appear later on. We need also the surface area m1 of the unit
2 m/2
sphere S m1 in Rm , m1 = (m/2)
.
Remark on formula (13). Note that formula (13) takes simpler forms
8
in some special cases. For example, when the functions hi (x) do not depend
on i, denoting by h(x) their common value, we have
Eh (x) =
h(x)Km .
E(N X ) = (2)m/2
m1 q(2 )m/2
d1 ...dm Lm m1
0
2/Km
h(2 )d
m1 q(2 )m/2
d1 ...dm
Proof of Theorem 4
Xi (t)
Q(i) ( t 2 , t 2 , t 2 )
1/2
which have variance 1. Denote Z(t) = (Z1 (t) , ..., Zm (t))T . Applying Rice
Formula for the expectation of the number of zeros of Z (Theorem 1):
E N X (V ) = E N Z (V ) =
1
m
(2) 2
dt,
...
where Z (t) := [Z1 (t) .. .. .. Zm (t)] is the matrix obtained by concatenation of
the vectors Z1 (t), ..., Zm (t). Note that since E (Zi2 (t)) is constant, it follows
i
that E Zi (t) Z
(t) = 0 for all i, j = 1, ..., m. Since the field is Gaussian this
tj
implies that Zi (t) and Zi (t) are independent and given that the coordinate
fields Z1 , ...Zm are independent, one can conclude that for each t, Z(t) and
Z (t) are independent. So
E N X (V ) = E N Z (V ) =
(2) 2
(15)
Zi Zi
(t)
(t)
t t
2
r Zi (s, t) |s=t = ri ( t 2 )t t + qi ( t 2 ) ,
s t
+ qi ), qi , ..., qi
U Zi (t)
= Diag hi , 1, ..., 1
qi
Put now
U Zi (t)
qi
Ti :=
and set
...
T := [T1 .. .. .. Tm ]
We have
1/2
qi .
(16)
i=1
Now, we write
T =
W1
Wm
where the Wi are random row vectors. Because of the properties of independence of all the entries of T , we know that :
W2 , ..., Wm are independent standard Gaussian vectors in Rm
10
| det(T )| = W1
d(Wj , Sj1 ),
j=2
where Sj1 denotes the subspace of Rm generated by W1 , ..., Wj1 and d denotes the Euclidean distance. Using the invariance under isometries of the
standard normal distribution of Rm we know that, conditioning on W1 , ..., Wj1 ,
E | det(T )| = E (
hi (x)i2 )1/2
i=1
Kj ,
j=1
where 1 , ..., m are i.i.d. standard normal in R. Using (16) and (15) we
obtain (13) .
5
5.1
Examples
Shub & Smale
1
,
1+x
d
dud1
; h(x) = hi (x) =
d
1+u
1 + ud
11
E(N X ) =
2
Km
+
0
md1
d = d(m1)/2
(1 + 2d )(m+1)/2
which differs by a constant factor from the analogous Shub & Smale result
for (1 + u)d which is dm/2 .
5.2
; h(x) =
1 + x + x2
1 + x + x2 (1 + x + x2 )2
1 + x + x2
and
Hm
E(N ) =
Jm
X
with Hm =
0
m1 (1 + 42 + 4 ) 2
d.
m
(1 + 2 + 4 ) 2 +1
5.3
m1
d 2 . Note that the factor 2 in Q has only been added for computational
convenience and does not modify the random variable N X of the unperturbed
system. For the perturbed system, we get
2dxd1
2d(d 1)xd2
q(x) =
; r(x) =
; h(x) = d.
(1 + xd )2
(1 + xd )2
Therefore,
X
E(N ) =
2
Km
m
2
2d2(d1)
(1 + 2d )2
m1
d d
m+1
2
Km 2m/2 d 2
+
0
md1
d. (17)
(1 + 2d )m
m2
2
m1
2
which shows that the mean number of zeros is reduced by the perturbation
at a geometrical rate as m grows.
5.4
Consider again the case in which the polynomials Q(i) are all equal and the
covariances depend only on the scalar product, i.e. Q(i) (u, v, w) = Q(u). We
assume further that the roots of Q, that we denote 1 , ..., d , are real
(0 < 1 .... d ). We get
d
q(x) =
h=1
1
; r(x) =
x + h
h=1
1
1
; h(x) =
2
(x + h )
qi (x)
h=1
h
.
(x + h )2
It is easy now to write an upper bound for the integrand in (13) and compute
the remaining integral, thus obtaining the inequality
E(N X )
which is sharp if 1 = ... = d .
13
d m/2
d ,
1
2/Km
m1
0
(18)
1
1
+
2
1+
+ 2
(m1)/2
+
2
2
(1 + )
( + 2 )2
1/2
d.
One can compute the limit of the right-hand side as 0. For this purpose,
2
notice that the function (+
and is
2 )2 attains its maximum at =
1
dominated by 42 . We divide the integral in the right-hand member of (18)
into two parts, setting for some > 0
m1
I, :=
0
1
1
+
2
1+
+ 2
(m1)/2
+
2
2
(1 + )
( + 2 )2
1/2
d,
and
+
m1
J, :=
1
1
+
2
1+
+ 2
(m1)/2
+
2
2
(1 + )
( + x2 )2
1/2
d.
By dominated convergence,
+
J,
22 + 1
2 + 1
(m1)/2
d
,
1 + 2
I, I,
I,
where
I,
:=
0
2
2
+
1 + 2 + 2
/
=
0
d
+
z 2
z 2
+
1 + z 2 (z 2 + 1)
as 0, and
+
I,
:=
0
2
2
+
1 + 2 + 2
(m1)/2
(m1)/2
dz
Jm , (19)
+1
z2
+ 2
d
2
1+
+
22 + 1 (m1)/2 d
+ Jm , (20)
2 + 1
1 + 2
0
(m1)/2
14
22
2 +1
<
22 +1
2 +1
1
Jm
22 + 1
2 + 1
d
1 + 2
< 2:
5.5
(m1)/2
2(m1)/2
.
Jm 2
An analytic example
; di > 0, 0.
(21)
gm (t)dt,
V
with
gm (t) =
1
(m + 1)
(m/2 + 1) (4)m/2
Lm
(2)m/2
r Xi (s, t) = i ( t s 2 ), (i = 1, ...m).
(22)
Xi
Xi
(0)
(0)
t
t
2 r Xi
s t
t=s
= 2i (0) ,
which implies, again using the same method as in the proof of Theorem 4 :
m
E | det(X (0))| = 2
m/2
Lm
i=1
|i (0)|1/2
E N (V ) =
m/2
i=1
|i (0)|1/2 Lm m (V ).
(24)
Our next task is to give a formula for the variance of N X (V ) and use it to
prove that -under certain additional conditions - the variance of
nX (V ) =
N X (V )
E N X (V )
- which has obviously mean value equal to 1- grows exponentially when the
dimension m tends to infinity. In other words, one should expect to have large
fluctuations of nX (V ) around its mean for systems having large m. Moreover
16
It is well known [8] that satisfies 2) and (0) = 1 if and only if there exists
a probability measure G on [0, +) such that
+
(x) =
0
(25)
x2 G(dx) < .
(26)
exponentially fast as m +.
Proof: To compute the variance of N X (V ) note first that
Var N X (V )
= E N X (V ) N X (V ) 1
+ E N X (V ) E N X (V )
17
, (27)
(28)
where pX(s),X(t) (., .) denotes the joint density of the random vectors X(s), X(t).
Next we compute the ingredients of the integrand in (29). Because of
invariance under translations, the integrand is a function of = t s. We
denote with 1 , ..., m the coordinates of .
The Gaussian density is immediate:
pX(s),X(t) (0, 0) =
1
1
m
2
(2) [1 (
2 )]m/2
(30)
Xi Xi
(s)
(s)/C
s
s
Xi Xi
(s)
(t)/C
s
t
2r
s t
t=s
1
r
r
(s, t)
(s, t)
2
1 (r(s, t)) s
s
r
2r
1
r
(s, t) +
(s, t)
(s, t)r(s, t).
2
s t
1 (r(s, t)) s
t
18
Asi Asi
Ati Ati
=E
= 2 (0)
E Asi Ati = 4 2 4
and for every i = j:
2
,
4
1 2
2
,
1 2
(31)
(32)
U0
.
. | U1
.
.
. V0 . | . V 1 .
..
..
.
.
.
.
.
|
.
.
.
. V0 | .
. V1
.
T =
,
.
. | U0
.
.
U1
. V1 . | . V 0 .
..
..
.
. . | .
. .
.
.
.
. V1 | .
. V0
where
U0 = U 0 (
) = 2 (0) 4
V0 = 2 (0) ;
U1 = U 1 (
V1 = V 1 (
) = 4
) = 2 ;
2 2
;
1 2
2 4
2 2
;
1 2
and there are zeros outside the diagonals of each one of the four blocks. Let
us perform a second regression of Ati on Asi , that is, write the orthogonal
decompositions
t,s
Ati = Bi
+ C Asi (i, = 1, m),
19
t,s
where Bi
is centered Gaussian independent of the matrix As , and
U12
;
U02
V2
t,s
Var(Bi
) = V0 1 12 .
V0
U1
,
U0
V1
,
=
V0
t,s
Var(Bi1
) = U0 1
For = 1, C1 =
For > 1, C
Conditioning we have :
t,s
E | det(As )|| det(At )| = E | det(As )|E | det((Bi
+ C Asi )i,=1,..,m )|/As
with obvious notations. For the inner conditional expectation, we can proceed in the same way as we did in the proof of Theorem 4 to compute the
determinant, obtaining a product of expectations of Euclidean norms of noncentered Gaussian vectors in Rk for k = 1, ..., m. Now we use the well-known
inequality
E +v E
U12
U02
1/2
V12
V02
(m1)/2
L2m .
N X (V
m (V )
2
V V
1 V12 V02
dsdt
1 2
m/2
H(
(33)
Let us put V = Vm in (33) and study the integrand in the right hand member.
The function
1/2
U02 (x) U12 (x)
H(x) =
V02 V12 (x)
is continuous for x > 0. Let us show that it does not vanish if x > 0.
It is clear that U12 U02 on applying the Cauchy-Schwarz inequality to
the pair of variables Asi1 , Ati1 . The equality holds if and only if the variables
20
).
Asi1 , Ati1 are linearly dependent. This would imply that the distribution - in
R4 - of the random vector
:= X(s), X(t), 1 X(s), 1 X(t)
would degenerate for s = t (we have denoted 1 differentiation with respect
to the first coordinate). We will show that this is not possible. Notice first
that for each w > 0, the function
e
(s, t)
ts
2w
x 2
4w
(x Rm ).
Var( w )G(dw),
Var() =
0
1
2
21
(34)
1
2
so that
(0)(x)+ (x) =
1
2
+
0
6.1
Acknowledgments
This work was supported by ECOS action U03E01. The authors thank two
anonymous referees for their remarks that have contributed to improve the
final version and for drawing our attention to the paper by Kostlan [7]
References
[1] J-M. Azas and M. Wschebor. On the distribution of the maximum of
a gaussian field with d parameters. Ann. Appl. Probability, to appear,
2004. see also preprint http://www.lsp.ups-tlse.fr/Azais/publi/ds1.pdf.
[2] A. T. Bharucha-Reid and M. Sambandham. Random polynomials. Probability and Mathematical Statistics. Academic Press Inc., Orlando, FL,
1986.
22
[3] L. Blum, F. Cucker, M. Shub, and S. Smale. Complexity and real computation. Springer-Verlag, New York, 1998. With a foreword by Richard
M. Karp.
[4] G. Box and J. Hunter. Multi-factor experimental designs for exploring
response surfaces. Ann. Math. Stat., 28:195241, 1957.
[5] A. Edelman and E. Kostlan. How many zeros of a random polynomial
are real? Bull. Amer. Math. Soc. (N.S.), 32(1):137, 1995.
[6] M. Kac. On the average number of real roots of a random algebraic
equation. Bull. Amer. Math. Soc., 49:314320, 1943.
[7] E. Kostlan. On the expected number of real roots of a system of random
polynomial equations. In Foundations of computational mathematics
(Hong Kong, 2000), pages 149188. World Sci. Publishing, River Edge,
NJ, 2002.
[8] I. J. Schoenberg. Metric spaces and completely monotone functions.
Ann. of Math. (2), 39(4):811841, 1938.
[9] M. Shub and S. Smale. Complexity of Bezouts theorem. II. Volumes
and probabilities. In Computational algebraic geometry (Nice, 1992),
volume 109 of Progr. Math., pages 267285. Birkhauser Boston, Boston,
MA, 1993.
[10] M. Spivak. A comprehensive introduction to differential geometry. Vol.
V. Publish or Perish Inc., Wilmington, Del., second edition, 1979.
23
i.z
M
and computing I =
1
M
h(ti )
i=1
I(U ) = 1/M
ti + U
i=1
where U is uniform on [0, 1]n . It is clear that E I(U ) = I and general considerations on QMC integration imply that I(U ) has small variance.
So we can make N independent replications of this calculation, computing
I = 1/N (I(U1 ) + + I(UN )
and construct Student-type confidence intervals. This interval is correct whatever the properties of the function h are. In practice N
is chosen rather small
(12) so that the MC implies roughly a loss of speed of 12 with respect to a
pure QMC method. But on the other hand we have a reliable estimation of
error.
We will present some numerical applications and also application to design of
experiments in large dimension making a comparison with LHS and Othogonal
Arrays.
References
Azas Genz, A. (2009),Computation of the distribution of the maximum of
stationary Gaussian sequences and processes. In preparation
Allan Genz web site http://www.math.wsu.edu/faculty/genz/homepage
Genz, A. (1992), Numerical Computation of Multivariate Normal Probabilities,
J. Comp. Graph. Stat. 1, pp. 141150.
Nuyens, D., and Cools, R. (2006), Fast algorithms for component-by-component
construction of rank-1 lattice rules in shift-invariant reproducing kernel
Hilbert spaces, Math. Comp 75, pp. 903920.
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
University ; Cecile
Mercadier Lyon, France and Mario Wschebor
Universite de Toulouse
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
1 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
2 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
6000
5000
4000
3000
2000
1000
20
40
60
80
100
120
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
3 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
4
0
20
40
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
60
80
100
120
4 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
Testing
The maximum of absolute value of the series is 3.0224. An estimation
of the covariance with WAFO gives
2
1.5
0.5
0.5
1.5
0
20
40
60
80
100
120
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
5 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
6 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
I :=
l1
(x)dx
(1)
ln
I :=
(z1 )dz1
l1 /T11
u2 T12 z1
T22
(z2 )dz2
l2 T12 z1
T22
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
(2)
7 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
1 (u1 /T11 )
I :=
dt1
1 (l1 /T11 )
u2 T12 1 (t1 )
T22
dt2
l2 T12 1 (t1 )
T22
(3)
h(t)dt.
(4)
[0,1]n
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
8 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
QMC
In the form (4) the MC evaluation is based on
M
h(ti )
I = 1/M
i=1
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
9 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Theorem
(Nuyens and cools, 2006) Assume that h is the tensorial product of
periodic functions that belong to a Koborov space (RKHS). Then the
minimax sequence and the worst error can be calculated by a
polynomial algorithm. Numerical results show that the convergence is
roughly O(M 1 ).
This result concerns the worst case so it is not so relevant
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
10 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
A meta theorem
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
11 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
MCQMC
Let (ti , i) be the lattice sequence, the way of estimating the integral
can be turn to be random but exactly unbiased by setting
M
I = 1/M
ti + U
i=1
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
12 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
13 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Do processes exist ?
In this part X(t) is a Gaussian process defined on a compact interval
[0, T].
Since such a process is always observed in a finite set of times and
since the previous method work with say n = 1000, is it relevant to
consider continuous case ?
Answer yes : random process occur as limit statistics. Consider for
example the simple mixture model
H0 : Y N(0, 1)
H1 : Y pN(0, 1) + (1 p)N(, 1) p [0, 1], M R
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
(5)
14 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
(6)
est 1
es2 1
et2 1
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
15 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
(7)
(8)
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
16 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
17 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
An example
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
18 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Extensions
Treat all the cases : maximum of the absolute value, non centered,
non-stationary. In each case some tricks have to be used.
A great challenge is to use such formulas for fields .
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
19 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
References
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
20 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
21 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
THANK-YOU
MERCI
GRACIAS
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
22 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
THANK-YOU
MERCI
GRACIAS
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
22 / 22
Abstract
This paper deals with the asymptotic behavior when the level tends to +1, of
the tail of the distribution of the maximum of a stationary Gaussian process on a
xed interval of the line. For processes satisfying certain regularity conditions, we
give a second order term for this asymptotics.
Introduction
X = fX (t); t 2 0; T ]g, T > 0 is a real-valued centered stationary Gaussian process with
B exp ? 1 u+
(1)
for some constants B > 0 and < 1: (respectively ) denotes the standard normal
distribution (respectively density).
The aim of this paper is to improve the description of the asymptotic behavior of
P (MT > u) as u ! +1 that follows from (1) replacing the bound for the error by
an equivalent as u ! +1. More precisely, under the regularity conditions required in
Theorem 1.1, we will prove that:
r
2
2
"
3 1?
!#
? u
4
2
2
1 + o(1)]
(2)
This contradicts Theorem 3.1. in Piterbarg's paper in which a di erent equivalent is
given in case T is small enough (see also Aza s e t al. (1999)).
We will assume further that X has C sample paths (this implies < 1) and that
for every n 1 and pairwise di erent values t ; ::; tn in 0; T ], the distribution of the set of
5n random variables (X j (t ); ::; X j (tn); j = 0; 1; 2; 3; 4) is non-degenerate. A su cient
condition for this to hold is the spectral measure of the process not to be purely atomic
or, if it is purely atomic, that the set of atoms have an accumulation point in the real
line (A proof of this facts can be done in the same way as in Chap. 10 of Cramer and
Leadbetter, 1967).
If is a random vector with values in Rn whose distribution has a density with respect
to Lebesgue measure, we denote by p (x) the density of at the point x 2Rn . 1C denotes
the indicator function of the set C .
If Y = fY (t) : t 2 Rg is a process in L we put ?Y (s; t) for its covariance function and
i j Y
?Yij (s; t) = @@si@t? j (s; t) for the partial derivatives, whenever they exist.
The proof of (2) will consist in computing the density of the distribution of the
random variable MT and studying its asymptotic behavior as u ! +1. Our main tool is
4
( )
( )
the following proposition which is a special case of the di erentiation Lemma 3.3 in Aza s
and Wschebor (1999):
Proposition 1.1 Let Y be a Gaussian process with C paths and such that for every
n 1 and pairwise di erent values t ; ::; tn in 0; T ] the distribution of the set of 3n
random variables (Y j (t ); ::; Y j (tn); j = 0; 1; 2) is non-degenerate. Assume also that
E fY (t)g = 0; ?Y (t; t) = E fY (t)g = 1.
Then, if is a C -function on 0; T ],
2
( )
( )
(3)
(4)
(0)
( )
Z T
( )
u t (s);8s2 0;T ]g
p Y t ;Y
( ( )
(t))
Here the functions `; a; t and the (random) functions Y `; Y a ; Y t are the continuous
extensions to 0; T ] of:
?
?
`
(s) = 1 (s) ? ?Y (s; 0) (0) ; Y `(s) = 1 Y (s) ? ?Y (s; 0)Y (0) for 0 < s T; (6)
(s) = (s ?2 t)
Y
(s) ? ?Y (s; t) (t) ? ??Y ((t;t; st)) 0(t)
10
11
Y
Y t(s) = (s ?2 t) Y (s) ? ?Y (s; t)Y (t) ? ??Y ((t;t; st)) Y (t)
0
10
11
0 s T; s 6= t;
0 s T; s 6= t:
(8)
(9)
We will repeatedly use the following Lemma. Its proof is elementary and we omit it.
Lemma 1.1 Let f and g be real-valued functions of class C de ned on the interval 0; T ]
of the real line verifying the conditions:
1) f has a unique minimum on 0; T ] at the point t = t , and f 0(t ) = 0; f "(t ) > 0:
2) Let k = inf j : g j (t ) 6= 0 and suppose k = 0 ; 1 or 2.
De ne
Z T
h(u) =
g(t) exp ? 21 u f (t) dt:
Then, as u ! 1:
Z
k (t ) 1
1
g
xk exp ? 41 f "(t )x dx;
h(u) t k! uk exp ? 2 u f (t )
J
where J = 0; +1) ; J = (?1; 0] or J = ]?1; +1 according as t = 0; t =
T or 0 < t < T respectively.
2
( )
( )
+1
C -paths, covariance r(:), = 1, and such that for every n 1 and pairwise di erent
t ; ::; tn in 0; T ], the distribution of the set of 5n random variables (X j (t ); ::; X j (tn); j =
0; 1; 2; 3; 4) is non-degenerate. We shall also assume the additional hypothesis that r0 < 0
in a set dense in 0; T ].
4
( )
( )
Step 1. Proposition1.1 applied to the process Y = X and the function (t) = 1 for all
t 2 0; T ] enables to write the density pMT of the distribution of the maximum MT as:
pMT (u) = A (u) + A (u) + A (u)] : (u); with
(10)
1
Z T
1
E (X t(t)? t(t)u)1fX t s u t s ;8s2 ;T g dt:
A (u) = ? p
2
Since X is a stationary process and (t) 1, it follows that the processes X and
Xe - de ned as Xe (t) = X (T ? t) - have the same law, so that P (X (s) u for all
s 2 0; T ] jX (0) = u) = P (X (s) u for all s 2 0; T ] jX (T ) = u). Hence, A (u) = A (u).
3
( )
( )
Step 2.
1 2
and
? r(s) s 2 0; T ]:
(s) = n (s) o = 11 +
r(s)
(E X `(s)] ) =
`
1 2
(11)
0 (T )
r
p
(0) =
T (1 + r(T ))
0 and
(T ) =
(1 + r(T )) 1 ? r (T )
2
0:
P Y a(s) u a(s); 8s 2 0; T ]
check that
r0(pT )
P Y a(T ) u a(T ) ;
0
E (Y a(T )) ) = E (Y 0(T )) = (1 ?(1r ?(Tr))(?T ))(r (T )) ) ;
so that, since the non-degeneracy hypothesis implies that for each T > 0, E (Y 0(T )) ) is
2
non-zeo, it follows that the numerator in the right-hand member is stricly positive for
T > 0.
Hence,
?
P Y a(s) u a(s); 8s 2 0; T ] ( (T )u) C (T ) exp ? u2 F (T ) ;
with C (T ) > 0, where F (t) is the function
2
(12)
which is well de ned since the denominator does not vanish because of the previous
remark.
The following properties of the function F are elementary and will be useful in our
calculations.
(a) F has a continuous extension at t = 0.
(b) F (t) > F (0) = ? for t 6= 0 because:
0
0
r00(t))(1 ? r(t)) and
* F 0(t) = 2 (1 ? r(t))( r ((1t)((?rr(t())t))??((r0(?
t)) )
* r0(t) < 0 for t 2 A 0; T ] with A dense in 0; T ], and
2
2
2
2
2 2
* For t 6= 0,
(r0(t)) ? ( ? r00(t))(1 ? r(t)) =
?
(E (X 0(t) ? X 0(0))(X (t) ? X (0))) ? E (X 0(t) ? X (0))
2
(c) F 0(0) = 0:
(d) F 00(0) = 9(( ? ? ) ) .
2
2
4
2 2
2
L (u; ) = ?
Y
Z T
with
p Y t ;Y
( ( )
( )
2
2
(T ) +
pY
u t (s);8s2 0;T ]g
(t))
(13)
(t);Y 0 (t))
(u (t); u 0(t))dt;
0
(t) + E f((Y(0t())t)) g
(u (t); u 0(t)) = 2 (E f(Y10(t)) g) = exp ? u2
= 2 (E f(Y10(t)) g) = exp ? u2 F (t) :
2
2
2
1 2
1 2
2
2
2
2
? >0
?
2
4
6
4
2
2
Z T
(t)E Y t(t)1fY t s
( )
p Y t ;Y
u t (s);8s2 0;T ]g
( ( )
(t))
(u (t); u 0(t))dt
Z T
t (t)) g) =
u
(
E
f
(
Y
(t) 2 (E f(Y 0(t)) g) = exp ? 2 F (t) dt C (t) exp ? u2 F (t) dt;
where C is a positive constant. Also, from Lemma 1.1 and the properties of the functions
F and , one gets:
Z T
(t) exp ? u2 F (t) dt C u1 exp ? u2 ?
:
C a positive constant. Hence,
Z T
1 2
1 2
2
2
2
2
Z T
(t)E Y t(t)1fY t s
( )
u t (s);8s2 0;T ]g
p Y t ;Y
( ( )
(t))
(u (t); u 0(t))dt =
= O u1 exp ? u2
2
2
2
2
(14)
(t)E
(t)1fY t s
( )
Z T
2
0
p Y t ;Y
u t (s);8s2 0;T ]g
( ( )
(t))
(u (t); u 0(t))dt
where A a positive constant. Since the function g(t) = (t) t(t) veri es g(0) = g0(0) = 0,
and g00(0) 6= 0, Lemma 1.1 implies:
2
Z T
0
(t)E
(t)1fY t s
( )
pY
u t (s);8s2 0;T ]g
= O u1 exp ? u2
(t);Y 0 (t))
(u (t); u 0(t))dt =
2
2
2
2
(15)
LY (u; ) = O u1 exp ? u2
2
2
2
2
(16)
d A (u) = O 1 exp ? u
:
(17)
du
u
2 ?
Further, observe that since Y is continuous and (s) > 0 for s 2 ]0; T ], (0) = 0,
if Y (0) > 0 the event fY (s) u (s); 8s 2 0; T ]g does not occur for positive u, and if
Y (0) < 0, the same event occurs if u is large enough. This implies that
2
2
2
2
2
and so,
A (u) ? 12 = ?
1
Z
u
d A (v)dv = O 1 exp ? u
dv
u
2
2
2
2
2
(18)
on applying (17).
Step 3.
We will now give an equivalent for A (u): Introduce the following notations:
3
for t 2 0; T ],
1 2
(s)
(1 ? r(t ? s))
p
=
t(s)=
t
=
(E f(X (s)) g)
(1 ? r (t ? s)) ? (r0(t ? s))
t
1 2
= F (t ? s) for s 2 0; T ]; s 6= t
and
t
(t) = p
2
4
2
2
Hence,
0
F 0(0) = 0:
(t) = p
2 F (0)
so that
( )
( )
Z T
Z T
1
B (u; t) dt ? p
Be (u; t) dt = S (u) ? T (u): (19)
A (u) = u 2
2
We will consider in detail the behavior of the rst term as u ! +1.
2
We apply again Proposition 1.1 to compute the derivative of B (u; t) with respect to u.
For t 2 ]0; T :
3
so that as u ! +1 :
Z T
0
(20)
* LZt (u; t) = t(T )P (Zta(s) u ta(s); 8s 2 0; T ]) ( t(T )u). In the same way:
p
LZt (u; t) p1 F (T ? t) exp ? u2 F (T ? t) :
2
and:
Z T
LZt (u; t)dt C u1 exp ? u2 F (0) :
for some constant C .
2
(21)
Z T
LZt (u; t) =?
3
( )
u tx (s);8s2 0;T ]g
( )
( ))
=
De ne
( )
( ))
u F (x ? t) +
(F 0(x ? t))
1
exp
?
2
4F (x ? t)E f(Zt0(x)) g
E f(Zt0(x)) g
2
(22)
0
Gt (x) = F (x ? t) + 4F (x(?F t()xE?f(tZ))0(x)) g :
2
Check that
min Gt(x) = Gt(t) = F (0);
G0t(t) = F 0(0) = 0:
x2 0;T ]
and
Zt(x) =
1
(X 00(t) + X (t)) +
? + O((x ? t) )
t) ( X 000(t) + X 0(t)) + O((x ? t) ):
+ (x ?
3
p
2
2
2
It follows that:
00(0)
E (Zt0(t)) = 9 ( ?? ) = FF (0)
;
2
and
2
4
2
2
We also have:
2
4
2 2
2
? (x; s) 0(x) :
Zt
t(s) ? ? (x; s): t(x) ? Zt
(s ? x)
? (x; x) t
Zt
2
(t; s) 0(t) =
?
t(s) =
Z
t
t(s) ? ? (t; s): t(t) ? Zt
t
(t ? s)
? (t; t) t
p
p
= (t ?2 s) F (t ? s) ? F (0):E fZt(t)Zt(s)g for s 6= t;
F 00(0) + pF (0)E n(Z 0(t)) o = 3 pF 00(0) = ( ? ) > 0;
t
(t t) = 21 p
t
2 F (0) 6( ? ) =
F (0)
where the last inequality is a consequence of the non-degeneracy condition.
Note that since E fZt(t)Zt(s)g 1,
2 pF (t ? s) ? pF (0) for s 6= t;
t
t (s)
(t ? s)
so that
inf tt(s) > 0:
s;t2 ;T
x
t
(s) =
Zt
10
11
10
11
2
4
2 3 2
2
On the other hand, it is easy to see that tx(s) is a continuous function of the triplet
(x; t; s) and a uniform continuity argument shows that one can nd > 0 in such a
way that if jx ? tj
then
x
c > 0 for all s 2 0; T ]:
t (s)
Thus, for jx ? tj , using the Landau-Shepp-Fernique inequality (see Fernique,
1974):
x
? x
t (x) t (x)u
t (x)
x
x s u x s ;8s2 ;T g = ? p
p
(1 + R)
E
(
Z
(
x
)
?
(
x
)
u
)1
f
Z
t
t
t
t
E f(Zt0(x)) g
E f(Zt0(x)) g
( )
where R
( )
10
So,
Z T
t(
p
(23)
S (u) = u 2 T ? u 2
2
Z T
0
dt
+1
d B (v; t) dv:
dv
3
S (u) = u 2 T ? (1 + o(1)) 2T
3 ( ? ) exp ? u
2
2
2
2
2
2
2
(24)
The second term in (19) can be treated in a similar way, only one should use the full
statement of Lemma 3.3 in Aza s and Wschebor (1999) instead of Proposition 1.1, thus
obtaining:
T (u) = O( u1 ): exp ? u2 ? :
(25)
Then, (24) together with (25) imply that as u ! +1:
2
2
2
A (u) = u 2 T ? (1 + o(1)) 2T
3
2
2
3 ( ? ) exp ? u
2
2
2
2
2
2
4
2
2
(26)
Replacing (18), (26) into (10) and integrating, one obtains (2).
Acknowledgment. The authors thank Professors J-M. Aza s, P. Carmona and C. Del-
References
Aza s, J-M., Cierco-Ayrolles, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM Probab. Statist., 3, 107-129.
Aza s, J-M. and Wschebor, M. (1999). On the Regularity of the Distribution of the
Maximum of One-parameter Gaussian Processes. Submitted.
11
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes. J.
Wiley & Sons, New-York.
Fernique, X. (1974). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
d'Ete de Probabilites de St. Flour. Lecture Notes in Mathematics, 480, Springer-Verlag.
New-York.
Piterbarg, V.I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th. Prob. Appl., 26, 687-705.
12
Introduction.
A1 .
The role of (A) in Numerical Linear Algebra has been recognized since a long time
[11], [12], [13] as well as its importance in the evaluation of algorithm complexity [5], [7].
(A) measures, to the rst order of approximation, the largest expansion in the relative
error of the solution of the m m linear system of equations
Ax = b
(1)
(where P is the probability dened on the probability space in which A is dened) or the
moments of the random variable (A). Of course, a priori this will depend on the meaning
of choosing A at random, that is, which is the probability distribution of A. A typical
result is the following:
Theorem 1 (Edelman, 1988) Let A = (ai,j )i,j=1,...,m and assume that the ai,j s are
i.i.d. Gaussian standard random variables. Then:
E {log (A)} = log m + C0 + m ,
(2)
veries M m 2 .
Then, there exists x0 such that, if x > x0 , then
1
P [(A) > x]
4.734m 1 + 4 (log x) 2
x
(3)
Remark 3 There are a few dierences between this statement and the actual statement
in [9]. The rst one is that instead of 4.734 their constant is 3.646, apparently due to a
mistake in the numerical evaluation. The second one, their hypothesis is supi,j |mi,j | 1
1
instead of M m 2 , which they actually use in their proof and which is not implied by
the previous one. Finally, the inequality from [10], which is applied in their proof, does
not apply for every x > 0.
If one denotes 1 , ....m , 0 1 .... m , the eigenvalues of the matrix At A (At
stands for the transpose of A), then
(A) =
1
2
m
1
MA
=
mA
where
MA = max f (x);
x =1
mA = min f (x);
x =1
1
2
f (x) = xt At Ax (x Rm ).
It is possible to study the random variable (A) using techniques related to extrema
of random elds. More precisely, if a > 0:
1
P [MA > a] = P [M + (X, a) 2] E M + (X, a) ,
2
(4)
where, if S m1 is the unit sphere in the m-dimensional euclidean space, then X is the
real-valued random eld
X = f | S m1
and
M + (X, a) = # x : x S m1 , X has a local maximum at the point x and X(x) > a
(note that since f is an even function, {MA > a} occurs if and only if {M + (X, a) 2}).
The main point in making inequality (4) a useful tool is that the expectation in the
right-hand side member can be computed - or at least estimated - using Rice formula for
the expectation of the number of critical points of the random eld X (the derivative of
X).
In fact, we will only use an upper bound for E {M + (X, a)}, as will be explained below.
The upper bound thus obtained for P [MA > a] will be one of the tools to prove Theorem
11 which contains a variant of (3) that implies an improvement if x is large enough.
m
However Conjecture 1 in [9] which states that P [(A) > x] O( x
) remains an open
problem.
Inequality in Proposition 6 is a variant of results that have been known since a certain
time (see for example Lemma 2.8 in [10]).
Our main point here is the connection between the spectrum of random matrices and
the zeros of random elds which makes useful Rice formulae for the moments of the
number of zeros. In our context, inequality (13) is interesting for large values of a for
which the classical inequalities are of the same order. Note also that in this case, the
constant 1/4 in the exponent can be replaced by any constant strictly smaller than 1/2,
if a is large enough.
On the other hand, for the time being, this method does not provide the precise bounds
on the distribution of the largest eigenvalue of a Wishart matrix for values of a close to
a = 2 (c.f. [4] or [8]).
These inequalities permit to deduce inequalities for the moments of log (A), as in
Corollary 13, which gives a bound for E {log (A)} for non-centered random matrices.
This also leads to an alternative proof of a weak version of Edelmans Theorem, which
instead of (2) states that
E {log (A)} log m + C
(5)
for some constant C.
Rice formulae for the moments of the number of zeros of a random eld can be applied
in some other related problems, which are in fact more complicated than the one we are
adressing here. In [2] this is the case for condition numbers in linear programming. We
briey sketch one of the results in this paper.
3
(6)
where A is an n m real matrix, n > m, and y < 0 denotes that all the coordinates
of the vector y are negative. In [1] the following condition number was dened, for the
(feasibility) problem of determining wheather the set of solutions of (6) is empty or not.
Denote by at1 , ...., atn the rows of A,
fk (x) =
atk x
(k = 1, ..., n),
ak
max fk (x).
D(A) = min
m1
xS
1kn
m(1 + log n)
1.
n
If ai,j , , i = 1, ..., n, j = 1, ..., m are i.i.d. Gaussian standard random variables, then
E {log C(A)} max (log m, log log n) + K,
(7)
where K is a constant.
To prove (7) one can also use a method based upon the formulae on extrema of random
elds, since the problem consists in giving ne bounds for the probability
P
xS m1
Technical preliminaries.
In this section, Bm1 (0, ) is the Euclidean ball centered at the origin with radius in
Rm1 , |Bm1 (0, )| is its Lebesgue measure, m1 the (m 1)-dimensional geometric
measure in S m1 and T 0 denotes that the bilinear form T is negative denite.
(8)
1
|Bm1 (0, )|
S m1
|det(F (x))| 1{
(9)
Proof. The hypothesis implies that the points of M+ (F, a) are isolated, hence, that
M + (F, a) is nite. Put
M+ (F, a) = {x1 , ...., xN } .
Then, for j = 1, ..., N :
F (xj ) 0.
F (xj ) = 0;
If 0 is small enough, using the inverse function theorem, there exist pairwise disjoint
open neighbourhoods U1 , ..., UN in S m1 of the points x1 , ..., xN respectively, such that
for each j = 1, ..., N the map x F (x) is a dieomorphism between Uj and Bm1 (0, 0 )
and
N
j=1
it follows that
N |Bm1 (0, 0 )| =
=
N
j=1
S m1
Uj
|det(F (x))|{
m1 (dx).
(10)
X (x) =
x
b2,2 bx1,1
...
2
bxm,2
(11)
....
bx2,m
x
x
....
....
= 2(B2,2 b1,1 Im1 )
... bxm,m bx1,1
(12)
2
36 2e
7a3
C1 (a)
1
G a m exp a2 m ,
m
4
2
36 2e
43 7
(13)
= C1 0.008677...
Proof. Step 1.
We consider the quadratic form dened on Rm :
fG (x) = xt Gt Gx.
We have, for t > 0:
1
> t E M + (fG , t) .
(14)
2
To be able to apply Proposition 5 to M + (fG , t) we need to check condition (8). One
way to do this is to use Proposition 4 in [2], applying it to the random vector eld V = fG
since the random variable fG (x) has a bounded density in Rm1 . One can conclude that
almost surely formula (9) holds true for F = fG .
P
Step 2.
For each x S m1 let us compute the joint distribution of fG (x) and fG (x) in RRm1 .
Note rst that due to the invariance under linear isometries, this joint distribution is
the same for all x S m1 . We compute it for x = w = (1, 0, ..., 0)t . Notice that, in this
case Aw = A, B w = B, ...
6
m
h=1
gh,i gh,1
2
gh,1
and
(i = 2, ..., m)
h=1
are independent, each one being Gaussian centered with variance b1,1 .
So, since the distribution of b1,1 is 2 with m degrees of freedom and on account of
(10) and (11), the joint density of fG (w) and fG (w) is equal to:
pfG (w),fG (w) (y, z) = 2m (y)
=
exp 12
(2)
m1
2
z 2
4y
2m1 y
1
2
3m
1
2
( m2 )(2)
m1
2
m1
2
exp 12 y +
z 2
4y
(15)
2P ( G
> t)
E lim
0
1
|Bm1 (0, )|
dy
S m1
|det(fG (x))| 1{
E |det(fG (x))| 1fG (x)0 /fG (x) = y, fG (x) = 0 pfG (x),fG (x) (y, 0) m1 (dx)
+
= m1 (S m1 )
S m1
E |det(fG (w))| 1fG (w)0 /fG (w) = y, fG (w) = 0 pfG (w),fG (w) (y, 0)dy.
In the last equality we have used again the fact that the law of the random eld
{fG (x) : x S m1 } is invariant under a linear isometry of Rm .
Substituting the density from (15) and taking into account that
m
m1 (S
m1
2 2
)= m ,
( 2 )
we obtain:
P( G
> t)
2 2
2m ( m2 )
exp y2
E |det(fG (w))| 1fG (w)0 /(b1,1 , ..., bm,1 ) = (y, 0, ..., 0)
dy.
y
(16)
Step 3.
From the expression (12) for fG (w), since B2,2 is positive denite we have that:
|det (fG (w))| 1{f
G (w)0
m1
1{f (w)0} (2b1,1 )m1 .
} (2b1,1 )
G
+
3
y
2
2
y m 2 exp
P G >t
2
2
t
2m ( m2 )
7
dy =
2m ( m2 )
2 Jm (t).
1
2
2
and
3
Jm (t) 2tm 2 1 +
m
1
2
exp
1
1
+ ... +
8
8
m1
m
1
2
exp
m
1
2
3
t
16
t
tm 2 exp ,
2
7
2
>a m
em
16 2 m 3
e2 2 1/2
a2 m
2
(a m)
exp
4
(m 2)m1 7
2
2
m1
2
4 2e
2
am
(ea2 )m 1 +
exp
3
7a m
m2
2
2
36 2e
a2 m
2 m
)
exp
(ea
7a3 m
2
2
36 2e
a2
m
exp
1 log(a2 )
3
7a m
2
2
36 2e
a2 m
1
,
exp
7a3
m
4
A1 v > x <
1/2
1
.
x
c
m1
= P t2m1 >
c .
m
mc
Proof. Let V = (V1 , ..., Vm ) be a m-dimensional random vector with standard Gaussian
distribution. We can assume that
V
.
U=
V
Let us denote, to simplify the notation K = V22 + ... + Vm2 . Then the statement
V12
c
>
2
V1 + K
m
is equivalent to that
V12
c
>
,
K
mc
(m 1)V12
c
m1
m1
=P
>
c = P t2m1 >
c ,
m
K
mc
mc
where tm1 is a real valued r.v. having Students distribution with m 1 degrees of
freedom.
Proposition 9 Assume that A = (ai,j )i,j=1,...,m , ai,j = mi,j + gi,j (i, j = 1, ..., m), where
the gi,j s are i.i.d. standard Gaussian r.v.s and M = (mi,j )i,j=1,...,m is non random.
Then, for each x > 0 :
P [ A1 x] C2 (m)
m1/2
,
x
(17)
where
2
C2 (m) =
1/2
sup
cP
c(0,m)
t2m1
m1
>
c
mc
C2 () = C2 2.34737...
A1 U > x = E P
A1 U > x U
1/2
1
.
x
(18)
A1 U x
c
m
1/2
A1 x
and
| < wA , U > |
c
m
1/2
A1 x
= E P
x} P | < wA , U > |
= E I{
A1
= E I{
A1 x} P
= P t2m1 >
t2m1 >
c
m
| < wA , U > |
and
c
m
1/2
1/2
m1
c
mc
m1
c P [ A1 x].
mc
where we have applied Lemma 8. From here and (18) we have that
P [ A1 x]
1
P t2m1 >
m1
c
mc
1/2
1 m
x c
1/2
To end the proof notice that, if g is a standard Gaussian random variable, then
sup c1/2 P t2m1 >
c(0,m)
m1
c
mc
c(0,1)
m1
c
mc
(19)
c(0,1)
Remark 10 Explicit expressions for C2 (m) dont seem to be easy to obtain. Therefore, we have carried out some numerical computations with MatLab in order to have
approximations to this value.
In the following table we include the results.
Table 1. Optimal values for C2 (m) and values of c in which they are reached.
m
3
C2 (m) 1.879
c
1.146
4
2.038
0.923
5
2.086
0.823
10
2.244
0.672
25
2.309
0.604
50
100
Notice from the table that restriction in (19) to that c (0, 1) is not important as long
as m 4.
Main results.
Theorem 11 Assume that A = (ai,j )i,j=1,...,m , ai,j = mi,j + gi,j (i, j = 1, ..., m), where
the gi,j s are i.i.d. centered Gaussian with common variance 2 and M = (mi,j )i,j=1,...,m
10
M
1
1 C1
+ C2 (m) m
+ C2 (m) 4m (log x) 2 ,
x
m
(20)
A >
1
M +a m P
C1
a2 m
G > a m exp
.
m
4
1
M +a m +P
C1
a2 m
C2 (m) m
exp
+
m
4
x
P [(A) > x] P
A >
A1 >
M +a m
M
+a m .
Putting
a=
4 log x
m
Corollary 12 With the notations and hypotheses of Theorem 11, m 3, for any x large
enough
1
m 1
M
P ((A) > x) H
+
+ (log x) 2 .
x m
where H is a constant.
Proof. Apply Theorem 11.
One can also use Propositions 6 and 9 to get bounds for the moments of log (A). For
example we can obtain the following corollary:
Corollary 13 With the notations and hypotheses of Theorem 11. If m 3, then
E {log (A)} log(m) + 1 + log C2 + log
11
M
C
+ 4 + 1 exp [4m] .
m
2m
1
M.
P A1 > ex
+ C2 me = log C2 m + 1.
+
Let =
(21)
M
4 m.
P [ A > ex ] dx
G > ex
C1
+
m
dx
1 x
M
exp
e
4
dx.
, we obtain that
C1
1 2
E {log A } +
dy
exp y
m 4 m
2
C1
exp (4m) .
+
2m
And the corollary follows from here and (21).
Putting M = 0, = 1, the last Corollary provides a weak version of Edelmans
Theorem of the form (5).
Acknowledgment. Authors want to thank an anonymous referee whose comments and
suggestions have improved the paper.
References
[1] Cheung, D.; Cucker, F. (2001). A new Condition Number for Linear Programming.
Math. Programming, 91, 163-174.
[2] Cucker, F.; Wschebor, M. (2002). On the Expected Condition Number of Linear
Programming Problems. Numer. Mathem. To appear.
[3] Cuesta-Albertos, J.; Wschebor, M. (2002). Some Remarks on the Condition Number
of a Real Random Square Matrix. Submitted.
[4] Davidson, K.R.; Szarek, S.J. (2001) Local Operator Theory, Random Matrices and
Banach Spaces, in Handbook of the Geometry of Banach Spaces, Vol. 1, Ch. 8, Eds.
W.B. Johnson and J. Lindenstrauss, Elsevier, pp. 317-366.
12
[11] Turing, A. (1948). Rounding-o errors in matrix processes. Quart. J. Mech. Appl.
Math. 1, 287-308.
[12] von Neumann, J; Goldstine, H. (1947). Numerical inverting matrices of high order.
Bull. Amer. Math. Soc. 53, 1021-1099.
[13] Wilkinson, J.H. (1963). Rounding Errors in Algebraic Processes. Ed. Prentice-Hall.
13
Zeitschrift ffir
Z. Wahrscheinlichkeitstheorie verw. Gebiete
60, 393-401 (1982)
Wahrscheinliehkeitstheorie
und verwandte Gebiete
9 Springer-Verlag 1982
Caracas,
(1)
Lemme 1. (i) Si {gn} est une suite uniformdment bornde de fonctions contin~ment
diffdrentiables qui converge presque s~rement vers ZB (la fonction indicatrice de
B ), alors :
(20 (B) =<lira inf ~ ]lgrad g,(t)[I dr.
(2)
n~oo
0044- 3719/82/0060/0393/$01.80
394
M. Wschebor
~ I[grad()~,)~(t)ll dr,
0_~
(3)
off
0 a={t: Ht-t'H>bVt'r
II. On considbre maintenant un processus stochastique r6el {X(t): t e R e} parametris6 dans R e. On va supposer que ses trajectoires soient continfiment
differentiables et dhnoter par
Ptt
.....
tk; tl ..... ta(Xl .... , Xk; 21 .... ,2h) dx~ ... dx k d21 ... d2 h
(4)
pour i4=j.
It'i-tj[+O,
c.={t: x(t)=u),
A . = { t : X(t)<u},
B~={t: X(0>u).
(5)
Ra
o~ T e s t un ouvert bornd de R e.
DOmonstration. On prouve l'6galit6 pour l'6sp6rance de la variable al6atoire
QT(A~). Pour QT(B.) c'est pareil.
Pour r e = l , 2 .... soit f,.~(C~176
1, non-croissante, f m ( x ) = O pour x > u ,
fro(x)= 1 pour x < = u - 1 / m . I1 est clair que
f ~ ( x ( o ) - , z ( . . . . ~(x(o)
v t ~ R e.
Formule de Rice
395
oo
[rgrad X(t)[I } dt
= l i m i n f ~ If/.(x)ldx ~ dt ~ []2IPpt;t(x;2)d2
m~
oo
R 1
Ra
=~ dt ~ []2clIpt;t(u;2:)d2.
T
Ra
E(QT(Au))~E { ~ Hgrad(zAu)~(t)[[dt}
w~
= lira ~ E { ]j ~ tp~( t - s)f" (X (s)) grad X (s) ds H} &.
rn~ao
T-6
(6)
Ra
E{ y 6~(t-s)Is
~
][gradX(t)l] ds}
(7)
Ra
E(QT(Au))>= ~ dt ~ I]2Hpt;t(u;2)d2.
T - ,~
Rd
(5')
Ra
E {Qr~t: graax(t)~w~(A,)},
396
M. Wschebor
S dt~ 11211pt;,(u;2)d2
1"
(8)
(9)
~A,~C.,
(10)
presque sfirement,
(11)
c'est ~t dire, pour que, presque sfirement, le processus n'ait pas d'extremum
locales. Dans le cas d = l le r6sultat est contenu dans un th6or6me de
Bulinskaya, [3]. N6anmoins, nous avons inclu la d6monstration parce qu'elle
donne une id6e des m6thodes employ6es pour d > 1.
(11) a ~t6 demontr6e par Ylvisaker [7] pour les processus Gaussiens centr6s
avec variance constante et trajectoires continues. Pour des processus g6n6raux,
Belyaiev [1] a prouv6 que si les trajectoires sont, presque sfirement, deux lois
continfiment differentiables et Pt;~(x; 2) est une fonction bonr6e, alors
T~={t: X(t)=u, grad X(t)=0} =~b
presque sfirement,
(12)
(Xi(t) indique la derivde par rapport dr la j-~me coordonde, calcul~e au point t), a
densit~
p,(x, y~,, ..., y J
localement bornde pour chaque choix de il, ..., ih, 1 <=i~< i 2 <
<ibid.
Formule de Rice
397
T, = q5 presque sfirement.
(ii) Si d > 1, posons ad = ( d - l)/(d + 1), et pour K compacte dans R d
sup
w(f,K)=
I]gradX(t)-gradX(s)H
s, t E K
IIt-sll <,~
P (~>y)<e
(13)
Lemme 2. Posons, pour ~>0, D~={t: IlgradX(t)ll <e}. Alors, sous les hipoth~ses
du th~or~me 2, si T est un ouvert born~ dam R e, on a:
E(QT~D~(A,)) <__Lea+1,
(14)
(14')
= l i m i n f ~ dt ~ If,~(x)ldx
m~oo
R1
[12[[pt;t(x,2)d2
{~:ll~ II <e}
~ g 8 d+ l
SA(r)=ad_I(AC3~B(O; r)),
Kd=dc TM.
ca=#d(B(O; 1))
398
M. Wschebor
(15)
VA(r) = .I SA(y)dy.
(16)
Donc, si la condition
(17)
est presque partout verifi6e pour O < y < R , on conclut de (16) que
r2
VA(r2)--VA(rl)>a ~ (VA(y))(e-l~/adv.
(18)
ri
VA(r) _--_(d,) a- /
pour O<r<_R.
(19)
(21)
(en fair, l'6galit6 est vraie, presque partout pour p > 0 , [5], p. 35) et on peut
trouver un rayon p 1 E CA, rA < P l < 27A tel que (21) soit satisfaite.
En utilisant l'inhgalit6 isop6rimetrique dans R a on a:
>=(Ka-a)(VA(?A))'a-1)/a>(Ka-a) ( d ) a - l ? y ( t.
(22)
Formule de Rice
399
(23)
(24)
400
M. Wschebor
V v
(25)
(26)
sup
dist{t,C uc~T}
t~Aunc~T
QBo; r
off
e. = w(6., ~')
6. = m i n {?., 2s.},
la derniSre in6galit6 6tant une cons6quance du fait que grad X s'annule sur C,.
gn r6sumant, nous avons:
(27)
0,
$3
b) w(6,
6~T ) < y
V6>0
Posons finalement
)@3"
sur S 3.
Formule de Rice
401
On a:
k=0
< Z oo
en utilisant le lemme 2.
Si on pose ~l=C~(d+ 1 ) - ( d - i ) > 0 , on a:
(29)
O.
k=O
Bibliographie
1.Belyaiev, Y.: Point Processes and First Passage Problems. Proc. Sixth Berkeley Sympos. Math.
Statist. Probab. 3, 1-17 (1972)
2. Benzaquen, S., Cabafia, E.M.: The Expected Measure of the Level Set of a Regular Stationary
Gaussian Process, [A para~tre dans Pacific J. Math.]
3. Bulinskaya, E.V.: On the mean number of crossings of a level by a stationary Gaussian process.
Theor. Probab. Appl. 6, 435-438 (1961)
4. Crfimer, H., Leadbetter, M.R.: Stationary and Related Stochastic Processes. NewYork: J. Wiley
1967
5. Miranda, M.: Frontiere minime. Mort. Mat. No. 27. IMPA (1976)
6. Wschebor, M.: On Crossings of Gaussian Fields. [A paraltre dans Stochastic Processes Appl.]
7. Ylvisaker, D.: The Expected Number of Zeros of a Stationary Gaussian Process. Ann. Math.
Statist. 1043-1046 (1965)
Received February 2, 1981; in revised form 2.2.82
Abstract
This paper deals with the asymptotic behavior when the level tends to +1, of
the tail of the distribution of the maximum of a stationary Gaussian process on a
xed interval of the line. For processes satisfying certain regularity conditions, we
give a second order term for this asymptotics.
Introduction
X = fX (t); t 2 0; T ]g, T > 0 is a real-valued centered stationary Gaussian process with
B exp ? 1 u+
(1)
for some constants B > 0 and < 1: (respectively ) denotes the standard normal
distribution (respectively density).
The aim of this paper is to improve the description of the asymptotic behavior of
P (MT > u) as u ! +1 that follows from (1) replacing the bound for the error by
an equivalent as u ! +1. More precisely, under the regularity conditions required in
Theorem 1.1, we will prove that:
r
2
2
"
3 1?
!#
? u
4
2
2
1 + o(1)]
(2)
This contradicts Theorem 3.1. in Piterbarg's paper in which a di erent equivalent is
given in case T is small enough (see also Aza s e t al. (1999)).
We will assume further that X has C sample paths (this implies < 1) and that
for every n 1 and pairwise di erent values t ; ::; tn in 0; T ], the distribution of the set of
5n random variables (X j (t ); ::; X j (tn); j = 0; 1; 2; 3; 4) is non-degenerate. A su cient
condition for this to hold is the spectral measure of the process not to be purely atomic
or, if it is purely atomic, that the set of atoms have an accumulation point in the real
line (A proof of this facts can be done in the same way as in Chap. 10 of Cramer and
Leadbetter, 1967).
If is a random vector with values in Rn whose distribution has a density with respect
to Lebesgue measure, we denote by p (x) the density of at the point x 2Rn . 1C denotes
the indicator function of the set C .
If Y = fY (t) : t 2 Rg is a process in L we put ?Y (s; t) for its covariance function and
i j Y
?Yij (s; t) = @@si@t? j (s; t) for the partial derivatives, whenever they exist.
The proof of (2) will consist in computing the density of the distribution of the
random variable MT and studying its asymptotic behavior as u ! +1. Our main tool is
4
( )
( )
the following proposition which is a special case of the di erentiation Lemma 3.3 in Aza s
and Wschebor (1999):
Proposition 1.1 Let Y be a Gaussian process with C paths and such that for every
n 1 and pairwise di erent values t ; ::; tn in 0; T ] the distribution of the set of 3n
random variables (Y j (t ); ::; Y j (tn); j = 0; 1; 2) is non-degenerate. Assume also that
E fY (t)g = 0; ?Y (t; t) = E fY (t)g = 1.
Then, if is a C -function on 0; T ],
2
( )
( )
(3)
(4)
(0)
( )
Z T
( )
u t (s);8s2 0;T ]g
p Y t ;Y
( ( )
(t))
Here the functions `; a; t and the (random) functions Y `; Y a ; Y t are the continuous
extensions to 0; T ] of:
?
?
`
(s) = 1 (s) ? ?Y (s; 0) (0) ; Y `(s) = 1 Y (s) ? ?Y (s; 0)Y (0) for 0 < s T; (6)
(s) = (s ?2 t)
Y
(s) ? ?Y (s; t) (t) ? ??Y ((t;t; st)) 0(t)
10
11
Y
Y t(s) = (s ?2 t) Y (s) ? ?Y (s; t)Y (t) ? ??Y ((t;t; st)) Y (t)
0
10
11
0 s T; s 6= t;
0 s T; s 6= t:
(8)
(9)
We will repeatedly use the following Lemma. Its proof is elementary and we omit it.
Lemma 1.1 Let f and g be real-valued functions of class C de ned on the interval 0; T ]
of the real line verifying the conditions:
1) f has a unique minimum on 0; T ] at the point t = t , and f 0(t ) = 0; f "(t ) > 0:
2) Let k = inf j : g j (t ) 6= 0 and suppose k = 0 ; 1 or 2.
De ne
Z T
h(u) =
g(t) exp ? 21 u f (t) dt:
Then, as u ! 1:
Z
k (t ) 1
1
g
xk exp ? 41 f "(t )x dx;
h(u) t k! uk exp ? 2 u f (t )
J
where J = 0; +1) ; J = (?1; 0] or J = ]?1; +1 according as t = 0; t =
T or 0 < t < T respectively.
2
( )
( )
+1
C -paths, covariance r(:), = 1, and such that for every n 1 and pairwise di erent
t ; ::; tn in 0; T ], the distribution of the set of 5n random variables (X j (t ); ::; X j (tn); j =
0; 1; 2; 3; 4) is non-degenerate. We shall also assume the additional hypothesis that r0 < 0
in a set dense in 0; T ].
4
( )
( )
Step 1. Proposition1.1 applied to the process Y = X and the function (t) = 1 for all
t 2 0; T ] enables to write the density pMT of the distribution of the maximum MT as:
pMT (u) = A (u) + A (u) + A (u)] : (u); with
(10)
1
Z T
1
E (X t(t)? t(t)u)1fX t s u t s ;8s2 ;T g dt:
A (u) = ? p
2
Since X is a stationary process and (t) 1, it follows that the processes X and
Xe - de ned as Xe (t) = X (T ? t) - have the same law, so that P (X (s) u for all
s 2 0; T ] jX (0) = u) = P (X (s) u for all s 2 0; T ] jX (T ) = u). Hence, A (u) = A (u).
3
( )
( )
Step 2.
1 2
and
? r(s) s 2 0; T ]:
(s) = n (s) o = 11 +
r(s)
(E X `(s)] ) =
`
1 2
(11)
0 (T )
r
p
(0) =
T (1 + r(T ))
0 and
(T ) =
(1 + r(T )) 1 ? r (T )
2
0:
P Y a(s) u a(s); 8s 2 0; T ]
check that
r0(pT )
P Y a(T ) u a(T ) ;
0
E (Y a(T )) ) = E (Y 0(T )) = (1 ?(1r ?(Tr))(?T ))(r (T )) ) ;
so that, since the non-degeneracy hypothesis implies that for each T > 0, E (Y 0(T )) ) is
2
non-zeo, it follows that the numerator in the right-hand member is stricly positive for
T > 0.
Hence,
?
P Y a(s) u a(s); 8s 2 0; T ] ( (T )u) C (T ) exp ? u2 F (T ) ;
with C (T ) > 0, where F (t) is the function
2
(12)
which is well de ned since the denominator does not vanish because of the previous
remark.
The following properties of the function F are elementary and will be useful in our
calculations.
(a) F has a continuous extension at t = 0.
(b) F (t) > F (0) = ? for t 6= 0 because:
0
0
r00(t))(1 ? r(t)) and
* F 0(t) = 2 (1 ? r(t))( r ((1t)((?rr(t())t))??((r0(?
t)) )
* r0(t) < 0 for t 2 A 0; T ] with A dense in 0; T ], and
2
2
2
2
2 2
* For t 6= 0,
(r0(t)) ? ( ? r00(t))(1 ? r(t)) =
?
(E (X 0(t) ? X 0(0))(X (t) ? X (0))) ? E (X 0(t) ? X (0))
2
(c) F 0(0) = 0:
(d) F 00(0) = 9(( ? ? ) ) .
2
2
4
2 2
2
L (u; ) = ?
Y
Z T
with
p Y t ;Y
( ( )
( )
2
2
(T ) +
pY
u t (s);8s2 0;T ]g
(t))
(13)
(t);Y 0 (t))
(u (t); u 0(t))dt;
0
(t) + E f((Y(0t())t)) g
(u (t); u 0(t)) = 2 (E f(Y10(t)) g) = exp ? u2
= 2 (E f(Y10(t)) g) = exp ? u2 F (t) :
2
2
2
1 2
1 2
2
2
2
2
? >0
?
2
4
6
4
2
2
Z T
(t)E Y t(t)1fY t s
( )
p Y t ;Y
u t (s);8s2 0;T ]g
( ( )
(t))
(u (t); u 0(t))dt
Z T
t (t)) g) =
u
(
E
f
(
Y
(t) 2 (E f(Y 0(t)) g) = exp ? 2 F (t) dt C (t) exp ? u2 F (t) dt;
where C is a positive constant. Also, from Lemma 1.1 and the properties of the functions
F and , one gets:
Z T
(t) exp ? u2 F (t) dt C u1 exp ? u2 ?
:
C a positive constant. Hence,
Z T
1 2
1 2
2
2
2
2
Z T
(t)E Y t(t)1fY t s
( )
u t (s);8s2 0;T ]g
p Y t ;Y
( ( )
(t))
(u (t); u 0(t))dt =
= O u1 exp ? u2
2
2
2
2
(14)
(t)E
(t)1fY t s
( )
Z T
2
0
p Y t ;Y
u t (s);8s2 0;T ]g
( ( )
(t))
(u (t); u 0(t))dt
where A a positive constant. Since the function g(t) = (t) t(t) veri es g(0) = g0(0) = 0,
and g00(0) 6= 0, Lemma 1.1 implies:
2
Z T
0
(t)E
(t)1fY t s
( )
pY
u t (s);8s2 0;T ]g
= O u1 exp ? u2
(t);Y 0 (t))
(u (t); u 0(t))dt =
2
2
2
2
(15)
LY (u; ) = O u1 exp ? u2
2
2
2
2
(16)
d A (u) = O 1 exp ? u
:
(17)
du
u
2 ?
Further, observe that since Y is continuous and (s) > 0 for s 2 ]0; T ], (0) = 0,
if Y (0) > 0 the event fY (s) u (s); 8s 2 0; T ]g does not occur for positive u, and if
Y (0) < 0, the same event occurs if u is large enough. This implies that
2
2
2
2
2
and so,
A (u) ? 12 = ?
1
Z
u
d A (v)dv = O 1 exp ? u
dv
u
2
2
2
2
2
(18)
on applying (17).
Step 3.
We will now give an equivalent for A (u): Introduce the following notations:
3
for t 2 0; T ],
1 2
(s)
(1 ? r(t ? s))
p
=
t(s)=
t
=
(E f(X (s)) g)
(1 ? r (t ? s)) ? (r0(t ? s))
t
1 2
= F (t ? s) for s 2 0; T ]; s 6= t
and
t
(t) = p
2
4
2
2
Hence,
0
F 0(0) = 0:
(t) = p
2 F (0)
so that
( )
( )
Z T
Z T
1
B (u; t) dt ? p
Be (u; t) dt = S (u) ? T (u): (19)
A (u) = u 2
2
We will consider in detail the behavior of the rst term as u ! +1.
2
We apply again Proposition 1.1 to compute the derivative of B (u; t) with respect to u.
For t 2 ]0; T :
3
so that as u ! +1 :
Z T
0
(20)
* LZt (u; t) = t(T )P (Zta(s) u ta(s); 8s 2 0; T ]) ( t(T )u). In the same way:
p
LZt (u; t) p1 F (T ? t) exp ? u2 F (T ? t) :
2
and:
Z T
LZt (u; t)dt C u1 exp ? u2 F (0) :
for some constant C .
2
(21)
Z T
LZt (u; t) =?
3
( )
u tx (s);8s2 0;T ]g
( )
( ))
=
De ne
( )
( ))
u F (x ? t) +
(F 0(x ? t))
1
exp
?
2
4F (x ? t)E f(Zt0(x)) g
E f(Zt0(x)) g
2
(22)
0
Gt (x) = F (x ? t) + 4F (x(?F t()xE?f(tZ))0(x)) g :
2
Check that
min Gt(x) = Gt(t) = F (0);
G0t(t) = F 0(0) = 0:
x2 0;T ]
and
Zt(x) =
1
(X 00(t) + X (t)) +
? + O((x ? t) )
t) ( X 000(t) + X 0(t)) + O((x ? t) ):
+ (x ?
3
p
2
2
2
It follows that:
00(0)
E (Zt0(t)) = 9 ( ?? ) = FF (0)
;
2
and
2
4
2
2
We also have:
2
4
2 2
2
? (x; s) 0(x) :
Zt
t(s) ? ? (x; s): t(x) ? Zt
(s ? x)
? (x; x) t
Zt
2
(t; s) 0(t) =
?
t(s) =
Z
t
t(s) ? ? (t; s): t(t) ? Zt
t
(t ? s)
? (t; t) t
p
p
= (t ?2 s) F (t ? s) ? F (0):E fZt(t)Zt(s)g for s 6= t;
F 00(0) + pF (0)E n(Z 0(t)) o = 3 pF 00(0) = ( ? ) > 0;
t
(t t) = 21 p
t
2 F (0) 6( ? ) =
F (0)
where the last inequality is a consequence of the non-degeneracy condition.
Note that since E fZt(t)Zt(s)g 1,
2 pF (t ? s) ? pF (0) for s 6= t;
t
t (s)
(t ? s)
so that
inf tt(s) > 0:
s;t2 ;T
x
t
(s) =
Zt
10
11
10
11
2
4
2 3 2
2
On the other hand, it is easy to see that tx(s) is a continuous function of the triplet
(x; t; s) and a uniform continuity argument shows that one can nd > 0 in such a
way that if jx ? tj
then
x
c > 0 for all s 2 0; T ]:
t (s)
Thus, for jx ? tj , using the Landau-Shepp-Fernique inequality (see Fernique,
1974):
x
? x
t (x) t (x)u
t (x)
x
x s u x s ;8s2 ;T g = ? p
p
(1 + R)
E
(
Z
(
x
)
?
(
x
)
u
)1
f
Z
t
t
t
t
E f(Zt0(x)) g
E f(Zt0(x)) g
( )
where R
( )
10
So,
Z T
t(
p
(23)
S (u) = u 2 T ? u 2
2
Z T
0
dt
+1
d B (v; t) dv:
dv
3
S (u) = u 2 T ? (1 + o(1)) 2T
3 ( ? ) exp ? u
2
2
2
2
2
2
2
(24)
The second term in (19) can be treated in a similar way, only one should use the full
statement of Lemma 3.3 in Aza s and Wschebor (1999) instead of Proposition 1.1, thus
obtaining:
T (u) = O( u1 ): exp ? u2 ? :
(25)
Then, (24) together with (25) imply that as u ! +1:
2
2
2
A (u) = u 2 T ? (1 + o(1)) 2T
3
2
2
3 ( ? ) exp ? u
2
2
2
2
2
2
4
2
2
(26)
Replacing (18), (26) into (10) and integrating, one obtains (2).
Acknowledgment. The authors thank Professors J-M. Aza s, P. Carmona and C. Del-
References
Aza s, J-M., Cierco-Ayrolles, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM Probab. Statist., 3, 107-129.
Aza s, J-M. and Wschebor, M. (1999). On the Regularity of the Distribution of the
Maximum of One-parameter Gaussian Processes. Submitted.
11
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes. J.
Wiley & Sons, New-York.
Fernique, X. (1974). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
d'Ete de Probabilites de St. Flour. Lecture Notes in Mathematics, 480, Springer-Verlag.
New-York.
Piterbarg, V.I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th. Prob. Appl., 26, 687-705.
12
URL: http://www.emath.fr/ps/
R
esum
e. Dans cet article nous utilisons la methode de Rice (Rice, 1944-1945) pour trouver un encadrement de la fonction de repartition du maximum dun processus Gaussien stationnaire regulier.
Nous derivons des expressions simplifiees des deux premiers termes de la serie de Rice (Miroshin, 1974,
Azas et Wschebor, 1997) suffisants pour lencadrement cherche. Notre contribution principale est la
donnee dune forme plus simple du second moment factoriel du nombre de franchissements vers le
haut, ce qui est, en quelque sorte, une generalisation de la formule de Steinberg et al. (Cramer and
Leadbetter, 1967, p. 212). Nous presentons ensuite une application numerique et des developpements
asymptotiques qui fournissent une nouvelle interpretation dun resultat de Piterbarg (1981).
AMS Subject Classification. 60Exx, 60Gxx, 60G10, 60G15, 60G70, 62E17, 65U05.
Received June 4, 1998. Revised June 8, 1999.
1. Introduction
1.1. Framework
Many statistical models involve nuisance parameters. This is the case for example for mixture models [10],
gene detection models [5,6], projection pursuit [20]. In such models, the distributions of test statistics are those
of the maximum of stochastic Gaussian processes (or their squares). Dacunha-Castelle and Gassiat [8] give for
example a theory for the so-called locally conic models.
Thus, the calculation of threshold or power of such tests leads to the calculation of the distribution of the
maximum of Gaussian processes. This problem is largely unsolved [2].
Keywords and phrases: Asymptotic expansions, extreme values, stationary Gaussian process, Rice series, upcrossings.
This paper is dedicated to Mario Wschebor in the occasion of his 60th birthday.
108
Miroshin [13] expressed the distribution function of this maximum as a sum of a series, so-called the Rice
series. Recently, Azas and Wschebor [3, 4] proved the convergence of this series under certain conditions and
proposed a method giving the exact distribution of the maximum for a class of processes including smooth
stationary Gausian processes with real parameter.
The formula given by the Rice series is rather complicated, involving multiple integrals with complex expressions. Fortunatly, for some processes, the convergence is very fast, so the present paper studies the bounds
given by the first two terms that are in some cases sufficient for application.
We give identities that yield simpler expressions of these terms in the case of stationary processes. Generalization to other processes is possible using our techniques but will not be detailed for shortness and simplicity.
For other processes, the calculation of more than two terms of the Rice series is necessary. In such a case,
the identities contained in this paper (and other similar) give a list of numerical tricks used by a program under
construction by Croquette.
We then use Maple to derive asymptotic expansions of some terms involved in these bounds. Our bounds
are shown to be sharp and our expansions are made for a fixed time interval and a level tending to infinity.
Other approaches can be found in the literature [12]. For example, Kratz and Rootzen [11] propose asymptotic
expansions for a size of time interval and a level tending jointly to infinity.
We consider a real valued centred stationary Gaussian process with continuous paths X = {Xt ; t [0, T ] R}.
We are interested in the random variables
X = sup Xt or X
= sup |Xt | .
t[0,T ]
t[0,T ]
For shortness and simplicity, we will focus attention on the variable X ; the necessary modifications for adapting
our method to X are easy to establish [5].
We denote by dF () the spectral measure of the process X and p the spectral moment of order p when it
exists. The spectral measure is supposed to have a finite second moment and a continuous component. This
implies ([7] p. 203) that the process is differentiable in quadratic mean and that for all pairwise different time
points t1 , . . . , tn in [0, T ], the joint distribution of Xt1 , . . . , Xtn , Xt1 , . . . , Xtn is non degenerated.
For simplicity, we will assume that moreover the process admits C 1 sample paths. We will denote by r(.) the
covariance function of X and, without loss of generality, we will suppose that 0 = r(0) = 1.
Let u be a real number, the number of upcrossings of the level u by X, denoted by Uu is defined as follows:
Uu = # {t [0, T ], Xt = u, Xt > 0}
For k N , we denote by k (u, T ) the factorial moment of order k of Uu and by k (u, T ) the factorial moment of
order k of Uu 11{X0 u} . We also define k (u, T ) = k (u, T ) k (u, T ). These factorial moments can be calculated
by Rice formulae. For example:
T 2 u2 /2
1 (u, T ) = E (Uu ) =
e
2
T
Ast (u) ds dt
0
with Ast (u) = E (Xs )+ (Xt )+ |Xs = Xt = u ps,t (u, u), where (X )+ is the positive part of X and ps,t the
joint density of (Xs , Xt ).
These two formulae are proved to hold under our hypotheses ( [7], p. 204). See also Wschebor [21],
Chapter 3, for the case of more general processes.
We will denote by the density of the standard Gaussian distribution. In order to have simpler expressions
x
of rather complicated formulae, we will use the folllowing three functions: (x) =
x
and (x) =
0
1
(y)dy = (x) .
2
109
1
E () E (( 1)) P ( > 0) E () .
2
Noting that P almost surely, {X > u} = {X0 > u} {X0 u, Uu > 0} and that E Uu (Uu 1)11{X0 u} 2 ,
we get:
P (X0 > u) + 1 (u, T )
2 (u, T )
P X u
2
(1.1)
1 (u, T ) =
dt
0
dx
P X u = P (X0 > u) +
(1)m+1
m=1
m (u, T )
m!
(1.2)
(1.3)
2 (u, T )
P X u P (X0 > u) + 1 (u, T ).
2
Since 2 (u, T ) 2 (u, T ), we see that, except this last modification which gives a simpler expression, Main
inequality (1.1) is relation (1.3) with n = 1.
110
Remark 1.1. In order to calculate these bounds, we are interested in the quantity 1 (u, T ). For asymptotic
calculations and to compare our results with Piterbargs ones, we will also consider the quantity k (u, T ). From
a numerical point of view, k (u, T ) and k (u, T ) are worth being distinguished because they are not of same
order of magnitude as u +. In the following sections, we will work with 1 (u, T ).
2. Some identities
First, let us introduce some notations that will be used in the rest of the paper. We set:
r (t)
u,
(t) = E (X0 |X0 = Xt = u) =
1 + r(t)
r 2 (t)
2 (t) = V ar (X0 |X0 = Xt = u) = 2
,
1 r2 (t)
r (t) 1 r2 (t) r(t)r 2 (t)
(t) = Cor (X0 , Xt |X0 = Xt = u) =
.
2 (1 r2 (t)) r 2 (t)
1 + (t)
Note that, since the spectrum of the process X admits a continuous component, |(t)| = 1.
In the sequel, the variable t will be omitted when it is not confusing and we will write r, r , , , , k, b instead
of r(t), r (t), (t), (t), (t), k(t), b(t).
Proposition 2.1. (i) If (X, Y ) has a centred normal bivariate distribution with covariance matrix
1
1
then a R+
a
1
P (X > a, Y > a) = arctan
1+
(x)
2
1
0
1+
x (x) dx
1
=2
2 (T t)
(iii) 2 (u, T ) =
0
1 r 2
u
1+r
1
2
1 r2 (t)
1+
x
1
1r
r
u (b)
1+r
1 r2
u
1 + r(t)
dx
dt
with:
T1 (t) = 2 (t)
(2.1)
(2.2)
b(t)
(2.3)
1
arctan (k(t)) 2
b(t)
(k(t) x) (x) dx .
0
(2.4)
111
Remark 2.2.
p. 27:
1. Formula (i) is analogous to the formula (2.10.4) given in Cramer and Leadbetters [7],
1
a2
exp
1z
2 1 z 2
dz.
Our formula is easier to prove and is more adapted to numerical application because, when t 0,
(t) 1 and the integrand in Cramer and Leadbetters formula tends to infinity.
2. Utility of these formulae:
these formulae permit a computation of Main inequality (1.1), at the cost of a double integral with
finite bounds. This is a notable reduction of complexity with respect to the original form. The form
(2.4) is more adapted to effective computation, because it involves an integral on a bounded interval;
this method has been implemented in a S+ program that needs about one second of Cpu to run an
example. It has been applied to a genetical problem in Cierco and Azas [6].
The form (iii) has some consequences both for numerical and theoretical purposes. The calculation of 2 (u, T )
yields some numerical difficulties around t = 0. The sum of the three terms is infinitly small with respect to
each term. To discard the diagonal from the computation, we use formula (iii) and Maple to calculate the
equivalent of the integrand in the neighbourhood of t = 0 at fixed u.
T
Ast (u) ds dt. The following proposition gives the Taylor expansion
0
of A at zero.
At (u) =
1
(2 6 4 )
1 4
exp
u2
1296 (4 2 )1/2 2 2
2 4 22
2
2
t4 + O(t5 ).
Piterbarg [17] or Wschebor [21] proved that At (u) = O ( (u(1 + ))) for some 0. Our result is more precise.
Our formulae give some asymptotic expansions as u + for 1 (u, T ) and 2 (u, T ) for small T .
Proposition 2.4. Assume that 8 is finite. Then, there exists a value T0 such that, for every T < T0
11/2
4 22
27
1 (u, T ) =
4 5 (2 6 2 )3/2
2
4
4
u
4 22
u6
1+O
1
u
9/2
4 22
3 3T
2 (u, T ) =
9/2 (2 6 2 )
2
4
4
u
4 22
u5
1+O
1
u
as u +.
3. A numerical example
In the following example, we show how the upper and lower bounds (1.1) permit to evaluate the distribution
of X with an error less than 104 .
We consider the centered stationary Gaussian process with covariance (t) := exp(t2 /2) on the interval
I = [0, 1], and the levels u = 3, 2.5, . . . , 3. The term P (X0 u) is evaluated by the S -plus function P norm,
1 and 2 using Proposition 2.1 and the Simpson method. Though it is rather difficult to assess the exact
precision of these evaluations, it is clear that it is considerably smaller than 104 . So, the main source of error
112
is due to the difference between the upper and lower bounds in (1.1).
u
P (X0 u)
3
0.00135
2.5
0.00621
2
0.02275
1.5
0.06681
1
0.15866
0.5
0.30854
0
0.50000
0.5
0.69146
1
0.84134
1.5
0.93319
2
0.97725
2.5
0.99379
3
0.99865
1
0.00121
0.00518
0.01719
0.04396
0.08652
0.13101
0.15272
0.13731
0.09544
0.05140
0.02149
0.00699
0.00177
2
lower bound upper bound
0
0.00014
0.00014
0
0.00103
0.00103
0
0.00556
0.00556
0.00001
0.02285
0.02285
0.00002
0.07213
0.07214
0.00004
0.17753
0.17755
0.00005
0.34728
0.34731
0.00004
0.55415
0.55417
0.00002
0.74591
0.74592
0.00001
0.88179
0.88180
0
0.95576
0.95576
0
0.98680
0.98680
0
0.99688
0.99688
4. Proofs
Proof of Proposition 2.1
Proof of point (i). We first search P (X > a, Y > a).
Put = cos(), [0, [, and use the orthogonal decomposition Y = X +
a X
Then {Y > a} = Z >
. Thus:
1 2
+
P (X > a, Y > a) =
a x
(x)
(x)(z) dx dz,
dx =
1 2
1 2 Z.
1
where D is the domain located between the two half straight lines starting from the point a, a
1+
Using a symmetry with respect to the straight line with angle passing through the origin, we get:
2
+
P (X > a, Y > a) = 2
(x)
a
1
x
1+
dx.
(4.1)
Now,
P (X > a, Y > a) = (a) P (X > a, Y < a) = (a) P (X > a, (Y ) > a) .
Applying relation (4.1) to (X, Y ) yields
+
(x)
a
1+
x
1
dx = 2
and
1+
x
1
(x) dx.
113
(k x) (x) dx =
0
1
arctan(k)
2
E Z + =
p0,t (x, u) dx
(1 r2 )
0
u
T
+
r (x r u)
r (x r u)
I2 =
dt
r
(1 r2 )
0
u
T
parts leads to
I2 = (u)
0
1 (u, T ) =
22
2
(u)
2
r
1r
u (b)
2
1r
1+r
2 1 r
r2
u
+
1+r
22 (1 r2 )
1r
u
1+r
dt. Integrating I2 by
dt.
r2
= 2 , we obtain:
1 r2
T
1r
u
1+r
dt + (u)
0
1 r2
1r
u
1+r
(b) dt.
Jij =
0
xi y j
2
1 2
1 2
0
1 2 (k b) (b).
dy
(4.2)
114
(4.3)
3/2
(k b) k b (k b) (b).
(4.4)
[ (k b) + k b (k b)] (b).
(4.5)
x
0
parts
3/2
v(x, y)
dx dy. Then, integrating by
x
2 1 2
(4.6)
1 2 [ (k b) + k b (k b)] (b).
1 2 2
b
1
(b) + 2 b2
(k x) (x) dx + 2 b (k b) (b).
b
as a(t, u)
b(t, u)
Note 4.2. Many results of this section are based on tedious Taylor expansions. These expansions have been
made or checked by a computer algebra system (Maple). They are not detailed in the proofs.
115
1 + (t)
= O(t) is small,
1 (t)
Proof of Proposition 2.3. Use form (iii) and remark that, when t is small, k(t) =
1
and, since () =
2
3
6
+ O 5 as 0, we get:
b(t)
b(t)
k(t)
arctan(k(t))
k 3 (t)
x(x)dx +
x3 (x)dx + O(t5 )
2
2 0
6 2 0
1
k(t)
2 arctan(k(t)) 2 ((0) (b(t)))
+ O(t5 ).
= 2 2 (t)(t) 2 (t)
k 3 (t)
2
+
2(0) b (t) + 2 (b(t))
6 2
In the same way:
2(t)(t)
k 3 (t) 3
b (t) + O(t5 ).
T3 (t) =
(b(t)) k(t)b(t)
6
2
And then, assuming 8 finite, use Maple to get the result.
T2 (t) = 2 2 (t)(t) 2 (t)
(p+1)
(i) Ip =
1 Mp+1
2 2
( dc )
p
(cos ) d 1 + O
0
1
u
22 6 2 24
p+1
and Mp+1 = E |Z|
where Z is a standard Gaussian random variable.
4 22
T
Mp
1
(ii) Jp =
tp (l(t) u) dt = (c u)(p+1)
1+O
2
u
0
with d =
1
6
Proof of Lemma 4.3. Since the derivative of l at zero is non zero, l is invertible in some neighbourghood of zero
1
1
and its inverse l1 satisfies l1 (t) = t + O(t2 ), l1 (t) = + O(t).
c
c
We first consider Ip and use the change of variable y = l(t)u, then
l(T )u
Ip =
y
u
l1
(kb) l1
y
u
(y) l1
y
u
dy
y
d
= y + u OU
u
c
1
6
y2
u2
and
l(T )u
(p+1)
yp
Ip = (c u)
22 6 2 24
t u + u O(t3 ) = d u t + u O(t3 ).
4 22
d
y + u OU
c
y2
u2
(y) 1 + OU
y
u
dy.
116
tu
2
t
. Then
2
(const) u t2
tu
2
(4.7)
(p+1)
yp
Ip = (c u)
0
l(T )u
yp
Put Kp (u) =
0
d
y
c
d
y
c
(y) 1 + OU
y
u
dy.
(4.8)
yp
Kp (u) =
0
d
y
c
+
c y yp
y2 + z 2
d
y (y) dy =
exp
dz dy. Then, using polar coorc
2
2
0
0
0
d
1 Mp+1 arctan( c )
p
dinates, we derive that Kp () =
(cos ) d. So we can see that the contribution of the
2 2
0
y
term OU
in formula (4.8) is O u(p+2) which gives the desired result for Ip .
u
Moreover, Kp () =
yp
2 (1 r)
u
2 (1 + r)
1r
u
1+r
r
(b)
1 r2
Then, 1 (u, T ) =
A1 (t) dt.
0
1
1
3
3 + 5 + O(z 7 ) .
z
z
z
(4.9)
117
2 (1 r(t))
u for the first term and z = b(t)
2 (t)(1 + r(t))
2 (1 r)
u
2 (1 + r)
1r
u
1+r
(b) ,
we get:
2 (1 + r) 1
2 (1 r) u
2 (1 r)
2 (1 + r)
u
+ OU
2
(1 + r)
2 (1 r)
1
r
1
+
3 + OU
2
b
1r b
(u)
A1 (t) =
2
3/2
2 (1 + r)
2 (1 r)
5/2
1
u5
1
b5
u3
4
2
exp 2(u4
2)
1 4 2
2
t2 + O(t4 ).
A1 (t) =
7/2
8
u3 2
2
To use Lemma 4.3 point (ii) to calculate 1 (u, T ), it is necessary to have a Taylor expansion of the coefficient
22
2 (1 r)
2 (1 r(t))
of u in
u
.
We
have
lim
=
, therefore, we set:
t0 2 (t)(1 + r(t))
2 (1 + r)
4 22
2 (1 r)
22
.
2 (1 + r)
4 22
l(t) =
2 (2 6 24 )
t + O(t2 ).
4 22
1
t (l(t) u) dt =
2
1
6
2
4
22
2 (2 6 24 )
u
4 22
1
=
2
11/2
4 22
27
1 (u, T ) =
4 5 (2 6 2 )3/2
2
4
1+O
1
u
4
u , we get the equivalent for 1 (u, T ).
4 22
4
u
4 22
u6
1+O
1
u
118
2 (T t)
2 (u, T ) =
0
1
2
1 r2 (t)
u
1 + r(t)
(4.10)
(x) (k x) dx.
b
The function x x2 1 (x) being bounded, we have
(kx) = (k b) + k (k b) (x b)
1 3
2
3
k b (k b) (x b) + OU k 3 (x b) ,
2
(4.11)
where the Landaus symbol has here the same meaning as in Lemma 4.3.
Moreover, using the expansion of given in formula (4.9), it is easy to check that as z +,
+
(z)
(z)
(z)
3 4 +O
2
z
z
z6
z
+
(z)
(z)
2
(x z) (x) dx = 2 3 + O
z
z5
z
+
(z)
3
(x z) (x) dx = O
.
z4
z
(x z) (x) dx =
Therefore, multiplying formula (4.11) by (x), integrating on [b; +[ and applying formula (4.9) once again
yield:
3
1 k2
3
1
1
+
+ k (k b) (b)
4
(k b) (b)
b b3 b5
b2
b
(k b) (b)
k
2
2
T2 = 2 b
+O
(k
b)
(b)
+
O
b7
b6
3
3
k
k
(k b) (b) + O
(b)
+O
b4
b4
Note that the penultimate term can be forgotten. Then, remarking that, as u +, b =
u, t and
k t, we obtain:
T2
2
2
= 2 2 b (k b) (b) + 2
(k b) (b) + 2
(k b) (b)
b2
b
2
Remark 4.5. As it will be seen later on, Lemma 4.3 shows that the contribution of the remainder to the
1
integral (4.10) can be neglected since the degrees in t and of each term are greater than 5. So, in the sequel,
u
we will denote the sum of these terms (and other terms that will appear later) by Remainder and we set:
T2 = U1 + U2 + U3 + U4 + U5 + U6 + U7 + U8 + U9 + Remainder.
119
Now, we have
U1 + T 3 = 0
1 2 2 k = (1 + ) k so that U7 + T1 = (1 + ) 2 k (k b) (b)
2
U2 + U3 = 2
(1 + ) (k b) (b)
b
2
U4 + U5 = 4 3 (k b) (b) 1 + O t2
b
2
U8 + U9 = 4 2 k (k b) (b) 1 + O t2
b
since = 1 + O t2 .
By the same remark as Remark 4.5 above, the term O t2 can be neglected. Consequently,
T1 + T2 + T3
= 2
2
2
(1 + ) (k b) (b) 4 3 (k b) (b)
b
b
(1 + ) 2 k (k b) (b) + 2 2 k 3 (k b) (b) + 4
2
k (k b) (b)
b2
+ Remainder.
Therefore, we are leaded to use Lemma 4.3 in order to calculate the following integrals:
(T t)
0
T
(T t)
0
T
(T t)
0
T
(T t)
0
u
2u
(kb) (b) dt = (T t) m1 (t) (kb) b2 +
dt
1+r
1
+r
0
2
2u
m2 (t) (k b) b2 +
dt
1+r
2
2u
dt
m3 (t) b2 (1 + k 2 ) +
1+r
2
2u
dt
m4 (t) b2 (1 + k 2 ) +
1+r
2
2u
dt
m5 (t) b2 (1 + k 2 ) +
1+r
(T t) m1 (t) exp
0
120
with:
m1 (t)
=
=
2
2
1
(t) (1 + (t))
1 r2 (t) b
4 22 3
1 2 6 24
t + O t5
5/2
36
u
2
m2 (t)
m3 (t)
=
=
m4 (t)
=
=
m5 (t)
=
=
5/2
4 22
2
1
(t)
=
t + O t3
7/2
1 r2 (t) b3
u3 2
1
1
(1 + (t)) 2 (t) k(t)
2 1 r2 (t)
3/2
2 2 6 24
t4 + O t6
864 22 4 22 3/2
2
1
2 (t) k 3 (t)
2 1 r2 (t)
3/2
2 4
1 2 6 24
t + O t6
864 22 4 22 3/2
4
1
( 1998).2
(t) k(t)
b2
2 1 r2 (t)
3/2
2 6 24 4 22
2 2
1
t + O t4 .
12
32 3/2 u2
4
=
Lemma 4.3 shows that we can neglect the terms issued from the t part of the factor T t in formula (4.10).
lim 2 +
=
t0 u
1+r
4 22
b2
2
4
lim 2 1 + k 2 +
=
t0 u
1+r
4 22
Therefore, we set:
2 2 6 24
l1 (t)
b2 (t)
2
4
+
=
2
u
1 + r(t)
4 22
l2 (t)
b2 (t)
2
4
1 + k 2 (t) +
u2
1+r
4 22
2 2 6 24
2
12 (4 22 )
t + O t3
5/2
18 (4 22 )
t + O t3 .
2 = T exp
4 u2
2 (4 22 )
4 22
4 22
1 2 6 24
3
5/2
7/2
36
2 u
u3 2
3/2
2 6 24 4 22
2
1
+
J2
12
32 3/2 u2
I1
1+O
1
u
121
2
2
2
2
8 3
3
3
(cos ) d =
and that
cos d =
, we find
Noting that
27
3
0
0
4
144 3 4 22
1
I3 =
u4 1 + O
2
u
2 22 (2 6 24 )
2 2
3 3 4 2
1
I1 =
u2 1 + O
2
u
2 2 (2 6 4 )
3
12 3 4 22
1
J2 =
u3 1 + O
2
2
u
2 (2 6 4 ) 2 (2 6 4 )
Finally, gathering the pieces, we obtain the desired expression of 2 .
5. Discussion
Using the general relation (1.3) with n = 1, we get
P X u P (X0 > u) 1 (u, T ) +
2 (u, T ) 3 (u, T )
2 (u, T )
2
2
6
A conjecture is that the orders of magnitude of 2 (u, T ) and 3 (u, T ) are considerably smaller than those of
1 (u, T ) and 2 (u, T ). Admitting this conjecture, Proposition 2.4 implies that for T small enough
9/2
4 22
T 2
3 3T
P X u = (u) +
(u)
2 9/2 (2 6 2 )
2
4
2
4
u
4 22
u5
1+O
1
u
which is Piterbargs theorem with a better remainder ([15], Th. 3.1, p. 703). Piterbargs theorem is, as far as we
know, the most precise expansion of the distribution of the maximum of smooth Gaussian processes. Moreover,
very tedious calculations would give extra terms of the Taylor expansion.
References
[1] M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables.
Dover, New York (1972).
[2] R.J. Adler, An Introduction to Continuity, Extrema and Related Topics for General Gaussian Processes, IMS, Hayward, Ca
(1990).
[3] J.-M. Azas and M. Wschebor, Une formule pour calculer la distribution du maximum dun processus stochastique. C.R. Acad.
Sci. Paris Ser. I Math. 324 (1997) 225-230.
[4] J-M. Azas and M. Wschebor, The Distribution of the Maximum of a Stochastic Process and the Rice Method, submitted.
[5] C. Cierco, Probl`
emes statistiques li
es a
` la d
etection et a
` la localisation dun g`
ene `
a effet quantitatif. PHD dissertation.
University of Toulouse, France (1996).
[6] C. Cierco and J.-M. Azas, Testing for Quantitative Gene Detection in Dense Map, submitted.
[7] H. Cram
er and M.R. Leadbetter, Stationary and Related Stochastic Processes, J. Wiley & Sons, New-York (1967).
[8] D. Dacunha-Castelle and E. Gassiat, Testing in locally conic models, and application to mixture models. ESAIM: Probab.
Statist. 1 (1997) 285-317.
[9] R.B. Davies, Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 (1977) 247-254.
[10] J. Ghosh and P. Sen, On the asymptotic performance of the log-likelihood ratio statistic for the mixture model and related
results, in Proc. of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Le Cam L.M. and Olshen R.A., Eds.
(1985).
122
GENERAL FORMULAE
>
phi:=t->exp(-t*t/2)/sqrt(2*pi);
2
e(1/2 t )
2
We introduce mu4=lambda4-lambda22 and mu6= lambda2*lambda6-lambda4^2
to make the outputs clearer.
>
assume(t>0);
>
assume(lambda2 > 0);
>
assume(mu4 > 0);
>
assume(mu6>0);
>
interface(showassumed=2);
>
Order:=12;
:= t
>
Order := 12
r:=t->1-lambda2*t^2/2!+lambda4*t^4/4!-lambda6*t^6/6!+lambda8*t^8/8!;
1
1
1
1
2 t2 +
4 t4
6 t6 +
8 t8
2
24
720
40320
siderels:= {lambda4=mu4+lambda2^2,lambda2*lambda6-lambda4^2=mu6}:
I_r2:=t->1-r(t)*r(t);
r := t 1
>
>
I r2 := t 1 r(t)2
>
simplify(simplify(series(I_r2(t),t=0,8),siderels));
>
1
1
1
1
1
2 t2 + ( 22
4) t4 + (
6 +
2 4 +
23 ) t6 + O(t8 )
3
12
360
24
24
with assumptions on t, 2 and 4
rp:=t->diff(r(t),t);
rp := t diff(r(t), t)
>
eval(rp(t));
1
1
1
4 t3
6 t5 +
8 t7
6
120
5040
with assumptions on 2 and t
2 t +
>
rs:=t->diff(r(t),t$2);
rs := t
>
2
r(t)
t2
eval(rs(t));
1
1
1
4 t2
6 t4 +
8 t6
2
24
720
with assumptions on 2 and t
2 +
123
124
mu:=t->-u*rp(t)/(1+r(t));
:= t
>
u rp(t)
1 + r(t)
sig2:=t->lambda2-rp(t)*rp(t)/I_r2(t);
sig2 := t 2
>
rp(t)2
I r2(t)
simplify(taylor(sig2(t),t=0,8),siderels);
1
1 6 22 4 3 42 2 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6
>
sigma:=t->sqrt(sig2(t));
:= t
>
simplify(taylor(sigma(t),t=0,6),siderels);
1
2
>
sig2(t)
1 6 22 4 3 42 2 6 3
t + O(t5 )
144
4 2
with assumptions on t, 4, 2 and 6
4 t +
b:=t->mu(t)/sigma(t);
b := t
>
(t)
(t)
simplify(taylor(b(t),t=0,6),siderels);
u 2
1
1 u 6
+ ( u 4 +
) t2 + O(t4 )
8
36 4(3/2)
4
with assumptions on 2, 4, t and 6
>
sig2rho:=t->-rs(t)-r(t)*rp(t)*rp(t)/I_r2(t);
sig2rho := t rs(t)
>
r(t) rp(t)2
I r2(t)
simplify(taylor(sig2rho(t),t=0,8),siderels);
1
1 6 22 4 + 3 42 + 4 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6
>
rho:=t->sig2rho(t)/sig2(t);
:= t
>
sig2rho(t)
sig2(t)
simplify(taylor(rho(t),t=0,8),siderels);
1 6 2
t + O(t4 )
18 2 4
with assumptions on t, 6, 2 and 4
1 +
k2:=t->(1+rho(t))/(1-rho(t));
k2 := t
>
1 + (t)
1 (t)
sk2:=simplify(taylor(k2(t),t=0),siderels);
1
1 6 2
t +
(3 26 4 + 9 24 42 + 9 22 43 2 6 22 4 3 8 22 4
36 2 4
2160
1
+ 3 44 + 13 6 42 + 5 62 ) (22 42 )t4 +
(147 28 42
907200
+ 175 6 26 4 273 26 43 + 63 24 44 + 196 6 24 42 + 120 8 24 42
+ 357 22 45 + 707 6 22 43 195 8 22 43 175 8 22 6 4 + 168 46
sk2 :=
k:=t->taylor(sqrt(sk2),t=0);
k := t taylor( sk2 , t = 0)
>
simplify(taylor(k(t),t=0,3),siderels);
1
6
>
6
t + O(t3 )
2 4
with assumptions on t, 6, 2 and 4
sqrtI_rho2:=t->k(t)*(1-rho(t));
sqrtI rho2 := t k(t) (1 (t))
>
T1:=t->sig2(t)*sqrtI_rho2(t)*phi(b(t))*phi(k(t)*b(t));
T1 := t sig2(t) sqrtI rho2(t) (b(t)) (k(t) b(t))
>
simplify(simplify(series(T1(t),t=0,6),siderels),power);
1
24
u2 22
6 4 e(1/2 4 ) 3
1
t
((5 62 22 u2 + 3 22 42 8 3 26 42 9 24 43
2880
2
9 22 44 15 6 22 42 u2 18 6 22 42 3 45 + 5 62 4 3 6 43 )
e(1/2
u2 22
4
T2 := t->2*sig2(t)*(rho(t)-(b(t))^2)*(arctan(k(t))/(2*pi)
-k(t)/sqrt(2*pi)*(phi(0)-phi(b(t))-k(t)^2/6*(2*phi(0)-((b(t))^2+2)*phi(b(t)))));
T2 := t 2sig2(t) ((t) b(t)2 )
1
2
2
k(t)
((0)
(b(t))
k(t)
(2
(0)
(b(t)
+
2)
(b(t))))
1 arctan(k(t))
6
125
126
simplify(simplify(series(T2(t),t=0,6),siderels),power);
1
24
u2 22
6 (u2 22 + 4) e(1/2 4 ) 3
t + O(t5 )
4 2
with assumptions on t, 6, 2 and 4
>
T3:=t->(2*sig2(t)*(k(t)*b(t)^2))/sqrt(2*pi)*(1-(k(t)*b(t))^2/6)*phi(b(t));
>
1
sig2(t) k(t) b(t)2 (1 k(t)2 b(t)2 ) (b(t))
6
T3 := t 2
2
simplify(simplify(series(T3(t),t=0,6),siderels),power);
u2 22
1
1 e(1/2 4 ) 6 2(3/2) u2 3
t
2 u2 (27 8 22 42 + 35 62 22 u2
24
25920
4
27 26 42 81 24 43 81 22 44 162 6 22 42 135 6 22 42 u2
27 45 45 62 4 + 243 6 43)e(1/2
u2 22
4
A:=t->((phi(u/sqrt((1+r(t)))))^2/sqrt(I_r2(t)))*(T1(t)+T2(t)+T3(t));
(
A := t
>
u
)2 (T1(t) + T2(t) + T3(t))
1 + r(t)
I r2(t)
simplify(simplify(series(A(t),t=0,6),siderels),power);
O(t4 )
with assumptions on t
Cphib:=t->phi(t)/t-phi(t)/t^3;
Cphib := t
>
sq:=t->sqrt((1-r(t))/(1+r(t)));
sq := t
>
(t) (t)
3
t
t
1 r(t)
1 + r(t)
simplify(simplify(series(sq(t),t=0,4),siderels),power);
1 2 22 + 4 3
1
2 t
t + O(t5 )
2
48
2
with assumptions on t, 2 and 4
>
nsigma:=t->sigma(t)/sqrt(lambda2);
(t)
nsigma := t
2
A1:=t->(1/sqrt(2*pi))*phi(u)*phi(sq(t)*u/nsigma(t))*((nsigma(t)/(sq(t)*u)
-(nsigma(t)/(sq(t)*u))^3)*sqrt(lambda2)+(1/b(t)-1/b(t)^3)*rp(t)/sqrt(I_r2(t)));
1
1
(
)
rp(t)
3
nsigma(t) nsigma(t)
sq(t) u
b(t) b(t)3
(u) (
) 2 +
)
(
3
3
nsigma(t)
sq(t) u
sq(t) u
I r2(t)
2
SA1:=simplify(simplify(series(A1(t),t=0,6),siderels),power);
A1 := t
>
u2 (4+22 )
)
4
1 2 e(1/2
4(5/2) 2
SA1 :=
t + O(t4 )
16
2(7/2) (3/2) u3
with assumptions on t, 4 and 2
L2:= t->(1-r(t))/((1+r(t))*nsigma(t)^2)-(lambda4-mu4)/mu4;
>
4 4
1 r(t)
(1 + r(t)) nsigma(t)2
4
SL2:=simplify(simplify(series(L2(t),t=0,6),siderels),power);
L2 := t
1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
We define c as the square root of the coefficient of t2
c:=sqrt(op(1,SL2))
1 2 2 6
c :=
6
4
with assumptions on 2, 6 and 4
>
nu1b:=(sqrt(2*pi))*op(1,SA1)*(c^(-3)*u^(-3)/2);
SL2 :=
u2 (4+22 )
)
4
27 2 e(1/2
4(11/2)
nu1b :=
8
2(7/2) u6 (2 6)(3/2)
with assumptions on 4, 2 and 6
PROOF OF THE EQUIVALENT OF NU2
>
m1:=t->(1+rho(t))*2*sigma(t)^2/(pi*b(t)*sqrt(I_r2(t)));
m1 := t 2
>
(1 + (t)) (t)2
b(t) I r2(t)
sm1:=simplify(simplify(series(m1(t),t=0,8),siderels),power);
1 6 4 3
sm1 :=
t + O(t5 )
36 2(5/2) u
with assumptions on t, 6, 4 and 2
127
128
m2:=t->(-4/pi)*sigma(t)^2*b(t)^(-3)/sqrt(I_r2(t));
>
(t)2
b(t)3 I r2(t)
sm2:=simplify(simplify(series(m2(t),t=0,6),siderels),power);
>
4(5/2)
t + O(t3 )
u3 2(7/2)
with assumptions on t, 4 and 2
m3:=t->-(1+rho(t))*sigma(t)^2*k(t)/(pi*sqrt((2*pi)*I_r2(t)));
m2 := t 4
sm2 :=
1
6(3/2) 2
sm3 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m4:=t->(2/pi)*sigma(t)^2*k(t)^3/sqrt(2*pi*I_r2(t));
m3 := t
>
>
m4 := t 2
>
>
(t)2 k(t)3
2 I r2(t)
sm4:=simplify(simplify(series(m4(t),t=0,6),siderels),power);
1
6(3/2) 2
sm4 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m5:=t->(4/pi)*sigma(t)^2*k(t)*b(t)^(-2)/sqrt(2*pi*I_r2(t));
>
(t)2 k(t)
b(t)2 2 I r2(t)
sm5:=simplify(simplify(series(m5(t),t=0,6),siderels),power);
1 6 4(3/2) 2 2
sm5 :=
t + O(t4 )
12 23 (3/2) u2
with assumptions on t, 6, 4 and 2
l12:=t-> (b(t)/u)^2 + 2/(1+r(t))-lambda4/mu4;
>
b(t)2
1
4
+2
u2
1 + r(t) 4
simplify(simplify(series(l12(t),t=0,8),siderels),power);
m5 := t 4
>
l12 := t
1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
>
b(t)2 (1 + k(t)2 )
1
4
+2
u2
1 + r(t) 4
simplify(simplify(series(l22(t),t=0,8),siderels),power);
1 2 6 2
t + O(t4 )
12 42
with assumptions on t, 2, 6 and 4
>
>
opm1:=op(1,sm1);
1 6 4
36 2(5/2) u
with assumptions on 6, 4 and 2
opm1 :=
>
opm2:=op(1,sm2);
4(5/2)
u3 2(7/2)
with assumptions on 4 and 2
opm2 :=
>
>
>
>
>
>
opm5:=op(1,sm5);
1 6 4(3/2) 2
opm5 :=
12 23 (3/2) u2
with assumptions on 6, 4 and 2
c1:=144*sqrt(3)*mu4^4*u^(-4)/(sqrt(2*pi)*lambda2^2*mu6^2);
3 44 2
c1 := 72 4
u 22 62
with assumptions on 4, 2 and 6
c2:=3*sqrt(3)*mu4^2*u^(-2)/(sqrt(2*pi)*lambda2*mu6);
3
3 42 2
c2 :=
2 u2 2 6
with assumptions on 4, 2 and 6
c5:=12*sqrt(3)*mu4^3*u^(-3)/(lambda2^(3/2)*mu6^(3/2));
3 43
c5 := 12 3 (3/2) (3/2)
u 2
6
with assumptions on 4, 2 and 6
B:=opm1*c1+opm2*c2+opm5*c5;
3 4(9/2) 3 2
B :=
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
simplify(B);
3 4(9/2) 3 2
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
129
Abstract
This paper deals with the asymptotic behavior when the level tends to +1, of
the tail of the distribution of the maximum of a stationary Gaussian process on a
xed interval of the line. For processes satisfying certain regularity conditions, we
give a second order term for this asymptotics.
Introduction
X = fX (t); t 2 0; T ]g, T > 0 is a real-valued centered stationary Gaussian process with
B exp ? 1 u+
(1)
for some constants B > 0 and < 1: (respectively ) denotes the standard normal
distribution (respectively density).
The aim of this paper is to improve the description of the asymptotic behavior of
P (MT > u) as u ! +1 that follows from (1) replacing the bound for the error by
an equivalent as u ! +1. More precisely, under the regularity conditions required in
Theorem 1.1, we will prove that:
r
2
2
"
3 1?
!#
? u
4
2
2
1 + o(1)]
(2)
This contradicts Theorem 3.1. in Piterbarg's paper in which a di erent equivalent is
given in case T is small enough (see also Aza s e t al. (1999)).
We will assume further that X has C sample paths (this implies < 1) and that
for every n 1 and pairwise di erent values t ; ::; tn in 0; T ], the distribution of the set of
5n random variables (X j (t ); ::; X j (tn); j = 0; 1; 2; 3; 4) is non-degenerate. A su cient
condition for this to hold is the spectral measure of the process not to be purely atomic
or, if it is purely atomic, that the set of atoms have an accumulation point in the real
line (A proof of this facts can be done in the same way as in Chap. 10 of Cramer and
Leadbetter, 1967).
If is a random vector with values in Rn whose distribution has a density with respect
to Lebesgue measure, we denote by p (x) the density of at the point x 2Rn . 1C denotes
the indicator function of the set C .
If Y = fY (t) : t 2 Rg is a process in L we put ?Y (s; t) for its covariance function and
i j Y
?Yij (s; t) = @@si@t? j (s; t) for the partial derivatives, whenever they exist.
The proof of (2) will consist in computing the density of the distribution of the
random variable MT and studying its asymptotic behavior as u ! +1. Our main tool is
4
( )
( )
the following proposition which is a special case of the di erentiation Lemma 3.3 in Aza s
and Wschebor (1999):
Proposition 1.1 Let Y be a Gaussian process with C paths and such that for every
n 1 and pairwise di erent values t ; ::; tn in 0; T ] the distribution of the set of 3n
random variables (Y j (t ); ::; Y j (tn); j = 0; 1; 2) is non-degenerate. Assume also that
E fY (t)g = 0; ?Y (t; t) = E fY (t)g = 1.
Then, if is a C -function on 0; T ],
2
( )
( )
(3)
(4)
(0)
( )
Z T
( )
u t (s);8s2 0;T ]g
p Y t ;Y
( ( )
(t))
Here the functions `; a; t and the (random) functions Y `; Y a ; Y t are the continuous
extensions to 0; T ] of:
?
?
`
(s) = 1 (s) ? ?Y (s; 0) (0) ; Y `(s) = 1 Y (s) ? ?Y (s; 0)Y (0) for 0 < s T; (6)
(s) = (s ?2 t)
Y
(s) ? ?Y (s; t) (t) ? ??Y ((t;t; st)) 0(t)
10
11
Y
Y t(s) = (s ?2 t) Y (s) ? ?Y (s; t)Y (t) ? ??Y ((t;t; st)) Y (t)
0
10
11
0 s T; s 6= t;
0 s T; s 6= t:
(8)
(9)
We will repeatedly use the following Lemma. Its proof is elementary and we omit it.
Lemma 1.1 Let f and g be real-valued functions of class C de ned on the interval 0; T ]
of the real line verifying the conditions:
1) f has a unique minimum on 0; T ] at the point t = t , and f 0(t ) = 0; f "(t ) > 0:
2) Let k = inf j : g j (t ) 6= 0 and suppose k = 0 ; 1 or 2.
De ne
Z T
h(u) =
g(t) exp ? 21 u f (t) dt:
Then, as u ! 1:
Z
k (t ) 1
1
g
xk exp ? 41 f "(t )x dx;
h(u) t k! uk exp ? 2 u f (t )
J
where J = 0; +1) ; J = (?1; 0] or J = ]?1; +1 according as t = 0; t =
T or 0 < t < T respectively.
2
( )
( )
+1
C -paths, covariance r(:), = 1, and such that for every n 1 and pairwise di erent
t ; ::; tn in 0; T ], the distribution of the set of 5n random variables (X j (t ); ::; X j (tn); j =
0; 1; 2; 3; 4) is non-degenerate. We shall also assume the additional hypothesis that r0 < 0
in a set dense in 0; T ].
4
( )
( )
Step 1. Proposition1.1 applied to the process Y = X and the function (t) = 1 for all
t 2 0; T ] enables to write the density pMT of the distribution of the maximum MT as:
pMT (u) = A (u) + A (u) + A (u)] : (u); with
(10)
1
Z T
1
E (X t(t)? t(t)u)1fX t s u t s ;8s2 ;T g dt:
A (u) = ? p
2
Since X is a stationary process and (t) 1, it follows that the processes X and
Xe - de ned as Xe (t) = X (T ? t) - have the same law, so that P (X (s) u for all
s 2 0; T ] jX (0) = u) = P (X (s) u for all s 2 0; T ] jX (T ) = u). Hence, A (u) = A (u).
3
( )
( )
Step 2.
1 2
and
? r(s) s 2 0; T ]:
(s) = n (s) o = 11 +
r(s)
(E X `(s)] ) =
`
1 2
(11)
0 (T )
r
p
(0) =
T (1 + r(T ))
0 and
(T ) =
(1 + r(T )) 1 ? r (T )
2
0:
P Y a(s) u a(s); 8s 2 0; T ]
check that
r0(pT )
P Y a(T ) u a(T ) ;
0
E (Y a(T )) ) = E (Y 0(T )) = (1 ?(1r ?(Tr))(?T ))(r (T )) ) ;
so that, since the non-degeneracy hypothesis implies that for each T > 0, E (Y 0(T )) ) is
2
non-zeo, it follows that the numerator in the right-hand member is stricly positive for
T > 0.
Hence,
?
P Y a(s) u a(s); 8s 2 0; T ] ( (T )u) C (T ) exp ? u2 F (T ) ;
with C (T ) > 0, where F (t) is the function
2
(12)
which is well de ned since the denominator does not vanish because of the previous
remark.
The following properties of the function F are elementary and will be useful in our
calculations.
(a) F has a continuous extension at t = 0.
(b) F (t) > F (0) = ? for t 6= 0 because:
0
0
r00(t))(1 ? r(t)) and
* F 0(t) = 2 (1 ? r(t))( r ((1t)((?rr(t())t))??((r0(?
t)) )
* r0(t) < 0 for t 2 A 0; T ] with A dense in 0; T ], and
2
2
2
2
2 2
* For t 6= 0,
(r0(t)) ? ( ? r00(t))(1 ? r(t)) =
?
(E (X 0(t) ? X 0(0))(X (t) ? X (0))) ? E (X 0(t) ? X (0))
2
(c) F 0(0) = 0:
(d) F 00(0) = 9(( ? ? ) ) .
2
2
4
2 2
2
L (u; ) = ?
Y
Z T
with
p Y t ;Y
( ( )
( )
2
2
(T ) +
pY
u t (s);8s2 0;T ]g
(t))
(13)
(t);Y 0 (t))
(u (t); u 0(t))dt;
0
(t) + E f((Y(0t())t)) g
(u (t); u 0(t)) = 2 (E f(Y10(t)) g) = exp ? u2
= 2 (E f(Y10(t)) g) = exp ? u2 F (t) :
2
2
2
1 2
1 2
2
2
2
2
? >0
?
2
4
6
4
2
2
Z T
(t)E Y t(t)1fY t s
( )
p Y t ;Y
u t (s);8s2 0;T ]g
( ( )
(t))
(u (t); u 0(t))dt
Z T
t (t)) g) =
u
(
E
f
(
Y
(t) 2 (E f(Y 0(t)) g) = exp ? 2 F (t) dt C (t) exp ? u2 F (t) dt;
where C is a positive constant. Also, from Lemma 1.1 and the properties of the functions
F and , one gets:
Z T
(t) exp ? u2 F (t) dt C u1 exp ? u2 ?
:
C a positive constant. Hence,
Z T
1 2
1 2
2
2
2
2
Z T
(t)E Y t(t)1fY t s
( )
u t (s);8s2 0;T ]g
p Y t ;Y
( ( )
(t))
(u (t); u 0(t))dt =
= O u1 exp ? u2
2
2
2
2
(14)
(t)E
(t)1fY t s
( )
Z T
2
0
p Y t ;Y
u t (s);8s2 0;T ]g
( ( )
(t))
(u (t); u 0(t))dt
where A a positive constant. Since the function g(t) = (t) t(t) veri es g(0) = g0(0) = 0,
and g00(0) 6= 0, Lemma 1.1 implies:
2
Z T
0
(t)E
(t)1fY t s
( )
pY
u t (s);8s2 0;T ]g
= O u1 exp ? u2
(t);Y 0 (t))
(u (t); u 0(t))dt =
2
2
2
2
(15)
LY (u; ) = O u1 exp ? u2
2
2
2
2
(16)
d A (u) = O 1 exp ? u
:
(17)
du
u
2 ?
Further, observe that since Y is continuous and (s) > 0 for s 2 ]0; T ], (0) = 0,
if Y (0) > 0 the event fY (s) u (s); 8s 2 0; T ]g does not occur for positive u, and if
Y (0) < 0, the same event occurs if u is large enough. This implies that
2
2
2
2
2
and so,
A (u) ? 12 = ?
1
Z
u
d A (v)dv = O 1 exp ? u
dv
u
2
2
2
2
2
(18)
on applying (17).
Step 3.
We will now give an equivalent for A (u): Introduce the following notations:
3
for t 2 0; T ],
1 2
(s)
(1 ? r(t ? s))
p
=
t(s)=
t
=
(E f(X (s)) g)
(1 ? r (t ? s)) ? (r0(t ? s))
t
1 2
= F (t ? s) for s 2 0; T ]; s 6= t
and
t
(t) = p
2
4
2
2
Hence,
0
F 0(0) = 0:
(t) = p
2 F (0)
so that
( )
( )
Z T
Z T
1
B (u; t) dt ? p
Be (u; t) dt = S (u) ? T (u): (19)
A (u) = u 2
2
We will consider in detail the behavior of the rst term as u ! +1.
2
We apply again Proposition 1.1 to compute the derivative of B (u; t) with respect to u.
For t 2 ]0; T :
3
so that as u ! +1 :
Z T
0
(20)
* LZt (u; t) = t(T )P (Zta(s) u ta(s); 8s 2 0; T ]) ( t(T )u). In the same way:
p
LZt (u; t) p1 F (T ? t) exp ? u2 F (T ? t) :
2
and:
Z T
LZt (u; t)dt C u1 exp ? u2 F (0) :
for some constant C .
2
(21)
Z T
LZt (u; t) =?
3
( )
u tx (s);8s2 0;T ]g
( )
( ))
=
De ne
( )
( ))
u F (x ? t) +
(F 0(x ? t))
1
exp
?
2
4F (x ? t)E f(Zt0(x)) g
E f(Zt0(x)) g
2
(22)
0
Gt (x) = F (x ? t) + 4F (x(?F t()xE?f(tZ))0(x)) g :
2
Check that
min Gt(x) = Gt(t) = F (0);
G0t(t) = F 0(0) = 0:
x2 0;T ]
and
Zt(x) =
1
(X 00(t) + X (t)) +
? + O((x ? t) )
t) ( X 000(t) + X 0(t)) + O((x ? t) ):
+ (x ?
3
p
2
2
2
It follows that:
00(0)
E (Zt0(t)) = 9 ( ?? ) = FF (0)
;
2
and
2
4
2
2
We also have:
2
4
2 2
2
? (x; s) 0(x) :
Zt
t(s) ? ? (x; s): t(x) ? Zt
(s ? x)
? (x; x) t
Zt
2
(t; s) 0(t) =
?
t(s) =
Z
t
t(s) ? ? (t; s): t(t) ? Zt
t
(t ? s)
? (t; t) t
p
p
= (t ?2 s) F (t ? s) ? F (0):E fZt(t)Zt(s)g for s 6= t;
F 00(0) + pF (0)E n(Z 0(t)) o = 3 pF 00(0) = ( ? ) > 0;
t
(t t) = 21 p
t
2 F (0) 6( ? ) =
F (0)
where the last inequality is a consequence of the non-degeneracy condition.
Note that since E fZt(t)Zt(s)g 1,
2 pF (t ? s) ? pF (0) for s 6= t;
t
t (s)
(t ? s)
so that
inf tt(s) > 0:
s;t2 ;T
x
t
(s) =
Zt
10
11
10
11
2
4
2 3 2
2
On the other hand, it is easy to see that tx(s) is a continuous function of the triplet
(x; t; s) and a uniform continuity argument shows that one can nd > 0 in such a
way that if jx ? tj
then
x
c > 0 for all s 2 0; T ]:
t (s)
Thus, for jx ? tj , using the Landau-Shepp-Fernique inequality (see Fernique,
1974):
x
? x
t (x) t (x)u
t (x)
x
x s u x s ;8s2 ;T g = ? p
p
(1 + R)
E
(
Z
(
x
)
?
(
x
)
u
)1
f
Z
t
t
t
t
E f(Zt0(x)) g
E f(Zt0(x)) g
( )
where R
( )
10
So,
Z T
t(
p
(23)
S (u) = u 2 T ? u 2
2
Z T
0
dt
+1
d B (v; t) dv:
dv
3
S (u) = u 2 T ? (1 + o(1)) 2T
3 ( ? ) exp ? u
2
2
2
2
2
2
2
(24)
The second term in (19) can be treated in a similar way, only one should use the full
statement of Lemma 3.3 in Aza s and Wschebor (1999) instead of Proposition 1.1, thus
obtaining:
T (u) = O( u1 ): exp ? u2 ? :
(25)
Then, (24) together with (25) imply that as u ! +1:
2
2
2
A (u) = u 2 T ? (1 + o(1)) 2T
3
2
2
3 ( ? ) exp ? u
2
2
2
2
2
2
4
2
2
(26)
Replacing (18), (26) into (10) and integrating, one obtains (2).
Acknowledgment. The authors thank Professors J-M. Aza s, P. Carmona and C. Del-
References
Aza s, J-M., Cierco-Ayrolles, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM Probab. Statist., 3, 107-129.
Aza s, J-M. and Wschebor, M. (1999). On the Regularity of the Distribution of the
Maximum of One-parameter Gaussian Processes. Submitted.
11
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes. J.
Wiley & Sons, New-York.
Fernique, X. (1974). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
d'Ete de Probabilites de St. Flour. Lecture Notes in Mathematics, 480, Springer-Verlag.
New-York.
Piterbarg, V.I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th. Prob. Appl., 26, 687-705.
12
Introduction
This course presents the application of the Malevich [15],Cuzick [6] Berman [4] method for establishing a central limit theorem for non linear functional of Gaussian processes (see Section 3).These
methods have been introduced in the 70s for studying zero crossing of stationary processes or
the sojourn time of a stochastic process. We present here mainly its application to the number
of roots of random processes. The basic argument is the approximation of the original process
by a m-dependent process (see Section 3). Section 2 presents a short memento of crossings of
process and the calculation of their moments. Our main tools and results are presented in Section
3. Section 4 presents generalizations and applications to some particular processes, in particular
random trigonometric polynomials and spe cular point in sea-wave modeling.
This section contains preliminary results almost without proofs. They can be found for example
in Azas and Wschebor [3].
For simplicity all the functions f (t) considered are real and of class C 1 . If I is a real interval
we will define:
Nu (f, I) := # {t I : f (t) = u} .
Nu (f, I), (Nu for short in case of no ambiguity) is the number of crossings of the level u or the
number of roots of the equation f (t) = u in the interval I. In a similar way, we define the number
of up-crossings or down crossings:
Uu (f, I) := # {t I : f (t) = u, f (t) > 0}
Du (f, I) := # {t I : f (t) = u, f (t) < 0} .
Down-crossings will not be considered in the sequel since the results are strictly equivalent to
those for the up-crossings.
We will say that the real-valued function f defined on the interval I = [t1 , t2 ] satisfies hypothesis
H1.u if:
f is a function of class C 1 ;
f (t1 ) = u, f (t2 ) = u;
{t : t I, f (t) = u, f (t) = 0} = .
Proposition 1 (Kacs counting formula) If f satisfies H1.u , then
1
Nu (f, I) = lim
0 2
(1)
The Kac counting formula has a weak version that will be useful
Proposition 2 (Banach formula) Assume that f is only absolutely continuous. Then for any
bounded Borel-measurable function g : R R, one has:
+
Nu (f, I) g(u) du =
(2)
This formula is a version of the change of variable formula for non one-to-one functions.
From these formula we deduce by passage to the limit the Rice formula that gives the factorial
moments of the number of (up-) crossings. For simplicity we limit to the Gaussian case and to
the first two moments.
Theorem 3 (Gaussian Rice formula) Let X = {X(t) : t I} , I a compact interval of the real
line, be a Gaussian process having C 1 -paths.
Suppose that for every point t I the variance of X(t) does not vanish. Then
E |X (t)| X(t) = u pX(t) (u)dt,
E Nu =
(3)
(4)
Then
E Nu (Nu 1) =
(5)
I2
22
E Nu = 2E Uu = |I| (u),
A very important issue is the finiteness of the second (factorial) moment. For stationary
processes a necessary and sufficient condition (in addition to (4)) is given by the Geman condition:
let (.) be the covariance of the process and define the function (.) by means of
( ) := E X(t)X(t + ) = 1
2 2
+ ( ).
2
(6)
E (Uu )(Uu 1) = 2
(T )
0
Remark that because of the Rolle theorem: Nu 2Uu + 1, thus the proposition above also gives
a bound for the variance of the number of crossings.
Our next main tool will be chaos expansion and Hermite polynomials. These polynomials are
orthogonal polynomials for the Gaussian measure (x)dx where is the standard normal density.
The nth Hermite polynomial Hn can be defined by means of the identity:
exp(tx t2 /2) =
Hn (x)
n=0
tn
.
n!
F (x) =
an Hn (x),
n=0
with
an =
1
n!
F (x)Hn (x)(x)dx,
||F ||22 =
a2n n!.
n=0
The Hermite rank of F is defined as the smallest n such that an = 0. For our purpose, we can
assume that this rank greater or equal than 1.
A useful standard tool to perform computations with Hermite polynomials and Gaussian variables is Mehlers formula which we state with an extension (see Leon and Ortega, [13]).
3
( )
d.
2
Lemma 5 (Generalized Mehlers formula) (a) Let (X, Y ) be a centered Gaussian vector E(X 2 ) =
E(Y 2 ) = 1 and = E(XY ). Then,
E(Hj (X)Hk (Y )) = j,k j .
(b) Let (X1 , X2 , X3 , X4 ) be a centered Gaussian vector with variance matrix
1
0 13 14
0
1 23 24
=
13 23 1
0
14 24 0
1
Then, if r1 + r2 = r3 + r4 ,
E Hr1 (X1 )Hr2 (X2 )Hr3 (X3 )Hr4 (X4 ) =
(d1 ,d2 ,d3 ,d4 )Z
(7)
Wiener chaos
Let L2 (, A, P) be the space of square integrable variables generated by the process X(t), t R.
This Hilbert space is the orthogonal sum of the Wiener chaos of order p, p = 0, . . . , n, . . . : Hp . Hp
is defined as the closed linear subspace of L2 (, A, P) generated by the variables Hp (X(t)), t R.
In particular the space H1 is simply the Gaussian space associated to X(t). A good reference on
this subject is the Nualart book [17].
3.1
Tt := 1/t
F (X(s)) ds,
(8)
An example of such a 1-dependent process is the Slepian process which is stationary with covariance
(t) = (1 t)+ .
Theorem 7 (Hoeffeding and Robins [7]) With the notations and hypotheses above, if the process X(t) is m dependent, then
F (X(s)) ds
t 1/t
N (0, 2 ) in distribution as t +,
where
2 =
1
Var
m
F (X(s))ds .
0
The proof is easy by the shortening method: we cut [0, T ] into smaller intervals separated
by gaps of size m giving the independence.
Our aim is to extend this result to processes which are not mdependent. The proof we
present follows Berman [4] with a generalization, due to Kratz and Leon [10] , to functions F in
(8) having an Hermite rank not necessarily equal to 1.
For > 0, we will approximate the given process X(t) by a new one X (t) which is 1/dependent and estimate the error.
As an additional hypothesis, we will assume that the process X(t) has a spectral density f ().
It has the following spectral representation:
X(t) =
(9)
where W1 and W2 are two independent Wiener processes (Brownian motions). Indeed, using
isometry properties of the stochastic integral, it is easy to see that the process given by (9) is
centered, Gaussian and with the good covariance:
=2
cos(t)f ()d.
0
Define now the function (.) as the convolution 1I[ 21 , 21 ] 1I[ 12 , 12 ] . This function is even, non
negative, (0) = 1, has support included in [1, 1] and a non-negative Fourier transform. Set
(.) := 1 (.) and let be its Fourier transform. Define
X (t) :=
(10)
where the convolution must be understood after prolonging f as an even function on R. The
covariance function of X (t) satisfies (t) = (t)(t). This implies that the process X (t) is
1
-dependent. We have the following proposition:
Proposition 8 Let X be a centered stationary Gaussian process with spectral density f () and
covariance function with L1 (R), positive integer. Let X (t) be defined by (10). Then
1
lim lim E
0 t
t
(H (X(s)) H (X (s))) ds
0
= 0.
(11)
1
tTt =
t
where
a2k k!
(F ) := 2
k (s)ds.
0
k=
Proof:
M
Define FM :=
n=
1
t
such that
a2k < .
2
k=M +1
t Var(Tt TtM ) = 2
c2k k!
k=M
s
(1 )k (s)ds 2
t
c2k k!
k=M
|| (s)ds.
0
Since is arbitrary, we only need to prove the asymptotic normality for TtM . Let us introduce
TtM, =
1
t
FM (X (s))ds,
0
where X (t) has been defined in (10). By Proposition 8 recalling that for k l, k is in L1 (R)
since is, we obtain:
lim lim t Var(TtM TtM, ) = 0.
0 t
Now Theorem 7 for m- dependent sequences implies that t TtM, is asymptotically normal. Notice
that
M
M, := lim t Var(TtM, ) = 2
t
a2k k!
k=0
k (s)ds
3.2
Our aim is to extend the result above to crossings. Let X(t) be a centered stationary Gaussian
process. With no loss of generality for our purposes, we assume that (0) = (0) = 1 and
(t) = 1 for t = 0. We also assume Gemans Condition (6).
(t) = 1 t2 /2 + (t)
with
(t)
dt converges at 0+ .
t2
x =
ak Hk (x),
k=0
x =
bk Hk (x),
k=0
|x| =
ck Hk (x).
k=0
(12)
1
k!
xHk (x)(x)dx =
0
k! 2
Hk2 (0).
The classical properties of Hermite polynomials easily imply that for positive k:
a2k+1 = b2k+1 = c2k+1 = 0,
a2k = b2k =
(1)k+1
,
22k k!(2k 1)
c2k = 2a2k .
We have the following Hermite expansion for the number of up-crossings:
Theorem 10 Under the conditions above,
Uu := Uu (X, [0, T ]) =
dj (u)ak
j=0 k=0
where dj (u) = j!1 (u)Hj (u) and ak is defined by (12). We have similar results, replacing ak by bk
or ck , for the number Du ([0, T ]) of down-crossings and for the total number of crossings Nu ([0, T ]).
Proof : Let g(.) L2 ((x)dx) and define the functional
t
Tg+ (t) =
g(X(s))X + (s)ds.
0
Tg+ (t) =
gj ak
Hj (X(s))Hk (X (s))ds,
(13)
j=0 k=0
where the gj s are the coefficients of the Hermite expansion of g. Using that for each s, X(s) and
X (s) are independent, we get:
t
g(X(s))(X (s))+
E
0
gj ak Hj (X(s))Hk (X (s)) ds
j,k0:k+jQ
(const)t2
By Fatous Lemma
E (Uu )2 lim inf E (Uu )2 .
0
To obtain an inequality in the opposite sense, we use the Banach formula (Proposition 2). To
do that, notice that this formula remains valid if one replaces in the left-hand side the total number
of crossings by the up-crossings and in the right-hand side |f (t)| by f + (t). So, on applying it to
the random path X(.), we see that:
u+
1
2
Uu =
Ux dx.
u
u+
1
2
E (Ux )2 dx = E (Uu )2
u
So, E (Uu )2 E (Uu )2 and since the random variables involved are non-negative, a standard
argument of passage to the limit based upon Fatous Lemma shows that Uu Uu in L2 .
We now apply (13) to Uu .
Uu =
dj (u)ak jk ,
(15)
j,k=0
1
1I xu
and
jk =
Hj (X(s))Hk (X (s))ds.
0
Notice that
dj (u)
This implies that:
1
(u)Hj (u) = dj (u).
j!
(16)
dj (u)ak jk .
Uu =
(17)
q=0 j+k=q
Theorem 11 Let {X(t) : t R} be a centered stationary Gaussian process verifying the conditions at the beginning of this subsection. Furthermore, let us assume that:
+
|(t)|dt,
0
| (t)|dt,
0
| (t)|dt < .
0
+ 2
gk k!
0
< . Put:
Hj (X(s))Hk (X (s))ds
0
t +
(18)
where
2 (q) < ,
0< =
q=1
and
q
2 (q) := 2
ak ak gqk gqk
0
k=0 k =0
The integrand in the right-hand side of this formula can be computed using Lemma 5. Similar
results exist, mutatis mutandis, for the sequences {bk } and {ck }.
A consequence is
Corollary 12 If the process X(t) satisfies the conditions of Theorem 11 then, as T +
2
eu /2
1
Uu [0, T ] T
2
T
N (0, 12 ) in distribution
1
eu /2
Nu [0, T ] T
N (0, 22 ) in distribution,
i = 0, 1, 2
as t +.
(19)
Step 1 In this step we prove that one can choose Q large enough (and that doesnt depend on
t) so that Ft can be replaced with an arbitrarily small error (in the L2 sense) by its components
in the first Q chaos
1
FtQ :=
t
Gqt
with Gqt :=
q=0
gqk ak
k=0
Let us consider
1
E (Gqt )2 =
t
q
1/t
gqk ak gqk ak
k,k =0
dt1
0
(20)
9
(const)
T0 (d ,d ,d ,d )Z
1 2 3 4
k,k =0
q
(const)
k,k =0
it follows that
in (21) is bounded
k!
d1 !d2 !d3 !d4 !
d d!(k d)!
above by 2q (k )!(q k )! or 2q (k)!(q k)! depending on the way we group terms. As a consequence
it is also bounded above by 2q (k )!(q k )!(k)!(q k)! and the right-hand side of (21) is bounded
above by
Remarking that sup
(const)
(t)dt
0
k,k =0
q
(const)
(22)
k,k =0
Gq2T0
(const)T02
2
(q k)!k!gqk
a2k .
(23)
k=0
Finally,
1
E
t
q
2
Gqt
2
(q k)!k!gqk
a2k ,
(const)
k=0
which is the general term of a convergent series. This proves also that 2 is finite.
Step 2 Let us prove that 2 > 0. It is sufficient to prove that 2 (2) > 0. Recall that a1 = 0
so that
+
E H2 (X(0))H2 (X(s))ds
0
+
+ a22 g02
E H2 (X (0))H2 (X (s))ds
0
+
+ 2a0 g2 a2 g0
10
( (s))2 ds + 4a0 g2 a2 g0
( (s))2 ds
0
Step 3 We define (.) = K 1I[1/4,1/4] , where the constant K is chosen such that (0) =
1. Then we define X (t) using (10). The new definition of (.) ensures now that X (t) is
differentiable. Define
Q
1
FtQ, :=
Gq,
t ,
t q=0
with
Gq,
t
gqk ak
0
k=0
FtQ
1
0
Kqk,k
:=
t
Kqk,k
:=
t
E(Kqk,k
Kqk,k
)2 = 2
0
ts
E Hqk (X(0))Hk X (0) Hqk (X(s))Hk X (s)
t
ts
E Hqk (Y1 (0))Hk Y1 (0) Hqk (Y2 (s))Hk Y2 (s) ds
t
(27)
where the processes Y1 (t) and Y2 (t) are chosen among {X(t), X (t)}. It suffices to prove that all
these terms have the same limit, as t + and then 0 whatever the choice is.
Applying Lemma 5, the expectation in(27) is equal to
t
0
ts
t
d1 ,...,d4 Z
(q k)!2 k!2
((s))d1 ( (s))d2 ( (s))d3 ( (s))d4 ds,
d1 !d2 !d3 !d4 !
11
where (.) is the covariance function between the processes Y1 and Y2 and Z is defined as in
Lemma 5. Again, since the number of terms in Z is finite, it suffices to prove that
t
lim lim
0 t
ts
((s))d1 ( (s))d2 +d3 ( (s))d4 ds,
t
where (d1 , . . . , d4 ) is chosen in Z, does not depend on the way to choose Y1 and Y2 . is the
Fourier transform of (say) g() which is taken among f (); f () or f () f (). Define
g() = ig() and g() = 2 g(). Then ((s))d1 ( (s))d2 +d3 ( (s))d4 is the Fourier transform
of the function
d
h() = g d1 () g (d2 +d3 ) () g 4 ().
The continuity and boundedness of f imply that all the functions above are bounded and continuous. The Fubini theorem shows that
t
0
ts
(s)d1 (s)d2 +d3 ( (s))d4 ds =
t
1 cos
h( ),
2
t
1 cos
h(0)d.
2
j=0
1
(u)Hj (u)
j!
(28)
First, considering the bound given by the right-hand side of (22), we can improve it by reintroducing the factor q2q that had been bound by 1. We get that in its new expression this right-hand
side is bounded by
q
(const)q2q
k,k =0
q
2 q
(const)q 2
k=0
q
(const)q 2 2q
a2k k! (const)q 2 2q .
k=0
Second we have to replace the bound (23). Since the series in (17) is convergent E
is the term of a convergent series and this in enough to conclude.
.
12
Gq2T0
Mourareau [?] has extended the result of Corollary 12 to the case of moving level uT
Theorem 13 Let uT be a moving level that tends to infinity with T . Suppose that instead of (18)
we assume only
lim sup
T
u2T
) < ,
i = 0, 1, 2.
E(Ut )
Then
1
T (uT )
2
T (uT )
2
UuT (T )
N 0,
2
)
2
The main point is that the conditions on the process are weaker. In particular process with long
range dependence satisfy conditions above as soon as the level increases sufficiently rapidly.
The second point is that the variance is now simple and explicit and it corresponds to the
Poissonian limit (The variance is equal to the expectation) known as the Vlokonskii- Rozanov
theorem.
Theorem 14 Assule the conditions of Theorem 11 except (18) which is now replaced by the very
weak Bermans condition
( ) log( ) 0 as .
Let uT be a movinf level such that E Uut = where is some constant. Then Uut converges to
a Poisson distribution with parameter .
This is a simplified version, the full one establishes a functional convergence of the point process
itself.
4.1
sin(t)
t
Since the covariance is not summable in the Lebesgue sense, it does not satisfy strictly the conditions of Corollary 12. But in fact the integral
(t)dt
R
can be defined by passage to the limit and it can be checked that the result holds true.
13
1
N
cos n =
n=1
1
(N + 1) sin( N2 )
.
cos(
)
N
2
sin 2
(29)
1.
1 2 2
1
YN
YN
N[0,N
q2 (u)),
] (u) E(N[0,N ] (u)) N (0, 3 u (u) +
N
q=2
2.
2 2 2
1
YN
YN
N[0,2N
q2 (u)),
] (u) E(N[0,2N ] (u)) N (0, 3 u (u) +
2N
q=2
where is the convergence in distribution as N and q2 (u) is the variance of the part in the
qth chaos.
4.2
Specular points
A different case of central limit theorem is given by the number of specular points. These are
point of the surface of the sea that appear in bright on a photo. We use a cylinder model: time
is fixed; the variation of the elevation of the sea W (x) as a function of the space variable x is
modeled by a smooth stationary Gaussian process; as a function of the second space variable y
the elevation of the sea is supposed to be constant.
Suppose that a source of light is located at (0, h1 ) and that an observer is located at (0, h2 )
where h1 and h2 are big with respect to W (x) and x. Only the variable x has to be taken into
account and the following approximation, was introduced long ago by Longuett-Higgins [?]: the
point x is a specular point if
W (x)
kx, with k :=
14
1 1
1
+
.
2 h1
h2
This is a non stationary case: there are more specular points underneath the observer. In particular
if SP (I) s the number of specular points contained in the interval I,
E(SP (I)) =
1
kx
4 ) ( )dx,
2
2
G(k,
I
(30)
where 2 , 4 are the spectral moments of order 2 and 4 repectively that are assumed to be finite;
G(, ) := E(|Z|), Z with distribution N (, 2 ).
An easy consequence of that formula is that
E(SP ) := E(SP (R)) =
G(k, 4)
k
24 1
,
k
as k tends to 0.
As a consequence the number of specular point is almost surely finite and the Central Limit
Theorem may only happen in the case where k 0,i.e. when the locations of the observer an the
source of light are infinitely far from the surface of the sea.
The central limit theorem is now established using Lyapounov type conditions for Lindeberg
type Central Limit Theorem for triangular arrays.
Theorem 16 Under some conditions (see Azas Le
on and Wschebor [2] for details), as k 0,
S
24 1
k
/k
N (0, 1),
in distribution,
References
[1] Azas J-M., and Le
on J. CLT for Crossings of random trigonometric Polynomials. Electronic
Journal of Probability, 18 (2013), paper 68.
[2] Azas , J-M., Le
on J. and Wschebor, M. Rice formulae and Gaussian waves Bernoulli Volume
17, Number 1 (2011), 170-193.
[3] Azas, J-M. and Wschebor, M. Level sets and Extrema of Random Processes and Fields.
Wiley (2009).
[4] Berman S.M. Ocupation times for stationary Gaussian process J. Applied Probability. 7, 721733 (1970).
[5] Geman, D. (1972). On the variance of the number of zeros of stationary Gaussian process,
Ann. Math. Statist., Vol 43, N 3, 977-982.
[6] Cuzick, J. A Central Limit Theorem for the Number of Zeros of a Stationary Gaussian
Process, The Annals of Probability. Volume 4, Number 4 (1976), 547-556.
[7] Hoeffding W. and Robbins H. (1948).The Central Limit Theorem for dependent random
variables, Duke Math- J. 15, 773-730 .
[8] Granville, A. Wigman, I. The distribution of the zeros of Random Trigonometric Polynomials.
American Journal of Mathematics. 133, 295-357 (2011).
15
[9] Kratz, M.F.(2006). Level crossings and other level functionals of stationary Gaussian processes. Probability Surveys Vol. 3, 230-288
[10] Kratz, M. and Le
on, J.R. Hermite polynominal expansion for non-smooth functionals of
stationary Gaussian processes: Crossings and extremes. Stoch. Proc. Applic. 66, 237-252,
(1997).
[11] Kratz, M. and Le
on, J.R. Central Limit Theorems for Level Functionals of Stationary Gaussian Processes and Fields. Journal of Theoretical Probability. Vol. 14, No. 3, (2001).
[12] Le
on J. (2006). A note on Breuer-Major CLT for non-linear functionals of continuous time
stationary Gaussian processes. Preprint
[13] Le
on J. and Ortega J. (1989). Weak convergence of different types of variation for biparametric
Gaussian processes, Colloquia Math. Soc. J. Bolyai, n 57, 1989, Limit theorems in Proba.
and Stat. Pecs.
[14] Longuett-Higgins, M.S. Reflection and refraction at a random surface. I, II, III, Journal of
the Optical Society of America, vol. 50, No.9, 838-856 (1960).
[15] Malevich , T.L. Asymptotic normality of the number of crossings of the level zero by a
Gaussian process. Theor. Probability Appli. 14, 287-295 (1969).
[16] Central Limit theorem for the number of crossings of an increasing level. Unpublished
manuscript.
[17] Nualart, D. The Malliavin Calculus and Related Topics. Springer-Verlag, (2006).
16
(k)
Xt1 , ..., Xtn , Xt1 , ..., Xtn , ..., Xt1 , ..., Xtn
is non degenerate.
We denote m(t) and r(s, t) the mean and covariance functions of X, that is
i+j
m(t) := E(Xt ), r(s, t) := E (Xs m(s))(Xt m(t)) and rij := s i t j r (i, j =
0, 1, ..) the partial derivatives of r, whenever they exist.
Our main results are the following:
Theorem 1.1. Let X = {Xt : t [0, 1]} be a stochastic process satisfying H2k .
Denote by F (u) = P (M u) the distribution function of M.
Then, F is of class C k and its succesive derivatives can be computed by repeated
application of Lemma 3.3.
Corollary 1.1. Let X be a stochastic process verifying H2k and assume also that
E(Xt ) = 0 and V ar(Xt ) = 1.
Then, as u +, F (k) (u) is equivalent to
(1)k1
uk u2 /2
e
2
(1)
2. Crossings
Our methods are based on well-known formulae for the moments of crossings of
the paths of stochastic processes with fixed levels, that have been obtained by a
variety of authors, starting from the fundamental work of S.O.Rice (19441945).
In this section we review without proofs some of these and related results.
Let f : I IR be a function defined on the interval I of the real numbers,
Cu (f ; I ) := {t I : f (t) = u}
Nu (f ; I ) =
Cu (f ; I )
denote respectively the set of roots of the equation f (t) = u on the interval I and
the number of these roots, with the convention Nu (f ; I ) = + if the set Cu is
infinite. Nu (f ; I ) is called the number of crossings of f with the level u on
the interval I .
In the same way, if f is a differentiable function the number of upcrossings
and downcrossings of f are defined by means of
Uu (f ; I ) := ({t I : f (t) = u, f (t) > 0})
Du (f ; I ) := ({t I : f (t) = u, f (t) < 0}).
For a more general definition of these quantities see Cramer and Leadbetter (1967).
In what follows, f p is the norm of f in Lp (I, ), 1 p +, denoting the Lebesgue measure. The joint density of the finite set of real-valued random
variables X1 , ...Xn at the point (x1 , ...xn ) will be denoted pX1 ,...,Xn (x1 , ...xn ) whenever it exists. (t) := (2)1/2 exp(t 2 /2) is the density of the standard normal
t
distribution, (t) := (u)du its distribution function.
The following proposition (sometimes called Kacs formula) is a common tool
to count crossings.
Proposition 2.1. Let f : I = [a, b] IR be of class C1 , f (a), f (b) = u. If f
does not have local extrema with value u on the inteval I , then
Nu (f ; I ) = lim 1/(2)
0
IRk
j =1
|xj | pXt
Ik
where it is understood that the density in the integrand of the definition of At1 ,...tk
(x1 , ...xk ) exists almost everywhere and that the integrals above can take the value
+.
,...,Xs
exists for (t1 , ...tk ), (s1 , ...sk ) I k \Dk (I ) and is a continuous function of
(t1 , ...tk ) and of x1 , ...xk at the point (u, ..., u).
2. the function
(t1 , ..., tk , x1 , ...xk ) At1 ,...tk (x1 , ...xk )
is continuous for (t1 , ..., tk ) I k \Dk (I ) and x1 , ...xk belonging to a neighbourhood of u.
3. (additional technical condition)
IR3
(2)
dt
(3)
(b) Simple variations of (3), valid under the same hypotheses are:
E[Uu (X; I )] =
E[Du (X; I )] =
dt
(4)
(5)
dt
In the same way one can obtain formulae for the factorial moments of marked
crossings, that is, crossings such that some additional condition holds true. For
example, if Y = {Yt : t IR} is some other stochastic process with real values
such that for every t , (Yt , Xt , Xt ) admit a joint density, a < b + and
Nua,b (X, I ) := {t : t I, Xt = u, a < Yt < b}.
Then
E[Nua,b (X; I )] =
b
a
dy
dt
(6)
+
In particular, if Ma,b
is the number of strict local maxima of X(.) on the interval I
+
such that the value of X(.) lies in the interval (a, b), then Ma,b
= D0a,b (X , I ) and:
+
]=
E[Ma,b
b
a
dy
dt
(7)
Sufficient conditions for the validity of (6) and (7) are similar to those for 3.
(c) Proofs of (2) for Gaussian processes satisfying certain conditions can be
found in Belayev (1966) and Cramer-Leadbetter (1967). Marcus (1977) contains
various extensions. The present statement of Proposition 2.2 is from Wschebor
(1985).
(d) It may be non trivial to verify the hypotheses of Proposition 2.2. However
some general criteria are available. For example if X is a Gaussian process with C1
paths and the densities
pXt1 ,...,Xtk ,Xs ,...,Xs
1
are non-degenerate for (t1 , ...tk ), (s1 , ...sk ) I k \Dk , then conditions 1, 2, 3 of
Proposition 2.2 hold true (cf Wschebor, 1985, p.37 for a proof and also for some
manageable sufficient conditions in non-Gaussian cases).
(e) Another point related to Rice formulae is the non existence of local extrema
at a given level. We mention here two well-known results:
Proposition 2.3 (Bulinskaya, 1961). Suppose that X has C1 paths and that for
every t I , Xt has a density pXt (x) bounded for x in a neighbourhood of u.
Then, almost surely, X has no tangencies at the level u, in the sense that if
TuX := {t I, Xt = u, Xt = 0},
then P (TuX = ) = 1.
Proposition 2.4 (Ylvisakers Theorem, 1968). Suppose that {Xt : t T } is a
real-valued Gaussian process with continuous paths, defined on a compact separable topological space T and that V ar(Xt ) > 0 for every t T . Then, for each
u IR, with probability 1, the function t Xt does not have any local extrema
with value u.
3. Proofs and related results
Let be a random variable with values in IRk with a distribution that admits a
density with respect to the Lebesgue measure . The density will be denoted by
p (.) . Further, suppose E is an event. It is clear that the measure
(B; E) := P ({ B} E)
defined on the Borel sets B of IRk , is also absolutely continuous with respect to .
We will denote the density of related to E the Radon derivative:
p (x; E) :=
d (.; E)
(x).
d
Theorem 3.1. Suppose that X has C2 paths, that X, X , X admit a joint density at
every time t, that for every t, Xt has a bounded density pXt (.) and that the function
1
I (x, z) :=
dt
is uniformly continuous in z for (x, z) in some neighbourhood of (u, 0). Then the
distribution of M admits a density pM (.) satisfying a.e.
pM (u) pX0 (u; X0 < 0) + pX1 (u; X1 > 0)
1
dt
(8)
Using Proposition 2.3, with probability 1, X (.) has no tangencies at the level 0,
thus an upper bound for this expectation follows from the Kacs formula:
1
0 2
+
= lim
Muh,u
1
0
(t)<0} |X
(t)|dt
a.s.
1
2
dz
u
uh
I (x, z)dx =
u
uh
I (x, 0)dx.
uh
(s, t)
,
(s) (t)
with
(t, t) = 1, 11 (t, t) = 1, 10 (t, t) = 0, 12 (t, t) = 0, 02 (t, t) = 1,
after some calculations, we get exactly their bound M(u) ( their formula (9)) for
the density of the maximum.
Let us illustrate formula (8) explicitly when the process is Gaussian, centered
with unit variance. By means of a deterministic time change, one can also assume
that the process has unit speed (V ar(Xt ) 1). Let L the length of the new
time interval. Clearly t, m(t) = 0, r(t, t) = 1, r11 (t, t) = 1, r10 (t, t) = 0,
r12 (t, t) = 0, r02 (t, t) = 1. Note that
Z N(, 2 ) E(Z ) = (/ ) (/ ).
The formulae for regression imply that conditionally on Xt = u, Xt = 0, Xt
has expectation u and variance r22 (t, t) 1. Formula (8) reduces to
pM (u) p+ (u) := (u) 1+(2)1/2
with Cg (t) :=
L
0
r22 (t, t) 1
As x +,
(x) = 1
(x)
x
(x)
x3
+O
(x)
x5
+O u4 (u/C + ) ,
with C + := supt[0,L] Cg (t).
Furthermore the exact equivalent of pM (u) when u + is
(2)1 u L exp(u2 /2)
as we will see in Corollary 1.1.
The following theorem is a special case of Lemma 3.3. We state it separately
since we use it below to compare the results that follow from it with known results.
Theorem 3.2. Suppose that X is a Gaussian process satisfying H2 . Then M has a
continuous density pM given for every u by
pM (u) = pX0 (u ; M u) + pX1 (u ; M u)
1
+
0
dt
(9)
and if x < 0 :
P (M u/Xt = u, Xt = 0, Xt = x )
1 E([Du ([0, t]) + Uu ([t, 1])] /Xt = u, Xt = 0, Xt = x ).
If we plug these lower bounds into Formula (9) and replace the expectations of
upcrossings and downcrossings by means of integral formulae of (4), (5) type, we
obtain the lower bound:
pM (u) pX0 (u; X0 < 0) + pX1 (u; X1 < 0)
1
+
0
dt
ds
0
1
dt
dx
0
|x |
t
0
dx .
(10)
Simpler expressions for (10) also adapted to numerical computations, can be found
in Cierco (1996).
Finally, some sharper upperbounds for pM (u) are obtained when replacing the
event {M > u} by {X0 + X1 > 2u}, the probability of which can be expressed
using the conditionnal expectation and variance of X0 + X1 ; we are able only to
express these bounds in integral form.
We now turn to the proofs of our main results.
(s t) t
Zs
2
2
s [0, 1) .
s [0, 1] s = t.
(12)
(13)
tm
.
s
is continuous.
Proof . (a) and (b) follow in a direct way, computing the regression coefficients
a (s), a (s) , bt (s), ct (s) and substituting into formulae (11), (12), (13). Note
that (b) also follows from (a) by applying it to Z + f and to Z. We prove now (c)
which is a consequence of the following:
Suppose Z(t1 , ..., tk ) is a Gaussian field with C p sample paths (p 2) defined on
[0, 1]k with no degeneracy in the same sense that in the definition of hypothesis Hk
(3) for one-parameter processes. Then the Gaussian fields defined by means of:
Z (t1 , ..., tk ) = (tk )1 Z(t1 , ..., tk1 , tk ) a (t1 , ..., tk )Z(t1 , ..., tk1 , 0)
for tk = 0,
Z (t1 , ..., tk ) = (1 tk )1 Z(t1 , ..., tk1 , tk ) a (t1 , ..., tk )Z(t1 , ..., tk1 , 1)
for tk = 1,
Z(t1 , ..., tk , tk+1 ) = 2 (tk+1 tk )2 (Z(t1 , ..., tk1 , tk+1 )
b(t1 , ..., tk , tk+1 )Z(t1 , ..., tk )
Z
(t1 , ..., tk ))
for tk+1 = tk
c(t1 , ..., tk , tk+1 )
tk
10
can be extended to [0, 1]k (respectively [0, 1]k , [0, 1]k+1 ) into fields with paths in
C p1 (respectively C p1 , C p2 ). In the above formulae,
- a (t1 , ..., tk ) is the regression coefficient of Z(t1 , ..., tk ) on Z(t1 , ..., tk1 , 0),
- a (t1 , ..., tk ) is the regression coefficient of Z(t1 , ..., tk ) on Z(t1 , ..., tk1 , 1),
- b(t1 , ..., tk , tk+1 ), c(t1 , ..., tk , tk+1 ) are the regression coefficients of
Z
(t1 , ..., tk ) .
Z(t1 , ..., tk1 , tk+1 ) on the pair Z(t1 , ..., tk ), t
k
Let us prove the statement on Z. The other two are simpler. Denote by V the subZ
(t1 , ..., tk ) . Denote
space of L2 ( , , P ) generated by the pair Z(t1 , ..., tk ), t
k
:= Y bZ(t1 , ..., tk ) + c
Z
(t1 , ..., tk ) ,
tk
Z
(t1 , ..., tk ).
tk
Note that if {Y : } is a random field with continuous paths and such that
Y is continuous in L2 ( , , P ) , then a.s.
, t1 , ..., tk )
V (Y )
is continuous.
From the definition:
Z(t1 , ..., tk , tk+1 ) = 2 (tk+1 tk )2
tk+1
tk
Z
(t1 , ..., tk )+R2 (t1 , ..., tk , tk+1 )
tk
2Z
(t1 , ..., tk1 , ) (tk+1 ) d
tk2
so that
Z(t1 , ..., tk , tk+1 ) =
(14)
It is clear that the paths of the random field Z are p 1 times continuously differentiable for tk+1 = tk . Relation (14) shows that they have a continuous extension
to [0, 1]k+1 with Z(t1 , ..., tk , tk ) =
V
2Z
(t , ..., tk )
tk2 1
. In fact,
= 2 (sk+1 sk )2
sk+1
sk
11
2Z
(s , ..., sk1 , )
tk2 1
(sk+1 ) d.
According to our choice of the version of the orthogonal projection V , a.s. the
integrand is a continuous function of the parameters therein so that, a.s.:
Z (s1 , ..., sk , sk+1 )
2Z
(t1 , ..., tk ) when (s1 , ..., sk , sk+1 )
tk2
(t1 , ..., tk , tk ).
V
This proves (c). In the same way, when p 3, we obtain the continuity of the
partial derivatives of Z up to the order p2.
The following lemma has its own interest besides being required in our proof
of Lemma 3.3. It is a slight improvement of Lemma 4.3, p. 76 in Piterbarg (1996)
in the case of one-parameter processes.
Lemma 3.2. Suppose that X is a Gaussian process with C3 paths and that for all
(2)
(3)
s = t, the distributions of Xs , Xs , Xt , Xt and of Xt , Xt , Xt , Xt do not degenerate. Then, there exists a constant K (depending on the process) such that
pXs ,Xt ,Xs ,Xt (x1 , x2 , x1 , x2 ) K(t s)4
for all x1 , x2 , x1 , x2 IR and all s, t, s = t [0, 1].
Proof .
pXs ,Xt ,Xs ,Xt (x1 , x2 , x1 , x2 ) (2)2 DetV ar(Xs , Xt , Xs , Xt )
1/2
where DetV ar stands for the determinant of the variance matrix. Since by hypothesis the distribution does not degenerate outside the diagonal s = t, the conclusion
of the lemma is trivially true on a set of the form {|s t| > }, > 0. By a compactness argument it is sufficient to prove it for s, t in a neighbourhood of (t0 , t0 )
for each t0 [0, 1]. For this last purpose we use a generalization of a technique
employed by Belyaev (1966). Since the determinant is invariant by adding linear
combination of rows (resp. columns) to another row (resp. column),
DetV ar(Xs , Xt , Xs , Xt ) = DetV ar(Xs , Xs , X s(2) , X s(3) ),
with
X s(2) = Xt Xs (t s)Xs
X s(3) = Xt Xs
2
X (2)
(t s) s
(t s)2 (2)
Xt0
2
(t s)2 (3)
Xt0 ,
6
The equivalence refers to (s, t) (t0 , t0 ). Since the paths of X are of class
(2)
(3)
C3 , Xs , Xs , (2(t s)2 )X s , (6(t s)2 )X s
tends almost surely to
12
(3)
Xt0 , Xt0 , Xt0 , Xt0 as (s, t) (t0 , t0 ). This implies the convergence of the
variance matrices. Hence
DetV ar(Xs , Xt , Xs , Xt )
(t s)8
(2)
(3)
DetV ar(Xt0 , Xt0 , Xt0 , Xt0 ),
144
t
(t)E v,u
Ztt t (t).u 1IAu (Z t , t )
(15)
13
(t1 t)2 t
Zt1 t (t1 ) u + (t1 ) (u v), ...
2
(tm t)2 t
Ztm t (tm ) u + (tm ) (u v) .
...,
2
t
v,u
=G
Proof . We start by showing that the arguments of Theorem 3.1 can be extended to
our present case to establish that Fv is absolutely continuous. This proof already
contains a first approximation to the main ideas leading to the proof of the lemma.
Step 1 Assume - with no loss of generality - that u 0 and write for h > 0:
Fv (u) Fv (u h) = E v .1IAu \Auh E v .1IAuh \Au
(16)
(17)
Note that:
where:
(1)
(1)
P (Muh,u 1) E Muh,u ,
and the formula for the expectation of the number of local maxima applied to the
process t Zt (t)(u h) imply
|E v .1IAu \Auh |
1I{(0)>0}
+1I{(1)>0}
1
+
0
(0)u
(0)(uh)
(1)u
(1)(uh)
(t)h
1I{(t)>0} dt
(18)
14
(19)
(20)
where
|R1 (h)| E |v |1I{(0)(uh)<Z0 (0)u,(1)(uh)<Z1 (1)u} 1I{(0)>0,(1)>0}
+ E |v |1I
+ E |v |1I
(1)
1I{(0)>0}
(1)
1I{(1)>0}
(0)(uh)<Z0 (0)u,Muh,u 1
(1)(uh)<Z1 (1)u,Muh,u 1
(1)
uh,u 1
15
Let us consider T2 (h). Using the integral formula for the expectation of the
number of local maxima:
T2 (h) 1I{(0)>0}
1
0
1I{(t)0} dt
(0)h
(t)h
dz0
dz.
1I{(t)0} dt
(t)h
0
where the random vector V2 is the same as in (18). Since the conditional expectation as well as the density are bounded for u in a bounded set and 0 < h < 1, this
expression is bounded by (const)h.
As for the second integral, when t is between and 1 the Gaussian vector
Z0 (0)(u h), Zt (t)(u h), Zt (t)(u h)
has a bounded density so that the integral is bounded by C h2 , where C is a constant
depending on .
Since > 0 is arbitrarily small, this proves that T2 (h) = o(h). T3 (h) is similar
to T2 (h).
We now consider T4 (h). Put:
(4)
Eh =
where
stands
h1/4 |v | h1/4
(1)
(1)
(21)
E |v |1IE C Muh,u E |v |4 E
h
(1)
Muh,u
1/4
P (EhC )
1/2
The polynomial bound on G, plus the fact that Z has finite moments of all
orders, imply that E |v |4 is uniformly bounded.
(1)
Also, Muh,u D0 (Z(.) (.)(u h), [0, 1]) = D (recall that D0 (g; I ) denotes the number of downcrossings of level 0 by function g). A bound for E D 4
16
E |v |1IE C Muh,u
h
(4)
(const) C1 eC2 h
+ hq/4 E |v |q
1/4
) + P (|v |
> h
1/2
> h1/4 )
1/2
where C1 , C2 are positive constants and q any positive number. The bound on the
first term follows from the Landau-Shepp (1971) inequality (see also Fernique,
1974) since even though the process depends on h it is easy to see that the bound
is uniform on h, 0 < h < 1. The bound on the second term is simply the Markov
inequality. Choosing q > 8 we see that the second term in (21) is o(h).
For the first term in (21) one can use the formula for the second factorial moment
(1)
of Muh,u to write it in the form:
1
0
1I{(s)0,(t)0} dsdt
0
E(|v |1IEh (Zs
(s)h
0
(t)h
dz1
dz2
y
t1
( )d |=|
y
t1
t2
(4) ( )d |
(t s)2
2
(4)
17
2
) /V4
= v4 )
(23)
u
uh
u
+1I{(1)>0} (1)
+
uh
(t)h
1I{(t)0} dt
0
0
pV2 (z, 0) + o(h)
u
uh
(24)
where:
H1 (x, h) = 1I{(0)>0} (0)E v .1IAu /Z0 = (0)x .pZ0 ((0)x)
+ 1I{(1)>0} (1)E v .1IAu /Z1 = (1)x .pZ1 ((1)x)
+
1I{(t)0}
0
E(v .1IAu (Zt
Fv (u) Fv (u h)
h
exists and admits the representation (15) in the statement of the Lemma. For that
purpose, we will prove the existence of the limit
1
lim E v .1IAu \Auh .
h0 h
(26)
18
h0,uh<x<u
H1 (x, h).
Consider the first term in expression (25). We apply Lemma 3.1(a) and with the
same notations therein:
Zt = a (t) Z0 + tZt ,
t = a (t) (0) + tt
t [0, 1] .
(27)
a (t) (0)(u x)
for all t [, 1] .
t
|a (t) (0)|
: t [, 1] .
t
We prove that as x u,
E v,x .1IB(u,x) E v,u .1IB(u,u)
(28)
We have,
|E v,x .1IB(u,x) E v,u .1IB(u,u) |
E |v,x v,u | + |E v,u (1IB(u,x) 1IB(u,u) ) |.
(29)
19
From the definition of v,x it is immediate that the first term tends to 0 as x u.
For the second term it suffices to prove that
P (B(u, x) B(u, u)) 0 as x u.
(30)
The first term is equal to zero because of Proposition 2.4. The second term
decreases to zero as 0 since M[,1] 0, M[0,] > 0 decreases to the empty
set.
It is easy to prove that the function
(u, v) E v,u .1IAu (Z
, )
is continuous. The only difficulty comes from the indicator function 1IAu (Z , ) although again the fact that the distribution function of the maximum of the process
Z(.) (.)u has no atoms implies the continuity in u in much the same way as
above.
So, the first term in the right-hand member of (25) has the continuous limit:
1I{(0)>0} (0)E v,u .1IAu (Z
) .pZ0 ((0).u).
With minor changes, we obtain for the second term the limit:
1I{(1)>0} (1)E v,u .1IAu (Z
) .pZ1 ((1).u),
where Z , are as in Lemma 3.1 and v,u as in the statement of Lemma 3.3.
The third term can be treated in a similar way. The only difference is that the regression must be performed on the pair (Zt , Zt ) for each t [0, 1], applying again
Lemma 3.1 (a),(b),(c). The passage to the limit presents no further difficulties, even
if the integrand depends on h.
Finally, note that conditionally on Zt = (t)u, Zt = (t)u one has
Zt (t)u = Ztt t (t)u
and
20
) .pZ0 ((0).u)
,
) .pZ1 ((1).u)
t
(t)1I{(t)>0} dtE v,u
(Ztt t (t).u)1IAu (Z t , t )
(t) = ( )(t)
and
, )
where v = G(Zt1 (t1 )v, ..., Ztm (tm )v) is continuoustly differentiable
and its derivative verifies (15) with the obvious changes, that is:
Fv
(u) = (0)E
v,u
+ (1)E
1
v,u
(t)E
pZt ,(Z
)t
.1IAu
(Z ) ,( )
.1IAu
v,u
.pZ0 (0) .u
(Z ) ,( )
(t) .u,
t
t
.pZ1 (1) .u
t
(t) .u dt.
)t ,( )t )
(31)
Let 0. We prove next that (Fv ) (u) converges for fixed (u, v) to a limit
function Fv (u) that is continuous in (u, v). On the other hand, it is easy to see that
for fixed (u, v) Fv (u) Fv (u). Also, from (31) it is clear that for each v, there
exists 0 > 0 such that if (0, 0 ), (Fv ) (u) is bounded by a fixed constant when
u varies in a bounded set because of the hypothesis on the functions G and and
the non-degeneracy of the one and two-dimensional distribution of the process Z.
21
So, it follows that Fv (u) = Fv (u) and the same computation implies that Fv (u)
satisfies (15).
Let us show how to proceed with the first term in the right-hand member of
(31). The remaining terms are similar.
Clearly, almost surely, as 0 one has Zt Zt , (Z )t Zt , (Z )t Zt
uniformly for t [0, 1], so that the definition of Z in (11) implies that (Z )t
Zt uniformly for t [0, 1], since the regression coefficient (a ) (t) converges to
a (t) uniformly for t [0, 1] (with the obvious notation).
Similarly, for fixed (u, v):
( )t t , (v,u ) v,u
uniformly for t [0, 1].
Let us prove that
E (v,u ) 1IAu
E v,u 1IAu (Z
(Z ) ,( )
) .
This is implied by
as
P Au
0. Denote, for
> 0, 0:
Cu, = Au
Au Z ,
0.
(32)
sup
uK,t[0,1]
and
Fu, = sup
t[0,1]
Zt (t)u .
< .
, ,
Eu, \ Cu,
D c, Fu, ,
22
sup
t[0,1]
Zt (t)u = 0 = 0.
) = P Au Z ,
| sup
t[0,1]
P Au Z ,
Zt (t).u || h |
23
E Ytt11 1IA
Y t1
pYt
(33)
This expression is exactly the expression in (9) with the indicated notational changes and after taking profit of the fact that the process is Gaussian, via the regression
on the conditionning in each term. Note that according to the definition of the
Y -process:
E 1IA(Y ) = E 1IAu (X , )
E 1IA(Y ) = E 1IAu (X
E Ytt11 1IA
Y t1
(1)
+
0
E Ytt11 1IA
Y t1
(1,0)
(0, 0)dt1
t1 ,Yt1
pY
A Y
,t2
pY
t2 ,(Y
,t2
pY
t2 ,(Y
) pY1 (0)
t ,
,Y (0, 0) + t1 (1)E Yt11 1I
1 t1
A
pYt
) pY1 (0)
1
0
(1)
(1)
In this formula pYt , pYt
0
1
t1
(t2 )E
Y t1 ,
Y t1 ,
pY t1 (0)
0
pY t1 (1)
1
dt1 ,
t2
(34)
and pYt ,Yt (0, 0)(1,0) stand respectively for the deriv1
1
ative of pYt0 (.), the derivative of pYt1 (.) and the derivative with respect to the first
variable of (pYt ,Yt (., .)).
1
1
To validate the above formula, note that:
24
The first two lines are obtained by differentiating with respect to u, the densities
pY0 (0) = pX0 (u), pY1 (0) = pX1 (u), pYt ,Yt (0, 0) = pXt ,Xt (u, 0).
1
1
1
1
Lines 3 and 4 come from the application of Lemma 3.3 to differentiate E(1IA(Y ) ).
The lemma is applied with Z = X , = , = 1.
Similarly, lines 5 and 6 contain the derivative of E(1IA(Y ) ).
The remaining corresponds to differentiate the function
E Ytt11 1IA(Y t1 ) = E Xtt11 t1 (t1 )u 1IAu (Xt1 , t1 )
in the integrand of the third term in (33). The first term in line 7 comes from the
simple derivative
..
Y t1 ,..,tm
where:
1 m k.
t1 , ..., tm [0, 1] { , } , m 1.
s1 , .., sp , 0 p m, are the elements in {t1 , ..., tm } that belong to [0, 1] (that
is, which are neither nor ). When p = 0 no integral sign is present.
Q(s1 , .., sp ) is a polynomial in the variables s1 , .., sp .
is a product of values of Y t1 ,...,tm at some locations belonging to s1 , .., sp .
K1 (s1 , .., sp ) is a product of values of some ancestors of t1 ,...,tm at some
locations belonging to the set s1 , .., sp {0, 1} .
K2 (s1 , .., sp ) is a sum of products of densities and derivatives of densities of
the random variables Z at the point 0, or the pairs ( Z , Z ) at the point (0, 0)
where s1 , .., sp {0, 1} and the process Z is some ancestor of Y t1 ,...,tm .
25
ct (s) =
r01 (s, t)
r11 (t, t)
1 r(s, t)
(t s)2
26
where
L(u) = L1 (u) + L2 (u) + L3 (u),
L1 (u) = P (Au (X , ),
L2 (u) = P (Au (X , ),
1
L3 (u) =
0
dt
.
(2 r11 (t, t))1/2
u
(2)1/2
1
0
u
r02 (t, t)
dt =
1/2
(r11 (t, t))
(2 )1/2
h=k
h=2
k 1 (kh)
(u)L(h1) (u).
k1
(37)
u
), j = 1, ..., k 1
aj
(38)
1 r(s, 0)
f or 0 < s 1, (0) = 0,
s
(t)E (Xt
,t
,
,t
)]
pX ( (1)u)
1
(t)u)1IAu (X
,t , ,t )
pX
,(X )t (
Notice that (1) is non-zero so that the first term is bounded by a constant
times a non-degenerate Gaussian density. Even though (0) = 0, the second
27
term is also bounded by a constant times a non-degenerate Gaussian density because the joint distribution of the pair (Xt , (X )t ) is non-degenerate and the pair
( (t), ( ) (t)) = (0, 0) for every t [0, 1].
Applying a similar argument to the succesive derivatives we obtain (38) with
L1 instead of L.
The same follows with no changes for
L2 (u) = P (Au (X , ).
For the third term
1
L3 (u) =
0
dt
(2 r11 (t, t))1/2
we proceed similarly, taking into account t (s) = 0 for every s [0, 1]. So (38)
follows and we are done.
Remark. Suppose that X satisfies the hypotheses of the Corollary with k 2.
Then, it is possible to refine the result as follows.
For j = 1, ..., k :
F (j ) (u) = (1)j 1 (j 1)!hj 1 (u)
1
1 + (2)1/2 .u.
0
1 (j )
where hj (u) = (1)
j ! ((u)) (u), is the standard j-th Hermite polynomial
(j = 0, 1, 2, ...) and
| j (u) | Cj exp(u2 )
where C1 , C2 , ... are positive constants and > 0 does not depend on j .
The proof of (39) consists of a slight modification of the proof of the Corollary.
Note first that from the above computation of (s) it follows that 1) if X0 < 0,
then if u is large enough Xs (s).u 0 for all s [0, 1] and 2) if X0 > 0,
then X0 (0).u > 0 so that:
L1 (u) = P (Xs (s).u 0) for all s [0, 1])
1
2
as u +.
1
L1 (u) =
2
+
u
L1 (v)dv D1 exp(1 u2 )
L3 (u) =
0
E (Xtt t (t)u)
dt
(2 r11 (t, t))1/2
28
E (Xtt t (t)u)1I(A
u (X
t , t ) C
dt
.
(2 r11 (t, t))1/2
(40)
(2)1/2 .u.
inf
s,t[0,1]
Then:
P
Au (Xt , t )
with D3 , 3 are positive constants, the last inequality being a consequence of the
Landau-Shepp-Fernique inequality.
The remainder follows in the same way as the proof of the Corollary.
Acknowledgements. This work has received a support from CONICYT-BID-Uruguay, grant
91/94 and from ECOS program U97E02.
References
1. Adler, R.J.: An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes, IMS, Hayward, Ca (1990)
2. Azas, J-M., Wschebor, M.: Regularite de la loi du maximum de processus gaussiens
reguliers, C.R. Acad. Sci. Paris, t. 328, serieI, 333336 (1999)
3. Belyaev, Yu.: On the number of intersections of a level by a Gaussian Stochastic process,
Theory Prob. Appl., 11, 106113 (1966)
4. Berman, S.M.: Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series (1992)
5. Bulinskaya, E.V.: On the mean number of crossings of a level by a stationary Gaussian
stochastic process, Theory Prob. Appl., 6, 435438 (1961)
6. Cierco, C.: Probl`emes statistiques lies a` la detection et a` la localisation dun g`ene a` effet
quantitatif. PHD dissertation. University of Toulouse.France (1996)
7. Cramer, H., Leadbetter, M.R.: Stationary and Related Stochastic Processes, J. Wiley &
Sons, New-York (1967)
8. Diebolt, J., Posse, C.: On the Density of the Maximum of Smooth Gaussian Processes,
Ann. Probab., 24, 11041129 (1996)
9. Fernique, X.: Regularite des trajectoires des fonctions aleatoires gaussiennes, Ecole
dEte de Probabilites de Saint Flour, Lecture Notes in Mathematics, 480, Springer-Verlag,New-York (1974)
10. Landau, H.J., Shepp, L.A.: On the supremum of a Gaussian process, Sankya Ser. A, 32,
369378 (1971)
11. Leadbetter, M.R., Lindgren, G., Rootzen, H.: Extremes and related properties of random
sequences and processes. Springer-Verlag, New-York (1983)
12. Lifshits, M.A.: Gaussian random functions. Kluwer, The Netherlands (1995)
13. Marcus, M.B.: Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 5271 (1977)
29
14. Nualart, D., Vives, J.: Continuite absolue de la loi du maximum dun processus continu,
C. R. Acad. Sci. Paris, 307, 349354 (1988)
15. Nualart, D., Wschebor, M.: Integration par parties dans lespace de Wiener et approximation du temps local, Prob. Th. Rel. Fields, 90, 83109 (1991)
16. Piterbarg, V.I.: Asymptotic Methods in the Theory of Gaussian Processes and Fields,
American Mathematical Society. Providence, Rhode Island (1996)
17. Rice, S.O.: Mathematical Analysis of Random Noise, Bell System Technical J., 23,
282332, 24, 45156 (19441945)
18. Tsirelson, V.S.: The Density of the Maximum of a Gaussian Process, Th. Probab. Appl.,
20, 817856 (1975)
19. Weber, M.: Sur la densite du maximum dun processus gaussien, J. Math. Kyoto Univ.,
25, 515521 (1985)
20. Wschebor, M.: Surfaces aleatoires. Mesure geometrique des ensembles de niveau, Lecture Notes in Mathematics, 1147, Springer-Verlag (1985)
21. Ylvisaker, D.: A Note on the Absence of Tangencies in Gaussian Sample Paths, The
Ann. of Math. Stat., 39, 261262 (1968)
February 2, 2008
Let X = {X(t) : t S} be a real-valued random field defined on some parameter set S and
M := suptS X(t) its supremum.
The study of the probability distribution of the random variable M , i.e. the function
FM (u) := P{M u} is a classical problem in probability theory. When the process is Gaussian,
general inequalities allow to give bounds on 1 FM (u) = P{M > u} as well as asymptotic
results for u +. A partial account of this well established theory, since the founding paper
by Landau and Shepp [20] should contain - among a long list of contributors - the works of
Marcus and Shepp [24], Sudakov and Tsirelson [30], Borell [13] [14], Fernique [17], Ledoux and
Talagrand [22], Berman [11] [12], Adler[2], Talagrand [32] and Ledoux[21].
During the last fifteen years, several methods have been introduced with the aim of obtaining more precise results than those arising from the classical theory, at least under certain
restrictions on the process X , which are interesting from the point of view of the mathematical
theory as well as in many significant applications. These restrictions include the requirement
Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.
the domain S to have certain finite-dimensional geometrical structure and the paths of the
random field to have a certain regularity.
Some examples of these contributions are the double sum method by Piterbarg [28]; the
Euler-Poincare Characteristic (EPC) approximation, Taylor, Takemura and Adler [34], Adler
and Taylor [3]; the tube method, Sun [31] and the well- known Rice method, revisited by Azas
and Delmas [5], Azas and Wschebor [6]. See also Rychlik [29] for numerical computations.
The results in the present paper are based upon Theorem 3 which is an extension of Theorem
3.1 in Azas and Wschebor [8] allowing to express the density pM of FM by means of a general
formula. Even though this is an exact formula, it is only implicit as an expression for the
density, since the relevant random variable M appears in the right-hand side. However, it can
be usefully employed for various purposes.
First, one can use Theorem 3 to obtain bounds for pM (u) and thus for P{M > u} for
every u by means of replacing some indicator function in (4) by the condition that the normal
derivative is extended outward (see below for the precise meaning). This will be called the
direct method. Of course, this may be interesting whenever the expression one obtains can
be handled, which is the actual situation when the random field has a law which is stationary
and isotropic. Our method relies on the application of some known results on the spectrum of
random matrices.
Second, one can use Theorem 3 to study the asymptotics of P{M > u} as u +. More
precisely, one wants to write, whenever it is possible
P{M > u} = A(u) exp
1 u2
2 2
+ B(u)
(1)
method.
In all cases, the second order approximation for the direct method provides an upper bound
for the one arising from the EPC method.
Our proofs use almost no differential geometry, except for some elementary notions in Euclidean space. Let us remark also that we have separated the conditions on the law of the
process from the conditions on the geometry of the parameter set.
Third, Theorem 3 and related results in this paper, in fact refer to the density pM of
the maximum. On integration, they imply immediately a certain number of properties of the
probability distribution FM , such as the behaviour of the tail as u +.
Theorem 3 implies that FM has a density and we have an implicit expression for it. The
proof of this fact here appears to be simpler than previous ones (see Azas and Wschebor [8])
even in the case the process has 1-dimensional parameter (Azas and Wschebor [7]). Let us
remark that Theorem 3 holds true for non-Gaussian processes under appropriate conditions
allowing to apply Rice formula.
Our method can be exploited to study higher order differentiability of FM (as it has been
done in [7] for one-parameter processes) but we will not pursue this subject here.
This paper is organized as follows:
Section 2 includes an extension of Rice Formula which gives an integral expression for the
expectation of the weighted number of roots of a random system of d equations with d real
unknowns. A complete proof of this formula in a form which is adapted to our needs in this
paper, can be found in [9]. There is an extensive literature on Rice formula in various contexts
(see for example Belayiev [10] , Cramer-Leadbetter [15], Marcus [23], Adler [1], Wschebor [35].
In Section 3, we obtain the exact expression for the distribution of the maximum as a consequence of the Rice-like formula of the previous section. This immediately implies the existence
of the density and gives the implicit formula for it. The proof avoids unnecessary technicalities
that we have used in previous work, even in cases that are much simpler than the ones considered here.
In Section 4, we compute (Theorem 4) the first order approximation in the direct method
for stationary isotropic processes defined on a polyhedron, from which a new upper bound for
P{M > u} for all real u follows.
In Section 5, we consider second order approximation, both for the direct method and the
EPC approximation method. This is the content of Theorems 5, 6 and 7.
Section 6 contains some examples.
x
(y)dy.
Assume that the random vectors , have a joint Gaussian distribution, where has
values in some finite dimensional Euclidean space. When it is well defined,
E(f ()/ = x)
is the version of the conditional expectation obtained using Gaussian regression.
Eu := {t S : X(t) > u} is the excursion set above u of the function X(.) and Au :=
{M u} is the event that the maximum is not larger than u.
, , , denote respectively inner product and norm in a finite-dimensional real Euclidean
space; d is the Lebesgue measure on Rd ; S d1 is the unit sphere ; Ac is the complement
of the set A. If M is a real square matrix, M 0 denotes that it is positive definite.
4
This is well-known and follows easily from the next lemma (called Bulinskaya s lemma)
that we state without proof, for completeness.
Lemma 1 Let Z(t) be a stochastic process defined on some neighborhood of a set T embedded
in some Euclidean space. Assume that the Hausdorff dimension of T is smaller or equal than
the integer m and that the values of Z lie in Rm+k for some positive integer k . Suppose, in
addition, that Z has C 1 paths and that the density pZ(t) (v) is bounded for t T and v in some
neighborhood of u Rm+k . Then, a. s. there is no point t T such that Z(t) = u.
With respect to A5, one has the following sufficient conditions: Assume A1, A2, A3 and as
additional hypotheses one of the following two:
t
X(t) is of class C 3
sup
tS,x V (0)
as 0,
In this section we review Rice formula for the expectation of the number of roots of a random
system of equations. For proofs, see for example [8], or [9], where a simpler one is given.
Theorem 1 (Rice formula) Let Z : U Rd be a random field, U an open subset of Rd and
u Rd a fixed point in the codomain. Assume that:
(i) Z is Gaussian,
(ii) almost surely the function t
Z(t) is of class C 1 ,
(iii) for each t U , Z(t) has a non degenerate distribution (i.e. Var Z(t) 0),
(iv) P{t U, Z(t) = u, det Z (t) = 0} = 0
Then, for every Borel set B contained in U , one has
E NuZ (B) =
(2)
Theorem 2 Let Z be a random field that verifies the hypotheses of Theorem 1. Assume that
for each t U one has another random field Y t : W Rd , where W is some topological space,
verifying the following conditions:
a) Y t (w) is a measurable function of (, t, w) and almost surely, (t, w)
ous.
Y t (w) is continu-
one puts on C(W, Rd ) the topology of uniform convergence on compact sets. Then, for each
compact subset I of U , one has
g(t, Y t ) =
E
tI,Z(t)=u
(3)
Remarks:
1. We have already mentioned in the previous section sufficient conditions implying hypothesis (iv) in Theorem 1.
2. With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .
pM (x) =
tS0
d
+
j=1
Sj
E | det(Xj (t))| 1IAx /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt),
(4)
Remark: One can replace | det(Xj (t))| in the conditional expectation by (1)j det(Xj (t)),
since under the conditioning and whenever M x holds true, Xj (t) is negative semi-definite.
Proof of Theorem 3
Let Nj (u), j = 0, . . . , d be the number of global maxima of X(.) on S that belong to Sj and are
larger than u. From the hypotheses it follows that a.s.
j=0,...,d Nj (u) is equal to 0 or 1, so
that
P{M > u} =
P{Nj (u) = 1} =
E(Nj (u)).
(5)
j=0,...,d
j=0,...,d
The proof will be finished as soon as we show that each term in (5) is the integral over (u, +)
of the corresponding term in (4).
This is self-evident for j = 0. Let us consider the term j = d. We apply the weighted Rice
formula of Section 2 as follows :
Z is the random field X defined on Sd .
For each t Sd , put W = S and Y t : S R2 defined as:
Y t (w) := X(w) X(t), X(t) .
Notice that the second coordinate in the definition of Y t does not depend on w.
6
In the place of the function g, we take for each n = 1, 2, . . . the function gn defined as
follows:
gn (t, f1 , f2 ) = gn (f1 , f2 ) = 1 Fn (sup f1 (w)) . 1 Fn (u f2 (w)) ,
wS
(6)
E
tSd ,X (t)=0
Sd
(7)
Notice that the formula holds true for each compact subset of Sd in the place of Sd , hence for
Sd itself by monotone convergence.
Let now n in (7). Clearly gn (Y t ) 1IX(s)X(t)0,sS . 1IX(t)u . The passage to the limit
does not present any difficulty since 0 gn (Y t ) 1 and the sum in the left-hand side is bounded
by the random variable N0X (Sd ), which is in L1 because of Rice Formula. We get
E(Nd (u)) =
Sd
= 1 ; sn S, (n = 1, 2, . . .) such that sn t,
7
t sn
as n +},
t sn
whenever this set is non-empty and Ct,j = {0} if it is empty. We will denote by Ct,j the dual
cone of Ct,j , that is:
Ct,j := {z Rd : z, 0 for all Ct,j }.
Notice that these definitions easily imply that Tt,j Ct,j and Ct,j Nt,j . Remark also that for
j = d0 , Ct,j = Nt,j .
We will say that the function X(.) has an extended outward derivative at the point t in
(t) C .
Sj , j d0 if Xj,N
t,j
Corollary 1 Under assumptions A1 to A5, one has :
(a) pM (x) p(x) where
E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)+
p(x) :=
tS0
d0
Sj
j=1
j,N (t)Ct,j
p(x)dx.
u
Proof
(a) follows from Theorem 3 and the observation that if t Sj , one has
(t) C }. (b) is an obvious consequence of (a).
{M X(t)} {Xj,N
t,j
The actual interest of this Corollary depends on the feasibility of computing p(x). It turns
out that it can be done in some relevant cases, as we will see in the remaining of this section.
+
Our result can be compared with the approximation of P{M > u} by means of u pE (x)dx
given by [3], [34] where
pE (x) :=
d0
(1)j
+
j=1
Sj
j,N (t)Ct,j
Under certain conditions , u pE (x)dx is the expected value of the EPC of the excursion set
Eu (see [3]). The advantage of pE (x) over p(x) is that one can have nice expressions for it in
quite general situations. Conversely p(x) has the obvious advantage that it is an upper-bound
of the true density pM (x) and hence provides upon integrating once, an upper-bound for the
tail probability, for every u value. It is not known whether a similar inequality holds true for
pE (x).
On the other hand, under additional conditions, both provide good first order approximations
for pM (x) as x as we will see in the next section. In the special case in which the process
X is centered and has a law that is invariant under isometries and translations, we describe
below a procedure to compute p(x).
For one-parameter centered Gaussian process having constant variance and satisfying certain
regularity conditions, a general bound for pM (x) has been computed in [8], pp.75-77. In the
two parameter case, Mercadier [26] has shown a bound for P{M > u}, obtained by means of a
method especially suited to dimension 2. When the parameter is one or two-dimensional, these
bounds are sharper than the ones below which, on the other hand, apply to any dimension but
to a more restricted context. We will assume now that the process X is centered Gaussian,
with a covariance function that can be written as
E X(s).X(t) = s t
(10)
X
ti (t).X(t)
2. E
X
X
ti (t). tk (t)
3. E
2X
ti tk (t).X(t)
4. E
2X
2X
ti tk (t). ti tk (t)
= 0,
= 2 ik and < 0,
= 2 ik , E
2X
X
ti tk (t). tj (t)
=0
= 24 ii .kk + i k .ik + ik i k ,
5. 2 0
6. If t Sj , the conditional distribution of Xj (t) given X(t) = x, Xj (t) = 0 is the same as
the unconditional distribution of the random matrix
Z + 2 xIj ,
where Z = (Zik : i, k = 1, . . . , j) is a symmetric j j matrix with centered Gaussian
entries, independent of the pair X(t), X (t) such that, for i k, i k one has :
E(Zik Zi k ) = 4 2 ii + ( 2 ) ik i k + 4 ii .kk (1 ik ) .
Let us introduce some additional notations:
Hn (x), n = 0, 1, . . . are the standard Hermite polynomials, i.e.
Hn (x) := ex
n x2
2 /2
n x2 /2
Jn (x) :=
ey
2 /2
Hn ()dy, n = 0, 1, 2, . . .
(11)
where stands for the linear form = ay + bx where a, b are some real parameters that satisfy
a2 + b2 = 1/2. Then
ey
2 /2
Hn ()dy = 2nb
Also:
Jn (0) =
ey
2 /2
(12)
ey
2 /2
Hn (ay)dy,
Jn (0) =
ey
2 /2
= 2a2
ey
(2p)!
2.
J2p (0) = (1)p (2b)2p (2p 1)!! 2 = (2b2 )p
p!
(13)
(14)
Now we can go back to (12) and integrate successively for n = 1, 2, . . . on the interval [0, x]
using the initial value given by (14) when n = 2p and Jn (0) = 0 when n is odd, obtaining :
(15)
Qn (x)
= nQn (x)
(16)
Qn (0) = 0 if n is odd
(17)
(18)
It is now easy to show that in fact Qn (x) = H n (x) , n = 0, 1, 2, . . . using for example that:
x
H n (x) = 2n/2 Hn .
2
The integrals
In (v) =
2 /2
et
Hn (t)dt,
will appear in our computations. They are computed in the next Lemma, which can be proved
easily, using the standard properties of Hermite polynomials.
10
Lemma 4 (a)
[ n1
]
2
v2 /2
(n 1)!!
Hn12k (v)
(n 1 2k)!!
(n 1)!! 2 (x)
2k
In (v) = 2e
k=0
n
+ 1I{n even} 2 2
(b)
(19)
(20)
(21)
Theorem 4 Assume that the process X is centered Gaussian, satisfies conditions A1-A5 with
a covariance having the form (10) and verifying the regularity conditions of the beginning of this
section. Moreover, let S be a polyhedron. Then, p(x) can be expressed by means of the following
formula:
d0
| | j/2
p(x) = (x)
H j (x) + Rj (x) gj
0 (t) +
,
(22)
j=1
tS0
where
gj =
(23)
Sj
where j (t) is the normalized solid angle of the cone Ct,j in Nt,j , that is:
j (t) =
d (t) = 1.
(24)
(25)
Notice that for convex or other usual polyhedra j (t) is constant for t Sj , so that gj is
equal to this constant multiplied by the j-dimensional geometric measure of Sj .
For j = 1, . . . d,
Rj (x) =
2
| |
j
2
((j + 1)/2
y2
dy
2
(26)
with := | |( )1/2
(27)
Tj (v) exp
where
v := (2)1/2 (1 2 )1/2 y x
and
j1
Tj (v) :=
k=0
Hk2 (v) v2 /2
Hj (v)
Ij1 (v).
e
j
2k k!
2 (j 1)!
(28)
2 /2
qn () = e
2 /2
c2k Hk2 ()
k=0
+
ey
+ 1I{n odd
2 /2
Hn (y)dy 2
ey
2 /2
Hn (y)dy
Hn1 ()
,
+ y 2 /2
e
H
(y)dy
n1
(29)
qn+1 ()
,
n+1
(30)
Proof:
Denote by 1 , . . . , n the eigenvalues of Gn . It is well-known (Mehta [25], Kendall et al. [19])
that the joint density fn of the n-tuple of random variables (1 , . . . , n ) is given by the formula
n
n
2
i=1 i
fn (1 , . . . , n ) = cn exp
1i<kn
(1+i/2)
i=1
Then,
n
E | det(Gn In )| = E
i=1
|i |
=
Rn i=1
|i |cn exp(
= e
2 /2
cn
cn+1
Rn
n
2
i=1 i
)
1i<kn
|k i | d1 , . . . , dn
fn+1 (1 , . . . , n , )d1 , . . . , dn = e
2 /2
cn qn+1 ()
.
cn+1 n + 1
(31)
j/2
j/2
(2 )
Xj,N
(t)
is independent of
(33)
Since the distribution of X (t) is centered Gaussian with variance 2 Id , it follows that :
E( 1IX (t)Cbt,0 /X(t) = x) = 0 (t)
12
(32)
if t S0 ,
(34)
and if t Sj , j 1:
E(| det(Xj (t))| 1IX
j,N (t)Ct,j
/X(t) = x, Xj (t) = 0)
8 Gj + 2
E | det(Gj Ij )| (y)dy,
d0
0 (t) +
j=1
tS0
| |
j/2
H j (x) gj
(36)
which is the product of a standard Gaussian density times a polynomial with degree d0 .
Integrating once, we get -in our special case- the formula for the expectation of the EPC
of the excursion set as given by [3]
The complementary term given by
d0
Rj (x)gj ,
(x)
(37)
j=1
can be computed by means of a formula, as it follows from the statement of the theorem
above. These formulae will be in general quite unpleasant due to the complicated form of
Tj (v). However, for low dimensions they are simple. For example:
T2 (v) = 2 2(v),
T1 (v) =
T3 (v) =
13
(38)
(39)
(40)
Second order asymptotics for pM (x) as x + will be mainly considered in the next
section. However, we state already that the complementary term (37) is equivalent, as
x +, to
12
2
3 2
x2
(41)
j+1
2
j/4
j/2
3 2
(2) (j 1)!
2j4
(42)
We are not going to go through this calculation, which is elementary but requires some
work. An outline of it is the following. Replace the Hermite polynomials in the expression
for Tj (v) given by (28) by the well-known expansion:
[j/2]
(1)i
Hj (v) = j!
i=0
(2v)j2i
i!(j 2i)!
(43)
v2
2j1
v 2j4 e 2 .
(j 1)!
(44)
Using now the definition of Rj (x) and changing variables in the integral in (26), one gets
for Rj (x) the equivalent:
12
Kj x2j4 e
2
3 2
x2
(45)
In particular, the equivalent of (37) is given by the highest order non-vanishing term in
the sum.
Consider now the case in which S is the sphere S d1 and the process satisfies the same
conditions as in the theorem. Even though the theorem can not be applied directly,
it is possible to deal with this example to compute p(x), only performing some minor
changes. In this case, only the term that corresponds to j = d 1 in (8) does not vanish,
= 1 for each t S d1 and one can use invariance
Ct,d1 = Nt,d1 , so that 1IX
b
(t)C
d1,N
t,d1
d1 S d1
E | det(Z + 2 xId1 ) + (2| |)1/2 Id1 |
(2)(d1)/2
(46)
14
variance 2| | and independent of the tangential derivative. So, we apply the previous
computation, replacing x by x + (2| |)1/2 and obtain the expression:
p(x) = (x)
+
2 d/2
(d/2)
| | (d1)/2
H d1 (x + (2| |)1/2 y) + Rd1 (x + (2| |)1/2 y) (y)dy.
(47)
Asymptotics as x +
In this section we will consider the errors in the direct and the EPC methods for large values
of the argument x. Theses errors are:
p(x) pM (x) =
d0
+
j=1
Sj
j,N (t)Ct,j
pE (x) pM (x) =
. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (48)
d0
(1)j
+
j=1
Sj
j,N (t)Ct,j
(50)
Proof :
Let W be an open neighborhood of the compact subset Sv of S such that dist(W, (S\Sd0 )) > 0
where dist denote the Euclidean distance in Rd . For t Sj W c , the density
pX(t),Xj (t) (x, 0)
can be written as the product of the density of Xj (t) at the point 0, times the conditional density
of X(t) at the point x given that Xj (t) = 0, which is Gaussian with some bounded expectation
and a conditional variance which is smaller than the unconditional variance, hence, bounded by
some constant smaller than 1. Since the conditional expectations in (48) are uniformly bounded
by some constant, due to standard bounds on the moments of the Gaussian law, one can deduce
that:
p(x) pM (x) =
W Sd0
d0 ,N (t)Ct,d0
.pX(t),Xd
as x +, for some 1 > 0. Our following task is to choose W such that one can assure
that the first term in the right hand-member of (51) has the same form as the second, with a
possibly different constant 1 .
To do this , for s S and t Sd0 , let us write the Gaussian regression formula of X(s) on the
pair (X(t), Xd 0 (t)):
X(s) = at (s)X(t) + bt (s), Xd 0 (t) +
ts
2
X t (s).
(52)
where the regression coefficients at (s), bt (s) are respectively real-valued and Rd0 -valued.
From now onwards, we will only be interested in those t W . In this case, since W does not
contain boundary points of S\Sd0 , it follows that
Ct,d0 = Nt,d0 and 1IX
d0 ,N (t)Ct,d0
= 1.
Moreover, whenever s S is close enough to t, necessarily, s Sd0 and one can show that
the Gaussian process {X t (s) : t W Sd0 , s S} is bounded, in spite of the fact that its
trajectories are not continuous at s = t. For each t, {X t (s) : s S} is a helix process, see [8]
for a proof of boundedness.
On the other hand, conditionally on X(t) = x, Xd 0 (t) = 0 the event {M > x} can be written as
{X t (s) > t (s) x, for some s S}
where
t (s) =
2(1 at (s))
.
ts 2
(53)
Our next goal is to prove that if one can choose W in such a way that
inf{ t (s) : t W Sd0 , s S, s = t} > 0,
(54)
then we are done. In fact, apply the Cauchy-Schwarz inequality to the conditional expectation
in (51). Under the conditioning, the elements of Xd0 (t) are the sum of affine functions of x
with bounded coefficients plus centered Gaussian variables with bounded variances, hence, the
absolute value of the conditional expectation is bounded by an expression of the form
Q(t, x)
1/2
X t (s)
>x
t
sS\{t} (s)
sup
16
1/2
(55)
where Q(t, x) is a polynomial in x of degree 2d0 with bounded coefficients. For each t W Sd0 ,
the second factor in (55) is bounded by
P sup
X t (s)
: t W Sd0 , s S, s = t > x
t (s)
1/2
X t (s)
: t W Sd0 , s S, s = t > x C2 exp(2 x2 ),
t (s)
for some positive constants C2 , 2 and any x > 0. Also, the same argument above for the density
pX(t),Xd (t) (x, 0) shows that it is bounded by a constant times the standard Gaussian density.
0
To finish, it suffices to replace these bounds in the first term at the right-hand side of (51).
It remains to choose W for (54) to hold true. Consider the auxiliary process
Y (s) :=
X(s)
r(s, s)
, s S.
(56)
s = t, it follows that X(t), Xd0 (t) are independent, on differentiation under the expectation
sign. This implies that in the regression formula (52) the coefficients are easily computed and
at (s) = r(s, t) which is strictly smaller than 1 if s = t, because of the non-degeneracy condition.
Then
2(1 r(s, t))
2(1 r Y (s, t))
t (s) =
.
(57)
ts 2
ts 2
Since r Y (s, s) = 1 for every s S, the Taylor expansion of r Y (s, t) as a function of s, around
s = t takes the form:
Y
r Y (s, t) = 1 + s t, r20,d
(t, t)(s t) + o( s t 2 ),
0
(58)
(59)
where the last equality follows by differentiation in (56) and putting s = t. (59) implies that
Y
(t, t) is uniformly positive definite on t Sv , meaning that its minimum eigenvalue has
r20,d
0,
a strictly positive lower bound. This, on account of (57) and (58), already shows that
inf{ t (s) : t Sv , s S, s = t} > 0,
(60)
(61)
(1 at0 (s0 ))
= t0 (s0 ),
t0 s 0 2
d0 t s
s
2
X t (s),
where (at )d0 (s) is a column vector of size d0 and (bt )d0 (s) is a d0 d0 matrix. Then, one must
have at (t) = 1, (at )d0 (t) = 0 . Thus
tn (sn ) = uTn (at0 )d0 (t0 )un + o(1),
where un := (sn tn )/ sn tn . Since t0 Sv we may apply (61) and the limit of tn (sn )
cannot be non-positive.
A straightforward application of Theorem 5 is the following
Corollary 2 Under the hypotheses of Theorem 5, there exists positive constants C, such that,
for every u > 0 :
+
+
u
and
1
:= lim 2x2 log p(x) pM (x)
x+
d2
(62)
1
lim 2x2 log pE (x) pM (x)
2 := x+
E
(63)
whenever these limits exist. In general, we are unable to compute the limits (62) or (63) or
even to prove that they actually exist or differ. Our more general results (as well as in [3], [34])
only contain lower-bounds for the liminf as x +. This is already interesting since it gives
some upper-bounds for the speed of approximation for pM (x) either by p(x) or pE (x). On the
other hand, in Theorem 7 below, we are able to prove the existence of the limit and compute
d2 for a relevant class of Gaussian processes.
18
For the next theorem we need an additional condition on the parameter set S. For S
verifying A1 we define
(S) = sup
sup
0jd0 tSj
sup
sS,s=t
(64)
t2 := sup
and
t := sup
dist 1
t r01 (s, t), Ct,j
1 r(s, t)
sS\{t}
(66)
where
t := Var(X (t))
(t) is the maximum eigenvalue of t
in (66), j is such that t Sj ,(j = 0, 1, . . . , d0 ).
The quantity in the right hand side of (65) is strictly bigger than 1.
Remark. In formula (65) it may happen that the denominator in the right-hand side is
identically zero, in which case we put + for the infimum. This is the case of the one-parameter
process X(t) = cos t + sin t where , are Gaussian standard independent random variables,
and S is an interval having length strictly smaller than .
19
Proof of Theorem 6
Let us first prove that suptS t < .
For each t S, let us write the Taylor expansions
r01 (s, t) = r01 (t, t) + r11 (t, t)(s t) + O( s t 2 )
= t (s t) + O( s t 2 )
L3
1 r(s, t)
+ L4 ,
(67)
L3 (S) + L4 .
1 r(s, t)
With the same notations as in the proof of Theorem 5, using (4) and (8), one has:
d0
+
j=1
Sj
j,N (t)Ct,j .
t2
1
,
+ (t)2t
T
{M > x} and {Rt (s) > (1 r(s, t))x r01
(s, t)1
t Xj,N (t) for some s S}
20
(69)
coincide.
(t)|X (t) = 0) the regression of X (t) on X (t) = 0. So, the probability in
Denote by (Xj,N
j
j
j,N
(69) can written as
Cbt,j
P{ t (s) > x
T (s, t)1 x
r01
(70)
where
t (s) :=
Rt (s)
1 r(s, t)
If 1
t r01 (s, t) Ct,j one has
T
r01
(s, t)1
t x 0
1
t r01 (s, t) = z + z
So, if x Ct,j :
T (s, t)1 x
r01
z T x + z T x
t
=
t x
1 r(s, t)
1 r(s, t)
using that z T x 0 and the Cauchy-Schwarz inequality. It follows that in any case, if x Ct,j
the expression in (70) is bounded by
Cbt,j
(71)
To obtain a bound for the probability in the integrand of (71) we will use the classical
inequality for the tail of the distribution of the supremum of a Gaussian process with bounded
paths.
The Gaussian process (s, t))
t (s), defined on (S S)\{s = t} has continuous paths. As
the pair (s, t) approches the diagonal of S S, t (s) may not have a limit but, almost surely,
it is bounded (see [8] for a proof). (For fixed t, t (.) is a helix process with a singularity at
s = t, a class of processes that we have already met above).
We set
mt (s) := E( t (s)) (s = t)
m := sups,tS,s=t |mt (s)|
:= E | sups,tS,s=t t (s) mt (s) | .
The almost sure boundedness of the paths of t (s) implies that m < and < . Applying the
Borell-Sudakov-Tsirelson type inequality (see for example Adler [2] and references therein) to
the centered process s
t (s)mt (s) defined on S\{t} , we get whenever xt x m > 0:
P{ t (s) > x t x for some s S}
(x t x m )2
.
2t2
jd
2
x mj,N (t)
exp
2j (t)
(t)|X (t))
where j (t) and j (t) are respectively the minimum and maximum eigenvalue of Var(Xj,N
j
and mj,N (t) is the conditional expectation E(Xj,N (t)|Xj (t) = 0). Notice that j (t), j (t), mj,N (t)
are bounded, j (t) is bounded below by a positive constant and j (t) (t).
P {Xj,N
Ct,j } {M > x}/X(t) = x, Xj (t) = 0
(2j (t))
jd
2
exp
x mj,N (t) 2
(x t x m )2
dx
+
2t2
2(t)
xm
+ P Xj,N
(t)|Xj (t) = 0
, (72)
t
where it is understood that the second term in the right-hand side vanishes if t = 0.
Let us consider the first term in the right-hand side of (72). We have:
x mj,N (t)
(x t x m )2
+
2t2
2(t)
2
(x t x m )2 ( x mj,N (t) )
+
2t2
2(t)
(x m t mj,N (t) )2
2
,
= A(t) x + B(t)(x m ) + C(t) +
2t2 + 2(t)2t
where the last inequality is obtained after some algebra, A(t), B(t), C(t) are bounded functions
and A(t) is bounded below by some positive constant.
So the first term in the right-hand side of (72) is bounded by :
2.(2j )
jd
2
exp
(x m t mj,N (t))2
2t2 + 2(t)2t
Rdj
dx
(x m t mj,N (t) )2
2t2 + 2(t)2t
(73)
where L is some constant. The last inequality follows easily using polar coordinates.
Consider now the second term in the right-hand side of (72). Using the form of the conditional
density pXj,N
(t)/Xj (t)=0 (x ), it follows that it is bounded by
P
(Xj,N
(t)/Xj (t)
= 0)
mj,N (t)
x m t mj,N (t)
t
L1 |x|dj2 exp
(x m t mj,N (t) )2
2(t)2t
where L1 is some constant. Putting together (73) and (74) with (72), we obtain (69).
The following two corollaries are straightforward consequences of Theorem 6:
22
(74)
tS 2
t
x+
1
.
+ (t)2t
d0
p (x) =
j=0
Examples
1 v (t0 )/2
1/k
kCk
E || 2k 1 x11/k (x),
(75)
1
v (2k) (t0 ) + 14 [v (t0 )]2 1Ik=2 . The
where is a standard normal random variable and Ck = (2k)!
proof is a direct application of the Laplace method. The result is new for the density of the
maximum, but if we integrate the density from u to +, the corresponding bound for P{M > u}
is known under weaker hypotheses (Piterbarg [28]).
2) Let the process X be centered and satisfy A1-A5. Assume that the the law of the process
is isotropic and stationary, so that the covariance has the form (10) and verifies the regularity
condition of Section 4. We add the simple normalization = (0) = 1/2. One can easily
check that
1 2 ( s t 2 ) 42 ( s t 2 ) s t 2
(76)
t2 = sup
[1 ( s t 2 )]2
sS\{t}
Furthermore if
(x) 0 for x 0
(77)
one can show that the sup in (76) is attained as s t 0 and is independent of t. Its value
is
t2 = 12 1.
The proof is elementary (see [4] or [34]).
Let S be a convex set. For t Sj , s S:
dist r01 (s, t), Ct,j = dist 2 ( s t 2 )(t s), Ct,j .
23
(78)
The convexity of S implies that (t s) Ct,j . Since Ct,j is a convex cone and 2 ( s t 2 ) 0,
one can conclude that r01 (s, t) Ct,j so that the distance in (78) is equal to zero. Hence,
t = 0 for every t S
and an application of Theorem 6 gives the inequality
lim inf
x+
1
2
.
log p(x) pM (x) 1 +
2
x
12 1
(79)
A direct consequence is that the same inequality holds true when replacing p(x) pM (x) by
|pE (x) pM (x)| in (79), thus obtainig the main explicit example in Adler and Taylor [3], or in
Taylor et al. [34].
Next, we improve (79). In fact, under the same hypotheses, we prove that the liminf is an
ordinary limit and the sign is an equality sign. We state this as
Theorem 7 Assume that X is centered, satisfies hypotheses A1-A5, the covariance has the
form (10) with (0) = 1/2, (x) 0 f or x 0. Let S be a convex set, and d0 = d 1.
Then
1
2
.
(80)
lim log p(x) pM (x) = 1 +
x+ x2
12 1
Remark Notice that since S is convex, the added hypothesis that the maximum dimension d0
such that Sj is not empty is equal to d is not an actual restriction.
Proof of Theorem 7
In view of (79), it suffices to prove that
lim sup
x+
1
2
log p(x) pM (x) 1 +
.
2
x
12 1
(81)
Using (4) and the definition of p(x) given by (8), one has the inequality
p(x) pM (x) (2)d/2 (x)
Sd
(82)
where our lower bound only contains the term corresponding to the largest dimension and we
have already replaced the density pX(t),X (t) (x, 0) by its explicit expression using the law of the
process. Under the condition {X(t) = x, X (t) = 0} if v0T X (t)v0 > 0 for some v0 S d1 , a
Taylor expansion implies that M > x. It follows that
E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0
E | det(X (t))| 1I
vS d1
We now apply Lemma 2 which describes the conditional distribution of X (t) given X(t) =
x, X (t) = 0 . Using the notations of this lemma, we may write the right-hand side of (83) as :
E | det(Z xId)| 1I
sup v T Zv > x
vS d1
=
x
y2
dy, (84)
2 2
y
Z12
... ...
Z1d
2 + y Z23 . . .
Z2d
Z :=
,
..
.
d + y
where the random variables {2 , . . . , d , Zik , 1 i < k d} are independent centered Gaussian
with
Var(Zik ) = 4 (1 i < k d) ; Var(i ) =
4 1
16 (8 1)
(i = 2, . . . , d) ; =
12 1
12 1
1
2
exp(
x
L
y2
)E | det(ZxId)| dy
2
2
2
x(1+0 )+1
exp(
x(1+0 )
y2
)0 (1(1+0 ))d1 xd dy
2 2
for x large enough. On account of (82),(83),(84), we conclude that for x large enough,
p(x) pM (x) L1 xd exp
x2 (x(1 + 0 ) + 1)2
+
.
2
2 2
for some new positive constant L1 . Since 0 can be chosen arbitrarily small, this implies (81).
3) Consider the same processes of Example 2, but now defined on the non-convex set {a
t b}, 0 < a < b. The same calculations as above show that t = 0 if a < t b and
t = max
for t = a.
4) Let us keep the same hypotheses as in Example 2 but without assuming that the covariance is decreasing as in (77). The variance is still given by (76) but t is not necessarily equal
to zero. More precisely, relation (78) shows that
t sup 2
sS\{t}
( s t 2 )+ s t
1 ( s t 2 )
The normalization: = 1/2 implies that the process X is identity speed, that is
Var(X (t)) = Id so that (t) = 1. An application of Theorem 6 gives
lim inf
x+
where
2
log p(x) pM (x) 1 + 1/Z .
x2
(85)
2
4 (z 2 )+ z
1 2 (z 2 ) 42 (z 2 )z 2
+
max
,
[1 (z 2 )]2
z(0,] [1 (z 2 )]2
z(0,]
Z := sup
and is the diameter of S.
5) Suppose that
25
the process X is stationary with covariance (t) := Cov(X(s), X(s + t)) that satisfies
(s1 , . . . , sd ) = i=1,...,d i (si ) where 1 , ..., d are d covariance functions on R which are
monotone, positive on [0, +) and of class C 4 ,
S is a rectangle
S=
[ai , bi ] , ai < bi .
i=1,...,d
Then, adding an appropriate non-degeneracy condition, conditions A2-A5 are fulfilled and Theorem 6 applies
It is easy to see that
..
r0,1 (s, t) =
.
1 (s1 t1 ) . . . d1 (sd1 td1 ).d (sd td )
belongs to Ct,j for every s S. As a consequence t = 0 for all t S. On the other hand,
standard regressions formulae show that
2
2
2
2
2
1 21 . . . 2d 2
Var X(s)/X(t), X (t)
1 2 . . . d 1 . . . d1 d
=
,
(1 r(s, t))2
(1 1 . . . d )2
References
[1] Adler, R.J. (1981). The Geometry of Random Fields. Wiley, New York.
[2] Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes. IMS, Hayward, Ca.
[3] Adler, R.J. and Taylor J. E.(2005). Random fields and geometry. Book to appear.
[4] Azas J-M., Bardet J-M. and Wschebor M. (2002). On the Tails of the distribution of the
maximum of a smooth stationary Gaussian Process. Esaim: P. and S., 6,177-184.
[5] Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution of the
maximum of a Gaussian random fields. Extremes (2002)5(2), 181-212.
[6] Azas, J-M and Wschebor, M. (2002). The Distribution of the Maximum of a Gaussian
Process: Rice Method Revisited, In and out of equilibrium: probability with a physical
flavour, Progress in Probability, 321-348, Birkha
user.
[7] Azas J-M. and Wschebor M (2001). On the regularity of the distribution of the Maximum
of one parameter Gaussian processes Probab. Theory Relat. Fields, 119, 70-98.
[8] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[9] Azas J-M. and Wschebor, M. (2006). A self contained proof of the Rice formula for random
fields. Preprint available at http://www.lsp.ups-tlse.fr/Azais/publi/completeproof.pdf.
[10] Belyaev, Y. (1966). On the number of intersections of a level by a Gaussian Stochastic
process. Theory Prob. Appl., 11, 106-113.
26
[11] Berman, S.M. (1985a). An asymptotic formula for the distribution of the maximum of a
Gaussian process with stationary increments. J. Appl. Prob., 22,454-460.
[12] Berman, S.M. (1992). Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series.
[13] Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent. Math., 30,
207-216.
[14] Borell, C. (2003). The Ehrhard inequality. C.R. Acad. Sci. Paris, Ser. I, 337, 663-666.
[15] Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes, J.
Wiley & Sons, New-York.
[16] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[17] Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
dEte de Probabilites de Saint Flour (1974). Lecture Notes in Mathematics, 480, SpringerVerlag, New-York.
[18] Fyodorov, Y. (2006). Complexity of Random Energy Landscapes, Glass Transition and
Absolute Value of Spectral Determinant of Random Matrices Physical Review Letters v. 92
(2004), 240601 (4pages); Erratum: ibid. v.93 (2004),149901(1page)
[19] Kendall, M.G., Stuart,A. and Ord, J.K. (1987). The Advanced Theory of Statistics, Vol. 3.
[20] Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process. Sankya
Ser. A 32, 369-378.
[21] Ledoux, M. (2001). The Concentration of Measure Phenomenon. American Math. Soc.,
Providence, RI.
[22] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces, Springer-Verlag,
New-York.
[23] Marcus, M.B. (1977). Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 52-71.
[24] Marcus, M.B. and Shepp, L.A. (1972). Sample behaviour of Gaussian processes. Proc.
Sixth Berkeley Symp. Math. Statist. Prob., 2, 423-442.
[25] Mehta,M.L. (2004). Random matrices, 3d-ed. Academic Press.
[26] Mercadier, C. (2006). Numerical bounds for the distribution of the maximum of one- and
two-dimensional processes, to appear in Advances in Applied Probability, 38, (1).
[27] Piterbarg, V; I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th, Proba. Appl., 26, 687-705.
[28] Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes and
Fields. American Mathematical Society. Providence. Rhode Island.
[29] Rychlik, I. (1990). New bounds for the first passage, wave-length and amplitude densities.
Stochastic Processes and their Applications, 34, 313-339.
[30] Sudakov, V.N. and Tsirelson, B.S. (1974). Extremal properties of half spaces for spherically
invariant measures (in Russian). Zap. Nauchn. Sem. LOMI, 45, 75-82.
27
[31] Sun, J. (1993). Tail Probabilities of the Maxima of Gaussian Random Fields, Ann. Probab.,
21, 34-71.
[32] Talagrand, M. (1996). Majorising measures: the general chaining. Ann. Probab., 24,
1049-1103.
[33] Taylor, J.E. and Adler, R. J. (2003). Euler characteristics for Gaussian fields on manifolds.
Ann. Probab., 31, 533-563.
[34] Taylor J.E., Takemura A. and Adler R.J. (2005). Validity of the expected Euler Characteristic heuristic. Ann. Probab., 33, 4, 1362-1396.
[35] Wschebor, M. (1985). Surfaces aleatoires. Mesure geometrique des ensembles de niveau.
Lecture Notes in Mathematics, 1147, Springer-Verlag.
28
bardet@cict.fr,
wscheb@fcien.edu.uy
February 7, 2002
Abstract
We study the tail of the distribution of the maximum of a stationary Gaussian
process on a bounded interval of the real line. Under regularity conditions, we give
an additional term for this asymptotics.
(t)dt,
Piterbarg (1981, Theorem 2.2.) proved (under the weaker condition 4 < instead of
8 < ) that for each T > 0 and any u R:
1 (u) +
2
u2 (1 + )
T (u) P (M > u) B exp
2
2
(1)
for some positive constants B and . It is easy to see (see for example Miroshin, 1974)
that the expression inside the modulus is non-negative, so that in fact:
0 1 (u) +
2
u2 (1 + )
T (u) P (M > u) B exp
2
2
(2)
The problem of improving relation (2) does not seem to have been solved in a satisfactory manner until now. A crucial step has been done by Piterbarg in the same paper
(Theorem 3.1) in which he proved that if T is small enough, then as u +:
T
2
4
3 3(4 22 )9/2
P (M > u) = 1(u)+
u [1 + o(1)] .
T (u)
9/2
5
2
4 22
22 (2 6 24 ) u
(3)
The same result has been obtained by other methods (Azas and Bardet, 2000; see also
Azas et al., 1999).
However Piterbarg equivalent (3) is of limited interest for applications since it contains
no information on the meaning of the expression T small enough.
The aim of this paper is to show that formula (3) is in fact valid for any length T
under appropriate conditions that will be described below.
Consider the function F (t) defined by
2
2 1 r(t)
F (t) :=
2 1 r 2 (t) r 2 (t)
Lemma 1 The even function F is well defined, has a continuous extension at zero and
F (0) =
22
;
4 22
F (0) = 0;
0 < F (0) =
2 (2 6 24 )
< .
9(4 22 )
Proof:
The denominator of F (t) is equal to 1 r 2 (t) Var X (0)|X(0), X(t) thus non zero
due to the non degeneracy hypothesis.
A direct Taylor expansion gives the value of F (0).
The expression of F (t) below shows that F (0) = 0 and gives the value of F (0)
F (t) =
(4)
Note that 4 22 can vanish only if there exists some real such that ({ }) =
({}) = 1/2. Similarly, 2 6 24 can vanish only if there exists some real and
p 0 such that ({ }) = ({}) = p, ({0}) = 1 2p. These cases are excluded
by the non degeneracy hypothesis.
We will say that the function F satisfies hypothesis (H) if it has a unique minimum at
t = 0. The next proposition contains some sufficient conditions for this to take place.
Proposition 1 (a) If r (t) < 0 for 0 < t T then (H) is satisfied.
(b) Suppose that X is defined on the whole line and that
4 > 222 ;
r(t), r (t) 0 as t ;
there exists no local maximum of r(t) (other than at t = 0) with value greater or
4 222
equal to
.
4
Then (H) is satisfied for every T > 0.
An example of a process satisfying condition (b) but not condition (a) is given by the
covariance
1 + cos(t) t2 /2
e
r(t) :=
2
if we choose sufficiently small. In fact, a direct computation gives 2 = 1 + 2 /2 ;
4 = 3 + 3 2 + 4 /2 so that
1 + 2
4 222
=
.
4
3 + 3 2 + 4 /2
2
,
,
On [0, ), the covariance attains its second largest local maximum in the interval
so that its value is smaller than exp
2
. Hence, choosing is sufficiently small the
2 2
Proofs
Notations :
p (x) is the density (when it exists) of the random variable at the point x IRn .
1IC denotes the indicator function of the event C.
Uu ([a, b]), u IR is the number of upscrossings on the interval [a, b] of the level u
by the process X defined as follows:
Uu ([a, b]) = #{t [a, b], X(t) = u, X (t) > 0}.
For k a positive integer, k (u, [a, b]) is the kth order factorial moment of Uu ([a, b])
k (u, [a, b]) = IE
We define also
k (u, [a, b]) = IE
h(u) =
0
1
g(t) exp u2 f (t)
2
dt.
Then, as u :
h(u)
g (k) (t )
k!
1
xk exp f (t )x2
4
J
dx
1 2
exp
u f (t )
uk+1
2
(5)
Write:
t+
h(u) =
t
1
g(t) exp u2 f (t)
2
dt +
[0,T ]{|tt |}
1
g(t) exp u2 f (t)
2
dt
1
exp ay 2 dy =
2
1
u7
1
3
1
2 3 + 3 5 +O
au a u
au
1
u7
1
exp au2 .
2
depending only on a0 .
(6)
K
, K a constant
u7
Proof of Theorem 1 :
Step 1 : The proof is based on an extension of Piterbargs result to intervals of any
length. Let > 0, the following relation is clear
P (M[0, ] > u) = P (X(0) > u) + P (Uu ([0, ]).1I{X(0)u} 1)
= 1 (u) + P (Uu ([0, ]) 1) P (Uu ([0, ]).1I{X(0)>u} 1).
In the sequel a term will be called negligible if it is O u6 exp
u +. We use the following relations to be proved later:
1 4 u 2
2 4 22
as
1 (u, [0, T ]) =
dx
u
(8)
m
m
+ m
,
1
(u) T
2
dt
2 F
e 2 F y dy
2 0
u
2
r 2 F y 2
r F
(1 r)u2
exp
exp
2(1 + r)
22 (1 r 2 )
2 (1 r 2 )
u
1 (u, [0, T ]) =
B(t, u)dt,
0
where r, r and F stand for r(t), r (t) and F (t) respectively. Clearly, since r (0) = 2 <
0, there exists T0 such that r < 0 on (0, T0 ]. Divide the integral into two parts: [0, T0 ]
and [T0 , T ]. Using formula (6) on [0, T0 ] we get
(u)
2
B(t, u) =
2 F 5/2
2 (1 r)2 3
u + O u5 (u) ,
r 2
4 u 2
2(4 22 )
On the other hand, since inf t[T0 ,T ] F (t) is strictly larger than F (0), it follows easily from
exp
ay 2
2
1
au2
dy (const) exp
2
a
that
a > 0 , u 0,
B(t, u)dt
T0
is negligible.
Step 3 : Proof of (ii). Use once more Markovs inequality:
P (Uu ([0, ])Uu ([, 2 ]) 1 ) IE (Uu ([0, ])Uu ([, 2 ])) .
Because of Rice formula (Cramer and Leadbetter, 1967):
(t (2 t))At (u)dt,
(9)
with
At (u) = E X + (0)X + (t)|X(0) = X(t) = u pX(0),X(t) (u, u).
It is proved in Azas et al. (1999) that
At (u) =
1
2
1 r2
u
1+r
with
T1 (t, u) = 1 + (b)(kb),
+
2
T2 (t, u) = 2( )
(kx)(x)dx
b
T3 (t, u) = 2(kb)(b)
= (t, u) = IE X (0)|X(0) = X(t) = u) =
r
u
1+r
(10)
r (1 r 2 ) rr 2
2 (1 r 2 ) r 2
1+
; b = b(t, u) = /,
1
k = k(t) =
z
(z) =
r 2
1 r2
(v)dv,
0
Since T1 (u) + T2 (u) + T3 (u) is non negative, majorizing (kb) and (kb) by 1 we get
2
1 r 2 (t)
At (u) (const)
(1 + )
k
1
k
k3
+ k3 + 2 + 7 + 6 + 4
b
b
b
b
b
exp
1
1 + F (t) u2 .
2
so that
2 (1 + )
t3
(const) ;
u
b 1 r 2 (t)
2 3
k
(const)t4 ;
2
1 r (t)
2 k
(const)t2 u2 ;
2
2
b 1 r (t)
and also that the other terms are negligible. Then, applying Lemma 2:
T0
0
thus negligible.
4 u 2
2(4 22 )
For t T0 remark that T1 (u)+T2(u)+T3(u) does not change when (and consequently
b) change of sign. Thus and b can supposed to be non negative. Forgetting negative
terms in formula (10) and majorizing by 1; 1 (b) by (const)(b) and by (const)u,
we get:
At (u) (const)2
u
1+r
1
1 + F (t) u2 .
2
We conclude as in Step 2.
< 0.
(11)
22
1 r(t )
>
1 + r(t )
4 22
Remark : The proofs above show that even if hypothesis (H) is not satisfied, it is
still possible to improve inequality (2). In fact it remains true for every such that
< min F (t).
t[0,T ]
Acknowledgment. The authors thank Professors P. Carmona and C. Delmas for useful
talks on the subject of this paper.
References
Azas, J-M., Cierco-Ayrolles, C. and Croquette, A. (1999). Bounds and asymptotic expansions for the distribution of the maximum of a smooth stationary Gaussian process.
ESAIM Probab. Statist., 3, 107-129.
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes.
J. Wiley & Sons, New-York.
Miroshin, R.N. (1974). Rice series in the theory of random functions. Vestnik Leningrad
Univ. Math., 1, 143-155.
Piterbarg, V.I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th. Prob. Appl., 26, 687-705.
10
URL: http://www.emath.fr/ps/
R
esum
e. Dans cet article nous utilisons la methode de Rice (Rice, 1944-1945) pour trouver un encadrement de la fonction de repartition du maximum dun processus Gaussien stationnaire regulier.
Nous derivons des expressions simplifiees des deux premiers termes de la serie de Rice (Miroshin, 1974,
Azas et Wschebor, 1997) suffisants pour lencadrement cherche. Notre contribution principale est la
donnee dune forme plus simple du second moment factoriel du nombre de franchissements vers le
haut, ce qui est, en quelque sorte, une generalisation de la formule de Steinberg et al. (Cramer and
Leadbetter, 1967, p. 212). Nous presentons ensuite une application numerique et des developpements
asymptotiques qui fournissent une nouvelle interpretation dun resultat de Piterbarg (1981).
AMS Subject Classification. 60Exx, 60Gxx, 60G10, 60G15, 60G70, 62E17, 65U05.
Received June 4, 1998. Revised June 8, 1999.
1. Introduction
1.1. Framework
Many statistical models involve nuisance parameters. This is the case for example for mixture models [10],
gene detection models [5,6], projection pursuit [20]. In such models, the distributions of test statistics are those
of the maximum of stochastic Gaussian processes (or their squares). Dacunha-Castelle and Gassiat [8] give for
example a theory for the so-called locally conic models.
Thus, the calculation of threshold or power of such tests leads to the calculation of the distribution of the
maximum of Gaussian processes. This problem is largely unsolved [2].
Keywords and phrases: Asymptotic expansions, extreme values, stationary Gaussian process, Rice series, upcrossings.
This paper is dedicated to Mario Wschebor in the occasion of his 60th birthday.
108
Miroshin [13] expressed the distribution function of this maximum as a sum of a series, so-called the Rice
series. Recently, Azas and Wschebor [3, 4] proved the convergence of this series under certain conditions and
proposed a method giving the exact distribution of the maximum for a class of processes including smooth
stationary Gausian processes with real parameter.
The formula given by the Rice series is rather complicated, involving multiple integrals with complex expressions. Fortunatly, for some processes, the convergence is very fast, so the present paper studies the bounds
given by the first two terms that are in some cases sufficient for application.
We give identities that yield simpler expressions of these terms in the case of stationary processes. Generalization to other processes is possible using our techniques but will not be detailed for shortness and simplicity.
For other processes, the calculation of more than two terms of the Rice series is necessary. In such a case,
the identities contained in this paper (and other similar) give a list of numerical tricks used by a program under
construction by Croquette.
We then use Maple to derive asymptotic expansions of some terms involved in these bounds. Our bounds
are shown to be sharp and our expansions are made for a fixed time interval and a level tending to infinity.
Other approaches can be found in the literature [12]. For example, Kratz and Rootzen [11] propose asymptotic
expansions for a size of time interval and a level tending jointly to infinity.
We consider a real valued centred stationary Gaussian process with continuous paths X = {Xt ; t [0, T ] R}.
We are interested in the random variables
X = sup Xt or X
= sup |Xt | .
t[0,T ]
t[0,T ]
For shortness and simplicity, we will focus attention on the variable X ; the necessary modifications for adapting
our method to X are easy to establish [5].
We denote by dF () the spectral measure of the process X and p the spectral moment of order p when it
exists. The spectral measure is supposed to have a finite second moment and a continuous component. This
implies ([7] p. 203) that the process is differentiable in quadratic mean and that for all pairwise different time
points t1 , . . . , tn in [0, T ], the joint distribution of Xt1 , . . . , Xtn , Xt1 , . . . , Xtn is non degenerated.
For simplicity, we will assume that moreover the process admits C 1 sample paths. We will denote by r(.) the
covariance function of X and, without loss of generality, we will suppose that 0 = r(0) = 1.
Let u be a real number, the number of upcrossings of the level u by X, denoted by Uu is defined as follows:
Uu = # {t [0, T ], Xt = u, Xt > 0}
For k N , we denote by k (u, T ) the factorial moment of order k of Uu and by k (u, T ) the factorial moment of
order k of Uu 11{X0 u} . We also define k (u, T ) = k (u, T ) k (u, T ). These factorial moments can be calculated
by Rice formulae. For example:
T 2 u2 /2
1 (u, T ) = E (Uu ) =
e
2
T
Ast (u) ds dt
0
with Ast (u) = E (Xs )+ (Xt )+ |Xs = Xt = u ps,t (u, u), where (X )+ is the positive part of X and ps,t the
joint density of (Xs , Xt ).
These two formulae are proved to hold under our hypotheses ( [7], p. 204). See also Wschebor [21],
Chapter 3, for the case of more general processes.
We will denote by the density of the standard Gaussian distribution. In order to have simpler expressions
x
of rather complicated formulae, we will use the folllowing three functions: (x) =
x
and (x) =
0
1
(y)dy = (x) .
2
109
1
E () E (( 1)) P ( > 0) E () .
2
Noting that P almost surely, {X > u} = {X0 > u} {X0 u, Uu > 0} and that E Uu (Uu 1)11{X0 u} 2 ,
we get:
P (X0 > u) + 1 (u, T )
2 (u, T )
P X u
2
(1.1)
1 (u, T ) =
dt
0
dx
P X u = P (X0 > u) +
(1)m+1
m=1
m (u, T )
m!
(1.2)
(1.3)
2 (u, T )
P X u P (X0 > u) + 1 (u, T ).
2
Since 2 (u, T ) 2 (u, T ), we see that, except this last modification which gives a simpler expression, Main
inequality (1.1) is relation (1.3) with n = 1.
110
Remark 1.1. In order to calculate these bounds, we are interested in the quantity 1 (u, T ). For asymptotic
calculations and to compare our results with Piterbargs ones, we will also consider the quantity k (u, T ). From
a numerical point of view, k (u, T ) and k (u, T ) are worth being distinguished because they are not of same
order of magnitude as u +. In the following sections, we will work with 1 (u, T ).
2. Some identities
First, let us introduce some notations that will be used in the rest of the paper. We set:
r (t)
u,
(t) = E (X0 |X0 = Xt = u) =
1 + r(t)
r 2 (t)
2 (t) = V ar (X0 |X0 = Xt = u) = 2
,
1 r2 (t)
r (t) 1 r2 (t) r(t)r 2 (t)
(t) = Cor (X0 , Xt |X0 = Xt = u) =
.
2 (1 r2 (t)) r 2 (t)
1 + (t)
Note that, since the spectrum of the process X admits a continuous component, |(t)| = 1.
In the sequel, the variable t will be omitted when it is not confusing and we will write r, r , , , , k, b instead
of r(t), r (t), (t), (t), (t), k(t), b(t).
Proposition 2.1. (i) If (X, Y ) has a centred normal bivariate distribution with covariance matrix
1
1
then a R+
a
1
P (X > a, Y > a) = arctan
1+
(x)
2
1
0
1+
x (x) dx
1
=2
2 (T t)
(iii) 2 (u, T ) =
0
1 r 2
u
1+r
1
2
1 r2 (t)
1+
x
1
1r
r
u (b)
1+r
1 r2
u
1 + r(t)
dx
dt
with:
T1 (t) = 2 (t)
(2.1)
(2.2)
b(t)
(2.3)
1
arctan (k(t)) 2
b(t)
(k(t) x) (x) dx .
0
(2.4)
111
Remark 2.2.
p. 27:
1. Formula (i) is analogous to the formula (2.10.4) given in Cramer and Leadbetters [7],
1
a2
exp
1z
2 1 z 2
dz.
Our formula is easier to prove and is more adapted to numerical application because, when t 0,
(t) 1 and the integrand in Cramer and Leadbetters formula tends to infinity.
2. Utility of these formulae:
these formulae permit a computation of Main inequality (1.1), at the cost of a double integral with
finite bounds. This is a notable reduction of complexity with respect to the original form. The form
(2.4) is more adapted to effective computation, because it involves an integral on a bounded interval;
this method has been implemented in a S+ program that needs about one second of Cpu to run an
example. It has been applied to a genetical problem in Cierco and Azas [6].
The form (iii) has some consequences both for numerical and theoretical purposes. The calculation of 2 (u, T )
yields some numerical difficulties around t = 0. The sum of the three terms is infinitly small with respect to
each term. To discard the diagonal from the computation, we use formula (iii) and Maple to calculate the
equivalent of the integrand in the neighbourhood of t = 0 at fixed u.
T
Ast (u) ds dt. The following proposition gives the Taylor expansion
0
of A at zero.
At (u) =
1
(2 6 4 )
1 4
exp
u2
1296 (4 2 )1/2 2 2
2 4 22
2
2
t4 + O(t5 ).
Piterbarg [17] or Wschebor [21] proved that At (u) = O ( (u(1 + ))) for some 0. Our result is more precise.
Our formulae give some asymptotic expansions as u + for 1 (u, T ) and 2 (u, T ) for small T .
Proposition 2.4. Assume that 8 is finite. Then, there exists a value T0 such that, for every T < T0
11/2
4 22
27
1 (u, T ) =
4 5 (2 6 2 )3/2
2
4
4
u
4 22
u6
1+O
1
u
9/2
4 22
3 3T
2 (u, T ) =
9/2 (2 6 2 )
2
4
4
u
4 22
u5
1+O
1
u
as u +.
3. A numerical example
In the following example, we show how the upper and lower bounds (1.1) permit to evaluate the distribution
of X with an error less than 104 .
We consider the centered stationary Gaussian process with covariance (t) := exp(t2 /2) on the interval
I = [0, 1], and the levels u = 3, 2.5, . . . , 3. The term P (X0 u) is evaluated by the S -plus function P norm,
1 and 2 using Proposition 2.1 and the Simpson method. Though it is rather difficult to assess the exact
precision of these evaluations, it is clear that it is considerably smaller than 104 . So, the main source of error
112
is due to the difference between the upper and lower bounds in (1.1).
u
P (X0 u)
3
0.00135
2.5
0.00621
2
0.02275
1.5
0.06681
1
0.15866
0.5
0.30854
0
0.50000
0.5
0.69146
1
0.84134
1.5
0.93319
2
0.97725
2.5
0.99379
3
0.99865
1
0.00121
0.00518
0.01719
0.04396
0.08652
0.13101
0.15272
0.13731
0.09544
0.05140
0.02149
0.00699
0.00177
2
lower bound upper bound
0
0.00014
0.00014
0
0.00103
0.00103
0
0.00556
0.00556
0.00001
0.02285
0.02285
0.00002
0.07213
0.07214
0.00004
0.17753
0.17755
0.00005
0.34728
0.34731
0.00004
0.55415
0.55417
0.00002
0.74591
0.74592
0.00001
0.88179
0.88180
0
0.95576
0.95576
0
0.98680
0.98680
0
0.99688
0.99688
4. Proofs
Proof of Proposition 2.1
Proof of point (i). We first search P (X > a, Y > a).
Put = cos(), [0, [, and use the orthogonal decomposition Y = X +
a X
Then {Y > a} = Z >
. Thus:
1 2
+
P (X > a, Y > a) =
a x
(x)
(x)(z) dx dz,
dx =
1 2
1 2 Z.
1
where D is the domain located between the two half straight lines starting from the point a, a
1+
Using a symmetry with respect to the straight line with angle passing through the origin, we get:
2
+
P (X > a, Y > a) = 2
(x)
a
1
x
1+
dx.
(4.1)
Now,
P (X > a, Y > a) = (a) P (X > a, Y < a) = (a) P (X > a, (Y ) > a) .
Applying relation (4.1) to (X, Y ) yields
+
(x)
a
1+
x
1
dx = 2
and
1+
x
1
(x) dx.
113
(k x) (x) dx =
0
1
arctan(k)
2
E Z + =
p0,t (x, u) dx
(1 r2 )
0
u
T
+
r (x r u)
r (x r u)
I2 =
dt
r
(1 r2 )
0
u
T
parts leads to
I2 = (u)
0
1 (u, T ) =
22
2
(u)
2
r
1r
u (b)
2
1r
1+r
2 1 r
r2
u
+
1+r
22 (1 r2 )
1r
u
1+r
dt. Integrating I2 by
dt.
r2
= 2 , we obtain:
1 r2
T
1r
u
1+r
dt + (u)
0
1 r2
1r
u
1+r
(b) dt.
Jij =
0
xi y j
2
1 2
1 2
0
1 2 (k b) (b).
dy
(4.2)
114
(4.3)
3/2
(k b) k b (k b) (b).
(4.4)
[ (k b) + k b (k b)] (b).
(4.5)
x
0
parts
3/2
v(x, y)
dx dy. Then, integrating by
x
2 1 2
(4.6)
1 2 [ (k b) + k b (k b)] (b).
1 2 2
b
1
(b) + 2 b2
(k x) (x) dx + 2 b (k b) (b).
b
as a(t, u)
b(t, u)
Note 4.2. Many results of this section are based on tedious Taylor expansions. These expansions have been
made or checked by a computer algebra system (Maple). They are not detailed in the proofs.
115
1 + (t)
= O(t) is small,
1 (t)
Proof of Proposition 2.3. Use form (iii) and remark that, when t is small, k(t) =
1
and, since () =
2
3
6
+ O 5 as 0, we get:
b(t)
b(t)
k(t)
arctan(k(t))
k 3 (t)
x(x)dx +
x3 (x)dx + O(t5 )
2
2 0
6 2 0
1
k(t)
2 arctan(k(t)) 2 ((0) (b(t)))
+ O(t5 ).
= 2 2 (t)(t) 2 (t)
k 3 (t)
2
+
2(0) b (t) + 2 (b(t))
6 2
In the same way:
2(t)(t)
k 3 (t) 3
b (t) + O(t5 ).
T3 (t) =
(b(t)) k(t)b(t)
6
2
And then, assuming 8 finite, use Maple to get the result.
T2 (t) = 2 2 (t)(t) 2 (t)
(p+1)
(i) Ip =
1 Mp+1
2 2
( dc )
p
(cos ) d 1 + O
0
1
u
22 6 2 24
p+1
and Mp+1 = E |Z|
where Z is a standard Gaussian random variable.
4 22
T
Mp
1
(ii) Jp =
tp (l(t) u) dt = (c u)(p+1)
1+O
2
u
0
with d =
1
6
Proof of Lemma 4.3. Since the derivative of l at zero is non zero, l is invertible in some neighbourghood of zero
1
1
and its inverse l1 satisfies l1 (t) = t + O(t2 ), l1 (t) = + O(t).
c
c
We first consider Ip and use the change of variable y = l(t)u, then
l(T )u
Ip =
y
u
l1
(kb) l1
y
u
(y) l1
y
u
dy
y
d
= y + u OU
u
c
1
6
y2
u2
and
l(T )u
(p+1)
yp
Ip = (c u)
22 6 2 24
t u + u O(t3 ) = d u t + u O(t3 ).
4 22
d
y + u OU
c
y2
u2
(y) 1 + OU
y
u
dy.
116
tu
2
t
. Then
2
(const) u t2
tu
2
(4.7)
(p+1)
yp
Ip = (c u)
0
l(T )u
yp
Put Kp (u) =
0
d
y
c
d
y
c
(y) 1 + OU
y
u
dy.
(4.8)
yp
Kp (u) =
0
d
y
c
+
c y yp
y2 + z 2
d
y (y) dy =
exp
dz dy. Then, using polar coorc
2
2
0
0
0
d
1 Mp+1 arctan( c )
p
dinates, we derive that Kp () =
(cos ) d. So we can see that the contribution of the
2 2
0
y
term OU
in formula (4.8) is O u(p+2) which gives the desired result for Ip .
u
Moreover, Kp () =
yp
2 (1 r)
u
2 (1 + r)
1r
u
1+r
r
(b)
1 r2
Then, 1 (u, T ) =
A1 (t) dt.
0
1
1
3
3 + 5 + O(z 7 ) .
z
z
z
(4.9)
117
2 (1 r(t))
u for the first term and z = b(t)
2 (t)(1 + r(t))
2 (1 r)
u
2 (1 + r)
1r
u
1+r
(b) ,
we get:
2 (1 + r) 1
2 (1 r) u
2 (1 r)
2 (1 + r)
u
+ OU
2
(1 + r)
2 (1 r)
1
r
1
+
3 + OU
2
b
1r b
(u)
A1 (t) =
2
3/2
2 (1 + r)
2 (1 r)
5/2
1
u5
1
b5
u3
4
2
exp 2(u4
2)
1 4 2
2
t2 + O(t4 ).
A1 (t) =
7/2
8
u3 2
2
To use Lemma 4.3 point (ii) to calculate 1 (u, T ), it is necessary to have a Taylor expansion of the coefficient
22
2 (1 r)
2 (1 r(t))
of u in
u
.
We
have
lim
=
, therefore, we set:
t0 2 (t)(1 + r(t))
2 (1 + r)
4 22
2 (1 r)
22
.
2 (1 + r)
4 22
l(t) =
2 (2 6 24 )
t + O(t2 ).
4 22
1
t (l(t) u) dt =
2
1
6
2
4
22
2 (2 6 24 )
u
4 22
1
=
2
11/2
4 22
27
1 (u, T ) =
4 5 (2 6 2 )3/2
2
4
1+O
1
u
4
u , we get the equivalent for 1 (u, T ).
4 22
4
u
4 22
u6
1+O
1
u
118
2 (T t)
2 (u, T ) =
0
1
2
1 r2 (t)
u
1 + r(t)
(4.10)
(x) (k x) dx.
b
The function x x2 1 (x) being bounded, we have
(kx) = (k b) + k (k b) (x b)
1 3
2
3
k b (k b) (x b) + OU k 3 (x b) ,
2
(4.11)
where the Landaus symbol has here the same meaning as in Lemma 4.3.
Moreover, using the expansion of given in formula (4.9), it is easy to check that as z +,
+
(z)
(z)
(z)
3 4 +O
2
z
z
z6
z
+
(z)
(z)
2
(x z) (x) dx = 2 3 + O
z
z5
z
+
(z)
3
(x z) (x) dx = O
.
z4
z
(x z) (x) dx =
Therefore, multiplying formula (4.11) by (x), integrating on [b; +[ and applying formula (4.9) once again
yield:
3
1 k2
3
1
1
+
+ k (k b) (b)
4
(k b) (b)
b b3 b5
b2
b
(k b) (b)
k
2
2
T2 = 2 b
+O
(k
b)
(b)
+
O
b7
b6
3
3
k
k
(k b) (b) + O
(b)
+O
b4
b4
Note that the penultimate term can be forgotten. Then, remarking that, as u +, b =
u, t and
k t, we obtain:
T2
2
2
= 2 2 b (k b) (b) + 2
(k b) (b) + 2
(k b) (b)
b2
b
2
Remark 4.5. As it will be seen later on, Lemma 4.3 shows that the contribution of the remainder to the
1
integral (4.10) can be neglected since the degrees in t and of each term are greater than 5. So, in the sequel,
u
we will denote the sum of these terms (and other terms that will appear later) by Remainder and we set:
T2 = U1 + U2 + U3 + U4 + U5 + U6 + U7 + U8 + U9 + Remainder.
119
Now, we have
U1 + T 3 = 0
1 2 2 k = (1 + ) k so that U7 + T1 = (1 + ) 2 k (k b) (b)
2
U2 + U3 = 2
(1 + ) (k b) (b)
b
2
U4 + U5 = 4 3 (k b) (b) 1 + O t2
b
2
U8 + U9 = 4 2 k (k b) (b) 1 + O t2
b
since = 1 + O t2 .
By the same remark as Remark 4.5 above, the term O t2 can be neglected. Consequently,
T1 + T2 + T3
= 2
2
2
(1 + ) (k b) (b) 4 3 (k b) (b)
b
b
(1 + ) 2 k (k b) (b) + 2 2 k 3 (k b) (b) + 4
2
k (k b) (b)
b2
+ Remainder.
Therefore, we are leaded to use Lemma 4.3 in order to calculate the following integrals:
(T t)
0
T
(T t)
0
T
(T t)
0
T
(T t)
0
u
2u
(kb) (b) dt = (T t) m1 (t) (kb) b2 +
dt
1+r
1
+r
0
2
2u
m2 (t) (k b) b2 +
dt
1+r
2
2u
dt
m3 (t) b2 (1 + k 2 ) +
1+r
2
2u
dt
m4 (t) b2 (1 + k 2 ) +
1+r
2
2u
dt
m5 (t) b2 (1 + k 2 ) +
1+r
(T t) m1 (t) exp
0
120
with:
m1 (t)
=
=
2
2
1
(t) (1 + (t))
1 r2 (t) b
4 22 3
1 2 6 24
t + O t5
5/2
36
u
2
m2 (t)
m3 (t)
=
=
m4 (t)
=
=
m5 (t)
=
=
5/2
4 22
2
1
(t)
=
t + O t3
7/2
1 r2 (t) b3
u3 2
1
1
(1 + (t)) 2 (t) k(t)
2 1 r2 (t)
3/2
2 2 6 24
t4 + O t6
864 22 4 22 3/2
2
1
2 (t) k 3 (t)
2 1 r2 (t)
3/2
2 4
1 2 6 24
t + O t6
864 22 4 22 3/2
4
1
( 1998).2
(t) k(t)
b2
2 1 r2 (t)
3/2
2 6 24 4 22
2 2
1
t + O t4 .
12
32 3/2 u2
4
=
Lemma 4.3 shows that we can neglect the terms issued from the t part of the factor T t in formula (4.10).
lim 2 +
=
t0 u
1+r
4 22
b2
2
4
lim 2 1 + k 2 +
=
t0 u
1+r
4 22
Therefore, we set:
2 2 6 24
l1 (t)
b2 (t)
2
4
+
=
2
u
1 + r(t)
4 22
l2 (t)
b2 (t)
2
4
1 + k 2 (t) +
u2
1+r
4 22
2 2 6 24
2
12 (4 22 )
t + O t3
5/2
18 (4 22 )
t + O t3 .
2 = T exp
4 u2
2 (4 22 )
4 22
4 22
1 2 6 24
3
5/2
7/2
36
2 u
u3 2
3/2
2 6 24 4 22
2
1
+
J2
12
32 3/2 u2
I1
1+O
1
u
121
2
2
2
2
8 3
3
3
(cos ) d =
and that
cos d =
, we find
Noting that
27
3
0
0
4
144 3 4 22
1
I3 =
u4 1 + O
2
u
2 22 (2 6 24 )
2 2
3 3 4 2
1
I1 =
u2 1 + O
2
u
2 2 (2 6 4 )
3
12 3 4 22
1
J2 =
u3 1 + O
2
2
u
2 (2 6 4 ) 2 (2 6 4 )
Finally, gathering the pieces, we obtain the desired expression of 2 .
5. Discussion
Using the general relation (1.3) with n = 1, we get
P X u P (X0 > u) 1 (u, T ) +
2 (u, T ) 3 (u, T )
2 (u, T )
2
2
6
A conjecture is that the orders of magnitude of 2 (u, T ) and 3 (u, T ) are considerably smaller than those of
1 (u, T ) and 2 (u, T ). Admitting this conjecture, Proposition 2.4 implies that for T small enough
9/2
4 22
T 2
3 3T
P X u = (u) +
(u)
2 9/2 (2 6 2 )
2
4
2
4
u
4 22
u5
1+O
1
u
which is Piterbargs theorem with a better remainder ([15], Th. 3.1, p. 703). Piterbargs theorem is, as far as we
know, the most precise expansion of the distribution of the maximum of smooth Gaussian processes. Moreover,
very tedious calculations would give extra terms of the Taylor expansion.
References
[1] M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables.
Dover, New York (1972).
[2] R.J. Adler, An Introduction to Continuity, Extrema and Related Topics for General Gaussian Processes, IMS, Hayward, Ca
(1990).
[3] J.-M. Azas and M. Wschebor, Une formule pour calculer la distribution du maximum dun processus stochastique. C.R. Acad.
Sci. Paris Ser. I Math. 324 (1997) 225-230.
[4] J-M. Azas and M. Wschebor, The Distribution of the Maximum of a Stochastic Process and the Rice Method, submitted.
[5] C. Cierco, Probl`
emes statistiques li
es a
` la d
etection et a
` la localisation dun g`
ene `
a effet quantitatif. PHD dissertation.
University of Toulouse, France (1996).
[6] C. Cierco and J.-M. Azas, Testing for Quantitative Gene Detection in Dense Map, submitted.
[7] H. Cram
er and M.R. Leadbetter, Stationary and Related Stochastic Processes, J. Wiley & Sons, New-York (1967).
[8] D. Dacunha-Castelle and E. Gassiat, Testing in locally conic models, and application to mixture models. ESAIM: Probab.
Statist. 1 (1997) 285-317.
[9] R.B. Davies, Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 (1977) 247-254.
[10] J. Ghosh and P. Sen, On the asymptotic performance of the log-likelihood ratio statistic for the mixture model and related
results, in Proc. of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Le Cam L.M. and Olshen R.A., Eds.
(1985).
122
GENERAL FORMULAE
>
phi:=t->exp(-t*t/2)/sqrt(2*pi);
2
e(1/2 t )
2
We introduce mu4=lambda4-lambda22 and mu6= lambda2*lambda6-lambda4^2
to make the outputs clearer.
>
assume(t>0);
>
assume(lambda2 > 0);
>
assume(mu4 > 0);
>
assume(mu6>0);
>
interface(showassumed=2);
>
Order:=12;
:= t
>
Order := 12
r:=t->1-lambda2*t^2/2!+lambda4*t^4/4!-lambda6*t^6/6!+lambda8*t^8/8!;
1
1
1
1
2 t2 +
4 t4
6 t6 +
8 t8
2
24
720
40320
siderels:= {lambda4=mu4+lambda2^2,lambda2*lambda6-lambda4^2=mu6}:
I_r2:=t->1-r(t)*r(t);
r := t 1
>
>
I r2 := t 1 r(t)2
>
simplify(simplify(series(I_r2(t),t=0,8),siderels));
>
1
1
1
1
1
2 t2 + ( 22
4) t4 + (
6 +
2 4 +
23 ) t6 + O(t8 )
3
12
360
24
24
with assumptions on t, 2 and 4
rp:=t->diff(r(t),t);
rp := t diff(r(t), t)
>
eval(rp(t));
1
1
1
4 t3
6 t5 +
8 t7
6
120
5040
with assumptions on 2 and t
2 t +
>
rs:=t->diff(r(t),t$2);
rs := t
>
2
r(t)
t2
eval(rs(t));
1
1
1
4 t2
6 t4 +
8 t6
2
24
720
with assumptions on 2 and t
2 +
123
124
mu:=t->-u*rp(t)/(1+r(t));
:= t
>
u rp(t)
1 + r(t)
sig2:=t->lambda2-rp(t)*rp(t)/I_r2(t);
sig2 := t 2
>
rp(t)2
I r2(t)
simplify(taylor(sig2(t),t=0,8),siderels);
1
1 6 22 4 3 42 2 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6
>
sigma:=t->sqrt(sig2(t));
:= t
>
simplify(taylor(sigma(t),t=0,6),siderels);
1
2
>
sig2(t)
1 6 22 4 3 42 2 6 3
t + O(t5 )
144
4 2
with assumptions on t, 4, 2 and 6
4 t +
b:=t->mu(t)/sigma(t);
b := t
>
(t)
(t)
simplify(taylor(b(t),t=0,6),siderels);
u 2
1
1 u 6
+ ( u 4 +
) t2 + O(t4 )
8
36 4(3/2)
4
with assumptions on 2, 4, t and 6
>
sig2rho:=t->-rs(t)-r(t)*rp(t)*rp(t)/I_r2(t);
sig2rho := t rs(t)
>
r(t) rp(t)2
I r2(t)
simplify(taylor(sig2rho(t),t=0,8),siderels);
1
1 6 22 4 + 3 42 + 4 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6
>
rho:=t->sig2rho(t)/sig2(t);
:= t
>
sig2rho(t)
sig2(t)
simplify(taylor(rho(t),t=0,8),siderels);
1 6 2
t + O(t4 )
18 2 4
with assumptions on t, 6, 2 and 4
1 +
k2:=t->(1+rho(t))/(1-rho(t));
k2 := t
>
1 + (t)
1 (t)
sk2:=simplify(taylor(k2(t),t=0),siderels);
1
1 6 2
t +
(3 26 4 + 9 24 42 + 9 22 43 2 6 22 4 3 8 22 4
36 2 4
2160
1
+ 3 44 + 13 6 42 + 5 62 ) (22 42 )t4 +
(147 28 42
907200
+ 175 6 26 4 273 26 43 + 63 24 44 + 196 6 24 42 + 120 8 24 42
+ 357 22 45 + 707 6 22 43 195 8 22 43 175 8 22 6 4 + 168 46
sk2 :=
k:=t->taylor(sqrt(sk2),t=0);
k := t taylor( sk2 , t = 0)
>
simplify(taylor(k(t),t=0,3),siderels);
1
6
>
6
t + O(t3 )
2 4
with assumptions on t, 6, 2 and 4
sqrtI_rho2:=t->k(t)*(1-rho(t));
sqrtI rho2 := t k(t) (1 (t))
>
T1:=t->sig2(t)*sqrtI_rho2(t)*phi(b(t))*phi(k(t)*b(t));
T1 := t sig2(t) sqrtI rho2(t) (b(t)) (k(t) b(t))
>
simplify(simplify(series(T1(t),t=0,6),siderels),power);
1
24
u2 22
6 4 e(1/2 4 ) 3
1
t
((5 62 22 u2 + 3 22 42 8 3 26 42 9 24 43
2880
2
9 22 44 15 6 22 42 u2 18 6 22 42 3 45 + 5 62 4 3 6 43 )
e(1/2
u2 22
4
T2 := t->2*sig2(t)*(rho(t)-(b(t))^2)*(arctan(k(t))/(2*pi)
-k(t)/sqrt(2*pi)*(phi(0)-phi(b(t))-k(t)^2/6*(2*phi(0)-((b(t))^2+2)*phi(b(t)))));
T2 := t 2sig2(t) ((t) b(t)2 )
1
2
2
k(t)
((0)
(b(t))
k(t)
(2
(0)
(b(t)
+
2)
(b(t))))
1 arctan(k(t))
6
125
126
simplify(simplify(series(T2(t),t=0,6),siderels),power);
1
24
u2 22
6 (u2 22 + 4) e(1/2 4 ) 3
t + O(t5 )
4 2
with assumptions on t, 6, 2 and 4
>
T3:=t->(2*sig2(t)*(k(t)*b(t)^2))/sqrt(2*pi)*(1-(k(t)*b(t))^2/6)*phi(b(t));
>
1
sig2(t) k(t) b(t)2 (1 k(t)2 b(t)2 ) (b(t))
6
T3 := t 2
2
simplify(simplify(series(T3(t),t=0,6),siderels),power);
u2 22
1
1 e(1/2 4 ) 6 2(3/2) u2 3
t
2 u2 (27 8 22 42 + 35 62 22 u2
24
25920
4
27 26 42 81 24 43 81 22 44 162 6 22 42 135 6 22 42 u2
27 45 45 62 4 + 243 6 43)e(1/2
u2 22
4
A:=t->((phi(u/sqrt((1+r(t)))))^2/sqrt(I_r2(t)))*(T1(t)+T2(t)+T3(t));
(
A := t
>
u
)2 (T1(t) + T2(t) + T3(t))
1 + r(t)
I r2(t)
simplify(simplify(series(A(t),t=0,6),siderels),power);
O(t4 )
with assumptions on t
Cphib:=t->phi(t)/t-phi(t)/t^3;
Cphib := t
>
sq:=t->sqrt((1-r(t))/(1+r(t)));
sq := t
>
(t) (t)
3
t
t
1 r(t)
1 + r(t)
simplify(simplify(series(sq(t),t=0,4),siderels),power);
1 2 22 + 4 3
1
2 t
t + O(t5 )
2
48
2
with assumptions on t, 2 and 4
>
nsigma:=t->sigma(t)/sqrt(lambda2);
(t)
nsigma := t
2
A1:=t->(1/sqrt(2*pi))*phi(u)*phi(sq(t)*u/nsigma(t))*((nsigma(t)/(sq(t)*u)
-(nsigma(t)/(sq(t)*u))^3)*sqrt(lambda2)+(1/b(t)-1/b(t)^3)*rp(t)/sqrt(I_r2(t)));
1
1
(
)
rp(t)
3
nsigma(t) nsigma(t)
sq(t) u
b(t) b(t)3
(u) (
) 2 +
)
(
3
3
nsigma(t)
sq(t) u
sq(t) u
I r2(t)
2
SA1:=simplify(simplify(series(A1(t),t=0,6),siderels),power);
A1 := t
>
u2 (4+22 )
)
4
1 2 e(1/2
4(5/2) 2
SA1 :=
t + O(t4 )
16
2(7/2) (3/2) u3
with assumptions on t, 4 and 2
L2:= t->(1-r(t))/((1+r(t))*nsigma(t)^2)-(lambda4-mu4)/mu4;
>
4 4
1 r(t)
(1 + r(t)) nsigma(t)2
4
SL2:=simplify(simplify(series(L2(t),t=0,6),siderels),power);
L2 := t
1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
We define c as the square root of the coefficient of t2
c:=sqrt(op(1,SL2))
1 2 2 6
c :=
6
4
with assumptions on 2, 6 and 4
>
nu1b:=(sqrt(2*pi))*op(1,SA1)*(c^(-3)*u^(-3)/2);
SL2 :=
u2 (4+22 )
)
4
27 2 e(1/2
4(11/2)
nu1b :=
8
2(7/2) u6 (2 6)(3/2)
with assumptions on 4, 2 and 6
PROOF OF THE EQUIVALENT OF NU2
>
m1:=t->(1+rho(t))*2*sigma(t)^2/(pi*b(t)*sqrt(I_r2(t)));
m1 := t 2
>
(1 + (t)) (t)2
b(t) I r2(t)
sm1:=simplify(simplify(series(m1(t),t=0,8),siderels),power);
1 6 4 3
sm1 :=
t + O(t5 )
36 2(5/2) u
with assumptions on t, 6, 4 and 2
127
128
m2:=t->(-4/pi)*sigma(t)^2*b(t)^(-3)/sqrt(I_r2(t));
>
(t)2
b(t)3 I r2(t)
sm2:=simplify(simplify(series(m2(t),t=0,6),siderels),power);
>
4(5/2)
t + O(t3 )
u3 2(7/2)
with assumptions on t, 4 and 2
m3:=t->-(1+rho(t))*sigma(t)^2*k(t)/(pi*sqrt((2*pi)*I_r2(t)));
m2 := t 4
sm2 :=
1
6(3/2) 2
sm3 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m4:=t->(2/pi)*sigma(t)^2*k(t)^3/sqrt(2*pi*I_r2(t));
m3 := t
>
>
m4 := t 2
>
>
(t)2 k(t)3
2 I r2(t)
sm4:=simplify(simplify(series(m4(t),t=0,6),siderels),power);
1
6(3/2) 2
sm4 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m5:=t->(4/pi)*sigma(t)^2*k(t)*b(t)^(-2)/sqrt(2*pi*I_r2(t));
>
(t)2 k(t)
b(t)2 2 I r2(t)
sm5:=simplify(simplify(series(m5(t),t=0,6),siderels),power);
1 6 4(3/2) 2 2
sm5 :=
t + O(t4 )
12 23 (3/2) u2
with assumptions on t, 6, 4 and 2
l12:=t-> (b(t)/u)^2 + 2/(1+r(t))-lambda4/mu4;
>
b(t)2
1
4
+2
u2
1 + r(t) 4
simplify(simplify(series(l12(t),t=0,8),siderels),power);
m5 := t 4
>
l12 := t
1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
>
b(t)2 (1 + k(t)2 )
1
4
+2
u2
1 + r(t) 4
simplify(simplify(series(l22(t),t=0,8),siderels),power);
1 2 6 2
t + O(t4 )
12 42
with assumptions on t, 2, 6 and 4
>
>
opm1:=op(1,sm1);
1 6 4
36 2(5/2) u
with assumptions on 6, 4 and 2
opm1 :=
>
opm2:=op(1,sm2);
4(5/2)
u3 2(7/2)
with assumptions on 4 and 2
opm2 :=
>
>
>
>
>
>
opm5:=op(1,sm5);
1 6 4(3/2) 2
opm5 :=
12 23 (3/2) u2
with assumptions on 6, 4 and 2
c1:=144*sqrt(3)*mu4^4*u^(-4)/(sqrt(2*pi)*lambda2^2*mu6^2);
3 44 2
c1 := 72 4
u 22 62
with assumptions on 4, 2 and 6
c2:=3*sqrt(3)*mu4^2*u^(-2)/(sqrt(2*pi)*lambda2*mu6);
3
3 42 2
c2 :=
2 u2 2 6
with assumptions on 4, 2 and 6
c5:=12*sqrt(3)*mu4^3*u^(-3)/(lambda2^(3/2)*mu6^(3/2));
3 43
c5 := 12 3 (3/2) (3/2)
u 2
6
with assumptions on 4, 2 and 6
B:=opm1*c1+opm2*c2+opm5*c5;
3 4(9/2) 3 2
B :=
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
simplify(B);
3 4(9/2) 3 2
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
129
February 2, 2008
Let X = {X(t) : t S} be a real-valued random field defined on some parameter set S and
M := suptS X(t) its supremum.
The study of the probability distribution of the random variable M , i.e. the function
FM (u) := P{M u} is a classical problem in probability theory. When the process is Gaussian,
general inequalities allow to give bounds on 1 FM (u) = P{M > u} as well as asymptotic
results for u +. A partial account of this well established theory, since the founding paper
by Landau and Shepp [20] should contain - among a long list of contributors - the works of
Marcus and Shepp [24], Sudakov and Tsirelson [30], Borell [13] [14], Fernique [17], Ledoux and
Talagrand [22], Berman [11] [12], Adler[2], Talagrand [32] and Ledoux[21].
During the last fifteen years, several methods have been introduced with the aim of obtaining more precise results than those arising from the classical theory, at least under certain
restrictions on the process X , which are interesting from the point of view of the mathematical
theory as well as in many significant applications. These restrictions include the requirement
Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.
the domain S to have certain finite-dimensional geometrical structure and the paths of the
random field to have a certain regularity.
Some examples of these contributions are the double sum method by Piterbarg [28]; the
Euler-Poincare Characteristic (EPC) approximation, Taylor, Takemura and Adler [34], Adler
and Taylor [3]; the tube method, Sun [31] and the well- known Rice method, revisited by Azas
and Delmas [5], Azas and Wschebor [6]. See also Rychlik [29] for numerical computations.
The results in the present paper are based upon Theorem 3 which is an extension of Theorem
3.1 in Azas and Wschebor [8] allowing to express the density pM of FM by means of a general
formula. Even though this is an exact formula, it is only implicit as an expression for the
density, since the relevant random variable M appears in the right-hand side. However, it can
be usefully employed for various purposes.
First, one can use Theorem 3 to obtain bounds for pM (u) and thus for P{M > u} for
every u by means of replacing some indicator function in (4) by the condition that the normal
derivative is extended outward (see below for the precise meaning). This will be called the
direct method. Of course, this may be interesting whenever the expression one obtains can
be handled, which is the actual situation when the random field has a law which is stationary
and isotropic. Our method relies on the application of some known results on the spectrum of
random matrices.
Second, one can use Theorem 3 to study the asymptotics of P{M > u} as u +. More
precisely, one wants to write, whenever it is possible
P{M > u} = A(u) exp
1 u2
2 2
+ B(u)
(1)
method.
In all cases, the second order approximation for the direct method provides an upper bound
for the one arising from the EPC method.
Our proofs use almost no differential geometry, except for some elementary notions in Euclidean space. Let us remark also that we have separated the conditions on the law of the
process from the conditions on the geometry of the parameter set.
Third, Theorem 3 and related results in this paper, in fact refer to the density pM of
the maximum. On integration, they imply immediately a certain number of properties of the
probability distribution FM , such as the behaviour of the tail as u +.
Theorem 3 implies that FM has a density and we have an implicit expression for it. The
proof of this fact here appears to be simpler than previous ones (see Azas and Wschebor [8])
even in the case the process has 1-dimensional parameter (Azas and Wschebor [7]). Let us
remark that Theorem 3 holds true for non-Gaussian processes under appropriate conditions
allowing to apply Rice formula.
Our method can be exploited to study higher order differentiability of FM (as it has been
done in [7] for one-parameter processes) but we will not pursue this subject here.
This paper is organized as follows:
Section 2 includes an extension of Rice Formula which gives an integral expression for the
expectation of the weighted number of roots of a random system of d equations with d real
unknowns. A complete proof of this formula in a form which is adapted to our needs in this
paper, can be found in [9]. There is an extensive literature on Rice formula in various contexts
(see for example Belayiev [10] , Cramer-Leadbetter [15], Marcus [23], Adler [1], Wschebor [35].
In Section 3, we obtain the exact expression for the distribution of the maximum as a consequence of the Rice-like formula of the previous section. This immediately implies the existence
of the density and gives the implicit formula for it. The proof avoids unnecessary technicalities
that we have used in previous work, even in cases that are much simpler than the ones considered here.
In Section 4, we compute (Theorem 4) the first order approximation in the direct method
for stationary isotropic processes defined on a polyhedron, from which a new upper bound for
P{M > u} for all real u follows.
In Section 5, we consider second order approximation, both for the direct method and the
EPC approximation method. This is the content of Theorems 5, 6 and 7.
Section 6 contains some examples.
x
(y)dy.
Assume that the random vectors , have a joint Gaussian distribution, where has
values in some finite dimensional Euclidean space. When it is well defined,
E(f ()/ = x)
is the version of the conditional expectation obtained using Gaussian regression.
Eu := {t S : X(t) > u} is the excursion set above u of the function X(.) and Au :=
{M u} is the event that the maximum is not larger than u.
, , , denote respectively inner product and norm in a finite-dimensional real Euclidean
space; d is the Lebesgue measure on Rd ; S d1 is the unit sphere ; Ac is the complement
of the set A. If M is a real square matrix, M 0 denotes that it is positive definite.
4
This is well-known and follows easily from the next lemma (called Bulinskaya s lemma)
that we state without proof, for completeness.
Lemma 1 Let Z(t) be a stochastic process defined on some neighborhood of a set T embedded
in some Euclidean space. Assume that the Hausdorff dimension of T is smaller or equal than
the integer m and that the values of Z lie in Rm+k for some positive integer k . Suppose, in
addition, that Z has C 1 paths and that the density pZ(t) (v) is bounded for t T and v in some
neighborhood of u Rm+k . Then, a. s. there is no point t T such that Z(t) = u.
With respect to A5, one has the following sufficient conditions: Assume A1, A2, A3 and as
additional hypotheses one of the following two:
t
X(t) is of class C 3
sup
tS,x V (0)
as 0,
In this section we review Rice formula for the expectation of the number of roots of a random
system of equations. For proofs, see for example [8], or [9], where a simpler one is given.
Theorem 1 (Rice formula) Let Z : U Rd be a random field, U an open subset of Rd and
u Rd a fixed point in the codomain. Assume that:
(i) Z is Gaussian,
(ii) almost surely the function t
Z(t) is of class C 1 ,
(iii) for each t U , Z(t) has a non degenerate distribution (i.e. Var Z(t) 0),
(iv) P{t U, Z(t) = u, det Z (t) = 0} = 0
Then, for every Borel set B contained in U , one has
E NuZ (B) =
(2)
Theorem 2 Let Z be a random field that verifies the hypotheses of Theorem 1. Assume that
for each t U one has another random field Y t : W Rd , where W is some topological space,
verifying the following conditions:
a) Y t (w) is a measurable function of (, t, w) and almost surely, (t, w)
ous.
Y t (w) is continu-
one puts on C(W, Rd ) the topology of uniform convergence on compact sets. Then, for each
compact subset I of U , one has
g(t, Y t ) =
E
tI,Z(t)=u
(3)
Remarks:
1. We have already mentioned in the previous section sufficient conditions implying hypothesis (iv) in Theorem 1.
2. With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .
pM (x) =
tS0
d
+
j=1
Sj
E | det(Xj (t))| 1IAx /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt),
(4)
Remark: One can replace | det(Xj (t))| in the conditional expectation by (1)j det(Xj (t)),
since under the conditioning and whenever M x holds true, Xj (t) is negative semi-definite.
Proof of Theorem 3
Let Nj (u), j = 0, . . . , d be the number of global maxima of X(.) on S that belong to Sj and are
larger than u. From the hypotheses it follows that a.s.
j=0,...,d Nj (u) is equal to 0 or 1, so
that
P{M > u} =
P{Nj (u) = 1} =
E(Nj (u)).
(5)
j=0,...,d
j=0,...,d
The proof will be finished as soon as we show that each term in (5) is the integral over (u, +)
of the corresponding term in (4).
This is self-evident for j = 0. Let us consider the term j = d. We apply the weighted Rice
formula of Section 2 as follows :
Z is the random field X defined on Sd .
For each t Sd , put W = S and Y t : S R2 defined as:
Y t (w) := X(w) X(t), X(t) .
Notice that the second coordinate in the definition of Y t does not depend on w.
6
In the place of the function g, we take for each n = 1, 2, . . . the function gn defined as
follows:
gn (t, f1 , f2 ) = gn (f1 , f2 ) = 1 Fn (sup f1 (w)) . 1 Fn (u f2 (w)) ,
wS
(6)
E
tSd ,X (t)=0
Sd
(7)
Notice that the formula holds true for each compact subset of Sd in the place of Sd , hence for
Sd itself by monotone convergence.
Let now n in (7). Clearly gn (Y t ) 1IX(s)X(t)0,sS . 1IX(t)u . The passage to the limit
does not present any difficulty since 0 gn (Y t ) 1 and the sum in the left-hand side is bounded
by the random variable N0X (Sd ), which is in L1 because of Rice Formula. We get
E(Nd (u)) =
Sd
= 1 ; sn S, (n = 1, 2, . . .) such that sn t,
7
t sn
as n +},
t sn
whenever this set is non-empty and Ct,j = {0} if it is empty. We will denote by Ct,j the dual
cone of Ct,j , that is:
Ct,j := {z Rd : z, 0 for all Ct,j }.
Notice that these definitions easily imply that Tt,j Ct,j and Ct,j Nt,j . Remark also that for
j = d0 , Ct,j = Nt,j .
We will say that the function X(.) has an extended outward derivative at the point t in
(t) C .
Sj , j d0 if Xj,N
t,j
Corollary 1 Under assumptions A1 to A5, one has :
(a) pM (x) p(x) where
E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)+
p(x) :=
tS0
d0
Sj
j=1
j,N (t)Ct,j
p(x)dx.
u
Proof
(a) follows from Theorem 3 and the observation that if t Sj , one has
(t) C }. (b) is an obvious consequence of (a).
{M X(t)} {Xj,N
t,j
The actual interest of this Corollary depends on the feasibility of computing p(x). It turns
out that it can be done in some relevant cases, as we will see in the remaining of this section.
+
Our result can be compared with the approximation of P{M > u} by means of u pE (x)dx
given by [3], [34] where
pE (x) :=
d0
(1)j
+
j=1
Sj
j,N (t)Ct,j
Under certain conditions , u pE (x)dx is the expected value of the EPC of the excursion set
Eu (see [3]). The advantage of pE (x) over p(x) is that one can have nice expressions for it in
quite general situations. Conversely p(x) has the obvious advantage that it is an upper-bound
of the true density pM (x) and hence provides upon integrating once, an upper-bound for the
tail probability, for every u value. It is not known whether a similar inequality holds true for
pE (x).
On the other hand, under additional conditions, both provide good first order approximations
for pM (x) as x as we will see in the next section. In the special case in which the process
X is centered and has a law that is invariant under isometries and translations, we describe
below a procedure to compute p(x).
For one-parameter centered Gaussian process having constant variance and satisfying certain
regularity conditions, a general bound for pM (x) has been computed in [8], pp.75-77. In the
two parameter case, Mercadier [26] has shown a bound for P{M > u}, obtained by means of a
method especially suited to dimension 2. When the parameter is one or two-dimensional, these
bounds are sharper than the ones below which, on the other hand, apply to any dimension but
to a more restricted context. We will assume now that the process X is centered Gaussian,
with a covariance function that can be written as
E X(s).X(t) = s t
(10)
X
ti (t).X(t)
2. E
X
X
ti (t). tk (t)
3. E
2X
ti tk (t).X(t)
4. E
2X
2X
ti tk (t). ti tk (t)
= 0,
= 2 ik and < 0,
= 2 ik , E
2X
X
ti tk (t). tj (t)
=0
= 24 ii .kk + i k .ik + ik i k ,
5. 2 0
6. If t Sj , the conditional distribution of Xj (t) given X(t) = x, Xj (t) = 0 is the same as
the unconditional distribution of the random matrix
Z + 2 xIj ,
where Z = (Zik : i, k = 1, . . . , j) is a symmetric j j matrix with centered Gaussian
entries, independent of the pair X(t), X (t) such that, for i k, i k one has :
E(Zik Zi k ) = 4 2 ii + ( 2 ) ik i k + 4 ii .kk (1 ik ) .
Let us introduce some additional notations:
Hn (x), n = 0, 1, . . . are the standard Hermite polynomials, i.e.
Hn (x) := ex
n x2
2 /2
n x2 /2
Jn (x) :=
ey
2 /2
Hn ()dy, n = 0, 1, 2, . . .
(11)
where stands for the linear form = ay + bx where a, b are some real parameters that satisfy
a2 + b2 = 1/2. Then
ey
2 /2
Hn ()dy = 2nb
Also:
Jn (0) =
ey
2 /2
(12)
ey
2 /2
Hn (ay)dy,
Jn (0) =
ey
2 /2
= 2a2
ey
(2p)!
2.
J2p (0) = (1)p (2b)2p (2p 1)!! 2 = (2b2 )p
p!
(13)
(14)
Now we can go back to (12) and integrate successively for n = 1, 2, . . . on the interval [0, x]
using the initial value given by (14) when n = 2p and Jn (0) = 0 when n is odd, obtaining :
(15)
Qn (x)
= nQn (x)
(16)
Qn (0) = 0 if n is odd
(17)
(18)
It is now easy to show that in fact Qn (x) = H n (x) , n = 0, 1, 2, . . . using for example that:
x
H n (x) = 2n/2 Hn .
2
The integrals
In (v) =
2 /2
et
Hn (t)dt,
will appear in our computations. They are computed in the next Lemma, which can be proved
easily, using the standard properties of Hermite polynomials.
10
Lemma 4 (a)
[ n1
]
2
v2 /2
(n 1)!!
Hn12k (v)
(n 1 2k)!!
(n 1)!! 2 (x)
2k
In (v) = 2e
k=0
n
+ 1I{n even} 2 2
(b)
(19)
(20)
(21)
Theorem 4 Assume that the process X is centered Gaussian, satisfies conditions A1-A5 with
a covariance having the form (10) and verifying the regularity conditions of the beginning of this
section. Moreover, let S be a polyhedron. Then, p(x) can be expressed by means of the following
formula:
d0
| | j/2
p(x) = (x)
H j (x) + Rj (x) gj
0 (t) +
,
(22)
j=1
tS0
where
gj =
(23)
Sj
where j (t) is the normalized solid angle of the cone Ct,j in Nt,j , that is:
j (t) =
d (t) = 1.
(24)
(25)
Notice that for convex or other usual polyhedra j (t) is constant for t Sj , so that gj is
equal to this constant multiplied by the j-dimensional geometric measure of Sj .
For j = 1, . . . d,
Rj (x) =
2
| |
j
2
((j + 1)/2
y2
dy
2
(26)
with := | |( )1/2
(27)
Tj (v) exp
where
v := (2)1/2 (1 2 )1/2 y x
and
j1
Tj (v) :=
k=0
Hk2 (v) v2 /2
Hj (v)
Ij1 (v).
e
j
2k k!
2 (j 1)!
(28)
2 /2
qn () = e
2 /2
c2k Hk2 ()
k=0
+
ey
+ 1I{n odd
2 /2
Hn (y)dy 2
ey
2 /2
Hn (y)dy
Hn1 ()
,
+ y 2 /2
e
H
(y)dy
n1
(29)
qn+1 ()
,
n+1
(30)
Proof:
Denote by 1 , . . . , n the eigenvalues of Gn . It is well-known (Mehta [25], Kendall et al. [19])
that the joint density fn of the n-tuple of random variables (1 , . . . , n ) is given by the formula
n
n
2
i=1 i
fn (1 , . . . , n ) = cn exp
1i<kn
(1+i/2)
i=1
Then,
n
E | det(Gn In )| = E
i=1
|i |
=
Rn i=1
|i |cn exp(
= e
2 /2
cn
cn+1
Rn
n
2
i=1 i
)
1i<kn
|k i | d1 , . . . , dn
fn+1 (1 , . . . , n , )d1 , . . . , dn = e
2 /2
cn qn+1 ()
.
cn+1 n + 1
(31)
j/2
j/2
(2 )
Xj,N
(t)
is independent of
(33)
Since the distribution of X (t) is centered Gaussian with variance 2 Id , it follows that :
E( 1IX (t)Cbt,0 /X(t) = x) = 0 (t)
12
(32)
if t S0 ,
(34)
and if t Sj , j 1:
E(| det(Xj (t))| 1IX
j,N (t)Ct,j
/X(t) = x, Xj (t) = 0)
8 Gj + 2
E | det(Gj Ij )| (y)dy,
d0
0 (t) +
j=1
tS0
| |
j/2
H j (x) gj
(36)
which is the product of a standard Gaussian density times a polynomial with degree d0 .
Integrating once, we get -in our special case- the formula for the expectation of the EPC
of the excursion set as given by [3]
The complementary term given by
d0
Rj (x)gj ,
(x)
(37)
j=1
can be computed by means of a formula, as it follows from the statement of the theorem
above. These formulae will be in general quite unpleasant due to the complicated form of
Tj (v). However, for low dimensions they are simple. For example:
T2 (v) = 2 2(v),
T1 (v) =
T3 (v) =
13
(38)
(39)
(40)
Second order asymptotics for pM (x) as x + will be mainly considered in the next
section. However, we state already that the complementary term (37) is equivalent, as
x +, to
12
2
3 2
x2
(41)
j+1
2
j/4
j/2
3 2
(2) (j 1)!
2j4
(42)
We are not going to go through this calculation, which is elementary but requires some
work. An outline of it is the following. Replace the Hermite polynomials in the expression
for Tj (v) given by (28) by the well-known expansion:
[j/2]
(1)i
Hj (v) = j!
i=0
(2v)j2i
i!(j 2i)!
(43)
v2
2j1
v 2j4 e 2 .
(j 1)!
(44)
Using now the definition of Rj (x) and changing variables in the integral in (26), one gets
for Rj (x) the equivalent:
12
Kj x2j4 e
2
3 2
x2
(45)
In particular, the equivalent of (37) is given by the highest order non-vanishing term in
the sum.
Consider now the case in which S is the sphere S d1 and the process satisfies the same
conditions as in the theorem. Even though the theorem can not be applied directly,
it is possible to deal with this example to compute p(x), only performing some minor
changes. In this case, only the term that corresponds to j = d 1 in (8) does not vanish,
= 1 for each t S d1 and one can use invariance
Ct,d1 = Nt,d1 , so that 1IX
b
(t)C
d1,N
t,d1
d1 S d1
E | det(Z + 2 xId1 ) + (2| |)1/2 Id1 |
(2)(d1)/2
(46)
14
variance 2| | and independent of the tangential derivative. So, we apply the previous
computation, replacing x by x + (2| |)1/2 and obtain the expression:
p(x) = (x)
+
2 d/2
(d/2)
| | (d1)/2
H d1 (x + (2| |)1/2 y) + Rd1 (x + (2| |)1/2 y) (y)dy.
(47)
Asymptotics as x +
In this section we will consider the errors in the direct and the EPC methods for large values
of the argument x. Theses errors are:
p(x) pM (x) =
d0
+
j=1
Sj
j,N (t)Ct,j
pE (x) pM (x) =
. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (48)
d0
(1)j
+
j=1
Sj
j,N (t)Ct,j
(50)
Proof :
Let W be an open neighborhood of the compact subset Sv of S such that dist(W, (S\Sd0 )) > 0
where dist denote the Euclidean distance in Rd . For t Sj W c , the density
pX(t),Xj (t) (x, 0)
can be written as the product of the density of Xj (t) at the point 0, times the conditional density
of X(t) at the point x given that Xj (t) = 0, which is Gaussian with some bounded expectation
and a conditional variance which is smaller than the unconditional variance, hence, bounded by
some constant smaller than 1. Since the conditional expectations in (48) are uniformly bounded
by some constant, due to standard bounds on the moments of the Gaussian law, one can deduce
that:
p(x) pM (x) =
W Sd0
d0 ,N (t)Ct,d0
.pX(t),Xd
as x +, for some 1 > 0. Our following task is to choose W such that one can assure
that the first term in the right hand-member of (51) has the same form as the second, with a
possibly different constant 1 .
To do this , for s S and t Sd0 , let us write the Gaussian regression formula of X(s) on the
pair (X(t), Xd 0 (t)):
X(s) = at (s)X(t) + bt (s), Xd 0 (t) +
ts
2
X t (s).
(52)
where the regression coefficients at (s), bt (s) are respectively real-valued and Rd0 -valued.
From now onwards, we will only be interested in those t W . In this case, since W does not
contain boundary points of S\Sd0 , it follows that
Ct,d0 = Nt,d0 and 1IX
d0 ,N (t)Ct,d0
= 1.
Moreover, whenever s S is close enough to t, necessarily, s Sd0 and one can show that
the Gaussian process {X t (s) : t W Sd0 , s S} is bounded, in spite of the fact that its
trajectories are not continuous at s = t. For each t, {X t (s) : s S} is a helix process, see [8]
for a proof of boundedness.
On the other hand, conditionally on X(t) = x, Xd 0 (t) = 0 the event {M > x} can be written as
{X t (s) > t (s) x, for some s S}
where
t (s) =
2(1 at (s))
.
ts 2
(53)
Our next goal is to prove that if one can choose W in such a way that
inf{ t (s) : t W Sd0 , s S, s = t} > 0,
(54)
then we are done. In fact, apply the Cauchy-Schwarz inequality to the conditional expectation
in (51). Under the conditioning, the elements of Xd0 (t) are the sum of affine functions of x
with bounded coefficients plus centered Gaussian variables with bounded variances, hence, the
absolute value of the conditional expectation is bounded by an expression of the form
Q(t, x)
1/2
X t (s)
>x
t
sS\{t} (s)
sup
16
1/2
(55)
where Q(t, x) is a polynomial in x of degree 2d0 with bounded coefficients. For each t W Sd0 ,
the second factor in (55) is bounded by
P sup
X t (s)
: t W Sd0 , s S, s = t > x
t (s)
1/2
X t (s)
: t W Sd0 , s S, s = t > x C2 exp(2 x2 ),
t (s)
for some positive constants C2 , 2 and any x > 0. Also, the same argument above for the density
pX(t),Xd (t) (x, 0) shows that it is bounded by a constant times the standard Gaussian density.
0
To finish, it suffices to replace these bounds in the first term at the right-hand side of (51).
It remains to choose W for (54) to hold true. Consider the auxiliary process
Y (s) :=
X(s)
r(s, s)
, s S.
(56)
s = t, it follows that X(t), Xd0 (t) are independent, on differentiation under the expectation
sign. This implies that in the regression formula (52) the coefficients are easily computed and
at (s) = r(s, t) which is strictly smaller than 1 if s = t, because of the non-degeneracy condition.
Then
2(1 r(s, t))
2(1 r Y (s, t))
t (s) =
.
(57)
ts 2
ts 2
Since r Y (s, s) = 1 for every s S, the Taylor expansion of r Y (s, t) as a function of s, around
s = t takes the form:
Y
r Y (s, t) = 1 + s t, r20,d
(t, t)(s t) + o( s t 2 ),
0
(58)
(59)
where the last equality follows by differentiation in (56) and putting s = t. (59) implies that
Y
(t, t) is uniformly positive definite on t Sv , meaning that its minimum eigenvalue has
r20,d
0,
a strictly positive lower bound. This, on account of (57) and (58), already shows that
inf{ t (s) : t Sv , s S, s = t} > 0,
(60)
(61)
(1 at0 (s0 ))
= t0 (s0 ),
t0 s 0 2
d0 t s
s
2
X t (s),
where (at )d0 (s) is a column vector of size d0 and (bt )d0 (s) is a d0 d0 matrix. Then, one must
have at (t) = 1, (at )d0 (t) = 0 . Thus
tn (sn ) = uTn (at0 )d0 (t0 )un + o(1),
where un := (sn tn )/ sn tn . Since t0 Sv we may apply (61) and the limit of tn (sn )
cannot be non-positive.
A straightforward application of Theorem 5 is the following
Corollary 2 Under the hypotheses of Theorem 5, there exists positive constants C, such that,
for every u > 0 :
+
+
u
and
1
:= lim 2x2 log p(x) pM (x)
x+
d2
(62)
1
lim 2x2 log pE (x) pM (x)
2 := x+
E
(63)
whenever these limits exist. In general, we are unable to compute the limits (62) or (63) or
even to prove that they actually exist or differ. Our more general results (as well as in [3], [34])
only contain lower-bounds for the liminf as x +. This is already interesting since it gives
some upper-bounds for the speed of approximation for pM (x) either by p(x) or pE (x). On the
other hand, in Theorem 7 below, we are able to prove the existence of the limit and compute
d2 for a relevant class of Gaussian processes.
18
For the next theorem we need an additional condition on the parameter set S. For S
verifying A1 we define
(S) = sup
sup
0jd0 tSj
sup
sS,s=t
(64)
t2 := sup
and
t := sup
dist 1
t r01 (s, t), Ct,j
1 r(s, t)
sS\{t}
(66)
where
t := Var(X (t))
(t) is the maximum eigenvalue of t
in (66), j is such that t Sj ,(j = 0, 1, . . . , d0 ).
The quantity in the right hand side of (65) is strictly bigger than 1.
Remark. In formula (65) it may happen that the denominator in the right-hand side is
identically zero, in which case we put + for the infimum. This is the case of the one-parameter
process X(t) = cos t + sin t where , are Gaussian standard independent random variables,
and S is an interval having length strictly smaller than .
19
Proof of Theorem 6
Let us first prove that suptS t < .
For each t S, let us write the Taylor expansions
r01 (s, t) = r01 (t, t) + r11 (t, t)(s t) + O( s t 2 )
= t (s t) + O( s t 2 )
L3
1 r(s, t)
+ L4 ,
(67)
L3 (S) + L4 .
1 r(s, t)
With the same notations as in the proof of Theorem 5, using (4) and (8), one has:
d0
+
j=1
Sj
j,N (t)Ct,j .
t2
1
,
+ (t)2t
T
{M > x} and {Rt (s) > (1 r(s, t))x r01
(s, t)1
t Xj,N (t) for some s S}
20
(69)
coincide.
(t)|X (t) = 0) the regression of X (t) on X (t) = 0. So, the probability in
Denote by (Xj,N
j
j
j,N
(69) can written as
Cbt,j
P{ t (s) > x
T (s, t)1 x
r01
(70)
where
t (s) :=
Rt (s)
1 r(s, t)
If 1
t r01 (s, t) Ct,j one has
T
r01
(s, t)1
t x 0
1
t r01 (s, t) = z + z
So, if x Ct,j :
T (s, t)1 x
r01
z T x + z T x
t
=
t x
1 r(s, t)
1 r(s, t)
using that z T x 0 and the Cauchy-Schwarz inequality. It follows that in any case, if x Ct,j
the expression in (70) is bounded by
Cbt,j
(71)
To obtain a bound for the probability in the integrand of (71) we will use the classical
inequality for the tail of the distribution of the supremum of a Gaussian process with bounded
paths.
The Gaussian process (s, t))
t (s), defined on (S S)\{s = t} has continuous paths. As
the pair (s, t) approches the diagonal of S S, t (s) may not have a limit but, almost surely,
it is bounded (see [8] for a proof). (For fixed t, t (.) is a helix process with a singularity at
s = t, a class of processes that we have already met above).
We set
mt (s) := E( t (s)) (s = t)
m := sups,tS,s=t |mt (s)|
:= E | sups,tS,s=t t (s) mt (s) | .
The almost sure boundedness of the paths of t (s) implies that m < and < . Applying the
Borell-Sudakov-Tsirelson type inequality (see for example Adler [2] and references therein) to
the centered process s
t (s)mt (s) defined on S\{t} , we get whenever xt x m > 0:
P{ t (s) > x t x for some s S}
(x t x m )2
.
2t2
jd
2
x mj,N (t)
exp
2j (t)
(t)|X (t))
where j (t) and j (t) are respectively the minimum and maximum eigenvalue of Var(Xj,N
j
and mj,N (t) is the conditional expectation E(Xj,N (t)|Xj (t) = 0). Notice that j (t), j (t), mj,N (t)
are bounded, j (t) is bounded below by a positive constant and j (t) (t).
P {Xj,N
Ct,j } {M > x}/X(t) = x, Xj (t) = 0
(2j (t))
jd
2
exp
x mj,N (t) 2
(x t x m )2
dx
+
2t2
2(t)
xm
+ P Xj,N
(t)|Xj (t) = 0
, (72)
t
where it is understood that the second term in the right-hand side vanishes if t = 0.
Let us consider the first term in the right-hand side of (72). We have:
x mj,N (t)
(x t x m )2
+
2t2
2(t)
2
(x t x m )2 ( x mj,N (t) )
+
2t2
2(t)
(x m t mj,N (t) )2
2
,
= A(t) x + B(t)(x m ) + C(t) +
2t2 + 2(t)2t
where the last inequality is obtained after some algebra, A(t), B(t), C(t) are bounded functions
and A(t) is bounded below by some positive constant.
So the first term in the right-hand side of (72) is bounded by :
2.(2j )
jd
2
exp
(x m t mj,N (t))2
2t2 + 2(t)2t
Rdj
dx
(x m t mj,N (t) )2
2t2 + 2(t)2t
(73)
where L is some constant. The last inequality follows easily using polar coordinates.
Consider now the second term in the right-hand side of (72). Using the form of the conditional
density pXj,N
(t)/Xj (t)=0 (x ), it follows that it is bounded by
P
(Xj,N
(t)/Xj (t)
= 0)
mj,N (t)
x m t mj,N (t)
t
L1 |x|dj2 exp
(x m t mj,N (t) )2
2(t)2t
where L1 is some constant. Putting together (73) and (74) with (72), we obtain (69).
The following two corollaries are straightforward consequences of Theorem 6:
22
(74)
tS 2
t
x+
1
.
+ (t)2t
d0
p (x) =
j=0
Examples
1 v (t0 )/2
1/k
kCk
E || 2k 1 x11/k (x),
(75)
1
v (2k) (t0 ) + 14 [v (t0 )]2 1Ik=2 . The
where is a standard normal random variable and Ck = (2k)!
proof is a direct application of the Laplace method. The result is new for the density of the
maximum, but if we integrate the density from u to +, the corresponding bound for P{M > u}
is known under weaker hypotheses (Piterbarg [28]).
2) Let the process X be centered and satisfy A1-A5. Assume that the the law of the process
is isotropic and stationary, so that the covariance has the form (10) and verifies the regularity
condition of Section 4. We add the simple normalization = (0) = 1/2. One can easily
check that
1 2 ( s t 2 ) 42 ( s t 2 ) s t 2
(76)
t2 = sup
[1 ( s t 2 )]2
sS\{t}
Furthermore if
(x) 0 for x 0
(77)
one can show that the sup in (76) is attained as s t 0 and is independent of t. Its value
is
t2 = 12 1.
The proof is elementary (see [4] or [34]).
Let S be a convex set. For t Sj , s S:
dist r01 (s, t), Ct,j = dist 2 ( s t 2 )(t s), Ct,j .
23
(78)
The convexity of S implies that (t s) Ct,j . Since Ct,j is a convex cone and 2 ( s t 2 ) 0,
one can conclude that r01 (s, t) Ct,j so that the distance in (78) is equal to zero. Hence,
t = 0 for every t S
and an application of Theorem 6 gives the inequality
lim inf
x+
1
2
.
log p(x) pM (x) 1 +
2
x
12 1
(79)
A direct consequence is that the same inequality holds true when replacing p(x) pM (x) by
|pE (x) pM (x)| in (79), thus obtainig the main explicit example in Adler and Taylor [3], or in
Taylor et al. [34].
Next, we improve (79). In fact, under the same hypotheses, we prove that the liminf is an
ordinary limit and the sign is an equality sign. We state this as
Theorem 7 Assume that X is centered, satisfies hypotheses A1-A5, the covariance has the
form (10) with (0) = 1/2, (x) 0 f or x 0. Let S be a convex set, and d0 = d 1.
Then
1
2
.
(80)
lim log p(x) pM (x) = 1 +
x+ x2
12 1
Remark Notice that since S is convex, the added hypothesis that the maximum dimension d0
such that Sj is not empty is equal to d is not an actual restriction.
Proof of Theorem 7
In view of (79), it suffices to prove that
lim sup
x+
1
2
log p(x) pM (x) 1 +
.
2
x
12 1
(81)
Using (4) and the definition of p(x) given by (8), one has the inequality
p(x) pM (x) (2)d/2 (x)
Sd
(82)
where our lower bound only contains the term corresponding to the largest dimension and we
have already replaced the density pX(t),X (t) (x, 0) by its explicit expression using the law of the
process. Under the condition {X(t) = x, X (t) = 0} if v0T X (t)v0 > 0 for some v0 S d1 , a
Taylor expansion implies that M > x. It follows that
E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0
E | det(X (t))| 1I
vS d1
We now apply Lemma 2 which describes the conditional distribution of X (t) given X(t) =
x, X (t) = 0 . Using the notations of this lemma, we may write the right-hand side of (83) as :
E | det(Z xId)| 1I
sup v T Zv > x
vS d1
=
x
y2
dy, (84)
2 2
y
Z12
... ...
Z1d
2 + y Z23 . . .
Z2d
Z :=
,
..
.
d + y
where the random variables {2 , . . . , d , Zik , 1 i < k d} are independent centered Gaussian
with
Var(Zik ) = 4 (1 i < k d) ; Var(i ) =
4 1
16 (8 1)
(i = 2, . . . , d) ; =
12 1
12 1
1
2
exp(
x
L
y2
)E | det(ZxId)| dy
2
2
2
x(1+0 )+1
exp(
x(1+0 )
y2
)0 (1(1+0 ))d1 xd dy
2 2
for x large enough. On account of (82),(83),(84), we conclude that for x large enough,
p(x) pM (x) L1 xd exp
x2 (x(1 + 0 ) + 1)2
+
.
2
2 2
for some new positive constant L1 . Since 0 can be chosen arbitrarily small, this implies (81).
3) Consider the same processes of Example 2, but now defined on the non-convex set {a
t b}, 0 < a < b. The same calculations as above show that t = 0 if a < t b and
t = max
for t = a.
4) Let us keep the same hypotheses as in Example 2 but without assuming that the covariance is decreasing as in (77). The variance is still given by (76) but t is not necessarily equal
to zero. More precisely, relation (78) shows that
t sup 2
sS\{t}
( s t 2 )+ s t
1 ( s t 2 )
The normalization: = 1/2 implies that the process X is identity speed, that is
Var(X (t)) = Id so that (t) = 1. An application of Theorem 6 gives
lim inf
x+
where
2
log p(x) pM (x) 1 + 1/Z .
x2
(85)
2
4 (z 2 )+ z
1 2 (z 2 ) 42 (z 2 )z 2
+
max
,
[1 (z 2 )]2
z(0,] [1 (z 2 )]2
z(0,]
Z := sup
and is the diameter of S.
5) Suppose that
25
the process X is stationary with covariance (t) := Cov(X(s), X(s + t)) that satisfies
(s1 , . . . , sd ) = i=1,...,d i (si ) where 1 , ..., d are d covariance functions on R which are
monotone, positive on [0, +) and of class C 4 ,
S is a rectangle
S=
[ai , bi ] , ai < bi .
i=1,...,d
Then, adding an appropriate non-degeneracy condition, conditions A2-A5 are fulfilled and Theorem 6 applies
It is easy to see that
..
r0,1 (s, t) =
.
1 (s1 t1 ) . . . d1 (sd1 td1 ).d (sd td )
belongs to Ct,j for every s S. As a consequence t = 0 for all t S. On the other hand,
standard regressions formulae show that
2
2
2
2
2
1 21 . . . 2d 2
Var X(s)/X(t), X (t)
1 2 . . . d 1 . . . d1 d
=
,
(1 r(s, t))2
(1 1 . . . d )2
References
[1] Adler, R.J. (1981). The Geometry of Random Fields. Wiley, New York.
[2] Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes. IMS, Hayward, Ca.
[3] Adler, R.J. and Taylor J. E.(2005). Random fields and geometry. Book to appear.
[4] Azas J-M., Bardet J-M. and Wschebor M. (2002). On the Tails of the distribution of the
maximum of a smooth stationary Gaussian Process. Esaim: P. and S., 6,177-184.
[5] Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution of the
maximum of a Gaussian random fields. Extremes (2002)5(2), 181-212.
[6] Azas, J-M and Wschebor, M. (2002). The Distribution of the Maximum of a Gaussian
Process: Rice Method Revisited, In and out of equilibrium: probability with a physical
flavour, Progress in Probability, 321-348, Birkha
user.
[7] Azas J-M. and Wschebor M (2001). On the regularity of the distribution of the Maximum
of one parameter Gaussian processes Probab. Theory Relat. Fields, 119, 70-98.
[8] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[9] Azas J-M. and Wschebor, M. (2006). A self contained proof of the Rice formula for random
fields. Preprint available at http://www.lsp.ups-tlse.fr/Azais/publi/completeproof.pdf.
[10] Belyaev, Y. (1966). On the number of intersections of a level by a Gaussian Stochastic
process. Theory Prob. Appl., 11, 106-113.
26
[11] Berman, S.M. (1985a). An asymptotic formula for the distribution of the maximum of a
Gaussian process with stationary increments. J. Appl. Prob., 22,454-460.
[12] Berman, S.M. (1992). Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series.
[13] Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent. Math., 30,
207-216.
[14] Borell, C. (2003). The Ehrhard inequality. C.R. Acad. Sci. Paris, Ser. I, 337, 663-666.
[15] Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes, J.
Wiley & Sons, New-York.
[16] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[17] Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
dEte de Probabilites de Saint Flour (1974). Lecture Notes in Mathematics, 480, SpringerVerlag, New-York.
[18] Fyodorov, Y. (2006). Complexity of Random Energy Landscapes, Glass Transition and
Absolute Value of Spectral Determinant of Random Matrices Physical Review Letters v. 92
(2004), 240601 (4pages); Erratum: ibid. v.93 (2004),149901(1page)
[19] Kendall, M.G., Stuart,A. and Ord, J.K. (1987). The Advanced Theory of Statistics, Vol. 3.
[20] Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process. Sankya
Ser. A 32, 369-378.
[21] Ledoux, M. (2001). The Concentration of Measure Phenomenon. American Math. Soc.,
Providence, RI.
[22] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces, Springer-Verlag,
New-York.
[23] Marcus, M.B. (1977). Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 52-71.
[24] Marcus, M.B. and Shepp, L.A. (1972). Sample behaviour of Gaussian processes. Proc.
Sixth Berkeley Symp. Math. Statist. Prob., 2, 423-442.
[25] Mehta,M.L. (2004). Random matrices, 3d-ed. Academic Press.
[26] Mercadier, C. (2006). Numerical bounds for the distribution of the maximum of one- and
two-dimensional processes, to appear in Advances in Applied Probability, 38, (1).
[27] Piterbarg, V; I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th, Proba. Appl., 26, 687-705.
[28] Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes and
Fields. American Mathematical Society. Providence. Rhode Island.
[29] Rychlik, I. (1990). New bounds for the first passage, wave-length and amplitude densities.
Stochastic Processes and their Applications, 34, 313-339.
[30] Sudakov, V.N. and Tsirelson, B.S. (1974). Extremal properties of half spaces for spherically
invariant measures (in Russian). Zap. Nauchn. Sem. LOMI, 45, 75-82.
27
[31] Sun, J. (1993). Tail Probabilities of the Maxima of Gaussian Random Fields, Ann. Probab.,
21, 34-71.
[32] Talagrand, M. (1996). Majorising measures: the general chaining. Ann. Probab., 24,
1049-1103.
[33] Taylor, J.E. and Adler, R. J. (2003). Euler characteristics for Gaussian fields on manifolds.
Ann. Probab., 31, 533-563.
[34] Taylor J.E., Takemura A. and Adler R.J. (2005). Validity of the expected Euler Characteristic heuristic. Ann. Probab., 33, 4, 1362-1396.
[35] Wschebor, M. (1985). Surfaces aleatoires. Mesure geometrique des ensembles de niveau.
Lecture Notes in Mathematics, 1147, Springer-Verlag.
28
1,2
rqsrtsrqrqssq r
sqrrrrsrtqsrqsr rr
qrq
rssqrrsqsrqq
s
qsqqsqr qqsrqqsrrs q
rrqqrs srq rqrsqtqrq qqqsqs
qqs rqr q tqsqr srqqsqqr
r rrqsq t rqr
s
q
s
q
q
s
q
s
q
s
q
n
qsrqrqrr rq sqsqMqrqqrsqrsrrqq
rrqqrqsq rqqsq
q r srq srqsr sq
s q r q r qs qs rqrqq r s
rqqssr qsqq rqrqsqsrqq srqsqrqstqq
qqqrrsqsq sqsqqsrsqsqs qrqrrqsq
r r s r(t)
= exp ( |t|) > 0 = 2 rq
s
q
qs r qsrq qs
s
q
q
rrqqqsqqrq q r qqsqsq qs
s sqqr rsqrrrq qr q srsqsqqqrsrsqsqrq
s
q
s
q
rrqr sq
qs qsrqrrq
qsq qq rqsr
qsqq qs
d0
1
qs qsr rs qsq qsrrq rqrqAqsr
B
A
sqqsrrqss (A B)A
Brrqrq nqq
A rrr r q
rqsqs A r B rrrrq
srrrrqsrq qsrqsrqsqqrqq r rqqsqsqr rrr
r ssq M q Lr n r
rrqrqs n n r (d ) ; d < d < . . . < d
sqqrq qsqqqqqsrrqsrrqs
rq qs rrr d s
sqrqqqs A r
rrrq q rrr d s
sqrqqqs B r
qrqqq q qrr rssqs rqrqsr rqsq
r B qsr qrqrqsqrqArr qsqsqrrqsqsq s qqsr
rr qsqrt srr rrqs q d drqsqsrr
|d d |
1 exp(2 |d d |)
i,n i=1,...,Mn
1,n
2,n
Mn ,n
i,n
i,n
R(d, d ) =
s
q
rr d [0, L] rq q Y
q
k rqrr qsqr qqr
q r
th
Yk = + Xk (d0 ) a/2 + k ,
q
a rq
k rrq q
X (d ) =
k rqs q
sr r qssr
q ( )
s
qsqr rrq r (F )
n qq
d Cn
rqq d := inf d
qssqqstr qqsq C qqrrqqrq r
M
n
qqr
qs ,
a = a = n
qssr q
{ Y , X (d ), ..., X (d
) , k = 1, ..., n}.
= a = n
, rqs d [0, L] qs
r q qsq r q q qs rr
qssqrqqrr q qqqqsqqqr a r rq qr
d
q rr rsrqqr qds qsrq aqrqq
qsrqrqsr
Y
Y
S (d) :=
th
1
1
th
1kn
1iMn
i+1,n
i,n
1/2
1,n
Mn ,n
1/2
k [Xk (d)=1]
n,1
k=1
n
k [Xk (d)=1]
k=1
n
[Xk (d)=1]
k=1
[Xk (d)=1]
k=1
.
Y Y
Y Y
n
n
rq d rrqqr X (d) rqqs
qsqssr qrr q
[Xk (d)=]
n,2
[Xk (d)=1]
k=1
[Xk (d)=1]
k=1
d
n
d+
n
d
n = sup {di,n : di,n < d; i = 1, . . . , Mn } ,
n,2
+
n
1,n
Mn ,n
n,2
n,2
n,2
d[0,L]
n,2
n,2
n,2
+
n
21
m(d) =
exp(2|d d|)
2
qsqrs rqqsqsqsr T = sup |X (d)| r
q qsqsrrq rqsq qsqrqqrqq
d[0,L]
d[0,L]
(X (d))
d[0,L]
2
d[0,L]
m (d) = (m ) (d) =
Xn (d) =
d[0,L]
d0 22 d
r (d) =
d[0,L]
d[0,L]
exp 2(d0 + 2 + d)
+ exp 2(d0 + 2 d)
d0 + 22 d
d 42
= exp(2(22 d))
2 (d v) r(v) dv
+ exp(2(22 + d))
(22 ) 2
qqsqsqrqs (t) =
d + 42
1
(22 ) 2
1 (u)du
= 1
q q qrq qsq
qsqrqq (Y(d))
d[0,L]
d[0,L]
d[0,L]
srqs q
r qsYqrqrrq qq ps (Y(d)
rqqr r q sss qs qsr(Y(d)))
d[0,L]
d[0,L]
q s
rq
() 21 [ (( 1))] ( 1) () .
r (2) = (U + D )
[(U + D )(U + D 1)]
(U + D )
(|Y | > u) (U + D )
[(U + D ) (U + D 1)] = (U (U 1))+ (D (D 1))+2 (U D ) .
rqrt r r qr
q q
U
= I (u) =
dt
dx
yp (x, u; y)dy
|y|p (x, u; y)dy
dx
dt
D
= I (u) =
(U (U 1)) = I (u) =
dt dt
yyp
(u, u; y , y )dy dy
(D (D 1)) = I (u) =
dt dt
|y y | p
(u, u; y , y )dy dy
|Y (0)|u
|Y (0)|u
|Y (0)|u
u |Y (0)|u
0,t;t
u |Y (0)|u
0,t;t
1 2
|y y | p
(u, u; y , y )dy dy ,
dt dt
(U D ) = I (u) =
rq psrq srqs (Y(0), Y (t),
Y (t)) qsqqqr
ts rrt
(|Y (0)| > u) + I (u) + I (u) I (u) + I (u)2 + 2I (u) (|Y | > u)
(|Y (0)| > u) + I (u) + I (u).
rq s
r q q
s Y = sup Y (d) qs (3) qsr
(Y > u) (Y (0) > u) + (U 1) (Y (0) > u) + (U ) .
rq qrrr
qqq s qs r|Y| rq Y {|Y
(0)| u}
qrrqssqsqrq
u r qsrqrqqqs
rrqsqs s r rqsrrq
rqtr r srrqqq
rqrq
qrqqsq (Y (d))
Y Y
qrqssr qqsrr
2 [ (Y (0) > u) + I (u)] I (u) I (u) (|Y | > u) 2 [ (Y (0) > u) + I (u)] .
qqsrq sqqsqsq rqr qqs tr
sq rqrr r rsqsqrsqqr q
qsqsqsqrq rt qrq
L
1 2
0,t;t
d[0,L]
d[0,L]
sq
rr qs
r
q
qr ==100
qr = 10
q
rsr = 0
rrssrr = 10
= 10
(X (d))
= 0 (X (d))
r
r
r 10
s
q
s
q
qrq rrqs
qs qsqq rqrqrr q
rqqqr qqs
qssqsq rq qr
qqsq qsr r tr
r r s rr rs rqts rq
s
q
d[0,L]
d[0,L]
> u)
(U (|Y+ D(0)|
2
(U (U 1))
2
(D (D 1)) 10 10
2
10 10
(U D )
r
= 6
d = 0.4 r r
r r
10
u
2
|Y (0)|u
q srsrqqrqsrq qsqsr
rqs qqq qr
= 10
qsrqsq sqr rqsqrqqsrsqq
qss r rqrq rr q r q q qsr qsrqs rqsr
= 10 , 10
qrqsrqr = 10 , 10 r rqqs
rqsr s sqrqq qqsrq qrr qsqs rr
10
=
sqr qqs r r
rrqsq s s srqst q
qsrrq
r
r r
10
sqrrqr
r rqsqrrq srrsr
qsrsqq rqsr
qsqrrrqq rq
q rr rq q q rqrrrqsq qr = 10 , 10
q = 10 , 10 r rqrq
2
rqrqqqs
q qq srqsrq ts q qsq q qsqr
sqsq
rrqsr r s
r q qs r sq qs
q r qs r r rrq
srqr ts q qrqqsq q r rqsrr
r s
rq qsr q q rqrqq
rq qsr q q rqrqq
qq s qqsr s r rq t s qq
r q rq q r rq
r q rrq qqsr
rsq rrq q rq r
q qrq q r r r sqs r
r sr s s
r r qs srq
q q s qqsrq qsrs qq qsr rq
r r sr sq
r qq rq
r qq rq
rqsrqsrrq r
q sqrsqq srt
srr q qrq
We begin with a proof of the so-called Area formula, under conditions that will be sufficient for
our main purpose. One can find this formula in its full generality in Federer [3] Th 3.2.5
For any function f , we denote Nuf (T ) the number of roots of the equation f (t) = u that
belong to the subset T of the domain of f .
Proposition 1 (Area formula) Let f be a C 1 function defined on an open subset U of Rd
taking values in Rd . Assume that the set of critical values of f has zero Lebesgue measure.
Let g: Rd R be continuous and bounded. Then
Rd
g(u)Nuf (T )du =
(1)
for any Borel subset T of U , whenever the integral in the right-hand side is well defined.
Proof. Notice first that, due to standard extension arguments, it suffices to prove (1) for
non-negative g and for T a compact parallelotope contained in U . Second, if T is a compact
parallelotope, since f is C 1 , the set of boundary values of f , that is, f (T ) has Lebesgue measure
zero.
We next define an auxiliary function (u) for u Rd in the following way:
If u is neither a critical value nor a boundary value and n := Nuf (T ) is non zero, we denote
by x(1) , . . . , x(n) the roots of f (x) = u belonging to T . Using the local inversion theorem, we
know that there exists some > 0 and n neighborhoods U1 , ..., Un of of x(1) , . . . , x(n) such that:
Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.
n
i=1 Ui ,
then f (t)
/ B(u; ).
(2)
Let (u) > 0 and 0 < < (u). Using the change of variable formula we have
n
| det(f (t))| 1I
T
f (t)u
< dt =
i=1
Ui
where V () is the volume of the ball with radius in Rd . Thus, we have an exact counter for
Nuf (T ) when it is non-zero, which obviously holds true also when Nuf (T ) = 0 for < (u)
Let g : Rd R continuous, bounded and non-negative and 0 > 0. For every < 0 /2 we
have:
Rd
g(u)Nuf (T )F
(u)
0
du =
(u)
0
g(u) F
Rd
du
1
V ( )
| det(f (t))| 1I
T
f (t)u <
dt
Applying Fubinis Theorem we see that the expression above is equal to:
A0 , :=
| det(f (t))| dt
T
1
V ( )
F
B(f (t); )
(u)
0
g(u)du.
A0 , in fact does not depend on so it is equal to its limit as 0 which is, because of the
continuity of the function u
(u)
0
g(u), equal to
| det(f (t))|F
T
(f (t))
0
g(f (t))dt.
Let now 0 tend to zero and use monotone convergence. For the left-hand side, we take into
account that the set of critical values and the set of boundary values have measure zero. For
2
the right-hand side, we use the definition of F, that the boundary of T has Lebesgue measure
zero and the integrand is zero if t is a critical point of f .
Remarks:
1: By standard extension arguments the continuous function g can be replaced by the
indicator function of a Borel set say B. Formula (1) can be rewritten as
h(t, u)du =
Rd
(3)
Rd
tZ 1 (u)
Rice formulae
(4)
E
Bk
Y t (w) is continu-
tI,Z(t)=u
(6)
Z(t) is of class C 2
b)
() =
sup
tU,xV (u)
Denote by det the modulus of continuity of | det(X (.))| and choose m large enough so that
d
P(Fm, ) = P det (
) .
m
Consider the partition of I into md small cubes with sides of length 1/m. Let Ci1 ...id be such a
cube and ti1 ...id its centre (1 i1 , ..., id m). Then
c
c
P GCi1 ...id EM
Fm,
When the event in the term corresponding to i1 ...id of the last sum occurs, we have:
|Zj (ti1 ...id )|
M
d j = 1, ..., d
m
< .
(7)
So, if m is chosen sufficiently large so that V (0) contains the ball centered at 0 with radius
M d
m , one has:
2M d
P(GI ) 2 + md (
d) C()
m
Since and are arbitrarily small, the result follows.
Remark:
With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .
Proof of Theorem 1
Let F : R+ [0, 1] be the function defined in (2), For m, n positive integers and x 0, define:
Fm (x) := F(mx) ; Gn (x) := 1 F(x/n).
(8)
A standard extension argument says that it is enough to prove the theorem when B is a
compact rectangle included in U . So we assume that this is the case. Let us introduce some
more notations:
(t) := | det(Z (t))| (t U )
For n, m positive integers and u Rd :
Cum (B) :=
Fm ((s)).
(9)
sB:Z(s)=u
m
m
Qn,m
u (B) := Cu (B)Gn (Cu (B)).
(10)
In (9) when the summation index set is empty, we put Cum (B) = 0. Let g : Rd R be
continuous with compact support . We apply the area formula (3) for the function
h(t, u) = Fm ((t))Gn (Cum (B))g(u) 1ItB
to get:
Rd
g(u)Qn,m
u (B)du =
m
(t) Fm ((t)) Gn (CZ(t)
(B)) g(Z(t))dt.
Rd
g(u) E(Qn,m
u (B))du =
g(u)du
Rd
Since this equality holds for any g continuous with bounded support, it follows that
E(Qn,m
u (B)) =
(11)
the one corresponding to v = u. Outside the union of V1 , ..., Vk , Z(t) u is bounded away from
zero in B, so that the contribution to Cvm (B) vanishes if v is sufficiently close to u.
This shows that a.s., the function v
Qn,m
is continuous at v = u. On the other hand, it is
v
obvious from its definition that Qn,m
v (B) n and an application of the Lebesgue dominated
convergence theorem implies the continuity of E(Qn,m
u (B)) as a function of u.
Let us now write the regression formulae for fixed t B :
Z(s) = at (s)Z(t) + Z t (s)
Z (s) = (at ) (s)Z(t) + (Z t ) (s),
(12)
where denotes the derivative with respect to s and the pair Z t (s), (Z t ) (s) is independent
from Z(t) for all s U .
Then, we write the conditional expectation on the right-hand side of (11) as the unconditional
expectation :
E tu (t)Fm (tu (t))Gn (Cum (B)) ,
(13)
where we use the notations
tu (s) := | det(Zut ) (s)|
Zut (s) := at (s)u + Z t (s)
Fm tu (s) .
Cum (B) :=
sB,Zut (s)=u
Now, observe that (11) implies that for almost every u Rd one has the inequality
E(Qn,m
u (B))
(14)
which is in fact true for all u Rd since both sides are continuous functions of u.
The remainder of the proof consists in proving the converse inequality. Let us fix n, m, u
and t. Let K be the compact set
K := {s B : tu (s) 1/4m}
If v varies in a sufficient small (random) neighborhood of u, the points outside K do not
contribute to the sum defining Cvm (B).
Let k the almost surely finite number of roots of Zut (s) = u lying in the set K. Assume that
k does not vanish and denote these roots by s1 , . . . , sk . Consider the equation
Zvt (s) v = 0.
(15)
where the inequality arises from the fact that some of the points si (v) may not belong to B and
hence, dont contribute to the sum defining Cvm (B). Now since (11) holds for a.e. u, one can
find a sequence {uN , N = 1, 2, . . .} converging to u such that (11) holds true for u = uN and
6
lim E Qn,m
uN (B)
N +
lim
N + B
Since Cum (B) is a.s. finite, we can now pass to the limit as n +, m + in that order
and applying Beppo-Levis Theorem, conclude the proof.
Proof of Theorem 2:
For each > 0, define the domain
Dk, (B) = {(t1 , ..., tk ) B k , ti tj if i = j, i, j = 1, ..., k}
and the process Z
(t1 , ..., tk ) Dk, (B)
It is clear that Z satisfies the hypotheses of Theorem 1 for every value (u, ..., u) (Rd )k . So,
e
Z
E N(u,...,u)
Dk, (B)
k
E
Dk, (B)
| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk (16)
j=1
Z
N(u,...,u)
Dk, (B) ,
and that the diagonal Dk (B) = (t1 , ..., tk ) B k , ti = tj for some pair i, j, i = j has zero Lebesgue
measure in (Rd )k .
Proof of Theorem 3:
The proof is essentially the same. It suffices to consider instead of Cum (B) the quantity
Cum (I) :=
Fm ((s)).gs (s, Y s ).
sI:Z(s)=u
(17)
References
[1] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[2] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[3] Federer, H. (1969). Geometric measure theory. Springer-Verlag, New York
We begin with a proof of the so-called Area formula, under conditions that will be sufficient for
our main purpose. One can find this formula in its full generality in Federer [3] Th 3.2.5
For any function f , we denote Nuf (T ) the number of roots of the equation f (t) = u that
belong to the subset T of the domain of f .
Proposition 1 (Area formula) Let f be a C 1 function defined on an open subset U of Rd
taking values in Rd . Assume that the set of critical values of f has zero Lebesgue measure.
Let g: Rd R be continuous and bounded. Then
Rd
g(u)Nuf (T )du =
(1)
for any Borel subset T of U , whenever the integral in the right-hand side is well defined.
Proof. Notice first that, due to standard extension arguments, it suffices to prove (1) for
non-negative g and for T a compact parallelotope contained in U . Second, if T is a compact
parallelotope, since f is C 1 , the set of boundary values of f , that is, f (T ) has Lebesgue measure
zero.
We next define an auxiliary function (u) for u Rd in the following way:
If u is neither a critical value nor a boundary value and n := Nuf (T ) is non zero, we denote
by x(1) , . . . , x(n) the roots of f (x) = u belonging to T . Using the local inversion theorem, we
know that there exists some > 0 and n neighborhoods U1 , ..., Un of of x(1) , . . . , x(n) such that:
Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.
n
i=1 Ui ,
then f (t)
/ B(u; ).
(2)
Let (u) > 0 and 0 < < (u). Using the change of variable formula we have
n
| det(f (t))| 1I
T
f (t)u
< dt =
i=1
Ui
where V () is the volume of the ball with radius in Rd . Thus, we have an exact counter for
Nuf (T ) when it is non-zero, which obviously holds true also when Nuf (T ) = 0 for < (u)
Let g : Rd R continuous, bounded and non-negative and 0 > 0. For every < 0 /2 we
have:
Rd
g(u)Nuf (T )F
(u)
0
du =
(u)
0
g(u) F
Rd
du
1
V ( )
| det(f (t))| 1I
T
f (t)u <
dt
Applying Fubinis Theorem we see that the expression above is equal to:
A0 , :=
| det(f (t))| dt
T
1
V ( )
F
B(f (t); )
(u)
0
g(u)du.
A0 , in fact does not depend on so it is equal to its limit as 0 which is, because of the
continuity of the function u
(u)
0
g(u), equal to
| det(f (t))|F
T
(f (t))
0
g(f (t))dt.
Let now 0 tend to zero and use monotone convergence. For the left-hand side, we take into
account that the set of critical values and the set of boundary values have measure zero. For
2
the right-hand side, we use the definition of F, that the boundary of T has Lebesgue measure
zero and the integrand is zero if t is a critical point of f .
Remarks:
1: By standard extension arguments the continuous function g can be replaced by the
indicator function of a Borel set say B. Formula (1) can be rewritten as
h(t, u)du =
Rd
(3)
Rd
tZ 1 (u)
Rice formulae
(4)
E
Bk
Y t (w) is continu-
tI,Z(t)=u
(6)
Z(t) is of class C 2
b)
() =
sup
tU,xV (u)
Denote by det the modulus of continuity of | det(X (.))| and choose m large enough so that
d
P(Fm, ) = P det (
) .
m
Consider the partition of I into md small cubes with sides of length 1/m. Let Ci1 ...id be such a
cube and ti1 ...id its centre (1 i1 , ..., id m). Then
c
c
P GCi1 ...id EM
Fm,
When the event in the term corresponding to i1 ...id of the last sum occurs, we have:
|Zj (ti1 ...id )|
M
d j = 1, ..., d
m
< .
(7)
So, if m is chosen sufficiently large so that V (0) contains the ball centered at 0 with radius
M d
m , one has:
2M d
P(GI ) 2 + md (
d) C()
m
Since and are arbitrarily small, the result follows.
Remark:
With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .
Proof of Theorem 1
Let F : R+ [0, 1] be the function defined in (2), For m, n positive integers and x 0, define:
Fm (x) := F(mx) ; Gn (x) := 1 F(x/n).
(8)
A standard extension argument says that it is enough to prove the theorem when B is a
compact rectangle included in U . So we assume that this is the case. Let us introduce some
more notations:
(t) := | det(Z (t))| (t U )
For n, m positive integers and u Rd :
Cum (B) :=
Fm ((s)).
(9)
sB:Z(s)=u
m
m
Qn,m
u (B) := Cu (B)Gn (Cu (B)).
(10)
In (9) when the summation index set is empty, we put Cum (B) = 0. Let g : Rd R be
continuous with compact support . We apply the area formula (3) for the function
h(t, u) = Fm ((t))Gn (Cum (B))g(u) 1ItB
to get:
Rd
g(u)Qn,m
u (B)du =
m
(t) Fm ((t)) Gn (CZ(t)
(B)) g(Z(t))dt.
Rd
g(u) E(Qn,m
u (B))du =
g(u)du
Rd
Since this equality holds for any g continuous with bounded support, it follows that
E(Qn,m
u (B)) =
(11)
the one corresponding to v = u. Outside the union of V1 , ..., Vk , Z(t) u is bounded away from
zero in B, so that the contribution to Cvm (B) vanishes if v is sufficiently close to u.
This shows that a.s., the function v
Qn,m
is continuous at v = u. On the other hand, it is
v
obvious from its definition that Qn,m
v (B) n and an application of the Lebesgue dominated
convergence theorem implies the continuity of E(Qn,m
u (B)) as a function of u.
Let us now write the regression formulae for fixed t B :
Z(s) = at (s)Z(t) + Z t (s)
Z (s) = (at ) (s)Z(t) + (Z t ) (s),
(12)
where denotes the derivative with respect to s and the pair Z t (s), (Z t ) (s) is independent
from Z(t) for all s U .
Then, we write the conditional expectation on the right-hand side of (11) as the unconditional
expectation :
E tu (t)Fm (tu (t))Gn (Cum (B)) ,
(13)
where we use the notations
tu (s) := | det(Zut ) (s)|
Zut (s) := at (s)u + Z t (s)
Fm tu (s) .
Cum (B) :=
sB,Zut (s)=u
Now, observe that (11) implies that for almost every u Rd one has the inequality
E(Qn,m
u (B))
(14)
which is in fact true for all u Rd since both sides are continuous functions of u.
The remainder of the proof consists in proving the converse inequality. Let us fix n, m, u
and t. Let K be the compact set
K := {s B : tu (s) 1/4m}
If v varies in a sufficient small (random) neighborhood of u, the points outside K do not
contribute to the sum defining Cvm (B).
Let k the almost surely finite number of roots of Zut (s) = u lying in the set K. Assume that
k does not vanish and denote these roots by s1 , . . . , sk . Consider the equation
Zvt (s) v = 0.
(15)
where the inequality arises from the fact that some of the points si (v) may not belong to B and
hence, dont contribute to the sum defining Cvm (B). Now since (11) holds for a.e. u, one can
find a sequence {uN , N = 1, 2, . . .} converging to u such that (11) holds true for u = uN and
6
lim E Qn,m
uN (B)
N +
lim
N + B
Since Cum (B) is a.s. finite, we can now pass to the limit as n +, m + in that order
and applying Beppo-Levis Theorem, conclude the proof.
Proof of Theorem 2:
For each > 0, define the domain
Dk, (B) = {(t1 , ..., tk ) B k , ti tj if i = j, i, j = 1, ..., k}
and the process Z
(t1 , ..., tk ) Dk, (B)
It is clear that Z satisfies the hypotheses of Theorem 1 for every value (u, ..., u) (Rd )k . So,
e
Z
E N(u,...,u)
Dk, (B)
k
E
Dk, (B)
| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk (16)
j=1
Z
N(u,...,u)
Dk, (B) ,
and that the diagonal Dk (B) = (t1 , ..., tk ) B k , ti = tj for some pair i, j, i = j has zero Lebesgue
measure in (Rd )k .
Proof of Theorem 3:
The proof is essentially the same. It suffices to consider instead of Cum (B) the quantity
Cum (I) :=
Fm ((s)).gs (s, Y s ).
sI:Z(s)=u
(17)
References
[1] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[2] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[3] Federer, H. (1969). Geometric measure theory. Springer-Verlag, New York
and FI (u) = P{MI u}, u R the probability distribution function of the random
variable MI . Our aim is to study the regularity of the function FI when d > 1.
There exist a certain number of general results on this subject, starting from
the papers by Ylvisaker (1968) and Tsirelson (1975) (see also Weber (1985), Lifshits
(1995), Diebolt and Posse (1996) and references therein). The main purpose of this
paper is to extend to d > 1 some of the results about the regularity of the function
u
FI (u) in Azas & Wschebor (2001), which concern the case d = 1.
Our main tool here is Rice Formula for the moments of the number of roots
NuZ (I) of the equation Z(t) = u on the set I, where {Z(t) : t I} is an Rd -valued
Gaussian field, I is a subset of Rd and u a given point in Rd . For d > 1, even
though it has been used in various contexts, as far as the authors know, a full proof
of Rice Formula for the moments of NuZ (I) seems to have only been published by R.
Adler (1981) for the first moment of the number of critical points of a real-valued
stationary Gaussian process with a d-dimensional parameter, and extended by Azas
and Delmas (2002) to the case of processes with constant variance. Caba
na (1985)
contains related formulae for random fields; see also the PHD thesis of Konakov
cited by Piterbarg (1996b). In the next section we give a more general result which
has an interest that goes beyond the application of the present paper. At the same
time the proof appears to be simpler than previous ones. We have also included
the proof of the formula for higher moments, which in fact follows easily from the
first moment. Both extend with no difficulties to certain classes of non-Gaussian
processes.
It should be pointed out that the validity of Rice Formula for Lebesgue-almost
every u Rd is easy to prove (Brillinger, 1972) but this is insufficient for a certain
number of standard applications. For example, assume X : I
R is a real-valued
random process and one is willing to compute the moments of the number of critical
points of X. Then, we must take for Z the random field Z(t) = X (t) and the
formula one needs is for the precise value u = 0 so that a formula for almost every
u does not solve the problem.
We have added Rice Formula for processes defined on smooth manifolds. Even
though Rice Formula is local, this is convenient for various applications. We will
need a formula of this sort to state and prove the implicit formulae for the derivatives
of the distribution of the maximum (see Section 3).
2
The results on the differentiation of FI are partial extensions of Azas & Wschebor (2001). They concern only the first two derivatives and remain quite far away
from what is known for d = 1. The main result in that paper states that if X is
a real-valued Gaussian process defined on a certain compact interval I of the real
line, has C 2k paths (k integer, k 1) and satisfies a non-degeneracy condition, then
the distribution of MI is of class C k .
For Gaussian fields defined on a d-dimensional regular manifold (d > 1) and
possessing regular paths we obtain some improvements with respect to classical
and general results due to Tsirelson (1975) for Gaussian sequences. An example is
Corollary 6.1, that provides an asymptotic formula for FI (u) as u + which is
explicit in terms of the covariance of the process and can be compared with Theorem
4 in Tsirelson (1975) where an implicit expression depending on the function F itself
is given.
We use the following notations:
If Z is a smooth function U
Rd , U a subset of Rd , its successive derivatives
are denoted Z , Z ,...Z (k) and considered respectively as linear, bilinear, ..., klinear
forms on Rd . For example, X (3) (t){v1 , v2 , v3 } is the value of the third derivative at
point t applied to the triplet (v1 , v2 , v3 ). The same notation is used for a derivative
on a C manifold.
I and I are respectively the interior, the boundary and the closure of the set
I,
I. If is a random vector with values in Rd , whenever they exist, we denote by
p (x) the value of the density of at the point x, by E() its expectation and by
Var() its variance-covariance matrix. is Lebesgue measure.
If u, v are points in Rd , u, v denotes their usual scalar product and u the
Euclidean norm of u.
For M a d d real matrix, we denote
M = sup M x
x =1
Rice formulae
E NuZ (I) =
(1)
E
Ik
A3 are verified. This follows immediately from the above statements. A standard extension argument shows that (1) holds true if one replaces I by any Borel subset of I
Sufficient conditions for hypotheses A3 to hold are given by the next proposition.
Proposition 2.1 Let Z : I
Rd , I a compact subset of Rd be a random field with
1
d
paths of class C and u R . Assume that
pZ(t) (x) C for all t I and x in some neighbourhood of u.
at least one of the two following hypotheses is satisfied:
a) a.s. t
Z(t) is of class C 2
b)
() =
sup
tI,xV (u)
Denote by det the modulus of continuity of | det(X (.))| and choose m large enough
so that
d
P(Fm, ) = P det (
) .
m
Consider the partition of I into md small cubes with sides of length 1/m. Let Ci1 ...id
such a cube and ti1 ...id its centre (1 i1 , ..., id m). Then
c
c
Fm,
P GCi1 ...id EM
(3)
When the event in the term corresponding to i1 ...id of the last sum occurs, we have:
|Zj (ti1 ...id )|
M
d j = 1, ..., d
m
< .
So, if m is
chosen sufficiently large so that V (0) contains the ball centred at 0 with
M d
radius m , one has:
P(GI ) 2 + md (
2M d
d) C()
m
(si )d 0 as m +.
i=1
So,
h(m)
NuZ (I)
= 0 P(EM ) +
c
NuZ (Ci ) = 0 EM
i=1
h(m)
+
i=1
d
P |Zj (ti ) uj | M si
j = 1, ..., d + C
2
h(m)
( dM si )d
i=1
=0
Thus
| Z (t1 )v i | = | Z (t1 )v i Z (i )v i |
d
k=1
|vk | Z (
) d
k=1
In conclusion
min Z (t1 ) Z (t1 )v Z (
)d,
that implies > .
Proof of Theorem 2.1: Consider a continuous non-decreasing function F such
that
F (x) = 0
F (x) = 1
for x 1/2
for x 1.
1
inf min Z (s) + Z(s) u
2 sI
1F
d
Z ()
(4)
ball with diameter 2 centred at a point in I there is at most one root of the
equation Z(t) = u, and a compactness argument shows that NuZ (I ) is bounded by
a constant C(, I), depending only on and on the set I.
Take now any real-valued non-random continuous function f : Rd R with
compact support. Because of the coarea formula (Federer, 1969, Th 3.2.3), since
a.s. Z is Lipschitz and , (u).f (u) is integrable:
Rd
f (u)NuZ (I ), (u)du =
Rd
variable | det(Z (t))|, (u) is a functional defined on {(Z(s), Z (s)) : s I}. Perform a Gaussian regression of (Z(s), Z (s)) : s I with respect to the random
variable Z(t), that is, write
Z(s) = Y t (s) + t (s)Z(t)
Zj (s) = Yjt (s) + jt (s)Z(t), j = 1, ..., d
where Zj (s) (j = 1, ..., d) denote the columns of Z (s), Y t (s) and Yjt (s) are Gaussian
vectors, independent of Z(t) for each s I, and the regression matrices t (s), jt (s)
(j = 1, ..., d) are continuous functions of s, t (take into account A2). Replacing in
the conditional expectation we are now able to get rid of the conditioning, and using
the fact that the moments of the supremum of an a.s. bounded Gaussian process
are finite, the continuity in u follows by dominated convergence.
So, now we fix u Rd and make 0, 0 in that order, both in (i) and (ii).
For (i) one can use Beppo Levis Theorem. Note that almost surely
= NuZ (I),
NuZ (I ) NuZ (I)
where the last equality follows from Lemma 2.1. On the other hand, the same
Lemma 2.1 plus A3 imply together that,almost surely:
inf min Z (s) + Z(s) u
sI
>0
so that the first factor in the right-hand member of (4) increases to 1 as decreases
to zero. Hence by Beppo Levis Theorem:
lim lim E NuZ (I ), (u) = E NuZ (I) .
0 0
For (ii), one can proceed in a similar way after de-conditioning obtaining (1). To
finish the proof, remark that standard Gaussian calculations show the finiteness of
the right-hand member of (1).
Proof of Theorem 2.2: For each > 0, define the domain
Dk, (I) = {(t1 , ..., tk ) I k , ti tj if i = j, i, j = 1, ..., k}
and the process Z
(t1 , ..., tk ) Dk, (I)
It is clear that Z satisfies the hypotheses of Theorem 2.1 for every value (u, ..., u)
(Rd )k . So,
Z
E N(u,...,u)
Dk, (I)
=
Dk, (I)
| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk (5)
j=1
To finish, let 0, note that NuZ (I) NuZ (I) 1 ... NuZ (I) k + 1 is the monotone
limit of
Z
N(u,...,u)
Dk, (I) ,
and that the diagonal Dk (I) = (t1 , ..., tk ) I k , ti = tj for some pair i, j, i = j has
zero Lebesgue measure in (Rd )k .
Remark Even thought we will not use this in the present paper, we point out
that it is easy to adapt the proofs of Theorems 2.1 and 2.2 to certain classes of
non-Gaussian processes.
For example, the statement of Theorem 2.1 remains valid if one replaces hypotheses A0 and A2 respectively by the following B0 and B2:
B0 : Z(t) = H(Y (t)) for t I where
Y : I Rn is a Gaussian process with C 1 paths such that for each t I, Y (t) has
a non-degenerate distribution and H : Rn Rd is a C 1 function.
B2 : for each t I, Z(t) has a density pZ(t) which is continuous as a function of
(t, u).
Note that B0 and B2 together imply that n d. The only change to be introduced in the proof of the theorem is in the continuity of (ii) where the regression is
performed on Y (t) instead of Z(t)
Similarly, the statement of Theorem 2.2 remains valid if we replace A0 by B0 and
add the requirement the joint density of Z(t1 ), ..., Z(tk ) to be a continuous function
of t1 , ..., tk , u for pairwise different t1 , ..., tk
Now consider a process X from I to R and define
X
(I) = {t I, X(.) has a local maximum at the point t, X(t) > u}
Mu,1
X
(I) = {t I, X (t) = 0, X(t) > u}
Mu,2
The problem of writing Rice Formulae for the factorial moments of these random
variables can be considered as a particular case of the previous one and the proofs are
10
the same, mutatis mutandis. For further use, we state as a theorem, Rice Formula
for the expectation. For short we do not state the equivalent of Theorem (2.2) that
holds true similarly.
Theorem 2.3 Let X : I
R , I a compact subset of Rd , be a random field. Let
X
u R, define Mu,i
(I), i = 1, 2 as above. For each d d real symmetric matrix M ,
1
we put (M ) := | det(M )|1IM 0 , 2 (M ) := | det(M )|.
Assume:
A0: X is Gaussian,
A1: a.s. t
X(t) is of class C 2 ,
A2: for each t I, X(t), X (t) has a non degenerate distribution in R1 Rd ,
A3: either
a.s. t
X(t) is of class C 3
or
() =
sup
tI,x V (0)
dx
u
2.1
Abstract manifold
Proposition 2.2 For k = 1, 2 the quantity which is expressed in every chart with
coordinates s1 , ..., sd as
+
(s) (x, o)
di=1 dsi ,
(6)
X
k = E Mu,k
(S) .
Denote by s1i and s2i , i = i, ..., d the coordinates in each chart. We have
Y1
=
s1i
2 Y1
=
s1i s1j
i ,j
Y2 Hi
s2i s1i
2 Y2 Hi Hj
+
s2i s2j s1i s1j
Y2 (s2 ),
Y2 2 Hi
.
s2i s1i s1j
pY1 (s1 ),Y1 (s1 ) (x, 0) = pY2 (s2 ),Y2 (s2 ) (x, 0)| det(H (s1 )|1
and at a singular point
Y1 (s1 ) = H (s1 )
Y2 (s2 )H (s1 ).
d(s)
(S)
(7)
X
Y
Since Mu,k
(S) is equal to Mu,k
{(S)} we see that the result is a direct consequence of Theorem (2.3)
2.1.2
Riemannian manifold
The form in (6) is intrinsic (in the sense that it does not depend on the parametrization) but the terms inside the integrand are not. It is possible to give a complete
intrinsic expression in the case when U is a equipped with a Riemannian metric.
When such a Riemannian metric is not given, it is always possible to use the metric
g induced by the process itself (see Taylor and Adler, 2002) by setting
gs (Y, Z) = E
Y (X) Z(X)
for Y, Z belonging the tangent space T (s) at s U . Y (X), (resp. Z(X)) denotes
the action of the tangent vector Y (resp. Z) on the function X. This metric leads
to very simple expressions for centred variance-1 Gaussian processes.
The main point is that at a singular point of X the second order derivative D2 X
is intrinsic since it defines locally the Taylor expansion. Given the Riemannian
metric gs the second differential can be represented by an endomorphism that will
be denoted 2 X(s).
D2 X(s){Y, Z} = Y (Z(X) = Z(Y (X) = gs (2 X(s)Y, Z).
13
(8)
In fact, at a singular point the definition given by formula (8) coincide with the
definition of the Hessian read in and orthonormal basis. This endomorphism is
intrinsic and of course its determinant. So in a chart
det 2 X(s) = det(D2 X(s)) det(gs )1 ,
(9)
1/2
X(s) = gs
1/2
DX
(10)
where we have omitted the tilde above X(s) for simplicity. This is the Riemannian
intrinsic expression.
2.1.3
Embedded manifold
the natural derivative on Rm . The manifold is equipped with the metric induced
by the Euclidean metric in Rm . Considering the form (10), clearly the Riemannian
volume is just the geometric measure on U .
Following Milnor (1965), we assume that the process Xt is defined on an open
neighbourhood of U so that the ordinary derivatives X (s) and X (s) are well defined
for s U . Denoting the projector onto the tangent and normal spaces by PT (s) and
PN (s) , we have.
X(s) = PT (s) (X (s)).
Wee now define the second fundamental form II of U embedded in Rm than can be
defined in our simple case as the bilinear application ( see Kobayashi Nomizu 199?
T 2, chap. 7 for details).
Y, Z T (s)
PN (s) (X Y ).
(11)
(12)
dx
u
dt
0
dt1 dt2 E (| det(Z1 (t1 ))|| det(Z2 (t2 ))|/Z1 (t1 ) = u1 , Z2 (t2 ) = u2 ) pZ1 (t1 ),Z2 (t2 ) (u1 , u2 ),
(14)
g Y t (.) + t (.)u
is continuous at u = u0
Then the formula :
E NuZ (I) =
16
holds true.
We will be particularly interested in the function = 1IMI <v for some v R. We
will see that later on that it satisfies the above conditions under certain hypotheses
on the process Z.
Our main goals in this and the next section are to prove existence and regularity of
the derivatives of the function u
FI (u) and, at the same time, that they satisfy
some implicit formulae that can be used to provide bounds on them. In the following
we assume that I is a d-dimensional C manifold embedded in RN , N d. and
where the function 1 has been defined in the statement of Theorem 2.3 and X
denotes the restriction of X to the boundary I.
Even for d = 1 (one parameter processes) and X Gaussian and stationary, inequality (15) provides reasonably good upper bounds for FI (u) (see Diebolt and
Posse (1996), Azas and Wschebor (2001). We will see an example for d = 2 at the
end of this section.
17
In the next section, we are able to prove that FI (u) is a C 1 function and that
formula (17) can be essentially simplified by getting rid of the conditional expectation, thus obtaining the second form for the derivative. This is done under weaker
regularity conditions but the assumption that X is Gaussian becomes essential.
In case the dimension d of the parameter is equal to 1, this is the starting point
to continue the differentiation procedure and under hypotheses H2k one is able to
(k)
prove that FI is a C k function and to obtain implicit formulae for FI (see Azas &
Wschebor, 2001)
When d > 1, a certain number of difficulties arise and it is not clear that the
process can continue beyond k = 2. With the purpose of establishing such formula
for FI , we introduce in Section 4 the helix-processes which appear in a natural
way in these formulae and have paths possessing singularities of a certain form that
will be described precisely in that section.
Definition 3.1 Let X : I R be real-valued stochastic process defined on a
subset of Rd . We will say that X satisfies condition (Hk ), k a positive integer, if
the following three conditions hold true:
X is Gaussian;
a.s. the paths of X are of class C k ;
for any choice of pairwise different values of the parameter t1 , ...., tn the joint
distribution of the random variables
X(t1 ), ..., X(tn ), X (t1 ), ..., X (tn ), ....., X (k) (t1 ), ..., X (k) (tn )
(16)
has maximum rank. Note that the number of distinct real-valued Gaussian
variables belonging to this set (16), on account of exchangeability of the order
of differentiation, is equal to
n 1+
d
d+1
k+d1
+
+ .... +
d1
d1
d1
The next proposition shows that there exist processes that satisfy (Hk ).
Proposition 3.1 Let X = X(t) : t Rd be a centred stationary Gaussian process having continuous spectral density f X . Assume that f X (x) > 0 for every x Rd
and that for any > 0 f X (x) C x holds true for some constant C and all
x Rd .
Then, X satisfies (Hk ) for every k = 1, 2, ...
18
Z=
h=1
where k denotes summation over all the d-tuples of non-negative integers k1 , k2 ..., kd
such that k1 +k2 +..+kd k and k1 ,k2 ...,kd ,h are complex numbers, then E |Z|2 = 0
implies k1 ,k2 ...,kd ,h = 0 for any choice of the indices k1 , k2 ..., kd , h in the sum. Using
the spectral representation, and denoting x = (x1 , ..., xd ),
n
E |Z|
. exp [i x, th th ] f X (x)dx
where the inner sum is over all 2dtuples of non-negative integers k1 , k2 ..., kd , k1 , k2 ..., kd
such that k1 + k2 + .. + kd k, k1 + k2 + .. + kd k. Hence,
2
E |Z|
k1
=
Rd h=1
kd
f X (x)dx
h=1
The result follows from the fact that the set of functions xk11 ...xkdd exp [i x, th ] where
k1 , k2 ..., kd , h vary as above, is linearly independent.
19
Proof : For u < v and S (respectively S) a subset of I (resp. I), let us denote
Mu,v (S) =
Mu,v (S) =
1 Muh,u (I) 1
Muh,u (I)
(18)
(dt)(dt)
I I
dxdx
uh
(dt)
I\I
uh
= sup |g(t)|
tI
,k
sup
k1 +k2 +..+kd k
21
k1 ,k2 ...,kd g
For fixed > 0 (to be chosen later on) and h > 0,we denote by Eh the event:
Eh =
,4
Because of the Landau-Shepp-Fernique inequality (see Landau-Shepp, 1970 or Fernique, 1975) there exist positive constants C1 , C2 such that
P(EhC ) C1 exp C2 h2 = o(h) as h 0
so that to have (17) it suffices to show that, as h 0 :
E
E
1I
Muh,u (I)
1IMI u 1IEh = o(h)
Muh,u (I)1
(20)
(21)
(s) (t)
II
dx1 dx2
uh
dx1 dx2
As,t =
uh
(s)0,X (t)0
for s, t D .
So it is enough to prove that As,t = o(h) for t s small, and we may assume
that s and t are in the same chart (U, ). Writing the process in this chart we may
assume that I is a ball or a half ball in Rd . Let s, t two such points, define the
process Y = Y s,t by
Y ( ) = X s + (t s)
; [0, 1].
Y (1) = x2 ,
Y (0) = Y (1) = 0
Q(1) = x2 ,
Q (0) = Q (1) = 0
Check that
Q(y) = x1 + (x2 x1 ) y 2 (3 2y), Q (0) = Q (1) = 6(x2 x1 )
Denote
Z( ) = Y ( ) Q( ) 0 1.
Under the conditioning, one has:
Z(0) = Z(1) = Z (0) = Z (1) = 0
and if also the event Eh occurs, an elementary calculation shows that for 0 1 :
|Z (4) ( )|
|Y (4) ( )|
= sup
(const) t s 4 h .
2!
2!
[0,1]
[0,1]
|Z ( )| sup
(24)
det(B)
(25)
v1 =
Note that in that case, the elements of matrix B are of the form X(s)vj , vk
hence bounded by (const)h . So,
ts
2 (d1)
ts
2 (d1)
Cd2
Cd2
2(d1)
2(d1)
(s)0,X (t)0
1IEh /C
E [Y (0)] [Y (1)]
ts
Y (0) + Y (1)
2
ts
Z (0) + Z (1)
2
1IEh /C
1IEh /C
1IEh /C
We now turn to the density in (22) using the following Lemma which is similar
to Lemma 4.3., p. 76, in Piterbarg (1996).
Lemma 3.1 For all s, t I:
ts
d+3
(26)
where D is a constant.
Proof. Assume that (26) does not hold, i.e., that there exist two convergent
sequences {sn }, {tn } in I , sn s , tn t such that
tn sn
d+3
(27)
If s = t , (27) can not hold, since the non degeneracy condition assures that this
sequence has the finite limit t s d+3 pX(s ),X(t ),X (s ),X (t ) (0, 0, 0, 0). So, s = t .
Since one can assume with no loss of generality that I is a ball or a half ball, the
n
segment [sn , tn ] is contained in I. Denote the unit vector e1,n = ttnn s
,complete
sn
d
it to an orthonormal basis {e1,n , e2,n , ..., ed,n } of R and take a subsequence of the
integers {nk } so that ej,nk ej as k + for j = 1, ..., d. In what follows, without
loss of generality, we assume that {nk } is the sequence of all positive integers. For
each Rd we denote 1,n , ..., d,n the coordinates of in the basis {e1,n , ..., ed,n }.
Note that tn sn has coordinates (t1,n s1,n , 0, ..., 0) = ( tn sn , 0, ..., 0).
Also, we denote 1 , ..., d the coordinates of in the basis {e1 , ..., ed }
The following computation is similar to the proof of Lemma 3.2. in Azas &
Wschebor (2001). We have:
n = det Var (X(sn ), X(tn ), X (sn ), X (tn ))
X
X
X
X
= det Var X(sn ), X(tn ),
(sn ),
(tn ), ...,
(sn ),
(tn )
1,n
1,n
d,n
d,n
X
X
X
= det Var X(sn ),
(sn ), Y1,n , Z1,n ,
(sn ), Z2,n , ...,
(sn ), Zd,n
1,n
2,n
d,n
where
X
(sn )(t1,n s1,n )
1,n
X
X
2
Z1,n =
(tn )
(sn )
Y1,n
1,n
1,n
t1,n s1,n
X
X
X
X
(tn )
(sn ), ....., Zd,n =
(tn )
(sn )
Z2,n =
2,n
2,n
d,n
d,n
Using now Taylor expansions and taking into account the integrability of the supremum of bounded Gaussian process, we have:
Y1,n = X(tn ) X(sn )
Y1,n =
Z1,n
(t1,n s1,n )2 2 X
(sn ) + 1,n (t1,n s1,n )3
2
2
1,n
(t1,n s1,n )2 3 X
=
(sn ) + n (t1,n s1,n )3
3
6
1,n
2X
(sn ) + 2,n (t1,n s1,n )2 , ......,
2,n 1,n
2X
= (t1,n s1,n )
(sn ) + d,n (t1,n s1,n )2
d,n 1,n
25
where the random variables 1,n , 2,n , ..., d,n , n are uniformly bounded in L2 of the
underlying probability space.
Substituting into n it follows that:
144 (t1,n s1,n )[8+2(d1)] n
X
2X
3X
2X
X
X
ts
d+1
ds dt
II
dx1 dx2
uh
(const) h22d
since the function (s, t)
t s d+1 is Lebesgue-integrable in I I. The last
constant depends only on the dimension d and the set I, Taking small enough
(20) follows.
An example: Let {X(s, t)} be a real-valued two-parameter Gaussian, centred
stationary isotropic process with covariance . Assume that its spectral measure
is absolutely continuous with density
(ds, dt) = f ()dsdt,
So that
= (s2 + t2 ) 2 .
f ()d = 1.
0
Using (15) which is a consequence of Theorem 3.1 and the invariance of the law of
the process, we have
FI (u) E 1 (X (0, 0))/X(0, 0) = u, X (0, 0) = (0, 0) pX(0,0),X (0,0) (u, (0, 0))
(1, 0))/X(1, 0) = u, X
(1, 0) = 0 p
+ 2E 1 (X
(1,0) (u, 0) = I1 + I2 . (28)
X(1,0),X
We denote by X, X , X the value of the different processes at some point (s, t);
by Xss , Xst , Xtt the entries of the matrix X and by and the standard normal
density and distribution.
One can easily check that:
X is independent of X and X , and has variance J3 Id
Xst is independent of X, X Xss and Xtt , and has variance 4 J5
Conditionally on X = u, the random variables Xss and Xtt have
expectation: J3
J (J3 )2
variance: 3
4 5
covariance: 4 J5 (J3 )2 .
Using an elementary computation we get that the expectation of the negative part
of a Gaussian variable with expectation and variance 2 is equal to
( ) (
).
We obtain
I2 =
2
(u)
J3
3
J5 (J3 )2
4
with
1
2
(bu) + J3 u(bu) ,
J3
b=
3
J
4 5
(J3 )2
1
2
As for I1 we remark that, conditionally on X = u, Xss + Xtt and Xss Xtt are
independent, so that a direct computation gives:
I1 =
1
(u)E 1 2J3 u
8J3
J5 2
(2 + 32 )
4
1I{ < 2J u} 1I
1
3
{ 1 2J3 u
27
, (29)
J5 2
(2 + 32 ) > 0}
4
2
I1 =
(u)
(2 +a2 c2 x2 )(acx)+[2a2 (acx)](acx) x(x)dx,
8J3
0
with a = 2J3 u, c =
J5
.
4
We choose, once for all along this section a finite atlas A for I. Then, to every t I
it is possible to associate a fixed chart that will be denoted (Ut , t ). When t I,
t (Ut ) can be chosen to be a half ball with t (t) belonging to the hyperplane limiting
this half ball. For t I, let Vt an open neighbourhood of t whose closure is included
in Ut and t a C function such that
t 1
t 0
on
on
Vt
Utc
(30)
(31)
1
s t 2.
2
(32)
1
2
st
1
s t 2.
2
where (st)N is the normal component of (st) with respect to the hyperplane
delimiting the half ball t (Ut ) . The rest of the definition is the same.
28
f (s) = a
ts f (t)+ < bts , f (t) > +n(t, s)f t (s),
Then s
X t (s) and s
pole t satisfying Ht,k .
X (1 )t + s v, v (1 )d,
0
With v =
st
st
X (s) = t s
X (1 )t + s v, v (1 )d ,
(33)
exp(u2 /2)
for every u R
2 /2)dv
exp(v
u
(34)
Let now Z satisfy the hypotheses of the theorem. For given a, b R, a < b,
choose A R+ so that |a| < A and consider the process:
X(t) =
Z(t) a |m | + A
.
+
(t)
0
m(t) a |m | + A
|m | + |a| |m | + A
+
+
0,
(t)
0
0
0
and
Var X(t) = 1.
So that (34) holds for the process X.
On the other hand:
|m | + A
|m | + A b a
{a < M Z b} {
< MX
+
}.
0
0
0
And it follows that
P a < MZ b
|m |+A ba
+
0
0
|m |+A
0
(u)du =
a
v a + |m | + A
1
dv.
0
0
FI (u) = 1
+ 1
d1
I
32
(36)
(37)
1/2
Now, observe that our improved version of Ylvisakers theorem (Theorem 4.1),applies
to the process s
X t (s) t (s)u defined on I\{t}. This implies that the first term
in (37) tends to zero as h 0. An analogous argument applies to the second term.
Finally, the continuity of FI (u) follows from the fact that one can pass to the limit
under the integral sign in (35).
To finish the proof we still have to show that the added hypotheses are in fact
unnecessary for the validity of the conclusion. Suppose now that the process X
satisfies only the hypotheses of the theorem and define
X (t) = Z (t) + Y (t)
(38)
where for each > 0, Z is a real-valued Gaussian process defined on I, measurable with respect to the -algebra generated by {X(t) : t I}, possessing C
33
paths and such that almost surely Z (t), Z (t), Z (t) converge uniformly on I to
X(t), X (t), X (t) respectively as 0. One standard form to construct such an approximation process Z is to use a C partition of the unity on I and to approximate
locally the composition of a chart with the function X by means of a convolution
with a C kernel.
In (38), Y denotes the restriction to I of a Gaussian centred stationary process
satisfying the hypotheses of proposition 3.1, defined on RN , and independent of
X. Clearly X satisfies condition (Hk ) for every k, since it has C paths and
the independence of both terms in (38) ensures that X inherits from Y the nondegeneracy condition in Definition 3.1. So, if
MI = max X (t) and FI (u) = P{MI u}
tI
one has
FI (u) = 1
+ 1
d1
E det X
(t)
E det X
(t)
(t)u 1IAu (X
(t)u 1IAu (X
t , ,t )
t , t )
pX
pX
(dt),
(t) (u, 0)
(t),X
(39)
We want to pass to the limit as 0 in (39). We prove that the right-hand member
is bounded if is small enough and converges to a continuous function of u as 0.
Since MI MI , this implies that the limit is continuous and coincides with FI (u)
by a standard argument on convergence of densities. We consider only the first term
in (39), the second is similar.
The convergence of X and its first and second derivative, together with the
non-degeneracy hypothesis imply that uniformly on t I, as 0 :
pX
(t)
(t)u ,
on account of the form of the regression coefficients and the definitions of X t and
t . The only difficulty is to prove that, for fixed u:
P{C C} 0 as
0,
where
C = Au (X t , t )
34
(40)
C = Au (X t , t )
We prove that
a. s. 1IC 1IC as
0,
(41)
sup
X t (s) t (s)u = 0
sI\{t}
sI\{t}
> 0 is small
sI\{t}
P{Au (X t , t )}.
35
sup
X t (s) t (s)u
sI\{t}
sI\{t}
Second derivative
FI (u) = 1
(1,0)
t (t)
E
I
i,j
i,j=1
ds t (s)
dt
I
I
t
E det X (s) (s)u det X (t) (t)u 1IAu /X t (s) = t (s)u, X t (s) = t (s)u
pX t (s),X t (s) t (s)u, t (s)u pX(t),X (t) u, 0 +
S d1
(42)
(1,0)
d
I
(43)
The derivative of the integrand in (43) is the sum of the three derivatives corresponding to the three locations where the variable u appears, namely :
in the density pX(t),X (t) (u, 0) which is clearly differentiable with bounded
derivative :
(1,0)
pX(t),X (t) (u, 0).
This gives the first term in (42).
In the derivative with respect to the first occurrence of u in
E det X t (t) t (t)u 1IAu (X t , t ) .
The derivative of which is
d
t (t)
i,j
i,j=1
where Ci,j (u) is the cofactor of location (i, j) in the matrix X t (t) t (t)u.
This quantity is uniformly bounded when u varies in a compact interval, which
follows easily from an expression of the type (33). This gives the second term
in (42).
in the derivative with respect to the second occurrence of u in
E det X t (t) t (t)u 1IAu (X t , t ) .
To evaluate this derivative define v as in (36) and set for sufficiently small:
I := I\B(t, ) ; Au = Au (X t , t ) := X t (s) t (s)u, s I ,
B(t, ) being the ball with center t and radius in the chart (t , Ut ). By dominated
convergence
E v 1IAu+h (X t , t ) 1IAu (X t , t )
37
dx
I
u+h
dx
u
S(t,)
t (s) = t (s)x
where S(t, ) is the sphere with centre t and radius , Y t (s) = X t (s) t (s)x ,
t (s) t (s)x.
Y t (s) = X
Let us prove that the first integral converges as 0. The only problem is the
behaviour around t. So it is sufficient to prove the convergence locally around t in
the chart (t , Ut ) with s in Vt which implies that n(s, t) = 12 t s 2 . Without loss
of generality we may assume that the representation of t in this chart is the point 0
in Rd . To study the behaviour of the integrand as s 0, we choose an orthonormal
basis with ss as first vector and set s = (, 0, ...0)T . At s = 0 the process X t and
its derivative have the following expansions (for short, derivatives are indicated by
sub-indices).
1 2
3
4
X 11 + X 111 + X 1111 + o(4 )
2
6
24
2
3
X t
3X t
(0)
3 s1
and X 1111 =
4X t
(0)
4 s1
Since
X t (s) = m(s).X t
with
m(s) := 1/n(s, 0) =
s21
2
4
m(s)
(, 0...0) = 3
;
2
+ ... + sd
s1
38
(45)
(46)
(47)
(48)
m(s)
2 m(s)
12
(, 0...0) = 0 (i = 1) ;
(, 0...0) = 4
2
si
s1
4
2 m(s)
2 m(s)
(,
0...0)
=
(i
=
1)
;
(, 0...0) = 0 (i = j).
s2i
4
si sj
Using derivation rules, we get
(49)
(50)
(51)
(52)
(53)
i=1
where the notation iit (s) has an obvious meaning. The condition
C(s) = X t (s) = t (s)x ; X t (s) = t (s)x
converges as s 0 to the condition
X 11 = 11 x ; X 111 = 111 x ; X ti (0) = ti (0)x (i = 2, d)
which is non singular (again the notations 11 , 111 , ti are obvious). Consider
a Gaussian variable which is measurable with respect to the process and which is
39
1
2
2
1
X
+
X
x = Op (2 ),
ii
11
2
2
2 ii
2 11
Since ( t (s) is bounded we see that the integrand is O(1d ) which ensures
convergence of I1 as 0. One easily check that the bound for the integrand is
uniform in t.
We consider now the limit of I2 as 0. It is enough to prove that for each
x R the expression
u+h
d1
(1)
(ds) t (s)
dx
u
S(t,)
t
t
t (s) = t (s)x p t t
E det X (s) v 1IAx /X t (s) = t (s)x, X
X (s),X (s) (s)x, (s)x .
(54)
(1)d1
u
2(d1)
(dw) t (t + w)
dx
S d1
X tN T (w)
X tT (w)
Corollary 6.1 Suppose that the process X satisfies the conditions of Theorem 4.2
and that in addition E(Xt ) = 0 and Var(Xt ) = 1.
Then, as u + F (u) is equivalent to
ud
u2 /2
e
(2)(d+1)/2
det((t))
I
1/2
dt,
(57)
r(s, t);
si
rij; (s, t) :=
2
r(s, t);
si sj
ri;j (s, t) :=
2
r(s, t).
si tj
Thus X(t) and X (t) are independent. Regression formulae imply that
ats = r(s, t),
t (s) =
1 r(t, s)
.
n(s, t)
This implies that t (t) = (t) and that the possible limit values of t (s) as s t
are in the set {v T (t)v : v S d1 }. Due to the non-degeneracy condition these
quantities are minorized by a positive constant. On the other hand for s = t
t (s) > 0. This shows that for every t I one has inf sI t (s) > 0. Since for every
t I the process X t is bounded it follows that
a.s. 1IAu (X t , t ) 1 as u +.
Also
det X t (t) t (t)u
A dominated convergence argument shows that the first term in (35) is equivalent
to
2 /2
ud det(t )(2)1/2 eu
(2)d/2 det(t )
1/2
dt =
ud
2
eu /2
(d+1)/2
(2)
det(t )
I
2 /2
The same kind of argument shows that the second term is O ud1 eu
completes the proof.
1/2
which
References
dt.
Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution
of the maximum of a Gaussian random fields. To appear in Extremes.
Azas, J-M. and Wschebor M. (1999). Regularite de la loi du maximum de
processus gaussiens reguliers. C.R. Acad. Sci. Paris, t. 328, serieI, 333-336.
Azas, J-M. and Wschebor M. (2001). On the regularity of the distribution of
the maximum of one-parameter Gaussian processes. Probab. Theory Relat. Fields,
119, 70-98.
Brillinger D. R., (1972). On the number of solutions of systems of random
equations. The Annals of Math. Statistics, 43, 534540.
Caba
na, E. M., (1985). Esperanzas de integrales sobre conjuntos de de nivel
aleatorios (spanish). Actas del segundo Congreso latinoamericano de probabilidades
Y estadistica mathematica. Caracas, 65-81
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic
Processes, J. Wiley & Sons, New-York.
Cucker, F.; Wschebor, (2003) M. On the Expected Condition Number of Linear
Programming Problems, to appear in Numerische Mathematik.
Diebolt, J. and Posse, C. (1996). On the Density of the Maximum of Smooth
Gaussian Processes.Ann. Probab., 24, 1104-1129.
Federer, H. (1969). Geometric measure theory. Springer-Verlag, New York
Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes.Ecole dEte de Probabilites de Saint Flour. Lecture Notes in Mathematics,
480, Springer-Verlag,New-York.
Kobayashi Nomizu 199? Foundation of differential geometry. J. Wiley & Sons,
New-York.
Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process.
Sankya Ser. A 32, 369-378.
Lifshits, M.A.(1995). Gaussian random functions . Kluwer, The Netherlands.
Milnor, J. W.(1965). Topology from the differentiable viewpoint. The Univerity
Press of Virginia , Charlottesville.
Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes
and Fields. American Mathematical Society. Providence, Rhode Island.
Piterbarg V. I., (1996b). Rices Method for Large Excursions of Gaussian Random Fields. Technical Report No. 478, University of North Carolina. Translation
of Rices method for Gaussian random fields.
Taylor J.E., and Adler R. (2002) Euler characteristics for Gaussian fields on
manifolds. Preprint
Tsirelson, V.S. (1975). The Density of the Maximum of a Gaussian Process. Th.
Probab. Appl., 20, 847-856.
43
44
Citations
From References: 1
From Reviews: 0
Article
X
(arcsin rij
1i<jn
Y +
arcsin rij
) exp
a2i + a2j
,
2(1 + ij )
X
Y
and rij
are covariances between Xi and Xj and between Yi and Yj , respectively, and
where rij
X
Y
ij = max(|rij |, |rij
|). Two more related comparison lemmas are stated. One of these is the
well-known Sudakov-Fernique inequality showing that if variances of arbitrary increments of a
Gaussian process X are less than or equal to variances of similar increments of a Gaussian process
Y then the mean of the supremum of X is less than or equal to the mean of the supremum of
Y , provided that the two Gaussian processes are separable centered with almost surely bounded
paths. Next the authors present the proof due to C. Borell of Ehrhards inequality [C. R. Math.
Acad. Sci. Paris 337 (2003), no. 10, 663666; MR2030108 (2004k:60102)] valid for general Borel
subsets of Rn (with no restrictions on the convexity of those sets). Namely, let n be the standard
Gaussian probability measure on Rn . Then for any pair A and B of Borel sets in Rn and all
the cumulative distribution function FMT of the maximum of X in terms of factorial moments m
of the number of up-crossings of X of a given level u, starting below u at time 0, holds:
()
(1)m+1
m=1
m
.
m!
Moreover, when the infinite series is truncated, the error bound for the resulting approximation is
also given. The second key result shows that for a Gaussian centered and stationary process on R
with covariance such that (0) = 1 and has a Taylor expansion at zero which is absolutely
convergent at t = 2T , the conditions of the above general Rice series theorem are satisfied and thus
representation () is valid. Much of the remainder of the chapter is devoted to efficient numerical
computation of the factorial moments of up-crossings, which is important for applications of the
Rice series. In particular it is shown that the Rice series approach is a priori better than the Monte
Carlo method (in terms of comparison of the complexities of the computation of the distribution
of the maximum) and, for standard error bounds, allows one to compute the desired distribution
with just a few terms of the Rice series. Chapter 5 concludes with a modification of the general
Rice series theorem discussed earlier to include continuous processes that do not have sufficiently
differentiable paths, which is achieved by employing in the series the factorial moments of upcrossings of an -mollified version (with > 0) of the underlying process and then taking to
0.
Chapter 6 revisits the subject of Rice formulas but in a much richer multiparameter setting.
The authors start by proving the area formula, then establish Rice formulas for the moments
of multiparameter Gaussian random fields (from a domain in Rd to Rd ) having continuously
differentiable trajectories, and also prove a closely related result on the expected number of
weighted roots corresponding to a given level set. Next, Rice formulas for the expected number of
local maxima and the expected number of critical points of a Gaussian random field with domain
D are established, where D is a C 2 -manifold (at first, the manifold has no additional structure,
then the results are further specialized to the cases when D has a Riemannian metric and when D
is embedded in a Euclidean space). Analogous results are subsequently also proved for the case of
Gaussian random fields from Rd to Rd but now d > d .
Chapter 7 is devoted to the analysis of regularity of the distribution of the maximum of Gaussian
random fields. The key result here is the representation formula for the density of the maximum
of a Gaussian real-valued field with C 2 -paths defined on an open set containing S, where S is a
compact subset of Rd which can be written as the disjoint union of a finite number of orientable
C 3 manifolds Sj of dimension j without boundary (where j = 0, . . . , d). Moreover, under certain
nondegeneracy conditions, this density of the maximum is shown to be continuous. On the other
hand, restricting attention to the one-parameter case allows the authors to derive subtler results on
the degree of smoothness of the distribution of the maximum. Namely, if a Gaussian process on
[0, 1] has paths in C 2k then the cumulative distribution function of the maximum is shown to be of
class C k .
Chapter 8 generally studies tails of the distribution of the maximum of a random field and is
divided into two parts. In the first part the authors focus solely on the case of one-parameter
Gaussian processes and analyze the asymptotic behavior of the successive derivatives of the
distribution of the maximum as well as the tails of the distribution of the maximum of certain
unbounded Gaussian processes. In the latter case the probability q that the supremum is finite is
strictly less than one, and the aim is to understand the speed at which P (MT u) converges to q
as u grows to +. In the second part the authors establish bounds for the density of the maximum
of a multiparameter Gaussian random field and subsequently analyze the asymptotic behavior of
the maximum given by
P (M > u) = A(u) exp(u2 /(2 2 )) + B(u),
where A(u) is a known function with polynomially bounded growth as u +, 2 =
supt Var(X(t)), and B(u) is an error bounded by a centered Gaussian density with variance
smaller than 2 .
Chapter 9 develops an efficient method, based on record times, for the numerical computation of
the distribution of the maximum of one- and two-parameter Gaussian random fields. The authors
first consider the parameter space [0, 1] and prove that if X is a Gaussian process with C 1 -paths,
then the maximum M = max{X(t), t [0, 1]} has a distribution with tails of the form
() P (M > u) =
1
P (X(0) > u) +
0
where pX(t) () is the probability density of X(t) and R is the set of record times, i.e. R = {t
[0, 1]: X(s) < X(t), s [0, t)}. The latter result is derived from Rychliks formula, which in
turn is based on the idea that
P (M u) = P (X(0) > u) + P (t R: X(t) = u) =
P (X(0) > u) + E[#{t R: X(t) = u}],
since the number of record times t such that X(t) = u is either 0 or 1. Then, upon using a
discretization of the condition {X(s) < X(t), s [0, t)}, one can use formula () to obtain
explicit upper bounds on P (M > u):
P (X(0) > u)
1
+
0
On the other hand, a similar time discretization provides the trivial lower bound
P (M > u) 1 P (X(0) u, . . . , X((n 1)/n) u),
where (at least for n up to 100) the integrals in the above upper and lower bounds can be easily
computed using the Matlab toolbox MAGP developed by Mercadier (2005). Subsequently this
record method is adapted by the authors to deal with the case of a two-parameter Gaussian random
field.
Chapter 10 presents asymptotic results for one-parameter stationary Gaussian processes on time
intervals whose size tends to infinity. First, provided that the level u tends to infinity jointly with
the size of the time interval so that the expectation of the number of up-crossings remains constant
and under the assumption of some local regularity (given by Gemans condition) and some mixing
(given by Bermans condition) of the underlying process, the Volkonski-Rozanov theorem [V. A.
Volkonski and Yu. A. Rozanov, Teor. Veroyatnost. i Primenen. 6 (1961), 202215; MR0137141
(25 #597)] is proved, showing that the asymptotic distribution of the number of up-crossings
is Poisson. The latter in turn implies that the suitably renormalized maximum of the process
converges to a Gumbel distribution. On the other hand, when the level u is fixed, under certain
conditions, the number of (up-)crossings is shown to satisfy a central limit theorem. In terms of
extensions of these results to a multiparameter setting, the authors quote Piterbargs theorem [V. I.
Piterbarg, Asymptotic methods in the theory of Gaussian processes and fields, Translated from the
Russian by V. V. Piterbarg, Amer. Math. Soc., Providence, RI, 1996; MR1361884 (97d:60044)] for
a multiparameter analogue of the Volkonski-Rozanov theorem. The multiparameter extensions of
the central limit type results for up-crossings are not directly developed in the book, but several
useful references are provided.
Chapter 11 deals with applications of Rice formulas to the study of some geometric characteristics of random sea surfaces. The random sea surface is modeled as a Gaussian stationary
3-parameter field which is the limit of the superposition of infinitely many elementary sea waves.
Namely, if one considers a moving incompressible fluid in a domain of infinite depth, then the
classical Euler equations, after some approximations, imply that the sea level X(t, x, y), where t
is time and (x, y) are spatial variables, satisfies
X(t, x, y) = f cos(t t + x x + y y + ),
where f and are the amplitude and phase, and the pulsations t , x and y are some parameters
2
satisfying the Airy relation 2x + 2y = gt , where g is the acceleration of gravity. If units are chosen
so that g = 1 and if f and g are independent random variables with f having Rayleigh distribution
and being uniform on [0, ], then X(t, x, y) is the Gaussian sine-cosine process of the form
X(t, x, y) = 1 sin(t t + x x + y y) + 2 cos(t t + x x + y y),
where 1 and 2 are independent standard normal random variables. The Rice formula is used
to derive from the directional spectrum of the sea various properties of the distribution of such
geometric characteristics like length of crests and velocities of contours. In addition, two nonGaussian generalizations of the above Gaussian sea surface model are also briefly discussed.
Chapter 12 is devoted to the application of the Rice formula to the study of the number of real
roots of a system of random equations, with a particular emphasis placed on large polynomial
systems with random coefficients. The authors start by proving the Shub-Smale theorem [M. Shub
and S. J. Smale, in Computational algebraic geometry (Nice, 1992), 267285, Progr. Math., 109,
Birkhauser Boston, Boston, MA, 1993; MR1230872 (94m:68086)] showing that if N X equals the
number of roots of the system of equations Xi (t) = 0 for all i = 1, . . . , m, where
(i)
Xi (t) :=
j1 ++jm di
(i)
di !
ian random variables with variances Var(aj1 ,...,jm ) = j1 !jm !(di (j
, then E(N X ) =
++j
))!
1
m
Universite de Toulouse, IMT, LSP, F31062 Toulouse Cedex 9, France. E-mail: azais@cict.fr
Escuela de Matem
atica, Facultad de Ciencias, Universidad Central de Venezuela, A.P. 47197,
Los Chaguaramos, Caracas 1041-A, Venezuela. E-mail: jose.leon@ciens.ucv.ve
3
Centro de Matem
atica, Facultad de Ciencias, Universidad de la Rep
ublica, Calle Igu
a 4225,
11400 Montevideo, Uruguay. E-mail: wschebor@cmat.edu.u
2
We use Rice formulae in order to compute the moments of some level functionals which are
linked to problems in oceanography and optics: the number of specular points in one and two
dimensions, the distribution of the normal angle of level curves and the number of dislocations
in random wavefronts. We compute expectations and, in some cases, also second moments of
such functionals. Moments of order greater than one are more involved, but one needs them
whenever one wants to perform statistical inference on some parameters in the model or to test
the model itself. In some cases, we are able to use these computations to obtain a central limit
theorem.
Keywords: dislocations of wavefronts; random seas; Rice formulae; specular points
1. Introduction
Many problems in applied mathematics require estimations of the number of points, the
length, the volume and so on, of the level sets of a random function {W (x) : x Rd }, or
of some functionals defined on them. Let us mention some examples which illustrate this
general situation:
1. A first example in dimension one is the number of times that a random process
{X(t) : t R} crosses the level u:
NAX (u) = #{s A : X(s) = u}.
Generally speaking, the probability distribution of the random variable NAX (u) is unknown, even for simple models of the underlying process. However, there exist some
formulae to compute E(NAX ) and also higher order moments; see, for example, [6].
2. A particular case is the number of specular points of a random curve or a random
surface. Consider first the case of a random curve. A light source placed at (0, h1 ) emits a
ray that is reflected at the point (x, W (x)) of the curve and the reflected ray is registered
This is an electronic reprint of the original article published by the ISI/BS in Bernoulli,
2011, Vol. 17, No. 1, 170193. This reprint differs from the original in pagination and
typographic detail.
1350-7265
2011 ISI/BS
171
by an observer placed at (0, h2 ). Using the equality between the angles of incidence and
reflection with respect to the normal vector to the curve (i.e., N (x) = (W (x), 1)), an
elementary computation gives
W (x) =
2 r1 1 r2
,
x(r2 r1 )
(1)
x 1 + 2 x h1 + h2
= kx,
2 1 2
2 h1 h2
where k :=
1 1
1
.
+
2 h1 h2
(2)
Set Y (x) := W (x) kx and let SP 2 (A) denote the number of roots of Y (x) belonging to
the set A, an approximation of SP 1 (A) under this asymptotic. The first part of Section
2 below will be devoted to obtaining some results on the distribution of the random
variable SP 2 (R).
4. Let W : Q Rd Rd with d > d be a random field and define the level set
W
CQ
(u) = {x Q : W (x) = u}.
Under certain general conditions, this set is a (d d )-dimensional manifold, but, in any
case, its (d d )-dimensional Hausdorff measure is well defined. We denote this measure
by dd . Our interest will be in computing the mean of the dd -measure of this level
W
set, that is, E[dd (CQ
(u))], as well as its higher moments. It will also be of interest to
compute
E
W (u)
CQ
where Y (s) is some random field defined on the level set. One can find formulae of this
type, as well as a certain number of applications, in [5, 14] (d = 1), [3], Chapter 6, and
[1].
5. Another set of interesting problems is related to phase singularities of random wavefronts. These correspond to lines of darkness in light propagation, or threads of silence in
sound propagation [4]. In a mathematical framework, they can be defined as the locations
of points where the amplitudes of waves vanish. If we represent a wave as
W (x, t) = (x, t) + i(x, t),
x Rd ,
172
where , are independent homogenous Gaussian random fields, then the dislocations
are the intersections of the two random surfaces (x, t) = 0, (x, t) = 0. Here, we only
consider the case d = 2. At fixed time, say t = 0, we will compute the expectation of the
random variable #{x S : (x, 0) = (x, 0) = 0}.
The aim of this paper is threefold: (a) to re-formulate some known results in a modern
language; (b) to prove a certain number of new results, both for the exact and approximate models, especially variance computations in cases in which only first moments have
been known until now, thus contributing to improve the statistical methods derived from
the probabilistic results; (c) in some cases, to prove a central limit theorem.
Rice formulae are our basic tools. For statements and proofs, we refer to the recent book
[3]. On the other hand, we are not giving full proofs since the required computations are
quite long and involved; one can find details and some other examples that we do not treat
here in [2]. For numerical computations, we use MATLAB programs which are available
at the site http://www.math.univ-toulouse.fr/~azais/prog/programs.html.
In what follows, d denotes the Lebesgue measure in Rd , d (B) the d -dimensional
Hausdorff measure of a Borel set B and M T the transpose of a matrix M . (const) is
a positive constant whose value may change from one occurrence to another. p (x) is
the density of the random variable or vector at the point x, whenever it exists. If not
otherwise stated, all random fields are assumed to be Gaussian and centered.
E(SP 2 (I)) =
I
G(k,
=
I
1
kx
4 )
2
2
kx
1
E(|Y (x)|)
2
2
I
dx
(3)
dx,
Z N (, 2 ) = [2(/) 1] + 2(/),
(4)
where () and () are respectively the density and cumulative distribution functions
of the standard Gaussian distribution.
173
If we look at the total number of specular points over the whole line, we get
1 k2
1 k4
24 1
G(k, 4 )
1+
+
+ ,
E(SP 2 (R)) =
k
k
2 4 24 24
(5)
4
is an increasing function of k .
We now turn to the computation of the expectation of the number of specular points
SP 1 (I) defined by (1). It is equal to the number of zeros of the process {Z(x) := W (x)
m1 (x, W (x)) : x R}, where
m1 (x, w) =
Assume that the process {W (x) : x R} is Gaussian, centered and stationary, with 0 = 1.
The process Z is not Gaussian, so we use [3], Theorem 3.4, to get
b
dx
a
(6)
2
2
1
1
em1 (x,w)/(22 ) dw.
ew /2
22
2
m1
m1
(x, W (x))
(x, W (x))W (x)
x
w
where K(x, w) =
m1
m1
(x, w) +
(x, w)m1 (x, w).
x
w
Once again, using Gaussian regression, we can write (6) in the form
E(SP 1 ([a, b])) =
1
2
4 22
2
dx
a
G(m, 1) exp
m2 (x, w)
1
w2 + 1
2
2
dw, (7)
174
Figure 1. Intensity of specular points in the case h1 = 100, h2 = 300, 4 = 3. Solid line corresponds to the exact formula, dashed line corresponds to the approximation (3).
result with the exact formula is around 2.102 larger (this is the same order as the error
in the computation of the integral). For h1 = 90, h2 = 110, 4 = 3, the results are 136.81
and 137.7, respectively. If h1 = 100, h2 = 300, 4 = 3, the results differ significantly and
Figure 1 displays the densities in the integrand of (6) and (3) as functions of x.
(8)
R2
(9)
175
1 k 2 (2 x2 + 22 (x y)xy + 2 y 2 )
,
=
exp
2
2
22 2 (x y)
2 2 2 (x y)
1
(10)
under the condition that the density (10) does not degenerate for x = y.
For the conditional expectation in (9), we perform a Gaussian regression of W (x)
(resp., W (y)) on the pair (W (x), W (y)). Putting z = x y, we obtain
W (x) = y (x) + ay (x)W (x) + by (x)W (y),
ay (x) =
(z) (z)
,
22 2 (z)
by (x) =
2 (z)
,
2 (z)
22
y (x) k (z) 1 +
(z)x + 2 y
22 2 (z)
x (y) k (z) 1 +
(z)y + 2 x
22 2 (z)
. (11)
Note that the singularity on the diagonal x = y is removable since a Taylor expansion
shows that for z 0,
(z) 1 +
1 4
(z)x + 2 y
=
x(z + O(z 3 )).
22 2 (z)
2 2
(12)
2 (z) (z)
.
22 2 (z)
2 2 (z)
,
22 2 (z)
(13)
(14)
1 2 6 24 2
z
4
2
(15)
and it follows that the singularity on the diagonal of the integrand in the right-hand side
of (9) is also removable.
We will make use of the following auxiliary statement that we state as a lemma for
further reference. The proof requires some calculations, but is elementary, so we omit it.
The value of H(; 0, 0) can be found in, for example, [6], pages 211212.
Lemma 1. Let
H(; , ) = E(| + || + |),
176
1 2 +
2
arctan
1 2
and
|R2 (; , )| 3(2 + 2 ).
In the next theorem, we compute the equivalent of the variance of the number of specular points, under certain hypotheses on the random process W and with the LonguetHiggins asymptotic. This result is new and useful for estimation purposes since it implies
that, as k 0, the coefficient of variation of the random variable S tends to zero at
a known speed. Moreover, it will also appear in a natural way when normalizing S to
obtain a central limit theorem.
Theorem 1. Assume that the centered Gaussian stationary process W = {W (x) : x R}
is -dependent, that is, (z) = 0 if |z| > , and that it has C 4 -paths. Then, as k 0, we
have
1
Var(S) = + O(1),
(16)
k
where
=
(z) =
J
+
2
24
24
3
,
J=
2 (z)H((z); 0, 0))
2(2 + (z))
dz,
(z)2 (z)
1
(4)
(z)
+
,
2 (z)
22 2 (z)
(0 i 4)
1
24
+
2 (z)H((z); 0, 0) 1 4
dz.
2
2 2 + (z)
177
The proof of this extension can be constructed along the same lines as the one we
give below, with some additional computations.
(2) The above computations complete the study done in [10] (Theorem 4). In [9], the
random variable SP 2 (I) is expanded in the WienerHermite chaos. The aforementioned expansion yields the same formula for the expectation and also allows a
formula to be obtained for the variance. However, this expansion is difficult to
manipulate in order to get the result of Theorem 1.
Proof of Theorem 1. We use the notation and the computations preceding the statement of the theorem.
Divide the integral on the right-hand side of (9) into two parts, corresponding to
|x y| > and |x y| , that is,
E(S(S 1)) =
|xy|>
|xy|
= I1 + I2 .
(17)
In the first term, the -dependence of the process implies that one can factorize the
conditional expectation and the density in the integrand. Taking into account that for
each x R, the random variables W (x) and W (x) are independent, we obtain for I1
I1 =
|xy|>
E(|W (x) k|)E(|W (y) k|)pW (x) (kx)pW (y) (ky) dx dy.
On the other hand, we know that W (x) (resp., W (x)) is centered normal with variance
2 (resp., 4 ). Hence,
4 )]
I1 = [G(k,
2
|xy|>
1
1 k 2 (x2 + y 2 )
dx dy.
exp
22
2
2
To compute the integral on the right-hand side, note that the integral over the whole x, y
plane is equal to 1/k 2 so that it suffices to compute the integral over the set |x y| .
Changing variables, this last integral is equal to
+
x+
dx
1 k 2 (x2 + y 2 )
1
dy =
exp
+ O(1),
22
2
2
k 2
where the last term is bounded if k is bounded (remember that we are considering an
approximation in which k 0). Therefore, we can conclude that
|xy|>
1
1 k 2 (x2 + y 2 )
1
dx dy = 2
exp
+ O(1),
22
2
2
k
k 2
24 1
+ O(1) .
k 2 k 2
(18)
178
Let us now turn to I2 . Using Lemma 1 and the equivalences (12) and (15), whenever
|z| = |x y| , the integrand on the right-hand side of (9) is bounded by
(const )[H((z); 0, 0) + k 2 (x2 + y 2 )].
We divide the integral I2 into two parts.
First, on the set {(x, y) : |x| 2, |x y| }, the integral is clearly bounded by some
constant.
Second, we consider the integral on the set {(x, y) : x > 2, |x y| }. (The symmetric
case, replacing x > 2 by x < 2, is similar that is the reason for the factor 2 in what
follows.) We have (recall that z = x y)
2 (z)[H((z); 0, 0) + R2 ((z); , )]
I2 = O(1) + 2
|xy|,x>2
1
2 22 2 (z)
exp
1 k 2 (2 x2 + 2 (x y)xy + 2 y 2 )
dx dy,
2
22 2 (x y)
2 (z)[H((z); 0, 0) + R2 ((z); , )]
I2 = O(1) + 2
k2 z 2
2
1
1
1
exp
(z) + (z)
2
2(2 + (z))
2
2
+
1
2(2 (z))
exp k 2
dz
(x z/2)2
dx.
2 (z)
k 2
+
0
1
1
1 1
exp 2 d =
+ O(1),
2
2
2 2k
(19)
(20)
179
(21)
Then, as k 0,
S
24 /1/k
/k
N (0, 1)
in distribution.
Remarks. One can give conditions for the additional hypothesis (21) to hold true. Even
though they are not nice, they are not costly from the point of view of physical models.
For example, either one of the following conditions implies (21):
(i) the paths x W (x) are of class C 11 (use [3], Theorem 3.6, with m = 4, applied to
the random process {W (x) : x R});
(ii) the paths x W (x) are of class C 9 and the support of the spectral measure has
an accumulation point (apply [3], Example 3.4, Proposition 5.10 and Theorem 3.4, to
show that the fourth moment of the number of zeros of W (x) is bounded).
Note that the asymptotic here differs from other ones existing in the literature on related
subjects (compare with, e.g., [7] and [12]).
Proof of Theorem 2. Let and be real numbers satisfying the conditions 1/2 <
< 1, + > 1, 2 + < 2. It suffices to prove the convergence as k takes values on a
sequence of positive numbers tending to 0. To keep in mind that the parameter is k, we
use the notation S(k) := S = SP 2 (R).
Choose k small enough so that k > 2 and define the sets of disjoint intervals, for
j = 0, 1, . . ., [k ] ([] denotes integer part),
Ujk = ((j 1)[k ] + /2, j[k ] /2),
Ijk = [j[k ] /2, j[k ] + /2].
Each interval Ujk has length [k ] and two neighboring intervals Ujk are separated
by an interval of length . So, the -dependence of the process implies that the random
variables SP 2 (Ujk ), j = 0, 1, . . . , [k ], are independent. A similar argument applies to
SP 2 (Ijk ), j = 0, 1, . . ., [k ].
We write
SP 2 (Ujk ),
T (k) =
|j|[k ]
Vk = (Var(S(k)))1/2
k/,
180
The proof is performed in two steps, which easily imply the statement. In the first, it
is proved that Vk [S(k) T (k)] tends to 0 in the L2 of the underlying probability space.
In the second step, we prove that Vk T (k) is asymptotically standard normal.
Step 1. We first prove that Vk [S(k) T (k)] tends to 0 in L1 . Since it is non-negative,
it suffices to show that its expectation tends to zero. We have
S(k) T (k) =
SP 2 (Ijk ) + Z1 + Z2 ,
|j|<[k ]
where Z1 = SP 2 (, [k ] [k ] + /2), Z2 = SP
2 ([k ] [k ] /2, +)).
k
Using the fact that E(SP 2 (I)) (const) I (kx/ 2 ) dx, we can show that
+
=0
[k ]k
(kx/ 2 ) dx ,
+
[k ][k ]
which tends to zero as a consequence of the choice of and . It suffices to prove that
Vk2 Var(S(k) T (k)) 0 as k 0. Using independence, we have
Var(S(k) T (k)) =
|j|<[k ]
which tends to zero because of the choice of . The remaining two terms can be bounded
in a similar form as in the proof of Theorem 1.
Step 2. T (k) is a sum of independent, but not equidistributed, random variables. To
prove that it satisfies a central limit theorem, we will use a Lyapunov condition based of
fourth moments. Set
Mjm := E{[SP 2 (Ujk ) E(SP 2 (Ujk ))]m }.
For the Lyapunov condition, it suffices to verify that
4
|j|[k ]
Mj4 0
as k 0, where 2 :=
Mj2 .
|j|[k ]
(22)
181
E(SP i1 SP i2 SP i3 SP i4 ),
(23)
where SP i stands for SP 2 (Ii ) E(SP 2 (Ii )). Since the size of all intervals is equal
to , given the finiteness of fourth moments in the hypothesis, it follows that
E(SP i1 SP i2 SP i3 SP i4 ) is bounded.
On the other hand, the number of terms which do not vanish in the sum of the righthand side of (23) is O(p2 ). In fact, if one of the indices in (i1 , i2 , i3 , i4 ) differs by more
than 1 from all the others, then E(SP i1 SP i2 SP i3 SP i4 ) = 0. Hence,
E[SP 2 (Ujk ) E(SP 2 (Ujk ))]4 (const)k 2
so that |j|[k ] Mj4 = O(k 2 k ). The inequality 2 + < 2 implies the Lyapunov
condition.
Wy = ky,
(25)
182
Let us define
Wx (x, y) kx
Wy (x, y) ky
Y(x, y) :=
(26)
Under very general conditions, for example, on the spectral measure of {W (x, y) : x, y
R}, the random field {Y (x, y) : x, y R} satisfies the conditions of [3], Theorem 6.2, and
we can write
E(| det Y (x, y)|)pY(x,y) (0) dx dy
E(SP 2 (Q)) =
(27)
since for fixed (x, y), the random matrix Y (x, y) and the random vector Y(x, y) are
independent so that the condition in the conditional expectation can be eliminated. The
density in the right-hand side of (27) has the expression
pY(x,y) (0) = p(Wx ,Wy ) (kx, ky)
1
=
2
1
20 02 211
k2
(02 x2 211 xy + 20 y 2 ) .
exp
2(20 02 211 )
(28)
To compute the expectation of the absolute value of the determinant in the righthand side of (27), which does not depend on x, y, we use the method of [4]. Set
2
:= det Y (x, y) = (Wxx k)(Wyy k) Wxy
.
We have
+
E(||) = E
1 cos(t)
dt .
t2
(29)
Define
2
h(t) := E[exp(it[(Wxx k)(Wyy k) Wxy
])].
Then
E(||) =
+
0
1 Re[h(t)]
dt .
t2
0 1/2 0
A = 1/2 0
0
0
0 1
and denote by the variance matrix of (Wxx , Wyy , Wx,y )
40
:= 22
31
22
04
13
31
13 .
22
(30)
183
h(t) = eitk E(exp[it((1 Z12 k(s11 + s21 )Z1 ) + (2 Z22 k(s12 + s22 )Z2 )
+ (3 Z32 k(s13 + s23 )Z3 ))]),
(31)
where (Z1 , Z2 , Z3 ) is standard normal and sij are the entries of 1/2 P T .
One can check that if is a standard normal variable and , are real constants, > 0,
then
2
/(12i )
2
1
2
+
i
+
exp
1 + 4 2
1 + 4 2
(1 + 4 2 )1/4
where = 12 arctan(2 ), 0 < < /4. Substituting this into (31), we obtain
3
dj (t, k)
Re[h(t)] =
1 + 42j t2
j=1
(j (t) + k 2 tj (t)) ,
cos
(32)
j=1
where, for j = 1, 2, 3:
dj (t, k) = exp
k 2 t2 (s1j + s2j )2
;
2 1 + 42j t2
j (t) =
1
arctan(2j t),
2
j (t) =
1
(s1j + s2j )2 j
.
t2
3
1 + 42j t2
Introducing these expressions into (30) and using (28), we obtain a new formula which
has the form of a rather complicated integral. However, it is well adapted to numerical
evaluation. On the other hand, this formula allows us to compute the equivalent as
k 0 of the expectation of the total number of specular points under the LonguetHiggins approximation. In fact, a first-order expansion of the terms in the integrand
gives a somewhat more accurate result, one that we now state as a theorem.
Theorem 3.
E(SP 2 (R2 )) =
m2
+ O(1),
k2
(33)
where
+
m2 =
0
+
=
0
1[
3
2 2 1/2
cos(
j=1 (1 + 4j t )]
2
t
1 23/2 [
3
j=1 (Aj
3
j=1 j (t))
dt
1 + Aj )](1 B1 B2 B2 B3 B3 B1 )
t2
dt,
(34)
184
Figure 2. Intensity function of the specular points for the Jonswap spectrum.
Bj = Bj (t) =
(1 Aj )/(1 + Aj ).
9 3 0
114
0
,
= 104 3 11 0 .
104
0 81
0 0 3
The integrand in (27) is displayed in Figure 2 as a function of the two space variables
x, y. The value of the asymptotic parameter m2 is 2.527103.
We now consider the variance of the total number of specular points in two dimensions,
looking for analogous results to the one-dimensional case (i.e., Theorem 1), in view of
their interest for statistical applications. It turns out that the computations become
185
much more complicated. The statements on variance and speed of convergence to zero of
the coefficient of variation that we give below include only the order of the asymptotic
behavior in the Longuet-Higgins approximation, but not the constant. However, we still
consider them to be useful. If one refines the computations, rough bounds can be given on
the generic constants in Theorem 4 on the basis of additional hypotheses on the random
field.
We assume that the real-valued, centered, Gaussian stationary random field {W (x) : x
R2 } has paths of class C 3 , the distribution of W (0) does not degenerate (i.e., Var(W (0))
is invertible). Moreover, let us consider W (0), expressed in the reference system xOy
of R2 as the 2 2 symmetric centered Gaussian random matrix
W (0) =
Wxx (0)
Wxy (0)
Wxy (0)
Wyy (0)
The function
z
(35)
Theorem 4. Let us assume that {W (x) : x R2 } satisfies the above conditions and that
it is also -dependent, > 0, that is, E(W (x)W (y)) = 0 whenever x y > . Then, for
k small enough,
Var(SP 2 (R2 ))
L
,
k2
(36)
(37)
We have already computed the equivalents as k 0 of the second and third term in
the right-hand side of (37). Our task in what follows is to consider the first term.
The proof is performed along the same lines as the one of Theorem 1, but instead of
applying a Rice formula for the second factorial moment of the number of crossings of a
186
one-parameter random process, we need [3], Theorem 6.3, for the factorial moments of a
2-parameter random field. We have
E(T (T 1)) =
pY(x),Y(y) (0, 0) dx dy
=
xy >
dx dy +
xy
dx dy = J1 + J2 .
For J1 , we proceed as in the proof of Theorem 1, using the -dependence and the
evaluations leading to the statement of Theorem 3. We obtain
J1 =
m22 O(1)
+ 2 .
k4
k
(38)
One can show that under the hypotheses of the theorem, for small k, one has
J2 =
O(1)
.
k2
(39)
We refer the reader to [2] for the lengthy computations leading to this inequality. In view
of (37), (33) and (38), this suffices to prove the theorem.
M2
dt
0
W (,y,t)
(40)
187
If the ergodicity assumption in time holds true, then we can conclude that a.s.
M2
1
T
W (,0,0)
W (,y,t)
dt
0
M1 M2
where
abc =
R3
ax by ct d(x , y , t )
are the spectral moments of W . Hence, on the basis of the quantity (40), for large T ,
one can make inference about the value of certain parameters of the law of the random
field. In this example, these are the spectral moments 200 and 000 .
If two-dimensional level information is available, one can work differently because there
exists an interesting relationship with Rice formulae for level curves that we explain in
what follows. We can write (x = (x, y))
W (x, t) = W (x, t) (cos (x, t), sin (x, t))T .
Using a Rice formula, more precisely, under conditions of [3], Theorem 6.10,
M2
200 u2 /(2000 )
e
,
000
0
CQ (0,u)
(41)
where Q = [0, M1 ] [0, M2 ]. We have a similar formula when we consider sections of the
set [0, M1 ] [0, M2 ] in the other direction. In fact, (41) can be generalized to obtain the
Palm distribution of the angle .
Set h1 ,2 = I[1 ,2 ] and, for 1 < 2 , define
W (,y,0)
N[0,M1 ] (u) dy = E
2 (Q)
(42)
CQ (u,s)
= 2 (Q)E h1 ,2
2
y W
1/2 exp(u /(200 ))
((x W )2 + (y W )2 )
.
x W
2000
Defining = 200 020 110 and assuming 2 (Q) = 1 for ease of notation, we readily
obtain
F (2 ) F (1 )
2
eu /(2000 )
h1 ,2 () x2 + y 2 e(1/(2))(02 x
R2
eu /(200 )
211 xy+20 y 2 )
dx dy
188
2
1
2 exp
2
(+ cos2 ( ) + sin2 ( )) d d,
2+
where + are the eigenvalues of the covariance matrix of the random vector
(x W (0, 0, 0), y W (0, 0, 0)) and is the angle of the eigenvector associated with + .
Noting that the exponent in the integrand can be written as 1/ (1 2 sin2 ( ))
with 2 := 1 + / and that
+
0
2 exp
H2
2
,
2H
F (2 ) F (1 ) = (const )
(1 2 sin2 ( ))
1/2
d.
From this relation, we get the density g() of the Palm distribution, simply by dividing
by the total mass:
g() =
(1 2 sin2 ( ))1/2
2
2
1/2 d
(1 sin ( ))
(1 2 sin2 ( ))1/2
.
4K( 2 )
(43)
Here, K is the complete elliptic integral of the first kind. This density characterizes
the distribution of the angle of the normal at a point chosen at random on the level
curve. In the case of a random field which is isotropic in (x, y), we have 200 = 020
and, moreover, 110 = 0, so that g turns out to be the uniform density over the circle
(Longuet-Higgins says that over the contour, the distribution of the angle is uniform
(cf. [11], page 348)). We have performed the numerical computation of the density (43)
for an anisotropic process with = 0.5, = /4. Figure 3 displays the densities of the
Palm distribution of the angle showing a large departure from the uniform distribution.
Let us turn to ergodicity. For a given subset Q of R2 and each t, let us define At =
{W (x, y, t) : > t; (x, y) Q} and consider the -algebra of t-invariant events A = At .
We assume that for each pair (x, y), (x, y, t) 0 as t +. It is well known that under
this condition, the -algebra A is trivial, that is, it only contains events having probability
zero or one (see, e.g., [6], Chapter 7). This has the following important consequence in
our context. Assume that the set Q has a smooth boundary and, for simplicity, unit
Lebesgue measure. Let us consider
Z(t) =
H(x, t) d1 (x)
(44)
CQ (u,t)
with H(x, t) = H(W (x, t), W (x, t)), where W = (Wx , Wy ) denotes the gradient in the
space variables and H is some measurable function such that the integral is well defined.
This is exactly our case in (42). The process {Z(t) : t R} is strictly stationary and,
189
Figure 3. Density of the Palm distribution of the angle of the normal to the level curve in the
case = 0.5 and = /4.
T
0
Z(s) ds EB [Z(0)],
where B is the -algebra of t-invariant events associated with the process Z(t). Since for
each t, Z(t) is At -measurable, it follows that B A so that EB [Z(0)] = E[Z(0)]. On the
other hand, the Rice formula yields (taking into account the fact that stationarity of W
implies that W (0, 0) and W (0, 0) are independent)
E[Z(0)] = E[H(u, W (0, 0)) W (0, 0) ]pW (0,0)(u).
We consider now the central limit theorem. Let us define
Z(t) =
1
t
t
0
(45)
To compute the variance of Z(t), one can again use the Rice formula for the first moment
of integrals over level sets, this time applied to the R2 -valued random field with parameter
in R4 , {(W (x1 , s1 ), W (x2 , s2 ))T : (x1 , x2 ) Q Q, s1, s2 [0, t]} at the level (u, u). We get
Var Z(t) =
2
t
t
0
s
I(u, s) ds,
t
190
where
I(u, s) =
Q2
pW (x1 ,0),W (x2 ,s) (u, u) dx1 dx2 (E[H(u, W (0, 0)) W (0, 0) ]pW (0,0)(u)) .
Assuming that the given random field is time--dependent, that is, (x, y, t) = 0 (x, y)
whenever t > , we readily obtain
t Var Z(t) 2
I(u, s) ds := 2 (u)
as t .
(46)
Now, using a variant of the HoeffdingRobbins theorem [8] for sums of -dependent
random variables, we can establish the following theorem.
Theorem 5. Assume that the random field W and the function H satisfy the conditions
of [3], Theorem 6.10. Assume, for simplicity, that Q has Lebesgue measure. Then:
(i) if the covariance (x, y, t) tends to zero as t + for every value of (x, y) Q,
we have
1
T
T
0
191
In what follows, we assume an isotropic Gaussian model. This means that we will
consider the wavefront as an isotropic Gaussian field
(x, t) =
R2
(|k|)
|k|
1/2
dW (k),
where k = (k1 , k2 ), |k| = k12 + k22 , (k) is the isotropic spectral density and W = (W1 +
iW2 ) is a standard complex orthogonal Gaussian measure on R2 with unit variance. We
are only interested in t = 0 and we put (x) := (x, 0) and (x) := (x, 0). We have,
setting k = |k|,
R2
(k)
k
1/2
cos( k x )
R2
(k)
k
1/2
cos( k x )
(x) =
(x) =
dW1 (k)
R2
(k)
k
1/2
sin( k x )
R2
(k)
k
1/2
sin( k x )
dW2 (k) +
(49)
where J (x) is the Bessel function of the first kind of order . Moreover, E[(r1 )(r2 )] = 0.
5.2. Variance
Again, let S be a measurable subset of R2 having Lebesgue measure equal to 1. We have
Var(NSZ (0)) = E(NSZ (0)(NSZ (0) 1)) + d2 d22
192
and for the first term, we use the Rice formula for the second factorial moment ([3],
Theorem 6.3), that is,
E(NSZ (0)(NSZ (0) 1)) =
where
A(s1 , s2 ) = E[| det Z (s1 ) det Z (s2 )||Z(s1 ) = Z(s2 ) = 02 ]pZ(s1 ),Z(s2 ) (04 ).
Here, 0p denotes the null vector in dimension p.
Taking into account the fact that the law of the random field Z is invariant under
translations and orthogonal transformations of R2 , we have
A(s1 , s2 ) = A((0, 0), (r, 0)) = A(r)
with r = s1 s2 .
The Rice function A(r) has two intuitive interpretations. First, it can be viewed as
A(r) = lim
1
E[N (B((0, 0), )) N (B((r, 0), ))].
2 4
(50)
pZ(0,0),Z(r,0)(04 ).
The density is easy to compute:
pZ(0,0),Z(r,0) (04 ) =
1
,
(2)2 (1 2 (r))
where (r) =
J0 (kr)(k) dk.
The conditional expectation turns out to be more difficult to calculate, requiring a long
computation (we again refer to [2] for the details). We obtain the following formula (that
can be easily compared to the formula in [4] since we are using the same notation):
A(r) =
A1
3
4 (1 C 2 )
1
(Z2 2Z12 t2 )
1
1
dt,
t2
(1 + t2 ) Z2 (Z2 Z12 t2 )
193
E = (r),
E2
,
1 C2
H = E/r,
A2 =
E2C
F02 H 2
1 F
2
F0
1 C2
A2
,
1 + Zt2
Z2 =
1 + t2
.
1 + Zt2
F = (r),
H F (1 C 2 ) E 2 C
,
F0 F0 (1 C 2 ) E 2
F0
E2
1 C2
F0 = (0),
Acknowledgement
This work has received financial support from the European Marie Curie Network
SEAMOCS.
References
[1] Azas, J.-M., Le
on, J. and Ortega, J. (2005). Geometrical characteristic of gaussian sea
waves. J. Appl. Probab. 42 119. MR2145485
[2] Azas, J.-M., Le
on, J. and Wschebor, M. (2009). Some applications of Rice formulas to
waves. Available at ArXiv:0910.0763v1 [math.PR].
[3] Azas, J.-M. and Wschebor, M. (2009). Level Sets and Extrema of Random Processes and
Fields. Hoboken, NJ: Wiley. MR2478201
[4] Berry, M.V. and Dennis, M.R. (2000). Phase singularities in isotropic random waves. Proc.
R. Soc. Lond. Ser. A 456 20592079. MR1794716
[5] Caba
na, E. (1985). Esperanzas de Integrales sobre Conjuntos de Nivel aleatorios. In Actas
del 2o. Congreso Latinoamericano de Probabilidad y Estadistica Matem
atica, Spanish
6582. Caracas, Venezuela: Regional Latino americana de la Soc. Bernoulli.
[6] Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes. New
York: Wiley. MR0217860
[7] Cuzick, J.A. (1976). Central limit theorem for the number of zeros of a stationary Gaussian
process. Ann. Probab. 4 547556. MR0420809
[8] Hoeffding, W. and Robbins, H. (1948). The central limit theorem for dependent random
variables. Duke Math. J. 15 773780. MR0026771
[9] Kratz, M. and Le
on, J.R. (2009). Level curves crossings and applications for Gaussian
models. Extremes. DOI: 10.1007/s10687-009-0090-x.
[10] Longuet-Higgins, M.S. (1960). Reflection and refraction at a random surface, I, II, III. J.
Optical Soc. Amer. 50 838856. MR0113489
[11] Longuet-Higgins, M.S. (1962). The statistical geometry of random surfaces. In Proc. Symp.
Appl. Math. Vol. XIII 105143. Providence, RI: Amer. Math. Soc. MR0140175
[12] Piterbarg, V. and Rychlik, I. (1999). Central limit theorem for wave functionals of Gaussian
processes. Adv. in Appl. Probab. 31 158177. MR1699666
194
Jose R. Leon
Mario Wschebor
October 5, 2009
Abstract
We use Rices formulas in order to compute the moments of some
level functionals which are linked to problems in oceanography and optics. For instance, we consider the number of specular points in one or
two dimensions, the number of twinkles, the distribution of normal angle
of level curves and the number or the length of dislocations in random
wavefronts. We compute expectations and in some cases, also second moments of such functionals. Moments of order greater than one are more
involved, but one needs them whenever one wants to perform statistical
inference on some parameters in the model or to test the model itself.
In some cases we are able to use these computations to obtain a Central
Limit Theorem.
Introduction
2 r1 1 r2
x(r2 r1 )
(1)
x2 + 2i , i=1,2.
The points (x, W (x)) of the curve such that x is a solution of (1) are called
specular points. We denote by SP1 (A) the number of specular points
such that x A, for each Borel subset A of the real line. One of our aims
in this paper is to study the probability distribution of SP1 (A).
The following approximation, which turns out to be very accurate in practice for ocean waves, was introduced long ago by Longuet-Higgins (see [13]
and [14]):
Suppose that h1 and h2 are big with respect to W (x) and x, then ri =
i + x2 /(2i ) + O(h3
i ). Then, (1) can be approximated by
W (x)
x h1 + h2
x 1 + 2
= kx,
2 1 2
2 h1 h2
where
k :=
(2)
1 1
1
+
.
2 h1
h2
Denote Y (x) := W (x) kx and SP2 (A) the number of roots of Y (x)
belonging to the set A, an approximation of SP1 (A) under this asymptotic.
The first part of Section 3 below will be devoted to obtain some results
on the distribution of the random variable SP2 (R).
Consider now the same problem as above, but adding a time variable t,
that is, W becomes a random function parameterized by the pair (x, t).
We denote Wx , Wt , Wxt , ... the partial derivatives of W .
We use the Longuet-Higgins approximation (2), so that the approximate
specular points at time t are (x, W (x, t)) where
Wx (x, t) = kx.
Generally speaking, this equation defines a finite number of points which
move with time. The implicit function theorem, when it can be applied,
shows that the x-coordinate of a specular point moves at speed
dx
Wxt
=
.
dt
Wxx k
The right-hand side diverges whenever Wxx k = 0, in which case a flash
appears and the point is called a twinkle. We are interested in the
(random) number of flashes lying in a set A of space and in an interval
[0, T ] of time. If we put:
Wx (x, t) kx
Wxx (x, t) k
Y(x, t) :=
(3)
W (u)
CQ
Y (s)ddd (s)].
where Y (s) is some random field defined on the level set. Caba
na [7],
Wschebor [19] (d = 1) Azas and Wschebor [4] and, in a weak form,
Z
ahle [20] have studied these types of formulas. See Theorems 5 and 6.
Another interesting problem is the study of phase singularities, dislocations of random wavefronts. They correspond to lines of darkness, in light
propagation, or threads of silence in sound [6]. In a mathematical framework they can be define as the loci of points where the amplitude of waves
vanishes. If we represent the wave as
W (x, t) = (x, t) + i(x, t), where x Rd
where , are independent homogenous Gaussian random fields the dislocations are the intersection of the two random surfaces (x, t) = 0, (x, t) =
0. We consider a fixed time, for instance t = 0. In the case d = 2 we will
study the expectation of the following random variable
#{x S : (x, 0) = (x, 0) = 0}.
In the case d = 3 one important quantity is the length of the level curve
L{x S : (x, 0) = (x, 0) = 0}.
All these situations are related to integral geometry. For a general treatment
of the basic theory, the classical reference is Federers Geometric Measure Theory [9].
The aims of this paper are: 1) to re-formulate some known results in a
modern language or in the standard form of probability theory; 2) to prove
new results, such as computations in the exact models, variance computations
in cases in which only first moments have been known, thus improving the
statistical methods and 3) in some case, obtain Central Limit Theorems.
The structure of the paper is the following: In Section 2 we review without
proofs some formulas for the moments of the relevant random variables. In
Section 3 we study expectation, variance and asymptotic behavior of specular
points. Section 4 is devoted to the study of the distribution of the normal to the
level curve. Section 5 presents three numerical applications. Finally, in Section
6 we study dislocations of wavefronts following a paper by Berry & Dennis [6].
Rice formulas
We give here a quick account of Rice formulas, which allow to express the
expectation and the higher moments of the size of level sets of random fields
by means of some integral formulas. The simplest case occurs when both the
4
dimension of the domain and the range are equal to 1, for which the first results
date back to Rice [17] (see also Cramer and Leadbetters book [8]). When
the dimension of the domain and the range are equal but bigger than 1, the
formula for the expectation is due to Adler [1] for stationary random fields. For
a general treatment of this subject, the interested reader is referred to the book
[4], Chapters 3 and 6, where one can find proofs and details.
Theorem 1 (Expectation of the number of crossings, d = d = 1) Let W =
{W (t) : t I} , I an interval in the real line, be a Gaussian process having C 1 paths. Assume that Var(W (t)) = 0 for every t I.
Then:
(4)
E NIW (u) = E |W (t)| W (t) = u pW (t) (u)dt.
I
E
Im
j=1
|W (tj )| W (t1 ) = ... = W (tm ) = u pW (t1 ),...,W (tm ) (u, ..., u)dt1 ...dtm .
(5)
Proposition 1 Under the same conditions of the above theorem one has
P({t A : W (t) = u , det(W (t)) = 0}) = 0
if
pX(t) (x) C for all x in some neighborhood of u,
at least one of the two following conditions is satisfied
a) the trajectories of W are twice continuously differentiable
b)
() = sup P{| det W (t)| < W (t) = x} 0
xV (u)
E
Bm
j=1
W (t) is of class C 1 .
1/2
W (t) = u
pW (t) (u)dt.
(7)
E
BW 1 (u)
(8)
3.1
Number of roots
where
i =
i d()
u2
2 2
0 ,
e
0
i = 0, 2, 4, . . . ,
(9)
3.2
We consider first the one-dimensional static case with the longuet-Higgins approximation (2) for the number of specular points, that is:
SP2 (I) = #{x I : Y (x) = W (x) kx = 0}
E(SP2 (I)) =
I
where 2 (x) is the variance of W (x) and G(, ) := E(|Z|), Z with distribution
N (, 2 ).
For the second equality in (10), in which we have erased the condition in the
conditional expectation, take into account that since Var(W (x)) is constant,
for each x the random variables W (x) and W (x) are independent (differentiate under the expectation sign and use the basic properties of the Gaussian
distribution).
An elementary computation gives:
G(, ) = [2(/) 1] + 2(/),
where (.) and (.) are respectively the density and the cumulative distribution
functions of the standard Gaussian distribution.
When the process W (x) is also stationary, v 2 = 2 and 2 (x) is constant
equal to 4 . If we look at the total number of specular points over the whole
line, we get
G(k, 4 )
(11)
E(SP2 (R)) =
k
which is the result given by [14] (part II, formula (2.14) page 846). Note that
this quantity is an increasing function of k4 ) .
Since in the longuet-Higgins approximation k 0, one can write a Taylor
expansion having the form:
E(SP2 (R))
24 1
1 k2
1 k4
1+
+
+ ...
k
2 4
24 24
(12)
Let us turn to the variance of the number of specular points, under some
additional restrictions. First of all, we assume for this computation that the
8
(13)
R2
(14)
where
1 k 2 (2 x2 + 22 (x y)xy + 2 y 2 )
,
2
22 2 (x y)
2 22 2 (x y)
(15)
under the additional condition that the density (15) does not degenerate for
x = y.
For the conditional expectation in (14) we perform a Gaussian regression of
W (x) (resp. W (y)) on the pair (W (x), W (y)). Putting z = x y, we obtain:
1
exp
ay (x) =
(16)
where y (x) is Gaussian centered, independent of (W (x), W (y)). The regression of W (y) is obtained by permuting x and y.
The conditional expectation in (14) can now be rewritten as an unconditional
expectation:
(z)x + 2 y
22 2 (z)
(z)y + 2 x
22 2 (z)
(17)
Notice that the singularity on the diagonal x = y is removable, since a Taylor
expansion shows that for z 0:
E y (x) k (z) 1 +
(z) 1 +
x (y) k (z) 1 +
(z)x + 2 y
1 4
=
x z + O(z 3 ) .
22 2 (z)
2 2
(18)
2 2 (z)
22 2 (z)
(19)
and
E y (x)x (y) = (4) (z) +
2 (z) (z)
.
22 2 (z)
(20)
H(; 0, 0) =
1 2 +
2
arctan
and
1 2
|R2 (; , )| 3(2 + 2 )
if 2 + 2 1 and 0 1.
In the next theorem we compute the equivalent of the variance of the number
of specular points, under certain hypotheses on the random process and with
the longuet-Higgins asymptotic. This result is new and useful for estimation
purposes since it implies that, as k 0, the coefficient of variation of the
random variable S tends to zero at a known speed. Moreover, it will also appear
in a natural way when normalizing S to obtain a Central Limit Theorem.
Theorem 7 Assume that the centered Gaussian stationary process W = {W (x) :
x R} is dependent, that is, (z) = 0 if |z| > , and that it has C 4 -paths.
Then, as k 0 we have:
Var(S) =
1
+ O(1).
k
where
J
= +
2
24
24
,
3
2
10
(22)
J=
2 (z)H (z); 0, 0)
2(2 + (z))
dz,
(23)
the functions H and 2 (z) have already been defined above, and
(z) =
(z)2 (z)
1
(4)
(z)
+
.
2 (z)
22 2 (z)
1
24
+
2 (z)H((z); 0, 0)
2
2 +
(z)
1 4
dz.
2
The proof of this extension can be performed following the same lines as
the one we give below, with some additional computations.
Proof of the Theorem: We use the notations and computations preceding
the statement of the theorem.
Divide the integral on the right-hand side of (14) into two parts, according as
|x y| > or |x y| , i.e.
E(S(S 1)) =
... = I1 + I2 .
... +
(24)
|xy|
|xy|>
In the first term, the dependence of the process implies that one can
factorize the conditional expectation and the density in the integrand. Taking
into account that for each x R, the random variables W (x) and W (x) are
independent, we obtain for I1 :
I1 =
|xy|>
On the other hand, we know that W (x) (resp. W (x)) is centered normal with
variance 2 (resp. 4 ). Hence:
I1 = G(k,
4 )
2
|xy|>
1 k 2 (x2 + y 2 )
1
exp
dxdy,
22
2
2
11
To compute the integral on the right-hand side, notice that the integral over
the whole x, y plane is equal to 1/k 2 so that it suffices to compute the integral
over the set |x y| . Changing variables, this last one is equal to
+
x+
dx
x
+
1
=
2k 2
1
1 k 2 (x2 + y 2 )
dy
exp
22
2
2
1
e 2 u du
u+ k
e 2 v dv
u k
+ O(1),
=
k 2
where the last term is bounded if k is bounded (in fact, remember that we are
considering an approximation in which k 0). So, we can conclude that:
|xy|>
1
1
1 k 2 (x2 + y 2 )
dxdy = 2
exp
+ O(1)
22
2
2
k
k 2
24 1
+ O(1) .
2
k
k 2
(25)
I2 = O(1) + 2
|xy|,x>2
22 2 (z)
exp
12
1 k 2 (2 x2 + 2 (x y)xy + 2 y 2 )
dxdy
2
22 2 (x y)
I2 = O(1) + 2
1
2(2 + (z))
+
exp
1
2(2
(z))
k2 z 2
2
1
1
2 2 (z) 2 + (z) 2
exp k 2
dz
(x z/2)2
dx
2 (z))
2k(x z/2)
=
2 (z)
so that it becomes:
+
1
1 1
1
1
exp 2 d =
+ O(1)
2
k 2 0
2
2 2k
(26)
(27)
To finish, put together (27) with (25), (24), (13) and (12).
Corollary 1 Under the conditions of Theorem 7, as k 0:
Var(S)
k.
E(S)
The proof follows immediately from the Theorem and the value of the expectation.
The computations made in this section are in close relation with the two
results of Theorem 4 in Kratz and Leon [12]. In this paper the random variable
SP2 (I) is expanded in the Wiener-Hermite Chaos. The aforementioned expansion yields the same formula for the expectation and allows obtaining also a
formula for the variance. However, this expansion is difficult to manipulate in
order to get the result of Theorem 7.
Let us now turn to the Central Limit Theorem.
13
24 1
k
/k
N (0, 1),
T (k) =
|j|[k ]
Denote
Vk = Var(S(k))
1/2
k/
We give the proof in two steps, which easily imply the statement. In the
first one, we prove that
Vk [S(k) T (k)]
SP2 (Ijk ) + Z1 + Z2
|j|<[k ]
where
Z1 = SP2 , [k ].[k ] + /2 ,
Z2 = SP2 [k ].[k ] /2, +) .
Using the fact that E SP2k (I) (const)
+
=0
[k ]k
+
2
(kx/
2 )dx .
[k ][k ]
|j|<[k ]
(29)
We already know that Vk2 E S(k) T (k) 0. Using the hypotheses of the
theorem, since each Ijk can be covered by a fixed number of intervals of size one,
we know that E SP2 (Ijk )(SP2 (Ijk ) 1) is bounded by a constant which does
not depend on k and j. We can write
Vk2
|j|<[k ]
which tends to zero because of the choice of . The remaining two terms can
be bounded by calculations similar to those of the proof of Theorem 7.
Step 2.
T (k) is a sum of independent but not equi-distributed random
variables. To prove it satisfies a Central Limit Theorem, we use a Lyapunov
condition based of fourth moments. Set:
Mjm := E
Mj4 0 as k 0,
(30)
where
2 :=
Mj2 .
|j|[k ]
(31)
where SPi stands for SP2 (Ii ) E SP2 (Ii ) Since the size of all the intervals
is equal to and given the finiteness of fourth moments in the hypothesis, it
follows that E SPi1 SPi2 SPi3 SPi4 is bounded.
On the other hand, notice that the number of terms which do not vanish in
the sum of the right-hand side of (31) is O(p2 ). In fact, if one of the indices in
(i1 , i2 , i3 , i4 ) differs more than 1 from all the other, then E SPi1 SPi2 SPi3 SPi4
vanishes. Hence,
E SP2 (Ujk ) E SP2 (Ujk )
(const)k 2
so that |j|[k ] Mj4 = O(k 2 k ). The inequality 2 + < 2 implies Lyapunov condition.
16
3.3
dx
a
(32)
m1
m1
(x, W (x))
(x, W (x))W (x),
x
w
m1
m1
(x, w))+
(x, w))m1 (x, w).
x
w
Using that for each x, W (x) and W (x) are independent random variables
and performing a Gaussian regression of W (x) on W (x), we can write (32) in
the form:
dx
a
E | 2 w K(x, w)|
1
1
m2 (x, w)
exp (w2 + 1
) dw.
2
2
2 2
(33)
1
2
4 22
2
dx
a
m2 (x, w)
1
) dw,
G(m, 1) exp (w2 + 1
2
2
17
(34)
where
m = m(x, w) =
2 w + K(x, w)
4 22
3.4
Number of twinkles
(35)
40 =
4 (d, d )
20 =
2 (d, d ),
where is the spectral measure of the stationary random field W (x, t). The
density in (35) satisfies
pY(x,t) (0) = (20 )1/2 kx(20 )1/2 (40 )1/2 k(40 )1/2 .
On the other hand
Y (x, t) =
Wxx (x, t) k
Wxxx (x, t)
Wxt (x, t)
Wxxt (x, t)
231
, for the first coordinate
40
240
, for the second coordinate
20
(36)
(37)
It follows that:
E | det(Y (x, t))| Y(x, t) = 0 = G
31
k,
40
18
22
40
231
.G
kx,
40
20
60
240
20
Summing up:
1
E T W(R, T ) =
T
1
40
=
40
k
40
setting 6 := 60
III page 853).
3.5
31
k,
40
240
20
22
31
k,
40
231
40
G
R
22
40
kx,
20
231 1
40 k
60
1
240
20
20
6 20 + 40
6
20 40
6 + 240
kx
20
(38)
x2 + y 2 + i , i = 1, 2, as in the one-parameter
x
2 r1 1 r2
,
2
+y
r2 r1
y
2 r1 1 r2
Wy = 2
.
x + y 2 r2 r1
Wx =
x2
(39)
(40)
19
Let us define:
Wx (x, y) kx
Wy (x, y) ky
Y(x, y) :=
(41)
Under very general conditions, for example on the spectral measure of {W (x, y) :
x, y R} the random field {Y (x, y) : x, y R} satisfies the conditions of Theorem 3, and we can write:
E SP2 (Q) =
Q
(42)
since for fixed (x, y) the random matrix Y (x, y) and the random vector Y (x, y)
are independent, so that the condition in the conditional expectation can be
erased.
The density in the right hand side of (42) has the expression
pY(x,y) (0) = p(Wx ,Wy ) (kx, ky)
k2
02 x2 211 xy + 20 y 2 .
2(20 02 211 )
20 02 211
(43)
To compute the expectation of the absolute value of the determinant in the right
hand side of (42), which does not depend on x, y, we use the method of [6]. Set
2
:= det Y (x, y) = (Wxx k)(Wyy k) Wxy
.
=
1
2
exp
We have
E(||) = E
1 cos(t)
dt .
t2
(44)
Define
2
h(t) := E exp it[(Wxx k)(Wyy k) Wxy
]
Then
E(||) =
+
0
1 Re[h(t)]
dt .
t2
0 1/2 0
0
A = 1/2 0
0
0 1
40 22 31
:= 22 04 13 .
31 13 22
20
.
(45)
E exp it (1 Z12 k(s11 +s21 )Z1 )+(2 Z22 k(s12 +s22 )Z2 )+(3 Z32 k(s13 +s23 )Z3 )
where (Z1 , Z2 , Z3 ) is standard normal and sij are the entries of 1/2 P T .
One can check that if is a standard normal variable and , are real
constants, > 0:
E ei (+)
i 2
2
2
1
+i
+
exp
1 + 4 2
1 + 4 2
(1 + 4 2 )1/4
where
1
arctan(2 ), 0 < < /4.
2
Replacing in (46), we obtain for Re[h(t)] the formula:
=
Re[h(t)] =
j=1
dj (t, k)
1+
42j t2
j (t) + k 2 tj (t)
cos
(47)
j=1
where, for j = 1, 2, 3:
dj (t, k) = exp
k 2 t2 (s1j + s2j )2
,
2
1 + 42j t2
j (t) =
1
arctan(2j t), 0 < j < /4,
2
j (t) =
(s1j + s2j )2 j
1
.
t2
3
1 + 42j t2
Introducing these expressions in (45) and using (43) we obtain a new formula
which has the form of a rather complicated integral. However, it is well adapted
to numerical evaluation.
On the other hand, this formula allows us to compute the equivalent as
k 0 of the expectation of the total number of specular points under the
longuet-Higgins approximation. In fact, a first order expansion of the terms in
the integrand gives a somewhat more accurate result, that we state as a theorem:
Theorem 9
E SP2 (R2 ) =
21
m2
+ O(1)
k2
(48)
(46)
where
+
m2 =
3
j=1 (1
+ 42j t2 )
cos
3
j=1
j (t)
t2
1/2
1 23/2
3
j=1
Aj
1 + Aj
t2
dt
1 B1 B2 B2 B3 B3 B1
dt,
(49)
where
Aj = Aj (t) = 1 + 42j t2
1/2
, Bj = Bj (t) =
(1 Aj )/(1 + Aj ).
The function
z
(50)
x y > .
Then, for k small enough:
Var SP2 (R2 )
L
,
k2
where L is a positive constant depending upon the law of the random field.
A direct consequence of Theorems 9 and 10 is the following:
Corollary 2 Under the same hypotheses of Theorem 10, for k small enough,
one has:
Var SP2 (R2 )
L1 k
E SP2 (R2 )
where L1 is a new positive constant.
Proof of Theorem 10. For short, let us denote T = SP2 (R2 ). We have:
Var(T ) = E(T (T 1)) + E(T ) [E(T )]2
(51)
... dxdy +
xy >
... dxdy = J1 + J2 .
xy
O(1)
m22
+ 2 .
4
k
k
(52)
23
(53)
pY(x),Y(y) (0, 0)
n Y(x) y x
sup
Y(s)
n Y(x)
(54)
s[x,y]
So,
E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0
y x 2E
= z 2E
sup
Y(s)
n Y(x)
s[x,y]
sup
Y(s)
n Y(0)
s[0,z]
24
W (z) W (0)
= k ,
z
where the last equality is again a consequence of the stationarity of the random
field {W (x) : x R2 }.
At this point, we perform a Gaussian regression on the condition. For the
condition, use again Taylor expansion, the non-degeneracy hypothesis and the
independence of W (0) and W (0). Then, use the finiteness of the moments of
the supremum of bounded Gaussian processes (see for example [4], Ch. 2), take
into account that z to get the inequality:
E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0 C4 z
1+k x
(55)
where C4 is a positive constant. Summing up, we have the following bound for
J2 :
J2 C1 C4 2
1+k x
R2
+
= C1 C4 2 2 2
1 + k
0
exp C2 k 2 ( x C3 )2 dx
4
(56)
exp C2 k 2 ( C3 )2 d
Let us consider a modeling of the sea W (x, y, t) as a function of two space variables and one time variable. Usual models are centered Gaussian stationary
with a particular form of the spectral measure that we discuss briefly below.
We denote the covariance by (x, y, t) = E(W (0, 0, 0)W (x, y, t)).
In practice, one is frequently confronted with the following situation: several
pictures of the sea on time over an interval [0, T ] are stocked and some properties
or magnitudes are observed. If the time T and the number of pictures are large,
and if the process is ergodic in time, the frequency of pictures that satisfy a
certain property will converge to the probability of this property to happen at
a fixed time.
Let us illustrate this with the angle of the normal to the level curve at a
point chosen at random. We consider first the number of crossings of a level
u by the process W (, y, t) for fixed t and y, defined as
W (,y,t)
M2
dt
0
W (,y,t)
(57)
If the ergodicity assumption in time holds true, we can conclude that a.s.:
1
T
M2
W (,y,t))
N[0,M1 ]
dt
0
W (,0,0))
(u) dy M1 E N[0,M1 ]
(u) =
M1 M2
200 21 u2
000 ,
e
000
where
abc =
R3
ax by ct d(x , y , t )
E
0
W (,y,0)
N[0,M1 ] (u) dy = E
=
CQ (0,u)
2 (Q)
200 2u2
000 ,
e
000
(58)
where Q = [0, M1 ] [0, M2 ]. We have a similar formula when we consider sections of the set [0, M1 ] [0, M2 ] in the other direction. In fact (58) can be
generalized to obtain the Palm distribution of the angle .
Set h1 ,2 = 1I[1 , 2 ] , and for 1 < 2 define
F (2 ) F (1 ) : = E 1 ({x Q : W (x, 0) = u ; 1 (x, s) 2 })
=E
(59)
u2
exp(
)
y W
)((x W )2 + (y W )2 )1/2 ] 200 .
= 2 (Q)E[h1 ,2 (
x W
2000
Denoting = 200 020 110 and assuming 2 (Q) = 1 for ease of notation, we
26
readily obtain
F (2 ) F (1 )
u2
e 2000
=
(2)3/2 ()1/2 000
R2
h1 ,2 () x2 + y 2 e 2 (02 x
211 xy+20 y 2 )
dxdy
u2
e 200
=
(2)3/2 (+ )1/2 000
2 exp(
2
(+ cos2 ( ) + sin2 ( )))dd
2+
where + are the eigenvalues of the covariance matrix of the random vector
(x W (0, 0, 0), y W (0, 0, 0)) and is the angle of the eigenvector associated to
+ . Remarking that the exponent in the integrand can be written as
1/ (1 2 sin2 ( )) with 2 := 1 + / and that
+
0
2 exp
H2
2
2H
F (2 ) F (1 ) = (const)
1 2 sin2 ( )
1/2
d.
From this relation we get the density g() of the Palm distribution, simply by
dividing by the total mass:
g() =
1 2 sin2 ( )
1 2 sin2 ( )
1/2
1/2
=
d.
1 2 sin2 ( )
4K( 2 )
1/2
(60)
Here K is the complete elliptic integral of the first kind. This density characterizes the distribution of the angle of the normal at a point chosen at random
on the level curve.
In the case of a random field which is isotropic in (x, y), we have 200 = 020
and moreover 110 = 0, so that g turns out to be the uniform density over
the circle (Longuet-Higgins says that over the contour the distribution of the
angle is uniform (cf. [15], pp. 348)).
Let now W = {W (x, t) : t R+ , x = (x, y) R2 } be a stationary zero
mean Gaussian random field modeling the height of the sea waves. It has the
following spectral representation:
ei(1 x+2 y+t)
W (x, y, t) =
27
f (1 , 2 , )dM (1 , 2 , ),
ei(
G(, )dd,
where H x, t = H W (x, t), W (x, t) , where W = (Wx , Wy ) denotes gradient in the space variables and H is some measurable function such that the integral is well-defined. This is exactly our case in (59). The process {Z(t) : t R} is
strictly stationary, and in our case has a finite mean and is Riemann-integrable.
By the Birkhoff-Khintchine ergodic theorem ([8] page 151), a.s. as T +,
1
T
T
0
Z(s)ds EB [Z(0)],
28
1
t
In order to compute second moments, we use Rice formula for integrals over
level sets (cf. Theorem 6), applied to the vector-valued random field
X(x1 , x2 , s1 , s2 ) = (W (x1 , s1 ), W (x2 , s2 ))T .
The level set can be written as:
CQ2 (u, u) = {(x1 , x2 ) QQ : X(x1 , x2 , s1 , s2 ) = (u, u)}
So, we get
Var Z(t) =
2
t
t
0
for 0 s1 t, 0 s2 t.
s
(1 )I(u, s)ds,
t
where
I(u, s) =
Q2
W (x2 , s)
W (x1 , 0) = u ; W (x2 , s) = u
pW (x1 ,0),W (x2 ,s) (u, u)dx1 dx2 E[H u, W (0, 0) ||W (0, 0)||]pW (0,0) (u)
Assuming that the given random field is time--dependent, that is,
(x, y, t) = 0 (x, y), whenever t > , we readily obtain
t Var Z(t) 2
Using now a variant of the Hoeffding-Robbins Theorem [11] for sums of dependent random variables, we get the CLT:
Numerical computations
the result with the exact formula is around 2.102 larger but it is almost hidden
by the precision of the computation of the integral.
If we consider the case (90, 110, 3), the results are respectively 136.81 and
137.7.
In the case (100, 300, 3), the results differ significantly and Figure 1 displays
the densities (32) and (10)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
100
200
300
400
500
114 0
0 81
30
0.55
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
2
1.5
0.5
0.5
1.5
Figure 2: Density of the Palm distribution of the angle of the normal to the
level curve in the case = 0.5 and = /4
and the matrix of Section 3.5 is
3 0
11 0
0 3
9
= 104 3
0
90
0.8
120
60
0.6
0.4
150
30
0.2
180
330
210
300
240
270
31
Figure 4: Intensity function of the specular points for the Jonswap spectrum
\protect\vrule width0pt\protect\href{http://www.math.univ-toulouse.fr/\string~azais/prog/pro
In this section we follow the article by Berry and Dennis [6]. As these authors, we
are interested in dislocations of wavefronts. These are lines in space or points in
the plane where the phase , of the complex scalar wave (x, t) = (x, t)ei(x,t) ,
is undefined, (x = (x1 , x2 )) is a two dimensional space variable). With respect
to light they are lines of darkness; with respect to sound, threads of silence.
32
(x, t) = 0.
We assume an isotropic Gaussian model. This means that we will consider the
wavefront as an isotropic Gaussian field
(x, t) =
R2
(|k|) 1/2
) dW (k),
|k|
(x) =
R2
(k) 1/2
) dW1 (k)
k
R2
(k) 1/2
) dW2 (k)+
k
R2
sin( kx )(
(k) 1/2
) dW2 (k) (61)
k
sin( kx )(
(k) 1/2
) dW1 (k) (62)
k
and
cos( kx )(
(x) =
R2
J0 (k|x x |)(k)dk
(63)
Moreover
exp i[ k x ] (
(k) 1/2
) dW (k).
k2
E [(r1 )(r2 )] = 4
0
sin(k|r1 r2 |)
(k)dk.
k|r1 r2 |
(64)
The same formula holds true for the process and also E [(r1 )(r2 )] = 0 for
any r1 , r2 , showing that the two coordinates are independent Gaussian fields .
33
6.1
K2
4
1
2
2
1
E[(det Z (x)Z (x)T )1/2 ].
2
Again
E[(det(Z (x)Z (x)T )1/2 ] = 2 E(V ),
where V is the surface area of the parallelogram generated by two standard
Gaussian variables in R3 . A similar method to compute the expectation of this
random area gives:
E(V ) = E(
4
2 (3)) E( 2 (2)) =
2
34
=2
2
Leading eventually to
d3 =
2
.
6.2
k2
3
giving
Variance
=
S2
+ d2 d22 .
Taking into account that the law of the random field is invariant under translations and orthogonal transformations of R2 , we have
A(s1 , s2 ) = A (0, 0), (r, 0) = A(r)
whith r = s1 s2 ,
The Rices function A(r)) , has two intuitive interpretations. First it can be
viewed as
A(r) = lim
1
E N B((0, 0), ) N B((r, 0), ) .
2 4
1 2 2 1 (0, 0) 1 2 2 1 (r, 0)
pZ(0,0),Z(r,0) (04 )
Z(0, 0) = Z(r, 0) = 02
(65)
35
1
, where (r) =
2
(2) (1 2 (r))
J0 (kr)(k)dk.
We use now the same device as above to compute the conditional expectation
of the modulus of the product of determinants, that is we write:
|w| =
(1 cos(wt)t2 dt.
(66)
C := (r)
E = (r)
H = E/r
F
= (r)
F0 = (0)
The regression formulas imply that the conditional variance matrix of the vector
W = 1 (0), 1 (r, 0), 2 (0), 2 (r, 0), 1 (0), 1 (r, 0), 2 (0), 2 (r, 0) ,
is given by
= Diag A, B, A, B
with
A=
B=
E C
F 1C
2
E2
F0 1C
2
F0
H
H
F0
E
F0 1C
2
E2 C
F 1C
2
1
1
1
1
2
1 T (t1 , 0) T (t1 , 0) T (0, t2 ) T (0, t2 )
dt2 t2
1 t2
2
2
2
2
1
1
1
1
+ T (t1 , t2 ) + T (t1 , t2 ) + T (t1 , t2 ) + T (t1 , t2 ) (67)
4
4
4
4
dt1
where
T (t1 , t2 ) = E exp i(w1 t1 + w2 t2 )
with
w1 = 1 (0)2 (0) 1 (0)2 (0) = W1 W7 W3 W5
w2 = 1 (r, 0)2 (r, 0) 1 (r, 0)2 (r, 0) = W2 W8 W4 W6 .
36
0
0
0
D
0
0
D 0
,
H=
0 D
0
0
D
0
0
0
1 t1 0
.
2 0 t2
A standard diagonalization argument shows that
D=
j j2 ) ,
j=1
where the j s are independent with standard normal distribution and the j
are the eigenvalues of 1/2 H1/2 . Using the characteristic function of the 2 (1)
distribution:
8
E exp(iWT HW) =
j=1
(1 2ij )1/2 .
(68)
Clearly
1/2 = Diag A1/2 , B 1/2 , A1/2 , B 1/2
and
1/2 H1/2
0
0
=
0
MT
0
0
0
MT
M
0
0
0
M
0
0
0
= 1+4tr(DBDA)+16 det(DBDA)
where
DBDA =
1
4
E
t21 F0 (F0 1C
2 ) + t1 t2 H(F
E2
t1 t2 H(F0 1C 2 ) + t22 F0 (F
E2C
1C 2 )
E2C
1C 2 )
E C
t21 F0 (F 1C
2 ) + t1 t2 H(F0
E2 C
t1 t2 H(F 1C 2 ) + t22 F0 (F0
So,
E2
E2C
) + 2t1 t2 H(F
)
2
1C
1 C2
E2C 2
E2 2
)
(F
)
(F0
1 C2
1 C2
37
(69)
(70)
E2
1C 2 )
E2
1C 2 )
giving
T (t1 , t2 ) = E exp(iWT HW)
E2
E2C
)
+
2t
t
H(F
)
1
2
1 C2
1 C2
E2 2
E2C 2
+ t21 t22 F02 H 2 (F0
)
(F
)
1 C2
1 C2
A1
2
1
dt1
(71)
E2
1C 2 )
the
2
dt2 t2
1 t2
1
1
1
1
1
+
+
2
2
2
2
2
2
2
2
1 + t1 1 + t2
2 1 + (t1 + t2 ) 2A2 t1 t2 + t1 t2 Z 1 + (t1 + t2 ) + 2A2 t1 t2 + t21 t22 Z
=
1
where
A1
2
dt1
1
1
+
1 + t21
1 + t22
A2 =
Z=
2
dt2 t2
1 t2
2
2
H F (1C )E C
F0 F0 (1C 2 )E 2
F02 H 2
E2 C 2
1 (F 1C
2 ) .(F0
F02
(72)
E2
2
1C 2 )
In this form, and up to a sign change, this result is equivalent to Formula (4.43)
of [6] (note that A22 = Y in [6]).
In order to compute the integral (72), first we obtain
1
1
1
dt2 = .
2
t2
1 + t22
We split the other term into two integrals, thus we have for the first one
1
2
1
1
1
dt2
2
2
2
2
2
t2 1 + (t1 + t2 ) 2A2 t1 t2 + t1 t2 Z
1 + t21
=
1
2(1 + t21 )
1
=
2(1 + t21 )
1+t2
A2
where Z2 = 1+Zt12 and Z1 = 1+Zt
2.
1
1
Similarly for the second integral we get
38
1
2
1
1
1
dt2
2
2
2
2
2
t2 1 + (t1 + t2 ) + 2A2 t1 t2 + t1 t2 Z
1 + t21
=
I1 + I2 =
1
2(1 + t21 )
1
2(1 + t21 )
1
=
(1 + t21 )
t22 + 2Z1 t1 t2
1
dt2 = I2
2
2
t2 t2 + 2Z1 t1 t2 + Z2
t22 + 2Z1 t1 t2
t22 2Z1 t1 t2
1
+
dt2
t22 t22 2Z1 t1 t2 + Z2
t22 + 2Z1 t1 t2 + Z2
In the third line we have used the formula provided by the method of residues.
In fact, if the polynomial X 2 SX + P with P > 0 has not root in [0, ), then
t4
t2
dt =
St2 + P
( P ).
P (S + 2 P )
In our case = (Z2 4Z12 t21 ), S = 2(Z2 2Z12 t21 ) and P = Z22 .
Therefore we get
A(r) =
A1
3
4 (1 C 2 )
t21
(1 + t21 ) Z2 (Z2 Z12 t21 )
Acknowledgement
This work has received financial support from European Marie Curie Network
SEAMOCS.
References
[1] R. J. Adler, The Geometry of Random Fields, Wiley,(1981).
[2] R. J. Adler and J. Taylor, Random Fields and Geometry. Springer, (2007).
[3] J-M. Azas, J. Leon and J. Ortega, Geometrical Characteristic of Gaussian
sea Waves. Journal of Applied Probability , 42,1-19. (2005).
[4] J-M. Azas, and M. Wschebor, Level set and extrema of random processes
and fields, Wiley (2009).
[5] J-M. Azas, and M. Wschebor, On the Distribution of the Maximum of a
Gaussian Field with d Parameters, Annals of Applied Probability, 15 (1A),
254-278, (2005).
39
[6] M.V. Berry, and M.R. Dennis, Phase singularities in isotropic random waves,
Proc. R. Soc. Lond, A, 456, 2059-2079 (2000).
[7] E. Caba
na, Esperanzas de Integrales sobre Conjuntos de Nivel aleatorios. Actas del 2 Congreso Latinoamericano de Probabilidad y Estadistica
Matem
atica, Editor: Sociedad Bernoulli secci
on de Latinoamerica, Spanish
, Caracas, 65-82 (1985).
[8] H. Cramer and M.R. Leadbetter, Stationary and Related Stochastic Processes, Wiley (1967).
[9] H. Federer, Geometric Measure, Springer (1969).
[10] E. Flores and J.R. Leon, Random seas, Levels sets and applications,
Preprint (2009).
[11] W. Hoeffding and H. Robbins, The Central Limit Theorem for dependent
random variables, Duke Math. J. 15 , 773-780,(1948).
[12] M. Kratz and J. R. Leon, Level curves crossings and applications for Gaussian models, Extremes, DOI 10.1007/s10687-009-0090-x (2009).
[13] P. Kree and C. Soize, Mecanique Aletoire, Dunod (1983).
[14] M. S. Longuet-Higgins, Reflection and refraction at a random surface. I, II,
III, Journal of the Optical Society of America, vol. 50, No.9, 838-856 (1960).
[15] M. S. Longuet-Higgins, The statistical geometry of random surfaces. Proc.
Symp. Appl. Math., Vol. XIII, AMS Providence R.I., 105-143 (1962).
[16] Nualart, D. and Wschebor, M., Integration par parties dans lespace de
Wiener et approximation du temps local, Prob. Th. Rel. Fields, 90, 83-109
(1991).
[17] S.O. Rice,(1944-1945). Mathematical Analysis of Random Noise, Bell System Tech. J., 23, 282-332; 24, 45-156 (1944-1945).
[18] WAFO-group . WAFO - A Matlab Toolbox for Analysis of Random Waves
and Loads - A Tutorial. Math. Stat., Center for Math. Sci., Lund Univ., Lund,
Sweden. ISBN XXXX, URL http://www.maths.lth.se/matstat/wafo.(2000)
[19] M. Wschebor, Surfaces Aleatoires. Lecture Notes Math. 1147, Springer,
(1985).
[20] U. Z
ahle, A general Rice formula, Palm measures, and horizontal-window
conditioning for random fields, Stoc. Process and their applications, 17, 265283 (1984).
40
AL
CHAPTER 1
MA
TE
RI
CO
PY
R
IG
HT
ED
This initial chapter contains a number of elements that are used repeatedly in the
book and constitute necessary background. We will need to study the paths of
random processes and fields; the analytical properties of these functions play a
relevant role. This raises a certain number of basic questions, such as whether the
paths belong to a certain regularity class of functions, what one can say about
their global or local extrema and about local inversion, and so on. A typical
situation is that the available knowledge on the random function is given by
its probability law, so one is willing to know what one can deduce from this
probability law about these kinds of properties of paths. Generally speaking,
the result one can expect is the existence of a version of the random function
having good analytical properties. A version is a random function which, at
each parameter value, coincides almost surely with the one given. These are the
contents of Section 1.4, which includes the classical theorems due to Kolmogorov
and the results of Bulinskaya and Ylvisaker about the existence of critical points
or local extrema having given values. The essence of all this has been well known
for a long time, and in some cases proofs are only sketched. In other cases we
give full proofs and some refinements that will be necessary for further use.
As for the earlier sections, Section 1.1 contains starting notational conventions
and a statement of the Kolmogorov extension theorem of measure theory, and
Sections 1.2 and 1.3 provide a quick overview of the Gaussian distribution and
some connected results. Even though this is completely elementary, we call the
readers attention to Proposition 1.2, the Gaussian regression formula, which
Level Sets and Extrema of Random Processes and Fields, By Jean-Marc Azas and Mario Wschebor
Copyright 2009 John Wiley & Sons, Inc.
10
11
will appear now and again in the book and can be considered as the basis of
calculations using the Gaussian distribution.
AF
, T A) (R, BR )
12
n=1,2,...
(1.2)
as follows: For each n = 1, 2, . . . and each n-tuple t1 , t2 , . . . , tn of distinct elements of T , Pt1 ,t2 ,...,tn is a probability measure on the Borel sets of the product
space Xt1 Xt2 Xtn , where Xt = R for each t T (so that this product
space is canonically identified as Rn ).
We say that the probability measures (1.2) satisfy the consistency condition if
for any choice of n = 1, 2, . . . and distinct t1 , . . . , tn , tn+1 T , we have
Pt1 ,...,tn ,tn+1 (B R) = Pt1 ,...,tn (B)
for any Borel set B in Xt1 Xtn . The following is the basic Kolmogorov
extension theorem, which we state but do not prove here.
Theorem 1.1 (Kolmogorov). {Pt1 ,t2 ,...,tn }t1 ,t2 ,...,tn T ; n=1,2,..., satisfy the consistency condition if and only if there exists one and only one probability measure
P on (C) such that
P(C(t1 , . . . , tn ; B1 , . . . , Bn )) = Pt1 ,...,tn (B1 Bn )
(1.3)
13
Notice that in the case of cylinders, if one wants to know whether a given
function g : T R belongs to C(t1 , . . . , tn ; B1 , . . . , Bn ), it suffices to look at
the values of g at the finite set of points t1 , . . . , tn and check if g(tj ) Bj
for j = 1, . . . , n. However, if one takes, for example, T = Z (the integers) and
considers the sets of functions
A = {g : g : T R, lim g(t) exists and is finite}
t+
or
B = {g : g : T R, sup |g(t)| 1},
tT
it is clear that these sets are in (C) but are not cylinders (they depend on an
infinite number of coordinates).
2. In general, (C) is strictly smaller than the family of all subsets of RT . To
see this, one can check that
(C) = {A RT : TA T , TA countable and BA a Borel set in RTA ,
such that g A if and only if g/TA BA }.
(1.4)
The proof of (1.4) follows immediately from the fact that the right-hand side is
a -algebra containing C. Equation (1.4) says that a subset of RT is a Borel set
if and only if it depends only on a countable set of parameter values. Hence,
if T is uncountable, the set
{g RT : g is a bounded function}
or
{g RT : g is a bounded function, |g(t)| 1 for all t T }
does not belong to (C). Another simple example is the following: If T = [0, 1],
then
{g RT : g is a continuous function}
is not a Borel set in RT , since it is obvious that there does not exist a countable
subset of [0, 1] having the determining property in (1.4). These examples lead to
the notion of separable process that we introduce later.
3. In the special case when
= RT , A = (C), and X(t)() = (t),
{X(t) : t T } is called a canonical process.
4. We say that the stochastic process {Y (t) : t T } is a version of the process
{X(t) : t T } if P(X(t) = Y (t)) = 1 for each t T .
14
Rd
exp(i z, x )(dx),
(zj zk )cj ck 0.
j,k=1
The random vector with values in Rd is said to have the normal distribua
tion, or the Gaussian distribution, with parameters (m, ) [m Rd and
d d positive semidefinite matrix] if the Fourier transform of the probability
distribution of is equal to
(z) = exp i m, z
1
2
z, z
z Rd .
and
(x) =
(y) dy
for the density and the cumulative distribution function of a standard normal
random variable, respectively.
If is nonsingular, is said to be nondegenerate and one can verify that it
has a density with respect to Lebesgue measure given by
(dx) =
1
1
exp (x m)T
(2)d/2 (det( ))1/2
2
(x m) dx
15
From the definition above it follows that if the random vector with values
in Rd has a normal distribution with parameters m and , A is a real matrix
with n rows and d columns, and b is a nonrandom element of Rn , then the
random vector A + b with values in Rn has a normal distribution with parameters (Am + b, A AT ). In particular, if is nonsingular, the coordinates of the
random vector 1/2 ( m) are independent random variables with standard
normal distribution on the real line.
Assume now that we have a pair and of random vectors in Rd and Rd ,
respectively, having finite moments of order 2. We define the d d covariance
matrix as
Cov(, ) = E(( E( )( E()T ).
It follows that if the distribution of the random vector (, ) in Rd+d is normal
and Cov(, ) = 0, the random vectors and are independent. A consequence
of this is the following useful formula, which is standard in statistics and gives
a version of the conditional expectation of a function of given the value of .
Proposition 1.2. Let and be two random vectors with values in Rd and Rd ,
respectively, and assume that the distribution of (, ) in Rd+d is normal and
Var() is nonsingular. Then, for any bounded function f : Rd R, we have
E(f ( )| = y) = E(f ( + Cy))
(1.5)
(1.6)
(1.7)
Proof. The proof consists of choosing the matrix C so that the random vector
= C
becomes independent of . For this purpose, we need the fact that
Cov( C, ) = 0,
and this leads to the value of C given by (1.6). The parameters (1.7) follow
immediately.
In what follows, we call the version of the conditional expectation given by
formula (1.5), Gaussian regression. To close this brief list of basic properties,
16
:= ((r(tj , tk )))j,k=1,...,n .
It is easily verified that the set of probability measures {Pt1 ,...,tn } verifies the
consistency condition, so that Kolmogorovs theorem applies and there exists
a unique probability measure P on the measurable space (RT , (C)), which
restricted to the cylinder sets depending on t1 , . . . , tn is Pt1 ,...,tn for any choice
of distinct parameter values t1 , . . . , tn . P is called the Gaussian measure generated by the pair (m, r). If {X(t) : t T } is a real-valued stochastic process with
distribution P, one verifies that:
Rd
exp(i , x )(dx),
where is a Borel measure on Rd with total mass equal to (0). is called the
spectral measure of the process. We usually assume that (0) = 1: that is, that
is a probability measure which is obtained simply by replacing the original
process {X(t) : t Rd } by the process {X(t)/( (0))1/2 : t Rd }.
17
({xn }) = ({xn }) =
1
2 cn
for n = 1, 2, . . . ;
({0}) = c0 ;
cn = 1.
n=0
1/2
X(t) = c0 0 +
t R,
(1.8)
n=1
x k (dx)
k = 0, 1, 2, . . .
(1.9)
whenever the integral exists. k is the kth spectral moment of the process.
An extension of the preceding class of examples is the following. Let
(T , T , ) be a measure space, H = L2R (T , T , ) the Hilbert space of real-valued
square-integrable functions on it, and {n (t)}n=1,2,... an orthonormal sequence
in H . We assume that each function n : T R is bounded and denote
Mn = suptT |n (t)|. In addition, let {cn }n=1,2,.. be a sequence of positive
numbers such that
cn < ,
cn Mn2 <
n=1
n=1
X(t) =
cn1/2 n n (t)
n=1
(1.10)
18
r(s, t) = E(X(s)X(t)} =
cn n (s)n (t).
n=1
We will prove a 01 law for Gaussian processes in this section without attempting
full generality. This will be sufficient for our requirements in what follows. For
a more general treatment, see Fernique (1974).
Definition 1.3. Let X = {X(t) : t T } and Y = {Y (t) : t S} be real-valued
stochastic processes defined on some probability space ( , A, P ). X and
Y are said to be independent if for any choice of the parameter values
t1 , . . . , tn T ; s1 , . . . , sm S, n, m 1, the random vectors
(X(t1 ), . . . , X(tn )), (Y (s1 ), . . . , Y (sm ))
are independent.
Proposition 1.4. Let the processes X and Y be independent and E (respectively,
F ) belong to the -algebra generated by the cylinders in RT (respectively, RS ).
Then
P(X() E, Y () F ) = P(X() E)P(Y () F ).
(1.11)
Proof. Equation (1.11) holds true for cylinders. Uniqueness in the extension
theorem provides the result.
Theorem 1.5 (01 Law for Gaussian Processes). Let X = {X(t) : t T }
be a real-valued centered Gaussian process defined on some probability space
( , A, P) and (E, E) a measurable space, where E is a linear subspace of RT
and the -algebra E has the property that for any choice of the scalars a, b R,
the function (x, y)
ax + by defined on E E is measurable with respect
to the product -algebra. We assume that the function X : E defined as
X() = X(, ) is measurable ( , A) (E, E). Then, if L is a measurable
subspace of E, one has
P(X() L) = 0 or 1.
19
Proof. Let {X(1) (t) : t T } and {X(2) (t) : t T } be two independent processes
each having the same distribution as that of the given process {X(t) : t T }.
For each , 0 < < /2, consider a new pair of stochastic processes, defined
for t T by
Z(1) (t) = X(1) (t) cos + X(2) (t) sin
Z(2) (t) = X(1) (t) sin + X(2) (t) cos .
(1.12)
If , (0, /2), = , the events E and E are disjoint. In fact, the matrix
cos
cos
sin
sin
is nonsingular and (1.12) implies that if at the same time Z(1) L, Z(1) L,
then X(1) (), X(2) () L also, since X(1) (), X(2) () are linear combinations of
Z(1) and Z(1) . Hence, Z(2) , Z(2) L and E , E cannot occur simultaneously.
To finish, the only way in which we can have an infinite family {E }0<</2 of
pairwise disjoint events with equal probability is for this probability to be zero.
That is, q(1 q) = 0, so that q = 0 or 1.
In case the parameter set T is countable, the above shows directly that
any measurable linear subspace of RT has probability 0 or 1 under a centered
Gaussian law. If T is a -compact topological space, E the set of real-valued
20
continuous functions defined on T , and E the -algebra generated by the topology of uniform convergence on compact sets, one can conclude, for example, that
the subspace of E of bounded functions has probability 0 or 1 under a centered
Gaussian measure. The theorem can be applied in a variety of situations similar
to standard function spaces. For example, put a measure on the space (E, E) and
take for L an Lp of this measure space.
(2n ) < ,
n=1
2n (2n ) < .
n=1
Then there exists a version X = {X(t) : t T } of the process Y such that the
paths t
X(t) are continuous on [0, 1].
Proof. For n = 1, 2, . . .; k = 0, 1, . . . , 2n 1, let
Ek,n =
k+1
2n
k
2n
(2
2n 1
) ,
En =
Ek,n .
k=0
In other words, if
/ lim supn En , one can find n0 () such that if n n0 (),
one has
Y
k+1
2n
k
2n
< (2n )
for all k = 0, 1, . . . , 2n 1.
REGULARITY OF PATHS
21
Denote by Y (n) the function whose graph is the polygonal with vertices
(k/2n , Y (k/2n )), k = 0, 1, . . . , 2n ; that is, if k/2n t (k + 1)/2n , one has
Y (n) (t) = (k + 1 2n t)Y
k
2n
k+1
.
2n
+ (2n t k)Y
2(n+1)
for n + 1 n0 ()
(n+1)
(here denotes the sup norm on [0, 1]). Since
) < by
n=1 (2
(n)
the hypothesis, the sequence of functions {Y } converges uniformly on [0, 1]
to a continuous limit function that we denote X(t), t [0, 1].
We set X(t) 0 when lim supn En . To finish the proof, it suffices to
show that for each t [0, 1], P(X(t) = Y (t)) = 1.
If t is a dyadic point, say t = k/2n , then given the definition of the sequence
of functions Y (n) , it is clear that Y (m) (t) = Y (t) for m n. Hence, for
/ lim supn En , one has X(t) = limm Y (m) (t) = Y (t). The result
follows from P((lim supn En )C ) = 1 (AC is the complement of the set
A).
If t is not a dyadic point, for each n, n = 1, 2, . . . , let kn be an integer such
that |t kn /2n | 2n , kn /2n [0, 1]. Set
Fn =
Y (t) X
kn
2n
(2n ) .
Y (t) X
kn
2n
kn
2n
kn
2n
(2n ) ,
22
(a)
E(|Y (t + h) Y (t)|p )
K|h|
,
| log |h||1+r
(1.13)
C
| log |h||a
(1.14)
Proof
(a) Set
(h) =
1
| log |h||b
(h) =
|h|
| log |h||1+rbp
1<b<
r
p
1
1
with 1 < b < (a 1)/2 and (h) = exp | log |h||a2b .
b
| log |h||
4C
Then
P(|Y (t + h) Y (t)| (h)) = P | |
(h)
,
Var(Y (t + h) Y (t))
where stands for standard normal variable. We use the following usual bound
for Gaussian tails, valid for u > 0:
P(| | u) = 2P( u) =
e(1/2)x dx
2
2 1 (1/2)u2
e
.
u
With
the foregoing choice of () and (), if |h| is small enough, one has
(h)/ Var(Y (t + h) Y (t)) > 1 and
P(|Yt+h Y (t)| (h)) (const) (h).
where (const) denotes a generic constant that may vary from line to line. On the
n
n
n
other hand,
1 (2 ) < and
1 2 (2 ) < are easily verified.
REGULARITY OF PATHS
23
Some Examples
1. Gaussian stationary processes. Let {Y (t) : t R} be a real-valued Gaussian centered stationary process with covariance ( ) = E(Y (t) Y (t + )). Then
condition (1.14) is equivalent to
C
| log | ||a
for sufficiently small | |, with the same meaning for C and a.
2. Wiener process. Take T = R+ . The function r(s, t) = s t is positive
semidefinite. In fact, if 0 s1 < < sn and x1 , . . . , xn R, one has
(0) ( )
(sj sk ) xj xk =
j,k=1
(1.15)
k=1
Y (t) =
as dW (s)
0
SQ =
24
when NQ = sup{(tj +1 tj ) : 0 j m 1} tends to 0. Here Q denotes the partition 0 = t0 < t1 < < tm = t of the interval [0, t] and {at : t 0} an adapted
stochastic process, bounded by the same constant as {at : t 0} and such that
m1
(1.16)
where (const) does not depend on t, h, and Q, and then apply Fatous lemma
when NQ 0.
Let us compute the left-hand side of (1.16). Set j = W (tj +1 ) W (tj ). We
have
m1
4
E(SQ
)=
j1 ,j2 ,j3 ,j4 =0
j1
j2
j3
j4 ).
(1.17)
(atjh
jh )
=E E
(atjh
h=1
jh )|Ftj4
h=1
3
=E
(atjh
jh )atj4 E(
j4 |Ftj4 )
=0
h=1
since
3
E(
j |Ftj ) = E(
j ) = 0 and
(atjh
h=1
jh )atj4
isFtj4 measurable.
REGULARITY OF PATHS
25
atjh
=E E
jh
atj1
j4 |Ftj1
atj4
j1
h=1
= E at3j atj4
j4 E
3
j |Ftj
3
j
=0
3
j1 |Ftj1
because
E
=E
= 0.
m1
atj
4
j
C14
j =0
3 (tj +1 tj )2 3 C14 h2 .
j =0
6
j3 =1 0j1 ,j2 <j3
j1
j2
2
j3
6
j3 =1 0j1 ,j2 <j3
j1
j2
= 6E
m1
j3 =1
m1
6 C12
atj
atj
j3 1
j =0
j3 1
(tj3 +1 tj3 )E
j3 =1
j =0
j3 1
m1
= 6 C12
(tj3 +1 tj3 )
j3 =1
Using (1.17), one obtains (1.16), and hence the existence of a version of the Ito
integral possessing continuous paths.
26
tV D
tV
tV D
tV
tJ D
and
tJ
tJ D
tJ
tJ D
tJ D
tJ
REGULARITY OF PATHS
27
since t
X(t)() is continuous.
In a similar way, one proves that Y (s)() X(s)().
The separability condition is usually met when the paths have some minimal
regularity (see Exercise 1.7). For example, if {X(t) : t R} is a real-valued
process having a.s. c`ad-l`ag paths (i.e., paths that are right-continuous with left
limits), it is separable. All processes considered in the sequel are separable.
Some Additional Remarks and References. A reference for Kolmogorovs
extension theorem and the regularity of paths, at the level of generality we have
considered here, is the book by Cramer and Leadbetter (1967), where the reader
can find proofs that we have skipped as well as related results, examples, and
details. For d-parameter Gaussian processes, a subject that we consider in more
detail in Chapter 6, in the stationary case, necessary and sufficient conditions
to have continuous paths are due to Fernique (see his St. Flour 1974 lecture
notes) and to Talagrand (1987) in the general nonstationary case. In the Gaussian
stationary case, Belayev (1961) has shown that either: with probability 1 the paths
are continuous, or with probability 1 the supremum (respectively, the infimum)
on every interval is + (respectively, ). General references on Gaussian
processes are the books by Adler (1990) and Lifshits (1995).
In this section we state some results, without detailed proofs. These follow the
lines of the preceding section.
Theorem 1.10. Let Y = {Y (t) : t [0, 1]} be a real-valued stochastic process
that satisfies the hypotheses of Theorem 1.6 and additionally, for any triplet t h,
t, t + h [0, 1], one has
P(|Y (t + h) + Y (t h) 2Y (t)| 1 (h)) 1 (h),
where 1 and 1 are two even functions, increasing for h > 0 and such that
n=1
2n 1 (2n ) < ,
2n 1 (2n ) < .
n=1
Then there exists a version X = {X(t) : t T } of the process Y such that almost
surely the paths of X are of class C 1 .
28
Sketch of the Proof. Consider the sequence {Y (n) (t) : t [0, 1]}n=1,2,... of
polygonal processes introduced in the proof of Theorem 1.6. We know that
a.s. this sequence converges uniformly to X = {X(t) : t [0, 1]}, a continuous
version of Y. Define:
Y (n) (t) := Y (n) (t )
Y
(n)
(0) := Y
(n)
(0 ) (right derivative).
2
| log | ||a
with 2 > 0, a > 3. Then there exists a version of Y with paths of class C 1 . For
the proof, apply Theorem 1.10.
A related result is the following. The proof is left to the reader.
Proposition 1.11 (Holder Conditions). Assume that
E(|Y (t + h) Y (t)|p ) K|h|1+r
(1.18)
REGULARITY OF PATHS
29
1
r(t + h, t + k) r(t, t + k) r(t, t + h) + r(t, t)
hk
r11 (t, t) as (k, h) (0, 0),
r
(s, t),
t
and similarly, that the covariance of X = {X (t) : t R} is r11 (s, t). Now let X
be a Gaussian process and X its derivative in quadratic mean. If this satisfies,
for example, the criterion in Corollary 1.7(b), it admits a continuous version
Y = {Y (t) : Y (t); t R}. Set
t
Y (t) := X(0) +
Y (s) ds.
0
30
Clearly, Y has C 1 -paths and E(X(s), Y (s)) = r(s, 0) + 0 r01 (s, t) dt = r(s, s).
In the same way, E(Y (s)2 ) = r(s, s), so that E([X(s) Y (s)]2 ) = 0. As a consequence, X admits a version with C 1 paths.
Using this construction inductively, one can prove the following:
Let X be a Gaussian process with mean C k and covariance C 2k and such that
its kth derivative in quadratic mean satisfies the weak condition of Corollary
1.7(b). Then X admits a version with paths of class C k .
If X is a Gaussian process with mean of class C and covariance of class
C , X admits a version with paths of class C .
X(h) X(0)
h
=2
1 cos hx
(dx),
h2
X(h) X(0)
h
< .
REGULARITY OF PATHS
31
In this section we consider the case when the parameter of the process lies in Rd
or, more generally, in some general metric space. We begin with an extension of
Theorem 1.6.
Theorem 1.14. Let Y = {Y (t) : t [0, 1]d } be a real-valued random field that
satisfies the condition
(Kd ) For each pair t, t + h [0, 1]d ,
(h),
(2n ) < ,
n=1
n=1
Then there exists a version X = {X(t) : t [0, 1]d } of the process Y such
that the paths t X(t) are continuous on [0, 1]d .
Proof. The main change with respect to the proof of Theorem 1.6 is that
we replace the polygonal approximation, adapted to one-variable functions by
another interpolating procedure. Denote by Dn the set of dyadic points of order
n in [0, 1]d ; that is,
Dn = t = (t1 , . . . , td ) : ti =
ki
, ki integers , 0 ki 2n , i = 1, . . . , d .
2n
f (n) is continuous.
f (n) (t) = f (t) for all t Dn .
f (n+1) f (n) = maxtDn+1 \Dn |f (t) f (n) (t)|, where
sup-norm on [0, 1]d .
denotes
A way to define f (n) is the following: Let us consider a cube Ct,n of the
nth-order partition of [0, 1]d ; that is,
Ct,n = t + 0,
1
2n
32
where t Dn with the obvious notation for the sum. For each vertex , set
f (n) ( ) = f ( ).
Now, for each permutation of {1, 2, . . . , d}, let S be the simplex
S = t + s : s = (s(1) , . . . , s(d) ), 0 s(1) s(d)
1
.
2n
It is clear that Ct,n is the union of the S s over all permutations. In a unique
way, extend f (n) to S as an affine function. It is then easy to verify the afore
mentioned properties and that
f (n+1) f (n)
sup
s,tDn+1 ,|ts|=2(n+1)
|f (s) f (t)|.
Kd |h|d
,
| log |h||1+r
(1.19)
C
| log |h||a
(1.20)
d+
REGULARITY OF PATHS
33
E(X(t) X(s))2 .
log(N ())
1/2
d < .
log(N ())
0
1/2
d < .
34
This condition can be compared with Kolmogorovs theorem. The reader can
check that Theorem 1.19 permits us to weaken the condition of Corollary 1.7(b)
to a > 1. On the other hand, one can construct counterexamples (i.e., processes
not having continuous paths) such that (1.14) holds true with a = 1. This shows
that the condition of Corollary 1.7(b) is nearly optimal and sufficient for most
applications. When the Gaussian process is no longer stationary, M. Talagrand
has given necessary and sufficient conditions for sample path continuity in terms
of the existence of majorizing measures (see Talagrand, 1987).
The problem of differentiability can be addressed in the same manner as for
d = 1. A sufficient condition for a Gaussian process to have a version with C k
sample paths is for its mean to be C k , its covariance C 2k , and its kth derivative
in quadratic mean to satisfy some of the criteria of continuity above.
In this section we give two classical results that are used several times in the book.
The first gives a simple sufficient condition for a one-parameter random process
not to have a.s. critical points at a certain specified level. The second result
states that under mild conditions, a Gaussian process defined on a quite general
parameter set with probability 1 does not have local extrema at a given level.
We will use systematically the following notation: If is a random variable with
values in Rd and its distribution has a density with respect to Lebesgue measure,
this density is denoted as
p (x)
x Rd .
REGULARITY OF PATHS
35
Let > 0 be given; choose > 0 so that P(E, ) < and m so that /m < ,
and [u l/m, u + l/m] v. We have
m1
P(TuX
J = ) P(E, ) +
C
P({TuX [tj , tj +1 ] = } E,
)
j =0
m1
m1
P |X(tj ) u|
< +
j =0
= +
j =0
|xu|( /m)
for every t T .
E(X(t)) 0
36
pM X (u) (u) :=
for every u R.
(1.21)
exp(v 2 /2) dv
gn (x) =
e(1/2)(xm(tk ))
= (x)Gn (x),
2
2
Gn (x) =
= x m(tk ))e
(1.22)
with
Yj = X(tj ) m(tj )
j = 1, . . . , n.
with
cj k = E(Yj Yk ),
where the random variables Yj cj k Yk and Yk are independent. Then the conditional probability becomes
P(Yj cj k Yk < x m(tj ) cj k (x m(tk )), j = 1, . . . , n; j = k).
REGULARITY OF PATHS
37
Gn (b)
(x) dx
Gn (x)(x) dx =
gn (x) dx 1,
so that
b
P{a < Mn b} =
a
(x) dx
a
1
gn (x) dx Gn (b)
(x) dx
(x) dx
|m | + A
Z(t) a
.
+
(t)
0
|m | + |a| |m | + A
m(t) a
|m | + A
+
0
+
(t)
0
0
0
and
Var X(t) = 1,
|m | + A
,
0
2 =
|m | + A b a
+
.
0
0
It follows that
P a < MZ b
2
1
(u) du =
a
v a + |m | + A
1
dv,
0
0
38
Theorem 1.21 follows directly from Theorem 1.22, since under the hypotheses
of Theorem 1.21, we can write
{EuX = }
EXERCISES
1.1. Let T = N be the set of natural numbers. Prove that the following sets
belong to (C).
(a) c0 (the set of real-valued sequences {an } such that an 0). Suggestion:
Note that c0 =
nm {|an | < 1/k}.
k=1
m=1
2
(b)
(the set of real-valued sequences {an } such that n |an |2 < ).
(c) The set of real-valued sequences {an } such that limn an 1.
1.2. Take T = R, T = BR . Then if for each
t
the function
(1.23)
X(t, ),
X(n) (t, ) =
Xk/2n ()1I{k/2n
t <(k+1)/2n } ,
k=
EXERCISES
39
its Fourier
exp(i x)(dx).
( ) =
() is of class C k and
is of class
t2
t4
t 2p
+ 4 + + (1)2p 2p
+ o(t 2p ).
2!
4!
(2p)!
x k (dx) A.
40
k!
tk
(t) 1 2
t2
t k2
+ + (1)k2 k2
2!
(k 2)!
2
n=1 cn n .
(c) {n }n=1,2,... are eigenfunctionswith eigenvalues {cn }n=1,2,... ,
respectivelyof the linear operator A : H H defined by
(Af )(s) =
1.7. Let {X(t) : t T } be a stochastic process defined on some separable topological space T .
(a) Prove that if X(t) has continuous paths, it is separable.
(b) Let T = R. Prove that if the paths of X(t) are c`ad-l`ag, X(t) is separable.
1.8. Let {X(t) : t Rd } be a separable stochastic process defined on some
(complete) probability space ( , A, P).
(a) Prove that the subset of {X() is continuous} is in A.
(b) Prove that the conclusion in part (a) remains valid if one replaces continuous by upper continuous, lower continuous, or continuous on
the right [a real-valued function f defined on Rd is said to be continuous on the right if for each t, f (t) is equal to the limit of f (s) when
each coordinate of s tends to the corresponding coordinate of t on its
right].
1.9. Show that in the case of the Wiener process, condition (1.18) holds for
every p 2, with r = p/2 1. Hence, the proposition implies that a.s.,
the paths of the Wiener process satisfy a Holder condition with exponent
, for every < 12 .
EXERCISES
41
1.10. (Wiener integral ) Let {W1 (t) : t 0} and {W2 (t) : t 0} be two independent Wiener processes defined on some probability space ( , A, P), and
denote by {W (t) : t R} the process defined as
W (t) = W1 (t) if t 0 and W (t) = W2 (t) if t 0.
L2 (R, ) denotes the standard L2 -space of real-valued measurable
functions on the real line with respect to Lebesgue measure and
1
L2 ( , A, P) the L2 of the probability space. CK
(R) denotes the subspace
2
1
of L (R, ) of C -functions with compact support. Define the function
1
I : CK
(R) L2 ( , A, P) as
I (f ) =
f (t)W (t) dt
R
(1.24)
1
for each nonrandom f CK
(R). Equation (1.24) is well defined for each
since the integrand is a continuous function with compact support.
42
1/2
(c) Prove that the stochastic process {CH I(Kt ) : t 0} has a version
with continuous paths. This normalized version with continuous paths
is usually called the fractional Brownian motion with Hurst exponent
H and is denoted {WH (t) : t 0}.
(d) Show that if H = 12 , then {WH (t) : t 0} is the standard Wiener process.
(e) Prove that for any > 0, almost surely the paths of the fractional
Brownian motion with Hurst exponent H satisfy a Holder condition
with exponent H .
1.12. (Local time) Let {W (t) : t 0} be a Wiener process defined in a probability
space ( , A, P). For u R, I an interval I [0, +] and > 0, define
(u, I ) =
1
2
1I|W (t)u|< dt =
1
({t I : |W (t) u| < }).
2
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
University ; Cecile
Mercadier Lyon, France and Mario Wschebor
Universite de Toulouse
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
1 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
2 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
6000
5000
4000
3000
2000
1000
20
40
60
80
100
120
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
3 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
4
0
20
40
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
60
80
100
120
4 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
Testing
The maximum of absolute value of the series is 3.0224. An estimation
of the covariance with WAFO gives
2
1.5
0.5
0.5
1.5
0
20
40
60
80
100
120
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
5 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
6 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
I :=
l1
(x)dx
(1)
ln
I :=
(z1 )dz1
l1 /T11
u2 T12 z1
T22
(z2 )dz2
l2 T12 z1
T22
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
(2)
7 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
1 (u1 /T11 )
I :=
dt1
1 (l1 /T11 )
u2 T12 1 (t1 )
T22
dt2
l2 T12 1 (t1 )
T22
(3)
h(t)dt.
(4)
[0,1]n
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
8 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
QMC
In the form (4) the MC evaluation is based on
M
h(ti )
I = 1/M
i=1
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
9 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Theorem
(Nuyens and cools, 2006) Assume that h is the tensorial product of
periodic functions that belong to a Koborov space (RKHS). Then the
minimax sequence and the worst error can be calculated by a
polynomial algorithm. Numerical results show that the convergence is
roughly O(M 1 ).
This result concerns the worst case so it is not so relevant
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
10 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
A meta theorem
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
11 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
MCQMC
Let (ti , i) be the lattice sequence, the way of estimating the integral
can be turn to be random but exactly unbiased by setting
M
I = 1/M
ti + U
i=1
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
12 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
50
100
150
200
250
300
350
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
13 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Do processes exist ?
In this part X(t) is a Gaussian process defined on a compact interval
[0, T].
Since such a process is always observed in a finite set of times and
since the previous method work with say n = 1000, is it relevant to
consider continuous case ?
Answer yes : random process occur as limit statistics. Consider for
example the simple mixture model
H0 : Y N(0, 1)
H1 : Y pN(0, 1) + (1 p)N(, 1) p [0, 1], M R
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
(5)
14 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
(6)
est 1
es2 1
et2 1
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
15 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
(7)
(8)
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
16 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
17 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
An example
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
18 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Extensions
Treat all the cases : maximum of the absolute value, non centered,
non-stationary. In each case some tricks have to be used.
A great challenge is to use such formulas for fields .
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
19 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
References
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
20 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
21 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
THANK-YOU
MERCI
GRACIAS
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
22 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
THANK-YOU
MERCI
GRACIAS
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
22 / 22
URL: http://www.emath.fr/ps/
R
esum
e. Dans cet article nous utilisons la methode de Rice (Rice, 1944-1945) pour trouver un encadrement de la fonction de repartition du maximum dun processus Gaussien stationnaire regulier.
Nous derivons des expressions simplifiees des deux premiers termes de la serie de Rice (Miroshin, 1974,
Azas et Wschebor, 1997) suffisants pour lencadrement cherche. Notre contribution principale est la
donnee dune forme plus simple du second moment factoriel du nombre de franchissements vers le
haut, ce qui est, en quelque sorte, une generalisation de la formule de Steinberg et al. (Cramer and
Leadbetter, 1967, p. 212). Nous presentons ensuite une application numerique et des developpements
asymptotiques qui fournissent une nouvelle interpretation dun resultat de Piterbarg (1981).
AMS Subject Classification. 60Exx, 60Gxx, 60G10, 60G15, 60G70, 62E17, 65U05.
Received June 4, 1998. Revised June 8, 1999.
1. Introduction
1.1. Framework
Many statistical models involve nuisance parameters. This is the case for example for mixture models [10],
gene detection models [5,6], projection pursuit [20]. In such models, the distributions of test statistics are those
of the maximum of stochastic Gaussian processes (or their squares). Dacunha-Castelle and Gassiat [8] give for
example a theory for the so-called locally conic models.
Thus, the calculation of threshold or power of such tests leads to the calculation of the distribution of the
maximum of Gaussian processes. This problem is largely unsolved [2].
Keywords and phrases: Asymptotic expansions, extreme values, stationary Gaussian process, Rice series, upcrossings.
This paper is dedicated to Mario Wschebor in the occasion of his 60th birthday.
108
Miroshin [13] expressed the distribution function of this maximum as a sum of a series, so-called the Rice
series. Recently, Azas and Wschebor [3, 4] proved the convergence of this series under certain conditions and
proposed a method giving the exact distribution of the maximum for a class of processes including smooth
stationary Gausian processes with real parameter.
The formula given by the Rice series is rather complicated, involving multiple integrals with complex expressions. Fortunatly, for some processes, the convergence is very fast, so the present paper studies the bounds
given by the first two terms that are in some cases sufficient for application.
We give identities that yield simpler expressions of these terms in the case of stationary processes. Generalization to other processes is possible using our techniques but will not be detailed for shortness and simplicity.
For other processes, the calculation of more than two terms of the Rice series is necessary. In such a case,
the identities contained in this paper (and other similar) give a list of numerical tricks used by a program under
construction by Croquette.
We then use Maple to derive asymptotic expansions of some terms involved in these bounds. Our bounds
are shown to be sharp and our expansions are made for a fixed time interval and a level tending to infinity.
Other approaches can be found in the literature [12]. For example, Kratz and Rootzen [11] propose asymptotic
expansions for a size of time interval and a level tending jointly to infinity.
We consider a real valued centred stationary Gaussian process with continuous paths X = {Xt ; t [0, T ] R}.
We are interested in the random variables
X = sup Xt or X
= sup |Xt | .
t[0,T ]
t[0,T ]
For shortness and simplicity, we will focus attention on the variable X ; the necessary modifications for adapting
our method to X are easy to establish [5].
We denote by dF () the spectral measure of the process X and p the spectral moment of order p when it
exists. The spectral measure is supposed to have a finite second moment and a continuous component. This
implies ([7] p. 203) that the process is differentiable in quadratic mean and that for all pairwise different time
points t1 , . . . , tn in [0, T ], the joint distribution of Xt1 , . . . , Xtn , Xt1 , . . . , Xtn is non degenerated.
For simplicity, we will assume that moreover the process admits C 1 sample paths. We will denote by r(.) the
covariance function of X and, without loss of generality, we will suppose that 0 = r(0) = 1.
Let u be a real number, the number of upcrossings of the level u by X, denoted by Uu is defined as follows:
Uu = # {t [0, T ], Xt = u, Xt > 0}
For k N , we denote by k (u, T ) the factorial moment of order k of Uu and by k (u, T ) the factorial moment of
order k of Uu 11{X0 u} . We also define k (u, T ) = k (u, T ) k (u, T ). These factorial moments can be calculated
by Rice formulae. For example:
T 2 u2 /2
1 (u, T ) = E (Uu ) =
e
2
T
Ast (u) ds dt
0
with Ast (u) = E (Xs )+ (Xt )+ |Xs = Xt = u ps,t (u, u), where (X )+ is the positive part of X and ps,t the
joint density of (Xs , Xt ).
These two formulae are proved to hold under our hypotheses ( [7], p. 204). See also Wschebor [21],
Chapter 3, for the case of more general processes.
We will denote by the density of the standard Gaussian distribution. In order to have simpler expressions
x
of rather complicated formulae, we will use the folllowing three functions: (x) =
x
and (x) =
0
1
(y)dy = (x) .
2
109
1
E () E (( 1)) P ( > 0) E () .
2
Noting that P almost surely, {X > u} = {X0 > u} {X0 u, Uu > 0} and that E Uu (Uu 1)11{X0 u} 2 ,
we get:
P (X0 > u) + 1 (u, T )
2 (u, T )
P X u
2
(1.1)
1 (u, T ) =
dt
0
dx
P X u = P (X0 > u) +
(1)m+1
m=1
m (u, T )
m!
(1.2)
(1.3)
2 (u, T )
P X u P (X0 > u) + 1 (u, T ).
2
Since 2 (u, T ) 2 (u, T ), we see that, except this last modification which gives a simpler expression, Main
inequality (1.1) is relation (1.3) with n = 1.
110
Remark 1.1. In order to calculate these bounds, we are interested in the quantity 1 (u, T ). For asymptotic
calculations and to compare our results with Piterbargs ones, we will also consider the quantity k (u, T ). From
a numerical point of view, k (u, T ) and k (u, T ) are worth being distinguished because they are not of same
order of magnitude as u +. In the following sections, we will work with 1 (u, T ).
2. Some identities
First, let us introduce some notations that will be used in the rest of the paper. We set:
r (t)
u,
(t) = E (X0 |X0 = Xt = u) =
1 + r(t)
r 2 (t)
2 (t) = V ar (X0 |X0 = Xt = u) = 2
,
1 r2 (t)
r (t) 1 r2 (t) r(t)r 2 (t)
(t) = Cor (X0 , Xt |X0 = Xt = u) =
.
2 (1 r2 (t)) r 2 (t)
1 + (t)
Note that, since the spectrum of the process X admits a continuous component, |(t)| = 1.
In the sequel, the variable t will be omitted when it is not confusing and we will write r, r , , , , k, b instead
of r(t), r (t), (t), (t), (t), k(t), b(t).
Proposition 2.1. (i) If (X, Y ) has a centred normal bivariate distribution with covariance matrix
1
1
then a R+
a
1
P (X > a, Y > a) = arctan
1+
(x)
2
1
0
1+
x (x) dx
1
=2
2 (T t)
(iii) 2 (u, T ) =
0
1 r 2
u
1+r
1
2
1 r2 (t)
1+
x
1
1r
r
u (b)
1+r
1 r2
u
1 + r(t)
dx
dt
with:
T1 (t) = 2 (t)
(2.1)
(2.2)
b(t)
(2.3)
1
arctan (k(t)) 2
b(t)
(k(t) x) (x) dx .
0
(2.4)
111
Remark 2.2.
p. 27:
1. Formula (i) is analogous to the formula (2.10.4) given in Cramer and Leadbetters [7],
1
a2
exp
1z
2 1 z 2
dz.
Our formula is easier to prove and is more adapted to numerical application because, when t 0,
(t) 1 and the integrand in Cramer and Leadbetters formula tends to infinity.
2. Utility of these formulae:
these formulae permit a computation of Main inequality (1.1), at the cost of a double integral with
finite bounds. This is a notable reduction of complexity with respect to the original form. The form
(2.4) is more adapted to effective computation, because it involves an integral on a bounded interval;
this method has been implemented in a S+ program that needs about one second of Cpu to run an
example. It has been applied to a genetical problem in Cierco and Azas [6].
The form (iii) has some consequences both for numerical and theoretical purposes. The calculation of 2 (u, T )
yields some numerical difficulties around t = 0. The sum of the three terms is infinitly small with respect to
each term. To discard the diagonal from the computation, we use formula (iii) and Maple to calculate the
equivalent of the integrand in the neighbourhood of t = 0 at fixed u.
T
Ast (u) ds dt. The following proposition gives the Taylor expansion
0
of A at zero.
At (u) =
1
(2 6 4 )
1 4
exp
u2
1296 (4 2 )1/2 2 2
2 4 22
2
2
t4 + O(t5 ).
Piterbarg [17] or Wschebor [21] proved that At (u) = O ( (u(1 + ))) for some 0. Our result is more precise.
Our formulae give some asymptotic expansions as u + for 1 (u, T ) and 2 (u, T ) for small T .
Proposition 2.4. Assume that 8 is finite. Then, there exists a value T0 such that, for every T < T0
11/2
4 22
27
1 (u, T ) =
4 5 (2 6 2 )3/2
2
4
4
u
4 22
u6
1+O
1
u
9/2
4 22
3 3T
2 (u, T ) =
9/2 (2 6 2 )
2
4
4
u
4 22
u5
1+O
1
u
as u +.
3. A numerical example
In the following example, we show how the upper and lower bounds (1.1) permit to evaluate the distribution
of X with an error less than 104 .
We consider the centered stationary Gaussian process with covariance (t) := exp(t2 /2) on the interval
I = [0, 1], and the levels u = 3, 2.5, . . . , 3. The term P (X0 u) is evaluated by the S -plus function P norm,
1 and 2 using Proposition 2.1 and the Simpson method. Though it is rather difficult to assess the exact
precision of these evaluations, it is clear that it is considerably smaller than 104 . So, the main source of error
112
is due to the difference between the upper and lower bounds in (1.1).
u
P (X0 u)
3
0.00135
2.5
0.00621
2
0.02275
1.5
0.06681
1
0.15866
0.5
0.30854
0
0.50000
0.5
0.69146
1
0.84134
1.5
0.93319
2
0.97725
2.5
0.99379
3
0.99865
1
0.00121
0.00518
0.01719
0.04396
0.08652
0.13101
0.15272
0.13731
0.09544
0.05140
0.02149
0.00699
0.00177
2
lower bound upper bound
0
0.00014
0.00014
0
0.00103
0.00103
0
0.00556
0.00556
0.00001
0.02285
0.02285
0.00002
0.07213
0.07214
0.00004
0.17753
0.17755
0.00005
0.34728
0.34731
0.00004
0.55415
0.55417
0.00002
0.74591
0.74592
0.00001
0.88179
0.88180
0
0.95576
0.95576
0
0.98680
0.98680
0
0.99688
0.99688
4. Proofs
Proof of Proposition 2.1
Proof of point (i). We first search P (X > a, Y > a).
Put = cos(), [0, [, and use the orthogonal decomposition Y = X +
a X
Then {Y > a} = Z >
. Thus:
1 2
+
P (X > a, Y > a) =
a x
(x)
(x)(z) dx dz,
dx =
1 2
1 2 Z.
1
where D is the domain located between the two half straight lines starting from the point a, a
1+
Using a symmetry with respect to the straight line with angle passing through the origin, we get:
2
+
P (X > a, Y > a) = 2
(x)
a
1
x
1+
dx.
(4.1)
Now,
P (X > a, Y > a) = (a) P (X > a, Y < a) = (a) P (X > a, (Y ) > a) .
Applying relation (4.1) to (X, Y ) yields
+
(x)
a
1+
x
1
dx = 2
and
1+
x
1
(x) dx.
113
(k x) (x) dx =
0
1
arctan(k)
2
E Z + =
p0,t (x, u) dx
(1 r2 )
0
u
T
+
r (x r u)
r (x r u)
I2 =
dt
r
(1 r2 )
0
u
T
parts leads to
I2 = (u)
0
1 (u, T ) =
22
2
(u)
2
r
1r
u (b)
2
1r
1+r
2 1 r
r2
u
+
1+r
22 (1 r2 )
1r
u
1+r
dt. Integrating I2 by
dt.
r2
= 2 , we obtain:
1 r2
T
1r
u
1+r
dt + (u)
0
1 r2
1r
u
1+r
(b) dt.
Jij =
0
xi y j
2
1 2
1 2
0
1 2 (k b) (b).
dy
(4.2)
114
(4.3)
3/2
(k b) k b (k b) (b).
(4.4)
[ (k b) + k b (k b)] (b).
(4.5)
x
0
parts
3/2
v(x, y)
dx dy. Then, integrating by
x
2 1 2
(4.6)
1 2 [ (k b) + k b (k b)] (b).
1 2 2
b
1
(b) + 2 b2
(k x) (x) dx + 2 b (k b) (b).
b
as a(t, u)
b(t, u)
Note 4.2. Many results of this section are based on tedious Taylor expansions. These expansions have been
made or checked by a computer algebra system (Maple). They are not detailed in the proofs.
115
1 + (t)
= O(t) is small,
1 (t)
Proof of Proposition 2.3. Use form (iii) and remark that, when t is small, k(t) =
1
and, since () =
2
3
6
+ O 5 as 0, we get:
b(t)
b(t)
k(t)
arctan(k(t))
k 3 (t)
x(x)dx +
x3 (x)dx + O(t5 )
2
2 0
6 2 0
1
k(t)
2 arctan(k(t)) 2 ((0) (b(t)))
+ O(t5 ).
= 2 2 (t)(t) 2 (t)
k 3 (t)
2
+
2(0) b (t) + 2 (b(t))
6 2
In the same way:
2(t)(t)
k 3 (t) 3
b (t) + O(t5 ).
T3 (t) =
(b(t)) k(t)b(t)
6
2
And then, assuming 8 finite, use Maple to get the result.
T2 (t) = 2 2 (t)(t) 2 (t)
(p+1)
(i) Ip =
1 Mp+1
2 2
( dc )
p
(cos ) d 1 + O
0
1
u
22 6 2 24
p+1
and Mp+1 = E |Z|
where Z is a standard Gaussian random variable.
4 22
T
Mp
1
(ii) Jp =
tp (l(t) u) dt = (c u)(p+1)
1+O
2
u
0
with d =
1
6
Proof of Lemma 4.3. Since the derivative of l at zero is non zero, l is invertible in some neighbourghood of zero
1
1
and its inverse l1 satisfies l1 (t) = t + O(t2 ), l1 (t) = + O(t).
c
c
We first consider Ip and use the change of variable y = l(t)u, then
l(T )u
Ip =
y
u
l1
(kb) l1
y
u
(y) l1
y
u
dy
y
d
= y + u OU
u
c
1
6
y2
u2
and
l(T )u
(p+1)
yp
Ip = (c u)
22 6 2 24
t u + u O(t3 ) = d u t + u O(t3 ).
4 22
d
y + u OU
c
y2
u2
(y) 1 + OU
y
u
dy.
116
tu
2
t
. Then
2
(const) u t2
tu
2
(4.7)
(p+1)
yp
Ip = (c u)
0
l(T )u
yp
Put Kp (u) =
0
d
y
c
d
y
c
(y) 1 + OU
y
u
dy.
(4.8)
yp
Kp (u) =
0
d
y
c
+
c y yp
y2 + z 2
d
y (y) dy =
exp
dz dy. Then, using polar coorc
2
2
0
0
0
d
1 Mp+1 arctan( c )
p
dinates, we derive that Kp () =
(cos ) d. So we can see that the contribution of the
2 2
0
y
term OU
in formula (4.8) is O u(p+2) which gives the desired result for Ip .
u
Moreover, Kp () =
yp
2 (1 r)
u
2 (1 + r)
1r
u
1+r
r
(b)
1 r2
Then, 1 (u, T ) =
A1 (t) dt.
0
1
1
3
3 + 5 + O(z 7 ) .
z
z
z
(4.9)
117
2 (1 r(t))
u for the first term and z = b(t)
2 (t)(1 + r(t))
2 (1 r)
u
2 (1 + r)
1r
u
1+r
(b) ,
we get:
2 (1 + r) 1
2 (1 r) u
2 (1 r)
2 (1 + r)
u
+ OU
2
(1 + r)
2 (1 r)
1
r
1
+
3 + OU
2
b
1r b
(u)
A1 (t) =
2
3/2
2 (1 + r)
2 (1 r)
5/2
1
u5
1
b5
u3
4
2
exp 2(u4
2)
1 4 2
2
t2 + O(t4 ).
A1 (t) =
7/2
8
u3 2
2
To use Lemma 4.3 point (ii) to calculate 1 (u, T ), it is necessary to have a Taylor expansion of the coefficient
22
2 (1 r)
2 (1 r(t))
of u in
u
.
We
have
lim
=
, therefore, we set:
t0 2 (t)(1 + r(t))
2 (1 + r)
4 22
2 (1 r)
22
.
2 (1 + r)
4 22
l(t) =
2 (2 6 24 )
t + O(t2 ).
4 22
1
t (l(t) u) dt =
2
1
6
2
4
22
2 (2 6 24 )
u
4 22
1
=
2
11/2
4 22
27
1 (u, T ) =
4 5 (2 6 2 )3/2
2
4
1+O
1
u
4
u , we get the equivalent for 1 (u, T ).
4 22
4
u
4 22
u6
1+O
1
u
118
2 (T t)
2 (u, T ) =
0
1
2
1 r2 (t)
u
1 + r(t)
(4.10)
(x) (k x) dx.
b
The function x x2 1 (x) being bounded, we have
(kx) = (k b) + k (k b) (x b)
1 3
2
3
k b (k b) (x b) + OU k 3 (x b) ,
2
(4.11)
where the Landaus symbol has here the same meaning as in Lemma 4.3.
Moreover, using the expansion of given in formula (4.9), it is easy to check that as z +,
+
(z)
(z)
(z)
3 4 +O
2
z
z
z6
z
+
(z)
(z)
2
(x z) (x) dx = 2 3 + O
z
z5
z
+
(z)
3
(x z) (x) dx = O
.
z4
z
(x z) (x) dx =
Therefore, multiplying formula (4.11) by (x), integrating on [b; +[ and applying formula (4.9) once again
yield:
3
1 k2
3
1
1
+
+ k (k b) (b)
4
(k b) (b)
b b3 b5
b2
b
(k b) (b)
k
2
2
T2 = 2 b
+O
(k
b)
(b)
+
O
b7
b6
3
3
k
k
(k b) (b) + O
(b)
+O
b4
b4
Note that the penultimate term can be forgotten. Then, remarking that, as u +, b =
u, t and
k t, we obtain:
T2
2
2
= 2 2 b (k b) (b) + 2
(k b) (b) + 2
(k b) (b)
b2
b
2
Remark 4.5. As it will be seen later on, Lemma 4.3 shows that the contribution of the remainder to the
1
integral (4.10) can be neglected since the degrees in t and of each term are greater than 5. So, in the sequel,
u
we will denote the sum of these terms (and other terms that will appear later) by Remainder and we set:
T2 = U1 + U2 + U3 + U4 + U5 + U6 + U7 + U8 + U9 + Remainder.
119
Now, we have
U1 + T 3 = 0
1 2 2 k = (1 + ) k so that U7 + T1 = (1 + ) 2 k (k b) (b)
2
U2 + U3 = 2
(1 + ) (k b) (b)
b
2
U4 + U5 = 4 3 (k b) (b) 1 + O t2
b
2
U8 + U9 = 4 2 k (k b) (b) 1 + O t2
b
since = 1 + O t2 .
By the same remark as Remark 4.5 above, the term O t2 can be neglected. Consequently,
T1 + T2 + T3
= 2
2
2
(1 + ) (k b) (b) 4 3 (k b) (b)
b
b
(1 + ) 2 k (k b) (b) + 2 2 k 3 (k b) (b) + 4
2
k (k b) (b)
b2
+ Remainder.
Therefore, we are leaded to use Lemma 4.3 in order to calculate the following integrals:
(T t)
0
T
(T t)
0
T
(T t)
0
T
(T t)
0
u
2u
(kb) (b) dt = (T t) m1 (t) (kb) b2 +
dt
1+r
1
+r
0
2
2u
m2 (t) (k b) b2 +
dt
1+r
2
2u
dt
m3 (t) b2 (1 + k 2 ) +
1+r
2
2u
dt
m4 (t) b2 (1 + k 2 ) +
1+r
2
2u
dt
m5 (t) b2 (1 + k 2 ) +
1+r
(T t) m1 (t) exp
0
120
with:
m1 (t)
=
=
2
2
1
(t) (1 + (t))
1 r2 (t) b
4 22 3
1 2 6 24
t + O t5
5/2
36
u
2
m2 (t)
m3 (t)
=
=
m4 (t)
=
=
m5 (t)
=
=
5/2
4 22
2
1
(t)
=
t + O t3
7/2
1 r2 (t) b3
u3 2
1
1
(1 + (t)) 2 (t) k(t)
2 1 r2 (t)
3/2
2 2 6 24
t4 + O t6
864 22 4 22 3/2
2
1
2 (t) k 3 (t)
2 1 r2 (t)
3/2
2 4
1 2 6 24
t + O t6
864 22 4 22 3/2
4
1
( 1998).2
(t) k(t)
b2
2 1 r2 (t)
3/2
2 6 24 4 22
2 2
1
t + O t4 .
12
32 3/2 u2
4
=
Lemma 4.3 shows that we can neglect the terms issued from the t part of the factor T t in formula (4.10).
lim 2 +
=
t0 u
1+r
4 22
b2
2
4
lim 2 1 + k 2 +
=
t0 u
1+r
4 22
Therefore, we set:
2 2 6 24
l1 (t)
b2 (t)
2
4
+
=
2
u
1 + r(t)
4 22
l2 (t)
b2 (t)
2
4
1 + k 2 (t) +
u2
1+r
4 22
2 2 6 24
2
12 (4 22 )
t + O t3
5/2
18 (4 22 )
t + O t3 .
2 = T exp
4 u2
2 (4 22 )
4 22
4 22
1 2 6 24
3
5/2
7/2
36
2 u
u3 2
3/2
2 6 24 4 22
2
1
+
J2
12
32 3/2 u2
I1
1+O
1
u
121
2
2
2
2
8 3
3
3
(cos ) d =
and that
cos d =
, we find
Noting that
27
3
0
0
4
144 3 4 22
1
I3 =
u4 1 + O
2
u
2 22 (2 6 24 )
2 2
3 3 4 2
1
I1 =
u2 1 + O
2
u
2 2 (2 6 4 )
3
12 3 4 22
1
J2 =
u3 1 + O
2
2
u
2 (2 6 4 ) 2 (2 6 4 )
Finally, gathering the pieces, we obtain the desired expression of 2 .
5. Discussion
Using the general relation (1.3) with n = 1, we get
P X u P (X0 > u) 1 (u, T ) +
2 (u, T ) 3 (u, T )
2 (u, T )
2
2
6
A conjecture is that the orders of magnitude of 2 (u, T ) and 3 (u, T ) are considerably smaller than those of
1 (u, T ) and 2 (u, T ). Admitting this conjecture, Proposition 2.4 implies that for T small enough
9/2
4 22
T 2
3 3T
P X u = (u) +
(u)
2 9/2 (2 6 2 )
2
4
2
4
u
4 22
u5
1+O
1
u
which is Piterbargs theorem with a better remainder ([15], Th. 3.1, p. 703). Piterbargs theorem is, as far as we
know, the most precise expansion of the distribution of the maximum of smooth Gaussian processes. Moreover,
very tedious calculations would give extra terms of the Taylor expansion.
References
[1] M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables.
Dover, New York (1972).
[2] R.J. Adler, An Introduction to Continuity, Extrema and Related Topics for General Gaussian Processes, IMS, Hayward, Ca
(1990).
[3] J.-M. Azas and M. Wschebor, Une formule pour calculer la distribution du maximum dun processus stochastique. C.R. Acad.
Sci. Paris Ser. I Math. 324 (1997) 225-230.
[4] J-M. Azas and M. Wschebor, The Distribution of the Maximum of a Stochastic Process and the Rice Method, submitted.
[5] C. Cierco, Probl`
emes statistiques li
es a
` la d
etection et a
` la localisation dun g`
ene `
a effet quantitatif. PHD dissertation.
University of Toulouse, France (1996).
[6] C. Cierco and J.-M. Azas, Testing for Quantitative Gene Detection in Dense Map, submitted.
[7] H. Cram
er and M.R. Leadbetter, Stationary and Related Stochastic Processes, J. Wiley & Sons, New-York (1967).
[8] D. Dacunha-Castelle and E. Gassiat, Testing in locally conic models, and application to mixture models. ESAIM: Probab.
Statist. 1 (1997) 285-317.
[9] R.B. Davies, Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 (1977) 247-254.
[10] J. Ghosh and P. Sen, On the asymptotic performance of the log-likelihood ratio statistic for the mixture model and related
results, in Proc. of the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Le Cam L.M. and Olshen R.A., Eds.
(1985).
122
GENERAL FORMULAE
>
phi:=t->exp(-t*t/2)/sqrt(2*pi);
2
e(1/2 t )
2
We introduce mu4=lambda4-lambda22 and mu6= lambda2*lambda6-lambda4^2
to make the outputs clearer.
>
assume(t>0);
>
assume(lambda2 > 0);
>
assume(mu4 > 0);
>
assume(mu6>0);
>
interface(showassumed=2);
>
Order:=12;
:= t
>
Order := 12
r:=t->1-lambda2*t^2/2!+lambda4*t^4/4!-lambda6*t^6/6!+lambda8*t^8/8!;
1
1
1
1
2 t2 +
4 t4
6 t6 +
8 t8
2
24
720
40320
siderels:= {lambda4=mu4+lambda2^2,lambda2*lambda6-lambda4^2=mu6}:
I_r2:=t->1-r(t)*r(t);
r := t 1
>
>
I r2 := t 1 r(t)2
>
simplify(simplify(series(I_r2(t),t=0,8),siderels));
>
1
1
1
1
1
2 t2 + ( 22
4) t4 + (
6 +
2 4 +
23 ) t6 + O(t8 )
3
12
360
24
24
with assumptions on t, 2 and 4
rp:=t->diff(r(t),t);
rp := t diff(r(t), t)
>
eval(rp(t));
1
1
1
4 t3
6 t5 +
8 t7
6
120
5040
with assumptions on 2 and t
2 t +
>
rs:=t->diff(r(t),t$2);
rs := t
>
2
r(t)
t2
eval(rs(t));
1
1
1
4 t2
6 t4 +
8 t6
2
24
720
with assumptions on 2 and t
2 +
123
124
mu:=t->-u*rp(t)/(1+r(t));
:= t
>
u rp(t)
1 + r(t)
sig2:=t->lambda2-rp(t)*rp(t)/I_r2(t);
sig2 := t 2
>
rp(t)2
I r2(t)
simplify(taylor(sig2(t),t=0,8),siderels);
1
1 6 22 4 3 42 2 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6
>
sigma:=t->sqrt(sig2(t));
:= t
>
simplify(taylor(sigma(t),t=0,6),siderels);
1
2
>
sig2(t)
1 6 22 4 3 42 2 6 3
t + O(t5 )
144
4 2
with assumptions on t, 4, 2 and 6
4 t +
b:=t->mu(t)/sigma(t);
b := t
>
(t)
(t)
simplify(taylor(b(t),t=0,6),siderels);
u 2
1
1 u 6
+ ( u 4 +
) t2 + O(t4 )
8
36 4(3/2)
4
with assumptions on 2, 4, t and 6
>
sig2rho:=t->-rs(t)-r(t)*rp(t)*rp(t)/I_r2(t);
sig2rho := t rs(t)
>
r(t) rp(t)2
I r2(t)
simplify(taylor(sig2rho(t),t=0,8),siderels);
1
1 6 22 4 + 3 42 + 4 6 4
4 t2 +
t + O(t6 )
4
144
2
with assumptions on t, 4, 2 and 6
>
rho:=t->sig2rho(t)/sig2(t);
:= t
>
sig2rho(t)
sig2(t)
simplify(taylor(rho(t),t=0,8),siderels);
1 6 2
t + O(t4 )
18 2 4
with assumptions on t, 6, 2 and 4
1 +
k2:=t->(1+rho(t))/(1-rho(t));
k2 := t
>
1 + (t)
1 (t)
sk2:=simplify(taylor(k2(t),t=0),siderels);
1
1 6 2
t +
(3 26 4 + 9 24 42 + 9 22 43 2 6 22 4 3 8 22 4
36 2 4
2160
1
+ 3 44 + 13 6 42 + 5 62 ) (22 42 )t4 +
(147 28 42
907200
+ 175 6 26 4 273 26 43 + 63 24 44 + 196 6 24 42 + 120 8 24 42
+ 357 22 45 + 707 6 22 43 195 8 22 43 175 8 22 6 4 + 168 46
sk2 :=
k:=t->taylor(sqrt(sk2),t=0);
k := t taylor( sk2 , t = 0)
>
simplify(taylor(k(t),t=0,3),siderels);
1
6
>
6
t + O(t3 )
2 4
with assumptions on t, 6, 2 and 4
sqrtI_rho2:=t->k(t)*(1-rho(t));
sqrtI rho2 := t k(t) (1 (t))
>
T1:=t->sig2(t)*sqrtI_rho2(t)*phi(b(t))*phi(k(t)*b(t));
T1 := t sig2(t) sqrtI rho2(t) (b(t)) (k(t) b(t))
>
simplify(simplify(series(T1(t),t=0,6),siderels),power);
1
24
u2 22
6 4 e(1/2 4 ) 3
1
t
((5 62 22 u2 + 3 22 42 8 3 26 42 9 24 43
2880
2
9 22 44 15 6 22 42 u2 18 6 22 42 3 45 + 5 62 4 3 6 43 )
e(1/2
u2 22
4
T2 := t->2*sig2(t)*(rho(t)-(b(t))^2)*(arctan(k(t))/(2*pi)
-k(t)/sqrt(2*pi)*(phi(0)-phi(b(t))-k(t)^2/6*(2*phi(0)-((b(t))^2+2)*phi(b(t)))));
T2 := t 2sig2(t) ((t) b(t)2 )
1
2
2
k(t)
((0)
(b(t))
k(t)
(2
(0)
(b(t)
+
2)
(b(t))))
1 arctan(k(t))
6
125
126
simplify(simplify(series(T2(t),t=0,6),siderels),power);
1
24
u2 22
6 (u2 22 + 4) e(1/2 4 ) 3
t + O(t5 )
4 2
with assumptions on t, 6, 2 and 4
>
T3:=t->(2*sig2(t)*(k(t)*b(t)^2))/sqrt(2*pi)*(1-(k(t)*b(t))^2/6)*phi(b(t));
>
1
sig2(t) k(t) b(t)2 (1 k(t)2 b(t)2 ) (b(t))
6
T3 := t 2
2
simplify(simplify(series(T3(t),t=0,6),siderels),power);
u2 22
1
1 e(1/2 4 ) 6 2(3/2) u2 3
t
2 u2 (27 8 22 42 + 35 62 22 u2
24
25920
4
27 26 42 81 24 43 81 22 44 162 6 22 42 135 6 22 42 u2
27 45 45 62 4 + 243 6 43)e(1/2
u2 22
4
A:=t->((phi(u/sqrt((1+r(t)))))^2/sqrt(I_r2(t)))*(T1(t)+T2(t)+T3(t));
(
A := t
>
u
)2 (T1(t) + T2(t) + T3(t))
1 + r(t)
I r2(t)
simplify(simplify(series(A(t),t=0,6),siderels),power);
O(t4 )
with assumptions on t
Cphib:=t->phi(t)/t-phi(t)/t^3;
Cphib := t
>
sq:=t->sqrt((1-r(t))/(1+r(t)));
sq := t
>
(t) (t)
3
t
t
1 r(t)
1 + r(t)
simplify(simplify(series(sq(t),t=0,4),siderels),power);
1 2 22 + 4 3
1
2 t
t + O(t5 )
2
48
2
with assumptions on t, 2 and 4
>
nsigma:=t->sigma(t)/sqrt(lambda2);
(t)
nsigma := t
2
A1:=t->(1/sqrt(2*pi))*phi(u)*phi(sq(t)*u/nsigma(t))*((nsigma(t)/(sq(t)*u)
-(nsigma(t)/(sq(t)*u))^3)*sqrt(lambda2)+(1/b(t)-1/b(t)^3)*rp(t)/sqrt(I_r2(t)));
1
1
(
)
rp(t)
3
nsigma(t) nsigma(t)
sq(t) u
b(t) b(t)3
(u) (
) 2 +
)
(
3
3
nsigma(t)
sq(t) u
sq(t) u
I r2(t)
2
SA1:=simplify(simplify(series(A1(t),t=0,6),siderels),power);
A1 := t
>
u2 (4+22 )
)
4
1 2 e(1/2
4(5/2) 2
SA1 :=
t + O(t4 )
16
2(7/2) (3/2) u3
with assumptions on t, 4 and 2
L2:= t->(1-r(t))/((1+r(t))*nsigma(t)^2)-(lambda4-mu4)/mu4;
>
4 4
1 r(t)
(1 + r(t)) nsigma(t)2
4
SL2:=simplify(simplify(series(L2(t),t=0,6),siderels),power);
L2 := t
1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
We define c as the square root of the coefficient of t2
c:=sqrt(op(1,SL2))
1 2 2 6
c :=
6
4
with assumptions on 2, 6 and 4
>
nu1b:=(sqrt(2*pi))*op(1,SA1)*(c^(-3)*u^(-3)/2);
SL2 :=
u2 (4+22 )
)
4
27 2 e(1/2
4(11/2)
nu1b :=
8
2(7/2) u6 (2 6)(3/2)
with assumptions on 4, 2 and 6
PROOF OF THE EQUIVALENT OF NU2
>
m1:=t->(1+rho(t))*2*sigma(t)^2/(pi*b(t)*sqrt(I_r2(t)));
m1 := t 2
>
(1 + (t)) (t)2
b(t) I r2(t)
sm1:=simplify(simplify(series(m1(t),t=0,8),siderels),power);
1 6 4 3
sm1 :=
t + O(t5 )
36 2(5/2) u
with assumptions on t, 6, 4 and 2
127
128
m2:=t->(-4/pi)*sigma(t)^2*b(t)^(-3)/sqrt(I_r2(t));
>
(t)2
b(t)3 I r2(t)
sm2:=simplify(simplify(series(m2(t),t=0,6),siderels),power);
>
4(5/2)
t + O(t3 )
u3 2(7/2)
with assumptions on t, 4 and 2
m3:=t->-(1+rho(t))*sigma(t)^2*k(t)/(pi*sqrt((2*pi)*I_r2(t)));
m2 := t 4
sm2 :=
1
6(3/2) 2
sm3 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m4:=t->(2/pi)*sigma(t)^2*k(t)^3/sqrt(2*pi*I_r2(t));
m3 := t
>
>
m4 := t 2
>
>
(t)2 k(t)3
2 I r2(t)
sm4:=simplify(simplify(series(m4(t),t=0,6),siderels),power);
1
6(3/2) 2
sm4 :=
t4 + O(t6 )
864 22 4 (3/2)
with assumptions on t, 6, 2 and 4
m5:=t->(4/pi)*sigma(t)^2*k(t)*b(t)^(-2)/sqrt(2*pi*I_r2(t));
>
(t)2 k(t)
b(t)2 2 I r2(t)
sm5:=simplify(simplify(series(m5(t),t=0,6),siderels),power);
1 6 4(3/2) 2 2
sm5 :=
t + O(t4 )
12 23 (3/2) u2
with assumptions on t, 6, 4 and 2
l12:=t-> (b(t)/u)^2 + 2/(1+r(t))-lambda4/mu4;
>
b(t)2
1
4
+2
u2
1 + r(t) 4
simplify(simplify(series(l12(t),t=0,8),siderels),power);
m5 := t 4
>
l12 := t
1 2 6 2
t + O(t4 )
18 42
with assumptions on t, 2, 6 and 4
>
b(t)2 (1 + k(t)2 )
1
4
+2
u2
1 + r(t) 4
simplify(simplify(series(l22(t),t=0,8),siderels),power);
1 2 6 2
t + O(t4 )
12 42
with assumptions on t, 2, 6 and 4
>
>
opm1:=op(1,sm1);
1 6 4
36 2(5/2) u
with assumptions on 6, 4 and 2
opm1 :=
>
opm2:=op(1,sm2);
4(5/2)
u3 2(7/2)
with assumptions on 4 and 2
opm2 :=
>
>
>
>
>
>
opm5:=op(1,sm5);
1 6 4(3/2) 2
opm5 :=
12 23 (3/2) u2
with assumptions on 6, 4 and 2
c1:=144*sqrt(3)*mu4^4*u^(-4)/(sqrt(2*pi)*lambda2^2*mu6^2);
3 44 2
c1 := 72 4
u 22 62
with assumptions on 4, 2 and 6
c2:=3*sqrt(3)*mu4^2*u^(-2)/(sqrt(2*pi)*lambda2*mu6);
3
3 42 2
c2 :=
2 u2 2 6
with assumptions on 4, 2 and 6
c5:=12*sqrt(3)*mu4^3*u^(-3)/(lambda2^(3/2)*mu6^(3/2));
3 43
c5 := 12 3 (3/2) (3/2)
u 2
6
with assumptions on 4, 2 and 6
B:=opm1*c1+opm2*c2+opm5*c5;
3 4(9/2) 3 2
B :=
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
simplify(B);
3 4(9/2) 3 2
2 (3/2) u5 2(9/2) 6
with assumptions on 4, 2 and 6
129
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Statistique spatiale et
Statistique simultanee,
maximum de processus
Rennes 24 mars 2009
Jean-Marc Azas
IMT,Toulouse
Laboratoire de Statistique et Probabilites,
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
1 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Motivation
Exemples
Un petit exemple en dimension 1
Champs aleatoires
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
2 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
` signal + bruit
modele
`
En statistique spatiale on est souvent amene a` considerer
le modele
signal + bruit gaussien.
par
Des exemples de telles situations sont donnees
lagriculture de precision
les neurosciences
`
les problemes
de modelisation
des vagues
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
3 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
` signal + bruit
modele
`
En statistique spatiale on est souvent amene a` considerer
le modele
signal + bruit gaussien.
par
Des exemples de telles situations sont donnees
lagriculture de precision
les neurosciences
`
les problemes
de modelisation
des vagues
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
3 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
` signal + bruit
modele
`
En statistique spatiale on est souvent amene a` considerer
le modele
signal + bruit gaussien.
par
Des exemples de telles situations sont donnees
lagriculture de precision
les neurosciences
`
les problemes
de modelisation
des vagues
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
3 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Agriculture de precision
Mesure du rendement par moissonneuse GPS
4 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Neuroscience
` 2 dimensionnel ou 3 dimensionnel pour le
On utilise un modele
`
cerveau et on desire
savoir si il existe une zone particulierement
par une activite donnee.
activee
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
5 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Spectre de houle
On mesure localement, en temps, le spectre de vagues et on veut
detecter
des instants de changement : les transition entre les etats
de mer.
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
6 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Motivation
Exemples
Un petit exemple en dimension 1
Champs aleatoires
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
7 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Motivation
Un petit exemple en dimension 1
Lynx
7000
6000
5000
4000
3000
2000
1000
20
40
60
80
100
120
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
8 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Motivation
Un petit exemple en dimension 1
4
0
20
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
40
60
80
100
120
9 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Test unidimensionnel
`
On fait les hypotheses
discutables suivantes
les observations sont gaussiennes
La serie
des erreurs est stationnaire et melangeante,
La
` proie
pseudo-periodicit
e est aleatoire,
due a` un modele
predateur.
La taille de la serie
114 est suffisante pour estimer la variance
1
n
Xi2
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
10 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Test unidimensionnel
`
On fait les hypotheses
discutables suivantes
les observations sont gaussiennes
La serie
des erreurs est stationnaire et melangeante,
La
` proie
pseudo-periodicit
e est aleatoire,
due a` un modele
predateur.
La taille de la serie
114 est suffisante pour estimer la variance
1
n
Xi2
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
10 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Test unidimensionnel
`
On fait les hypotheses
discutables suivantes
les observations sont gaussiennes
La serie
des erreurs est stationnaire et melangeante,
La
` proie
pseudo-periodicit
e est aleatoire,
due a` un modele
predateur.
La taille de la serie
114 est suffisante pour estimer la variance
1
n
Xi2
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
10 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Motivation
Un petit exemple en dimension 1
Yi
suit approximativement
reduite)
`
un loi normale standard (centree
Dou la regle
de test
Sous lhypothese nulle dabsence de signal
si |
Yi
| > 1.96
e.
on declare
quil y a un signal au point i consider
3
4
0
20
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
40
60
80
100
120
11 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Risque simultane
eralement
La methode
le plus rudimentaire (mais pas toujours la plus mauvaise)
est la methode
de Bonferroni qui consiste a` faire chaque test
ementaire
el
au niveau = /114 dans notre cas
qnorm(1 0.025/114) = 3.5157
` multiplication par lecart
et apres
type 1.28 donne 4.5. Ce qui ne
detecte
rien Peut on faire mieux ? ?
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
12 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
De maniere
la distribution du maximum est inconnue
meme
dans les cas les plus simples :
marche aleatoires,
processus auto-regressif
dordre 1. On peut faire
Un methode
est decrire
la densite du vecteur gaussien
(2)n/2
1
x 1 x
exp
.
det()
2
et de lintegrer
sur un hyper-rectangle [u, u]n . Cela se fait
numeriquement
par des methodes
fort complexes que je ne vais pas
decrire
pour des tailles jusqua` 1000
un niveau de
On trouve en utilisant la matrice de variance estimee,
signification de 0.4978 ce qui est clairement non significatif.
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
13 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Motivation
Exemples
Un petit exemple en dimension 1
Champs aleatoires
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
14 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
On suppose
observe entierement
la fonction de la variable reelle
X(t).
e est regulier
Que le phenom
ene
aleatoire
consider
(derivable)
` le maximum M (sans valeur absolue pour
on considere
simplifier) sur un intervalle borne par exemple [0, T].
on utilise la methode
de Rice qui est basee
es
basiques suivantes
P{M > u} P{X(0) > u} + P{Uu > 0} P{X(0) > u} + E(Uu )
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
15 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
16 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
E(Uu ) =
0
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
17 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Une precision
super-exponentielle
`
Sous certaines hypotheses
P{MT > u} = 1 (u) + T
2
(u) + O (u(1 + ))
2
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
18 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Champs aleatoires
Motivation
Exemples
Un petit exemple en dimension 1
Champs aleatoires
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
19 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Champs aleatoires
`
Retour aux problemes
de lintroduction
` une fonction aleatoire
On considere
sur R2 (pour simplifier) : un
champ aleatoire.
Le nombre de franchissement courbe (de
niveau) : ne permet pas de construire de bornes.
En negligeant
les effets bords
P{M > u}
edemment
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
20 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Champs aleatoires
`
Theoreme
Considerons
le carre [O, T]2 , alors si le champ est centre et de
11 +
22
+1 (u) + O (u(1 + ))
ou` est la matrice de variance-covariance du gradient.
Resultats
du a divers auteurs sous diverses conditions. Piterbarg
(1981), Taylor Takemura Adler (2005) , Azas Wschebor (2008).
On peut meme
facturer le
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
21 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Champs aleatoires
Conclusion
debut
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
22 / 23
Motivation
Maximum dun processus sur la droite
Champs aleatoires
Champs aleatoires
MERCI
Jean-Marc
IMT,Toulouse )
Azas (Laboratoire de Statistique et Probabilites,
23 / 23
Abstract
This paper deals with the problem of obtaining methods to compute the
distribution of the maximum of a one-parameter stochastic process on a fixed
interval, mainly in the Gaussian case. The main point is the relationship
between the values of the maximum and crossings of the paths, via the socalled Rices formulae for the factorial moments of crossings.
We prove that for some general classes of Gaussian process the so-called
Rice series is convergent and can be used for to compute the distribution
of the maximum. It turns out that the formulae are adapted to the numerical
computation of this distribution and becomes more efficient than other numerical methods, namely simulation of the paths or standard bounds on the
tails of the distribution.
We have included some relevant numerical examples to illustrate the power
of the method.
Introduction
Let X = {Xt : t IR} be a stochastic process with real values and continuous paths
defined on a probability space (, , P ) and MT := max{Xt : t [0, T ]}.
The computation of the distribution function of the random variable MT
F (T, u) := P (MT u), u IR
by means of a closed formula based upon natural parameters of the process X is
known only for a very restricted number of stochastic processes (and trivial functions
of them): the Brownian Motion {Wt : t 0}; the Brownian Bridge, Bt := Wt
1
tW1 (0 t 1); Bt 0 Bs ds (Darling, 1983); the Brownian Motion with a linear
t
drift (Shepp, 1979); 0 Ws ds + yt (McKean, 1963, Goldman, 1971 Lachal, 1991); the
stationary Gaussian processes with covariance equal to:
1. r(t) = e|t| (Ornstein-Uhlenbeck process, DeLong, 1981),
2. r(t) = (1 |t|)+ , T a positive integer (Slepian process, Slepian 1961, Shepp,
1971),
3. r(t) even, periodic with with period 2, r(t) = 1|t| for 0 |t| 1, 0 < 2,
(Shepp and Slepian 1976),
4. r(t) = 1 |t|/1 /1 , |t| < 1 /, 0 < 1/2, T = (1 )/
(Cressie 1980),
5. r(t) = cos t.
Given the interest in F (T, u) for a large diversity of theoretical and technical
purposes an extensive literature has been developed of which we give a sample of
references pointing to various directions:
1. Obtaining inequalities for F (T, u) : Slepian (1962); Landau & Shepp (1970);
Marcus & Shepp (1972); Fernique (1974); Borell (1975); Ledoux (1996); Talagrand (1996) and references therein. A general review of a certain number of
classical results is in Adler (1990, 2000).
2. Describing the behaviour of F (T, u) under various asymptotics : Qualls and
Watanabe (1973); Piterbarg (1981, 1996); Leadbetter, Lingren and Rootzen
(1983); Berman (1985a, b, 1992); Talagrand (1988); Berman & Kono (1992) ;
Sun (1993); Wschebor (2000); Azas, Bardet and Wschebor (2000).
2
called Davies bound(1977), or more accurately, the first term in the Rice series
to obtain approximations for F (T, u). But as T increases, for moderate values of u
the Davies bound is far from the true value and one requires the computation of
the succesive terms. The numerical results are shown in the case of four Gaussian
stationary processes for which no closed formula is known.
An asymptotic approximation of F (T, u) as u + recently obtained by Azas,
Bardet and Wschebor (2000). It extends to any T a previous result by Piterbarg
(1981) for sufficiently small T .
One of the key points in the computation is the numerical approximation of the
factorial moments of upcrossings by means of Rice integral formulae. For that purpose, the main difficulty is the precise description of the behaviour of the integrands
appearing in these formulae near the diagonal, which is again an old subject that is
interesting on its own - see Belayeiv (1966), Cuzick (1975) - and remains widely open.
We have included in the Section Computation of Moments some new results, that
give partial answers and are helpful to improve the numerical methods.
The extension to processes with non-smooth trajectories can be done by smoothing the paths by means of a deterministic device, applying the previous methods
to the regularized process and estimating the error as a function of the smoothing
width. We have not included these type of results here since for the time being they
do not appear to be of practical use.
The Note (Azas & Wschebor 1997) contains a part of the results in the present
paper, without proofs.
Notations
Let f : I IR be a function defined on the interval I of the real numbers,
Cu (f ; I) := {t I : f (t) = u}
Nu (f ; I) := (Cu (f ; I))
denote respectively the set of roots of the equation f (t) = u on the interval I and the
number of these roots, with the convention Nu (f ; I) = + if the set Cu is infinite.
Nu (f ; I) is called the number of crossings of f with the level u on the interval
I. In what follows, I will be the interval [0, T ] if it is not stated otherwise.
In the same way, if f is a differentiable function the number of upcrossings of
f is defined by means of
Uu (f ; I) := ({t I : f (t) = u, f (t) > 0}).
4
m =
dt1 ...dtm
[0,T ]
(1)
[0,T ]m
dt1 ...dtm
dx
[0,+)m
x1 ...xm
(2)
(References for conditions for this formula to hold true that suffice for our presente purposes and also for proofs can be found, for example, in Marcus (1977) and
in Wschebor (1985).
This section contains two main results. The first is Theorem 2.1 that requires
the process to have C paths and contains a general condition enabling to compute
F (T, u) as the sum of a series. The second is Theorem 2.2 that illustrates the same
situation for Gaussian stationary processes from conditions on the the covariance.
As for Theorem 2.3, it contains upper and lower bounds on F (T, u) for processes
with C k paths verifying some additional conditions.
Theorem 2.1 Assume that a.s. the paths of the stochastic process X are of class
C and that the density pXT /2 is bounded by some constant D.
(i) If there exists a sequence of positive numbers {ck }k=1,2,... such that:
k := P
X (2k1)
ck .T 2k1 +
22k1
Dck
= o 2k (k )
(2k 1)!
(3)
then :
(1)m+1
m
m!
(4)
(ii) In formula (4) the error when one replaces the infinite sum by its m0 -th
m
:= sup 2k+1 k .
km
We will call the series in the right-hand term of (4) the Rice Series.
For the proof we will assume, with no loss of generality that T = 1.
We start with the following lemma on the Cauchy remainder for polynomial
interpolation (Davis 1975, Th. 3.1.1 ).
Lemma 2.1 a) Let I be an interval in the real line, f : I IR a function of
class C k , k a positive integer, t1 , ..., tk , k points in I and let P (t) be the - unique
- interpolation polynomial of degree k 1 such that for i = 1, ..., k: f (ti ) = P (ti ),
taking into account possible multiplicities.
Then, for t I :
1
(t t1 )....(t tk )f (k) ()
k!
f (t) P (t) =
where
1
f (k)
k!2k
The next combinatorial lemma plays the central role in what follows. A proof is
given in Lindgren (1972).
Lemma 2.2 Let be a non-negative integer-valued random variable having finite
moments of all orders. Let k, m, M (k 0, m 1, M 1) be integers and denote
M
pk := P ( = k) ; m := E( [m] ) ; SM :=
(1)m+1
m=1
m
m!
Then
(i) For each M :
2M
pk S2M +1
pk
S2M
(5)
k=1
k=1
(ii) The sequence {SM ; M = 1, 2, ...} has a finite limit if and only if m /m! 0
as m , and in that case:
P ( 1) =
(1)m+1
pk =
m=1
k=1
m
.
m!
(6)
Remark. A by-product of Lemma 2.2 that will be used in the sequel is the following:
if in (6) one substitutes the infinite sum by the M partial sum, the absolute value
M +1 /((M + 1)!) of the first neglected term is an upper-bound for the error in the
computation of P ( 1).
7
Lemma 2.3 With the same notations as in Lemma 2.2 we have the equality:
E( [m] ) = m
(k 1)[m1] P ( k) (m = 1, 2, ...).
k=m
[m]
(k)[m1]
=m
k=m1
E(
[m]
)=
[m]
P ( = j) =
j=m
(k 1)[m1] =
P ( = j)m
j=m
k=m
(k 1)[m1] P ( k).
=m
k=m
Lemma 2.4 Suppose that a.s. the paths of the process X belong to C and that
pX1/2 is bounded by the constant D. Then for any sequence {ck , k = 1, 2, ...} of
positive numbers, one has
[m]
E((Uu )
)m
(k 1)[m1] P
X (2k1)
ck +
k=m
22k1
Dck
,
(2k 1)!
(7)
ck ) + P (Uu k, X (2k1)
< ck ).
< ck } {|X1/2 u|
8
22k1
ck
}.
(2k 1)!
1
(k+1)
k [m] m
2
= m 2(m+1)
m!
m! k=m
m!
1
1x
(m)
|x=1/2 = m
0.
and the result follows from m
Remarks
One can replace condition pXT /2 (x) D for all x by pXT /2 (x) D for x in some
neighbourhood of u. In this case, the statement of Theorem 2.1 holds if one adds in
(2k1)
ck ) P (|X0
| ck ) + 2P (Uck (X (2k1) , I) 1)
1/2
P( X
(2k1)
2 4k2
+ (1/)
ck
ck )
4k
4k2
1/2
exp
Choose
ck := (B1 k4k2 )1/2 if
ck := (4k )1/2 if
4k
B1 k
4k2
4k
> B1 k.
4k2
B1 k
2
1
+ (B1 k)1/2 e 2 .
ck )
2
1/2
(1 + 2(C1 + 1)k)22k
10
(k = 1, 2, ...),
c2k
24k2
(9)
8
1/2
(1 + 2(C1 + 1)m)2m (m = 1, 2, ...).
(10)
Remarks
a) If one is willing to use Rice formulae to compute the factorial moments m , it
is enough to verify that the distribution of
Xt1 , ..., Xtk , Xt1 , ..., Xtk
is non-degenerate for any choice of k = 1, 2, ... (t1 , ..., tk ) I k \Dk (I). For Gaussian
stationary processes a sufficient condition for non-degeneracy is the spectral measure
not to be purely atomic (see Cramer and Leadbetter (1967) for a proof). The same
kind of argument permits to show that the conclusion remains if the spectral measure
is purely atomic and the set of its atoms has an acumulation point in IR. Sufficient
conditions for the finiteness of m are given also in Nualart & Wschebor (Lemma
1.2, 1991).
b) If instead of requiring the paths of the process X to be of class C , one relaxes
this condition up to a certain order of differentiability, one can still get upper and
lower bounds for P (M > u).
Theorem 2.3 Let X = {Xt : t I} be a real -valued stochastic process. Suppose
that pXt (x) is bounded for t I, x IR and that the paths of X are of class C p+1 .
Then
2K+1
if
(1)m+1
m
m!
and
2K
if
(1)m+1
m=1
m
.
m!
Note that all the moments in the above formulae are finite.
The proof is a straightforward application of Lemma 2.2 and Lemma 1.2 in
Nualart & Wschebor (1991).
When the level u is high, the results by Piterbag (1981, 1996), which were until
recently the sharpest known asymptotic bounds for the tail of the distribution of the
11
2 (const)e
u2 (1+)
2
0 1 (u) +
(11)
Computation of Moments
An efficient numerical computation of the factorial moments of crossings is associated to a fine description of the behaviour as the k-tuple (t1 , ..., tk ) approaches the
diagonal Dk (I), of the integrands
A+
t1 ,...,tk (u, ..., u) =
[0,+)m
dx
[0,+)m
that appear respectively in Rice formulae for the k th factorial moment of upcrossings or the k th factorial moment of upcrossings with the additional condition that
X0 u (see formula(2).
For example in Azas, Cierco and Croquette (1999) it is proved that if X is
Gaussian, stationary, centered and 8 < , then the integrand A+
s,t (u, u) in the
12
A+
s,t (u, u)
1
(2 6 24 )
1 4
exp
u2
1296 (4 22 )1/2 2 22
2 4 22
(t s)4 ,
(12)
as t s 0.
(12) can be extended to non-stationary Gaussian processes obtaining an equivalence of the form:
A+
s,t (u, u)
J(t)(t s)4
as s, t t
(13)
where J(t) is a continuous non-zero function of t depending on u, that can be expressed in terms of the mean and covariance functions of the process and its derivatives. We give a proof of an equivalence of the form (13) in the next proposition.
One can profit of this equivalence to improve the numerical methods to compute
2 (the second factorial moment of the number of upcrossings restricted to X0
u). Equivalence formulae such as (12) or (13) can be used to avoid numerical
degeneracies near the diagonal D2 (I). Note that even in case X is stationary at the
departure, under conditioning on X0 , the process that must be taken into account
in the actual computation of the factorial moments of upcrossings that appear in
the Rice series(4) will be non-stationary, so that equivalence (13) is the appropriate
tool.
Proposition 3.1 Suppose that X is a Gaussian process with C 5 paths and that for
(2)
(3)
each t I the joint distribution of Xt , Xt , Xt , Xt does not degenerate.Then (13)
holds true.
1
a two-dimensional random vector having as proba2
s
bility distribution the conditional distribution of X
given Xs = Xt = u.
Xt
One has:
Proof. Denote by =
+ +
A+
pXs ,Xt (u, u)
s,t (u, u) = E 1 2
(14)
Put = t s and check the following Taylor expansions around the point s:
E (1 ) = m1 + m2 2 + L1 3
(15)
E (2 ) = m1 + m2 2 + L2 3
(16)
13
V ar () =
a 2 + b 3 + c 4 + 11 5
a 2
b+b
2
a 2
3 + d 4 + 12 5
b+b
2
3 + d 4 + 12 5
a 2 + b 3 + c 4 + 22 5
(17)
(18)
(2)
1 det V ar(Xs , Xs , Xs )T 2
V ar(1 )
4 det V ar(Xs , Xs )T
where denotes equivalence as 0. So,
(2)
1 det V ar(Xs , Xs , Xs )T
a=
4 det V ar(Xs , Xs )T
which is a continuous non-vanishing function for s I. Note that the coefficient of
. This follows either by
3 in the Taylor expansion of Cov(1 , 2 ) is equal to b+b
2
direct computation or noting that det V ar() is a symmetric function of the pair
s, t.
Put
(s, t) = det V ar()
The behaviour of (s, t) as s, t t can be obtained by noting that
(s, t) =
det V ar(Xs , Xt , Xs , Xt )T
det V ar(Xs , Xt )T
and applying Lemma 3.2 in Azas and Wschebor (2000) or Lemma 4.3, p.76 in
Piterbarg (1996) which provide an equivalent for the numerator, so that:
(s, t) (t) 6
14
(19)
with
(2)
(t) =
(3)
The non degeneracy hypothesis implies that (t) is continuous and non zero.
Then:
E
1+ 2+
1
1/2
2 [(s, t)]
xy exp
0
1
F (x, y) dxdy
2(s, t)
(20)
where
F (x, y) = V ar(2 )(x E(1 ))2 + V ar(1 )(y E(2 ))2 2Cov(1 , 2 )(x E(1 ))(y E(2 ))
Substituting the expansions (15), (16), (17) in the integrand of (20) and making
the change of variables x = 2 v, y = 2 w we get, as s, t t:
E 1+ 2+
5
2 (t)
1/2
vw exp
0
1
F (v, w) dvdw
2(t)
(21)
bb
2
and
2
k
computing the integral of A+
t1 ,...,tk (u) over I , instead of choosing at random the point
(t1 , t2 , ..., tk ) in the cube I k with a uniform distribution, we do it with a probability
law that has a density proportional to the function 1i<jk (tj ti )4 . For its proof
we will use the following auxiliary proposition, that has its own interest and extends
(19) to any k.
if t1 , t2 , ...., tk t :
(2k1) T
(tj ti )8
1i<jk
(22)
Proof. With no loss of generality, we consider only ktuples (t1 , t2 , ...., tk ) such
that ti = tj if i = j.
Suppose f : I IR is a function of class C 2m , m 1, and t1 , t2 , ...., tm
are pairwise different points in I. We use the following notations for interpolating
polynomials:
Pm (t; f ) is the polynomial of degree 2m 1 such that
Pm (tj ; f ) = f (tj ) and Pm (t; f ) = f (tj ) for j = 1, ..., m.
Qm (t; f ) is the polynomial of degree 2m 2 such that
Qm (tj ; f ) = f (tj ) for j = 1, ..., m ; Qm (t; f ) = f (tj ) for j = 1, ..., m 1.
From Lemma 2.1 we know that
f (t) Pm (t; f ) =
f (t) Qm (t; f ) =
1
(t t1 )2 ....(t tm )2 f (2m) ()
(2m)!
1
(t t1 )2 ....(t tm1 )2 (t tm )f (2m1) ()
(2m 1)!
where
= (t1 , t2 , ...., tm , t), = (t1 , t2 , ...., tm , t)
16
(23)
(24)
and
min(t1 , t2 , ...., tm , t) , max(t1 , t2 , ...., tm , t).
Note that the function
g(t) = f (2m1) ((t1 , t2 , ...., tm , t)) =
1
(tm t1 )2 ....(tm tm1 )2 f (2m1) ((t1 , t2 , ...., tm , tm ))
(2m 1)!
(25)
Put
m = (t1 , t2 , ...., tm , tm ), m = (t1 , t2 , ...., tm , tm ).
Since Pm (t; f ) is a linear functional of
(f (t1 ), ..., f (tm ), f (t1 ), ..., f (tm ))
and Qm (t; f ) is a linear functional of
(f (t1 ), ..., f (tm ), f (t1 ), ..., f (tm1 ))
with coefficients depending (in both cases) only on t1 , t2 , ...., tm , t, it follows that:
= det V ar Xt1 , Xt1 , Xt2 P1 (t2 ; X), Xt2 Q2 (t2 , X), ...
T
[2!...(2k 1)!]2
(tj ti )8
1i<jk
with
(2)
(2k2)
(2k1) T
Proposition 3.3 Suppose that X is a centered Gaussian process with C 2k1 paths
and that for each pairwise distinct values of the parameter t1 , t2 , ..., tk I the joint
(2k1)
distribution of (Xth , Xth , ...., Xth
, h = 1, 2, ..., k) is non-degenerate. Then, as
t1 , t2 , ..., tk t :
A+
t1 ,...,tk (0, ..., 0) Jk (t )
(tj ti )4
1i<jk
1
[2!.....(k 1)!]2
(tj ti )2 . Dk1 (t ).
(26)
1i<jk
For pairwise different values t1 , t2 , ..., tk , let Z = (Z1 , ..., Zk )T be a random vector
having the conditional distribution of (Xt1 , ...., Xtk )T given Xt1 = Xt2 = ... = Xtk =
0. The (Gaussian) distribution of Z is centered and we denote its covariance matrix
by . Also put:
1 =
1
ij
det()
i,j=1,...,k
ij being the cofactor of the position (i, j) in the matrix . Then, one can write:
+
+
A+
. pXt1 ,...,Xtk (0, ..., 0)
t1 ,...,tk (0, ..., 0) = E Z1 ...Zk
(27)
and
A+
t1 ,...,tk (0, ..., 0) =
1
k
2
(2) (det())
1
2
x1 ...xk exp
(R+ )k
F (x1 , ..., xk )
2. det()
dx1 ...dxk
(28)
where
k
ij xi xj .
F (x1 , ..., xk ) =
i,j=1
18
1
[k!.....(2k 1)!]2
(tj ti )6 .
1i<jk
D2k1 (t )
.
Dk1 (t )
We consider now the behaviour of the ij (i, j = 1, ..., k). Let us first look at 11 .
Using the same method as above, now applied to the cofactor of the position (1, 1)
in , one has:
11
1
[2!...(2k2)!]2
ti )8
2i<jk (tj
1
[2!.....(k1)!]2
2hk (t1
2
1i<jk (tj ti )
1
=
[k!...(2k 2)!]2
(tj ti )
2i<jk
th )4 D2k2 (t )
Dk1 (t )
6
(t1 th )
2hk
=
D2k2 (t )
Dk1 (t )
1
[k!...(2k 2)!]2
(tj ti )6
3i<jk
(ti tj )2 . yj
xj =
i=1,i=j
19
j = 1, ..., k
D2k2 (t )
Dk1 (t )
y1 ...yk exp
(R+ )k
1i<jk
1
G(y1 , ..., yk )
2. det()
dy1 ...dyk
where
k
h=k
G(y1 , ..., yk ) =
i,j=1
h=k
ij
(th tj )2
(th ti )
h=1,h=i
yi yj .
h=1,h=j
so that, as t1 , t2 , ..., tk t
G(y1 , ..., yk )
D2k2 (t )
[(2k 1)!]2
det()
D2k1 (t )
i=k
yi
i=1
Now, passage to the limit under the integral sign in (28), which is easily justified by
application of the Lebesgue Theorem, leads to
E
Z1+ ...Zk+
1
(2)
k
2
|tj ti |
k!...(2k 1)!
1i<jk
Dk1 (t )
D2k1 (t )
1
2
Ik ( )
y1 ...yk exp
Ik () =
(R+ )k
i=k
yi
i=1
and
= [(2k 1)!]2
D2k2 (t )
D2k1 (t )
2!...(2k 2)!
Ik (1)
D2k1 (t)
D2k2 (t)
Numerical examples
4.1
First, let us compare the numerical computation based upon Theorem 2.1 with
the Monte-Carlo method based on the simulation of the paths. We do this for
stationary Gaussian processes that satisfy the hypotheses of Theorem 2.2 and also
the non-degeneracy condition that ensures that one is able to compute the factorial
moments of crossings by means of Rice formulae.
Suppose that we want to compute P (M > u) with an error bounded by , where
> 0 is a given positive number.
To proceed by simulation, we discretize the paths by means of a uniform partition
{tj := j/n, j = 0, 1, ..., n}. Denote
M (n) := sup Xtj .
0jn
Using Taylors formula at the time where the maximum M of X(.) occurs, one
gets :
0 M M (n) X
/(2n
It follows that
0 P (M > u) P (M (n) > u) = P (M > u, M (n) u)
P (u < M u + X
/(2n
)).
The bound for m
in Equation (10) implies that computing a partial sum with
(const)log(1/) terms assures that the tail in the Rice series is bounded by . If
one computes each m by means of a Monte-Carlo method for the multiple integrals
appearing in the Rice formulae, then the number of elementary operations for the
whole procedure will have the form (const) 2 log(1/). Hence, this is better than
simulation as tends to zero.
As usual, for given > 0, the value of the generic constants decides the comparison between both methods.
More important is the fact that the enveloping property of the Rice series implies
that the actual number of terms required by the application of Theorem 2.1 can
. More
be much smaller than the one resulting from the a priori bound on m
|m
m | ,
m
0 +1
.
(m0 + 1)!
(29)
m+1
(1)
m=1
m
m+1 m
(1)
(e + 1).
m! m=1
m!
Putting = /(e + 1), we get the desired bound. In other words one can profit of
the successive numerical approximations of m to determine a new m0 which turns
out to be - in certain interesting examples - much smaller than the one deduced
4.2
Next, we will give the results of the evaluation of P (MT > u) using up to three
terms in the Rice series in a certain number of typical cases. We compare these
results with the classical evaluation using what is often called the Davies (1977)
bound. In fact this bound seems to have been widely used since the work of Rice
(1944). It is an upper-bound with no control on the error, given by:
P (M > u) P (X0 > u) + E Uu ([0, T ])
22
(30)
The above mentioned result by Piterbarg (11) shows that in fact, for fixed T and
high level u this bound is sharp. In general, using more than one term of the Rice
series supplies a remarkable improvement in the computation.
We consider several stationary centered Gaussian processes listed in the following
table, where the covariances and the corresponding spectral densities are indicated.
process
X1
X2
X3
X4
covariance
1 (t) = exp(t2 /2)
2 (t) = (ch(t))1
1
3 (t) = 31/2 t sin(31/2 t)
spectral density
f1 (x) = (2)1/2 exp(x2 /2)
1
f2 (x) = 2ch((x)/2)
f3 (x) = 121/2 1I{3<x<3}
4
f4 (x) = 105 (5 + x2 )4
23
u
-2
-1
T
10
1.00
1.00
1.00
1.00
0.98-1.00
0.98-1.00
0.99
0.98-1.00
0.90-1.00
0.87-1.00
0.92-1.00
0.88-1.00
0.74-0.77
0.70-0.76
0.76-0.78
0.72-0.77
0.22
0.21
0.22
0.22
0.02
0.02
0.02
0.02
Table 1: Values of P (M > u) for the different processes. Each cell contains, from
top to bottom, the values corresponding to stationary centered Gaussian processes
with covariances 1 , 2 , 3 and 4 respectively. The calculation uses three terms
of the Rice series for the upper-bound and two terms for the lower-bound. Both
bounds are rounded up to two decimals and when they differ, both displayed.
24
One, three, or two terms of the Rice series (R1, R3, R2 in the sequel) that is
K
P (X0 > u) +
(1)m+1
m=1
m
m!
with K = 1, 3 or 2.
Note that the bound D differs from R1 due to the difference between 1 and
1 . These bounds are evaluated for T = 4, 6, 8, 10, 15 and also for T = 20 and
T = 40 when they fall in the range [0, 1]. Between these values an ordinary spline
interpolation has been performed.
In addition we illustrate the complete detailed calculation in three chosen cases.
They correspond to zero and positive levels u. For u negative, it is easy to check
that the Davies bound is often greater than 1, thus non informative.
For u = 0, T = 6, = 1 , we have P (X0 > u) = 0.5, 1 = 0.955, 1 = 0.602,
2 /2 = .150, 3 /6 = 0.004, so that:
D = 1.455 , R1 = 1.103 , R3 = 0.957 , R2 = 0.953
R2 and R3 give a rather good evaluation of the probability, the Davies bound
gives no information.
For u = 1.5, T = 15, = 2 , we have P (X0 > u) = 0.067, 1 = 0.517,
1 = 0.488, 2 /2 = 0.08, 3 /6 = 0.013, so that:
D = 0.584 , R1 = 0.555 , R3 = 0.488 , R2 = 0.475
In this case the Davies bound is not sharp and a very clear improvement is
provided by the two bounds R2 and R3.
For u = 2, T = 10, = 3 , we have P (X0 > u) = 0.023, 1 = 0.215,
1 = 0.211, 2 /2 = 0.014, 3 /6 = 3104 , so that:
D = 0.238 , R1 = 0.234 , R3 = 0.220. , R2 = 0.220
In this case the Davies bound is rather sharp.
As a conclusion, these numerical results show that it is worth using several terms
of the Rice series. In particular the first three terms are relatively easy to compute
and provide a good evaluation of the distribution of M under a rather broad set of
conditions.
25
Acknowledgements
We thank C. Delmas for computational assistance. This work has received a support
ECOS program U97E02.
References
26
29
, u =1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
8
10
12
Length of the interval
14
16
18
20
Figure 1: For the process with covariance 1 and the level u = 1, representation of
the three upper-bounds D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
2 u =0
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
10
15
Figure 2: For the process with covariance 2 and the level u = 0, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
30
u =2
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
10
15
20
25
Length of the interval
30
35
40
Figure 3: For the process with covariance 3 and the level u = 2, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
4 u =1.5
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
8
10
12
Length of the interval
14
16
18
20
Figure 4: For the process with covariance 4 and the level u = 1.5, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
31
Elisabeth
Gassiat2
Cecile Mercadier1
Introduction
Mixtures of populations is a modelling tool widely used in applications and the literature
on the subject is vast. For finite mixtures, the first task is the choice of the number of
components in the mixture. Some estimation or testing procedures have been proposed
for this purpose, see for instance the books of Titterington et al. (1985), Lindsay (1995)
and McLachlan and Peel (2000) or the papers of James et al. (2001), Gassiat (2002) and
references therein. Asymptotic optimality of the likelihood ratio test (LRT) in several
parametric contexts is well known. Using the LRT for testing the number of components
in a mixture appears quite natural. In one way, simulation studies show that the LRT
performs well in various situations (see Goffinet et al., 1992). In another way, the asymptotic
distribution and power of the test have to be evaluated to compare with other known tests.
In this paper, we focus on the asymptotic properties of the LRT for testing that i.i.d.
observations X1 , . . . , Xn come from a mixture of p0 populations in a parametric set of densities F (null hypothesis H0 ) against a mixture of p populations (alternative H1 ), where the
integers p0 and p satisfy p0 < p.
In Section 2 we apply results of Gassiat (2002) to obtain the asymptotic distribution of
the LRT statistic for testing (H0 ) against (H1 ) under the null hypothesis as well as under
contiguous alternatives. Indeed, Gassiat (2002) gives a quite weak assumption under which
the derivation of the asymptotic distribution of the LRT statistic is made in the general situation when one has to test a small model in a larger one, under the null hypothesis as well
as under contiguous hypotheses. This applies to the number of components in a mixture
of populations in a parametric set with eventually an unknown nuisance parameter. For
However, except for smoothness assumptions, the main point is that these asymptotic results require that the parameter set is bounded.
In Sections 3 and 4 we study what happens when the set of parameters becomes larger
and larger. For simplicity we restrict our attention to the simplest model: the contamination model for family of distributions indexed by a single real parameter. Indeed, roughly
speaking, the LRT statistic converges in distribution to half the square of the supremum of
some Gaussian process indexed by a compact set of scores. But when this set of scores is
enlarged, the covariance of the Gaussian process is close to 0 for sufficiently distant scores,
so that the supremum of the Gaussian process may become arbitrarily large. Thus one also
knows that for unbounded sets of parameters, the LRT statistic tends to infinity in probability, as Hartigan first noted for normal mixtures (see Hartigan, 1985). Here, we prove
that under some extreme circumstances the LRT can have less power than moment tests or
goodness-of-fit tests. At the end of the introduction we draw carefully practical conclusions
from this result.
More precisely, let T be [T, T ] and F = {ft , t T} be a parametric set of probability
densities on R with respect to the Lebesgue measure. Using i.i.d. observations X1 , . . . , Xn ,
we consider the testing problem for the density g of the observations.
(H0 ) : g = f0
against
(H1 ) : g = (1 )f0 + ft , 0 1, t T.
(1)
We prove that:
For general parametric sets F, T = [T, T ] and T large enough, under contiguous
alternatives, the LRT for (1) has asymptotic power close to the asymptotic level,
under some smoothness assumptions, see Theorem 7.
A set of assumptions is given for which Theorem 7 applies in the case of translation
mixtures, that is when ft () = f0 ( t), see Corollary 1. This is done in Section 3.
When ft is the standard Gaussian with mean t we get the normal mixture problem.
When the set of means is not a priori bounded, that is T = R, Liu and Shao (2004)
obtained the asymptotic distribution of the LRT under the null hypothesis by using
the strong approximation proved in Bickel and Chernoff (1993). We prove in Theorem
8 of Section 4 that the asymptotic power under contiguous alternatives is equal to the
asymptotic level.
The way to obtain these results is to gather together: expansion of the LRT obtained in
Gassiat (2002) to identify contiguity and apply Le Cams third Lemma (see van der Vaart,
1998), behaviour of the supremum of a Gaussian process on an interval with bounds tending
to infinity as obtained in Azas and Mercadier (2004), and the normal comparison inequality
as refined in Li and Shao (2002). Proof of most results of Sections 3 and 4 are detailed in
Section 5.
Independently of our work, for the Gaussian model with unbounded means, Hall and
Stewart (2004) obtained the speed of separation of alternatives that ensures asymptotic
power to be bigger than asymptotic level. Their result indicates
that it should be log log n/ n
Practical application
Such tests that have power less or equal to level are sometimes called worthless (see for
example van der Vaart, 1998) but this word would be dangerous because practical interpretation of our result must take into account the following points:
It is well known that for mixtures of population in general the convergence to the
asymptotic distribution is very slow. For example for a very simple test, as the skewness test, Boistard (2003) showed that n = 103 observations are needed to meet the
asymptotic distribution.
For maximum likelihood estimates (MLE) and tests, the problem of the speed of
convergence to the asymptotic distribution is very difficult to address since in practice
MLE are computed through iterative algorithms and are only approximative. The
most famous one is the EM algorithm and its variants. All these algorithms depend
on tuning constants, in particular concerning the stopping rule. It is shown for example
in table 6.3 of McLachlan and Peel (2000) (based on results by Seidel et al., 2000) that
the distribution of the LRT depends heavily on these tuning constants. Simulation
results by Liu and Shao (2004) suggest that their asymptotic distribution is not met
for n = 5.103 observations.
Nowadays some results and softwares are available to compute the distribution of the
maximum of Gaussian processes. See for example Garel (2001), Delmas (2003) and
Mercadier (2004). In particular these results show that, as soon as the means are
contained in some not huge set, the asymptotic power under contiguous alternatives
of the LRT is generally better than that of moment tests or of goodness-of-fit tests.
Nevertheless, the LRT is not uniformly most powerful.
Our result that shows that the LRT is asymptotically less powerful than moment tests
is valid in practice only for very large data sets. For all the reasons above it will be very
difficult to say precisely when. Simulations have proved that in practice LRT based on
Monte-Carlo calculation of threshold or bootstrapping behave well (see Goffinet et al., 1992)
for unbounded parameter.
Our opinion is that the main consequence of our result for large or unbounded parameter
sets is that the study of the LRT for mixtures in the compact case seems to be the more
relevant case.
mixture models, allowing an unknown nuisance parameter and setting a general result in
such situations.
Assume one would like to use the LRT for testing (H0 ) : g M0 against (H1 ) : g
M, where g is the generic density of i.i.d. observations X1 , . . . , Xn , M0 M are sets of
densities with respect to some measure on Rk (or more generally on some Polish space).
n
Let n (g) = i=1 log g(Xi ) be the log-likelihood, and
n = sup
n (g)
gM
sup
n (g)
gM0
be the LRT statistic. Let also g0 be a density in M0 that will denote the true (unknown)
density of the observations. In the first considered examples, and without loss of generality,
we will assume that g0 coincides with f0 .
Throughout the paper we use 2 to denote the norm in L2 (g0 ).
0
appear naturally. Define the set S as the
When studying n (g) n (g0 ), functions gg
g0
2
subset of the unit sphere in L (g0 ) of such functions when normalized:
S=
g g0 g g0
/
g0
g0
2,
g M \ {g0 } ,
(2)
2,
g M0 \ {g0 } .
(3)
g g0 g g0
/
g0
g0
A bracket [L, U ] of length is the set of functions b such that L b U , where L and U are
functions in L2 (g0 ) such that U L 2 . Define H[],2 (S, ) the entropy with bracketing
of S with respect to the norm 2 , as the logarithm of the number of brackets of length
needed to cover S. To apply the theorem in Gassiat (2002), the only needed assumption is:
1
(4)
This assumption implies in particular that S is Donsker and that its closure is compact. As
said before, when M is parameterized, S is also parameterized and smoothness properties will
allow to verify (4). But in general the parameterization will not be continuous throughout
S. The delicate point may be that one has to find all possible limit points, in L2 (g0 ), of
gn g0
0
0
/ gngg
sequences gngg
2 when
2 tends to 0. The set D (resp. D0 ) of limit points of
g0
0
0
gn g0
gn g0
0
sequences gngg
/
where
2
2 tends to 0, gn M \ {g0 } (resp. gn M0 \ {g0 })
g0
g0
0
will be parameterized in such a way that Lipschitz properties can be used on subsets.
Let us for example see how it applies to the simple contamination mixture model (1). In
this case,
M0 = {f0 } , M = {g,t = (1 )f0 + ft , 0 1, t [T, T ]}
for a given positive real number T . Since M0 is a singleton, we do not need to define S0 and
g g
0
, so that
D0 . One has ,tg0 0 = ftff
0
S=
ft f0
2 = 0, which
f0
gn ,tn g0
2 tends to 0
g0
If
dt =
ft f0
f0
,
ft f0
2
f0
t [T, 0) (0, T ] .
f0
f
f00
f0
, d 0+ =
.
D = dt , t [T, 0) (0, T ], d0 =
f0
f0
f0 2
f0 2
Here derivatives are taken with respect to parameter t. Again under smoothness assumptions, it will be possible to prove, considering {dt , t [T, 0), d0 } and {dt , t (0, T ], d0+ }
that the number of brackets of length needed to cover S is of order at most O(1/ ), so
that Assumption (4) holds. (A complete proof is given below for contamination models with
multidimensional parameterization).
In general when M0 contains more than one density, D0 D, and if the parameterization
is smooth enough, it will be possible to define a set U in Rk0 Rk1 and a set U0 in Rk0 such
that
D = {du , u U} and D0 = {d(v,0) , v U0 }.
Define the covariance function r(, ) on U U by
r(u1 , u2 ) =
du1 du2 g0 d.
du (Xi ), 0
sup
max
vU0
i=1
d(v,0) (Xi ), 0
+ oP0 (1),
i=1
uU
(5)
vU0
where Z() is the Gaussian process on U with covariance r(, ) and P0 is the joint distribution
of the observations X1 , ..., Xn under the null hypothesis. In the particular case when M0
is reduced to a single element, a direct application of Corollary 3.1 of Gassiat (2002) gives
that 2 n converges in distribution to
sup (max {Z(u), 0})2 .
(6)
uU
It will be seen in the examples below that r(, ) is in general not continuous everywhere
on the closure of U U. Z() is not a continuous Gaussian field, though the isonormal
process on D is continuous, so that the suprema involved in (5) are a.s. finite. In general,
r(, ) is continuous almost everywhere. In the simple contamination mixture model (1), for
non null s and t,
r(s, t) =
ft f0
f0
ft f0
2
f0
fs f0
f0
fs f0
2
f0
f0 d;
(7)
r is continuous for non zero s and t and admits the following limits
r(0+ , 0+ ) = r(0 , 0 ) = 1, r(0+ , 0 ) = 1.
It is also proved in Gassiat (2002) that
if the densities gn in M \ M0 are such that
0
converges to some du0 with n gngg
2 tending to a positive constant c,
0
n
n
then the distributions (g0 ) and (gn ) are mutually contiguous, and 2 n converges
in distribution under this contiguous alternative to
gn g0
gn g0
2
g0 /
g0
sup (max {Z(u) + c r(u, u0 ), 0}) sup (max {Z(v, 0) + c r((v, 0), u0 ), 0}) .
uU
(8)
vU0
In general (5) and (8) reduce to the square of only one supremum, due to the particular
structure of the Gaussian process.
We will see, in the subsequent subsections, examples such as: translation mixtures, exponential families, in particular Bernoulli or Gaussian mixtures.
2.1
Contamination mixture.
We consider here the contamination mixture model where parameter t may be multidimensional: t T, T being a compact subset of Rk such that 0 belongs to the interior of T. Let
and , denote the Euclidean norm and scalar product in Rk . Again,
M0 = {f0 }, M = {g,t = (1 )f0 + ft , 0 1, t T} ,
S=
ft f0
f0
,
ft f0
2
f0
dt =
tT .
We shall use the following Assumptions (CM), insuring smoothness and some non degeneracy:
(CM)
ft = f0 a.e. if and only if t = 0.
There exists a positive real and a function B L (f0 ) that upper bounds all
following functions:
ft 1 ft
,
, i = 1, . . . , k, t T,
f0 f0 ti
1 2 ft
, i, j = 1, . . . , k, t T, t .
f0 ti tj
Notice that in this assumption the real number is fixed. We shall prove that the condition
(4) holds true for S by splitting it into two sets
S1 = {dt , t T, t } and S2 = {dt , t T, t < }.
Since (g,t f0 )/f0 2 = (ft f0 )/f0 2 tends to 0 as soon as or t tends to 0, it is
easy to see that a limit point exists only if either t converges to a limit different of 0 or tt
converges to some . One obtains easily
D=
dt =
Set ht =
ft f 0 ft f 0
/
f0
f0
ft f0
f0 .
2,
1
t T d =
f0
i
i=1
f0
1
/
ti f0
i
i=1
f0
ti
2,
=1 .
t
dt
= ti
ti
ht 2
ht
ht
ht
f0 d
ht
.
ht 2
This proves that, there exists a constant C such that, for all t and s such that t and
s , |dt ds | C B t s , so that the number of brackets of length needed to
cover S1 is of order at most O(1/ k ) and that Condition (4) holds true for the set S1 .
Now, for any T such that = 1, one has letting t = , R,
(d ) =
k
ht
i=1 i ti
ht
k
ht
i=1 i ti
ht
ht
ht
f0 d
and
in [0, ] , such that
But using Taylor expansions, there exists ,
k
ht =
i
i=1
h
h0 2
=
i
+
ti
ti
2
i=1
i j
i,j=1
2 h
,
ti tj
ht
.
ht 2
i
i=1
ht
=
ti
i
i=1
2 h
h0
i j
+
.
ti
t
t
i j
i,j=1
i=1 i ti
(d ) =
with
Ht =
Ht
h
d
i=1 i ti
i=1
f0 d
ht
.
ht 2
ht
ht
2 h
ht
2 h
1
1
i j
i j
2 ht =
.
ti
ti tj
2 i,j=1
ti tj
i,j=1
But using (CM), this implies that for some constant C, (0, ],
(d ) C B,
and that
lim
(d ) = d
1
2
k
2 h0
i,j=1 i j ti tj
k
h0
i=1 i ti 2
1
2
k
h0
i=1 i ti
k
h0
i=1 i ti 2
k
2 h0
i,j=1 i j ti tj
k
h0
i=1 i ti 2
k
h0
i=1 i ti
k
h0
i=1 i ti 2
f0 d,
d d + d d + d d
C B ( + ) + d d
and that, using (CM), there exists a positive constant C such that
k
inf
=1
i
i=1
h0
ti
C,
= 1,
C B .
ds dt f0 d,
(9)
and let Z() be the Gaussian field on T\{0} with covariance r. Notice that, on each direction
such that t 0 with t/ t , one may extend r(, ) by continuity, setting
r(, t) = r(t, ) =
d dt f0 d ; r(, ) =
d d f0 d.
(10)
Theorem 1 Assume (CM). Then (f0 )n and [((1 n )f0 + n ftn ) ]n are mutually
contiguous, 2 n converges under (f0 )n in distribution to
sup (max{Z(t), 0})2 =
sup Z(t)
tT
tT
sup(Z(t) + (t))
tT
tT
with
(t) = c r(t, t0 ) if tn t0 = 0, and (t) = c r(t, ) if tn 0 and tn / tn . (11)
Remark: Set m 0 under (f0 )n and m under [((1 n )f0 + n ftn ) ]n . Letting
t go to 0 radially in two opposite directions and using covariance properties in the neighbourhood of 0 we see that almost surely suptT (Z(t) + m(t)) > 0 what justifies equalities
in preceding theorem.
Let us give applications of this theorem to particular models:
2.1.1
Translation mixtures
We consider the translation mixture model, where is the Lebesgue measure and
ft () = f0 ( t).
Then, it is easy to see that Theorem 1 applies as soon as the following Assumptions (CTM)
hold:
(CTM)
f0 is positive on Rk ,
There exists a function B L2 (f0 ) that upper bounds all following functions:
f0
1
f0 (x t)
,
(x t) , i = 1, . . . , k, t T,
f0 (x)
f0 (x) xi
2 f0
1
(x t) , i, j = 1, . . . , k, t T, t .
f0 (x) xi xj
f0
ft
ti (x) = xi (x t),
k
f0
i=1 i xi = 0 a.e., so that
k
ft
i=1 i ti
Indeed, since
if is such that
then
unless = 0.
Here are some examples of situations in which these assumptions are met : f0 being
the inverse of a polynomial with degree at least 2, among which the Cauchy density, the
Gaussian densities and the normalization of ch(x)1 .
The covariance function r is given for non null s and t by
r(s, t) =
f0 (x s)f0 (x t)
d(x) 1
f0 (x)
f0 (x s)2
d(x) 1
f0 (x)
f0 (x t)2
d(x) 1
f0 (x)
and if the dimension k = 1, one may define r(0+ , 0 ) = 1, and for non null t
f0 (x)f0 (x t)
d(x)
f0 (x)
r(0+ , t) = r(0 , t) =
f0 (x t)2
d(x) 1
f0 (x)
f02 (x)
d(x)
f0 (x)
2.1.2
Gaussian mixtures
Without loss of generality we may assume that f0 is standard normal. Let K be a bound
for t , t T. Then the following bounds show that the function B exists for any
f0 (x t)
= exp x, t t 2 /2
f0 (x)
f0 (x t)
1
f0
(x t) = |xi ti |
f0 (x) xi
f0 (x)
2
f0 (x t)
1
f0
(x t) = |xi ti ||xj tj |
f0 (x) xi xj
f0 (x)
2
1
f0 (x t)
f0
(x t) = |(xi ti )2 1|
2
f0 (x) xi
f0 (x)
exp K x ,
( x + K) exp K x ,
( x + K)2 exp K x , i = j,
[1 + ( x + K)2 ] exp K x .
So (CTM) holds, and Theorem 1 applies, as soon as f0 is some Gaussian density on Rk and
T is compact. The covariance of the process Z is:
r(s, t) =
2.1.3
exp( t, s ) 1
.
exp( t 2 ) 1 exp( s 2 ) 1
Binomial mixtures
k!
Here is the measure with density x!(kx)!
with respect to the counting measure on the set
{0, 1, . . . , k}. We consider the binomial family Bi(k, ) with density x (1 )kx ; x =
0, 1, ..., k. Let 0 (0, 1) and ft be the density of Bi(k, 0 + t). The most relevant case for
genetic applications is the case 0 = 1/2, see Problem 1 in Chernoff and Lander (1995). We
have
ft (x) = (t + 0 )x (1 t 0 )kx ,
ft
(x) =
t
2 ft
(x) =
t2
kx
x
t + 0
1 t 0
x
kx
t + 0
1 t 0
ft (x),
x
kx
+
ft (x).
2
(t + 0 )
(1 t 0 )2
2
ft
t
It is clear that ft (x) and f
t (x) are uniformly upper bounded and that t2 (x) is upper
bounded for t small enough, proving Assumptions (CM). Direct calculations lead to
t
t
ft f 0
(x) = (1 + )x (1
)kx 1,
f0
0
1 0
r(s, t) =
(s, t)
,
(s, s) (t, t)
with
k
(1 +
(s, t) =
x=0
s
t
t
s x
) (1
)kx 1 (1 + )x (1
)kx 1 0x (1 0 )kx
0
1 0
0
1 0
2.1.4
This case generalizes the preceding. Let ft be a regular exponential family with exhaustive
statistic T (x) = (T1 (x), . . . , Tk (x)):
k
ti Ti (x) (t) ,
and assume T is a compact subset in the interior of the definition set of the exponential
k
family. Then t ft is infinitely differentiable on T. Let F (x) = suptT exp
i=1 ti Ti (x) ,
Assumption (CEM) will be:
(CEM)
There exists B in L2 (f0 ) that upper bounds all following functions: F , |Ti |F ,
|Ti Tj |F , i, j = 1, . . . , k.
One can see easily that (CEM) implies (CM), so that Theorem 1 applies to exponential
families as soon as (CEM) holds. Direct calculations again lead to
ft f 0
(x) = exp
f0
r(s, t) =
i=1
ti Ti (x) (t) 1,
2.2
We consider here the case where one wants to test a single population in the family of
densities ft , t T, T compact subset of Rk against a mixture of two such populations. That
is:
M0 = {ft , t T},
and
M = {g,t1 ,t2 = (1 )ft1 + ft2 , 0 1, t1 T, t2 T} .
We suppose moreover that 0 is an interior point of T and that f0 is the unknown distribution
of the observations (with no loss of generality). We shall use Assumptions (TP), insuring
smoothness and some non degeneracy:
(TP)
(1 )ft1 + ft2 = f0 a.e. if and only if ( = 0 and t1 = 0) or ( = 1 and t2 = 0)
or (t1 = 0 and t2 = 0),
t ft is three times continuously differentiable a.e. at any t T,
k
t
Rk , t T, s T, 0, (fs f0 ) + i=1 i f
ti = 0 a.e. if and only if
s = 0 and = 0,
2
k
ft
and > 0, such that Rk , t T with t i,j=1 i j t i t
= 0 a.e. if
j
and only if = 0,
there exists a function B L2 (f0 ) that upper bounds all following functions:
1 2 ft
ft 1 ft
,
, i, j = 1, . . . , k, t T,
,
f0 f0 ti
f0 ti tj
3 ft
1
, i, j, l = 1, . . . , k, t T, t .
f0 ti tj tl
10
dt,a, =
k
1 f0
i=1 i f0 ti
,
k
1 f0
i=1 i f0 ti 2
0
a ftff
+
0
0
a ftff
+
0
t T \ {0}, Rk , a 0, a + = 1 ,
D0 = {d0,0, , = 1} .
Let r(, ) be as in Section 2.1:
hs
hs
r(s, t) =
ht
ht
f0 d
2
1 f0
f0 ti
1 f0
f0 ti 2
i,j =
f0 d, i, j = 1, . . . , k,
and for any t, let C(t) be the k-dimensional vector of covariances of Z(t) and W :
C(t)i =
1 f0
f0 ti
1 f0
f0 ti 2
ht
ht
f0 d, i = 1, . . . , k.
2
sup
(aZ(t) + , W )
a 0, t T, Rk
a2 + T + 2a T C(t) = 1
Remark that:
sup
, W
(12)
T =1
sup
= W T 1 W,
, W
(13)
T =1
and that the supremum is attained for colinear to 1 W . Then consider the matrix:
=
C(t)T
1
C(t)
with inverse
uT
u M
1 =
()
1
sup
(aZ(t) + , W )
1
()
11
Z(t)
W
(14)
which is equal to
Z(t) +
uT W
+ WT M
uuT
uT W
W = Z(t) +
+ W T 1 W.
Z(t)
W
C(t)T M (t)W
,0
1 + C(t)T M (t)C(t)
1 + C(t)T M (t)C(t) + W T 1 W.
(15)
C(t)T M (t)W
1 + C(t)T M (t)C(t)
1 + C(t)T M (t)C(t) .
Indeed one may see, letting t go to 0 radially in two opposite directions, that the supremum
of the Gaussian process involved in formula (15) is non negative. Let now n , tn1 and tn2
(1n )ftn +n ftn f0
1
2
1
2
/
be sequences such that
2 tends to some dt0 ,a0 ,0 in
f0
f0
(1n )ftn1 +n ftn2 f0
D, with limn+ n
2 = c for some positive constant c. Then, using
f0
the same tricks again:
Theorem 2 Assume (TP). Then (f0 )n and [((1 n )ftn1 + n ftn2 ) ]n are mutually
contiguous, 2 n converges under (f0 )n in distribution to
sup Z(t)
tT
C(t)T M (t)W
1 + C(t)T M (t)C(t)
1 + C(t)T M (t)C(t) ,
1 + C(t)T M (t)C(t) ,
where if t0 = 0 then a0 = 0.
Notice that, when t0 = 0, d0,a0 ,0 = d0,0,0 , and d0,0,0 , dt,a,, , = cC(t)T 0 + c0 . This is
why one has to take a0 = 0 when t0 = 0 in the last formula of Theorem 2.
2.2.1
Examples.
Bernoulli mixtures,
12
case, W is the Gaussian vector with covariance the correlation matrix of the vector (T1 (X), . . . , Tk (X)), when X has density f0 . Recall that the variance matrix of
the vector (T1 (X), . . . , Tk (X)) when X has density f0 is the matrix D2 of second
derivatives of the function at point 0, and the vector C(t) is given by
C(t)i =
2.3
ti (t)
ti (0)
, i = 1, . . . , k.
We consider here the contamination mixture model with some unknown parameter, which is
the same for all populations. A typical example may be that of mixtures of Gaussian distributions with the same unknown variance, or translation mixtures with the same unknown
scale parameter. We shall assume that the nuisance parameter is identifiable, so that its
maximum likelihood estimator is consistent. This will allow to reduce the possible nuisance
parameters in the definition of the set S to be in a neighbourhood of the true unknown one
(recall that S is only a theoretical tool to verify that some theorem apply, and compute the
set of normalized scores, so that this does not restrict the model M, for which the nuisance
parameter is not restricted to be in a neighbourhood of the true one).
Let F = {ft, , t T, A} be a set of densities with respect to some dominating measure
, where T is a compact subset of Rk and A is a compact subset of Rh . We consider here
the case where
M0 = {f0,, A},
and
M = {g,t, = (1 )f0, + ft, , 0 1, t T, A} .
The unknown true distribution of the observations will be f0,0 . We suppose that (0, 0 )
is an interior point of T A. We shall use Assumptions (CMN), insuring smoothness and
some non degeneracy:
(CMN)
(1 )f0, + ft, = f0,0 a.e. if and only if = 0 and [ = 0 or t = 0],
(t, ) ft, is twice continuously differentiable a.e. at any (t, ) T A,
> 0, such that Rh , t T, A with 0 , 0:
f
h
= 0 a.e. if and only if t = 0 and = 0,
(ft,0 f0,0 ) + i=1 i 0,
i
k
t,0
0,
and Rk , t , 0 :
i=1 i ti +
i=1 i i = 0 a.e. if and
only if = 0 and = 0 .
There exists a function B L2 (f0,0 ) that upper bounds all following functions:
1
ft,
ft,
ft,
1
, i = 1, . . . , k,
, i = 1, . . . , h, (t, ) TA, 0 ,
,
f0,0 f0,0 ti
f0,0 i
1
f0,0
1
f0,0
2 ft,
2 ft,
1
, i, j = 1, . . . , k,
, i = 1, . . . , k, j = 1, . . . , h,
ti tj
f0,0 ti j
2 ft,
, i, j = 1, . . . , h, (t, ) T A, 0 , t .
i j
Then, since the maximum likelihood estimator of parameter is consistent, one only needs
to verify Assumption (4) for
S=
2,
0 1, t T, A, 0 ,
where we restrict our definition to , t and such that (1 )f0, + ft, differs from f0,0 .
One has also
S0 =
2,
13
0 1, A, 0 .
i
i=1
f0,0
f0,0
+
i
,
i
ti
i=1
Ht,,, /f0,0
.
Ht,,, /f0,0 2
dt,,, =
=1 ,
D0 = d0,0,,0, Rh , = 1 .
Note that due to the existence of the nuisance parameter which is fixed to 0 , now D does
not contain S.
It will be possible to obtain the asymptotic distributions in the same way as in Section
2.2. Let again
ht
hs
f0,0 d
r(s, t) =
hs 2
ht 2
with ht = (ft,0 f0,0 )/f0,0 , and Z() the associated Gaussian field.
Note that this process is the same as the one of Section 2.1 if we set f0 = f0,0 . Let also
f0,0
0
W , and C(t) be the same as in Section 2.1 replacing f
ti by
ti .
Let V be the h-dimensional centered Gaussian variable with variance :
f
f
i,j =
1
f0,0
f0,0
1
f0,0
0,0
i
f0,0
i
f0,0
0,0
j
f0,0
j
f0,0 d, i, j = 1, . . . , h,
and for any t, let G(t) be the h-dimensional vector of covariances of Z(t) and V :
1 f0,0
ht
f0,0 i
G(t)i =
f0,0 d, i = 1, . . . , h.
1 f0,0
h
t 2
2
f0,0
f
f
Si,j =
1
f0,0
1
f0,0
1
f0,0
0,0
i
f0,0
i
1
f0,0
0,0
tj
f0,0
tj
N (t) =
f0,0 d, i = 1, . . . , h, j = 1, . . . , k.
C(t)T
G(t)
,
1
ST
S
U (t)U (t)T
Theorem 3 Assume (CMN). Then (f0,0 )n and [((1 n )f0,n + n ftn ,n ) ]n are
14
U (t)T N (t)
1 + U (t)T N (t)U (t)
W
V
+
ST
S
W
V
V T 1 V,
W + c0 + c0 C(t0 )
V + c0 + c0 G(t0 )
U (t)T N (t)
1 + U (t)T N (t)U (t)
ST
S
W + c0 + c0 C(t0 )
V + c0 + c0 G(t0 )
W + c0 + c0 C(t0 )
V + c0 + c0 G(t0 )
2.3.1
There exists a function B L2 (f0,1 ) that upper bounds all following functions:
f0,1 (x t)
f0,1
1
(1 + |xi |)
,
(x t) , i = 1, . . . , k, t T,
f0,1 (x)
f0,1 (x)
xi
1
2 f0,1
(x t) , i, j = 1, . . . , k, t T, t .
(1 + |xi ||xj |)
f0,1 (x)
xi xj
These assumptions are met when f0,1 is the inverse of a polynomial with degree at least 2,
among which the Cauchy density, or the Gaussian densities and the normalization of ch(x) 1 .
2.3.2
Here h = k(k + 1)/2 since is the unknown variance. It is easy to see that Assumptions
(CTMN) hold, and Theorem 3 applies, as soon as the ft, are the Gaussian distributions
N(t, ) on Rk , T is compact, A is a compact subset of symmetric matrices that are positive
definite.
2.4
15
i fti , .
gp,,T, =
(16)
i=1
M0 =
i = 1, i = 1, . . . , p0
i=1
M=
i = 1, i = 1, . . . , p .
i=1
To understand what happens and how to do computations, the main point is to understand
how two mixtures with eventually different number of populations may become close.
The main weak identifiability Assumption (WID) will be that gp,,T, = gq, ,T , if and
p
q
only if = and
i=1 i ti =
i=1 i t i where z is the Dirac measure at z.
Then, if the parameterization (t, ) ft, is smooth enough, two mixtures become close
if their parameter becomes close, and their mixing measure becomes close in the weak
topology.
Let now g0 = gp0 ,0 ,T0 ,0 be a particular mixture in M0 which has exactly p0 populations
and not fewer, that will denote the true unknown density of the observations. We denote by
t0,i the elements of T0 . Since parameter is identifiable, its maximum likelihood estimator
is consistent under weak smoothness assumptions, so that to define the sets S and S0 by (2)
and (3), one may restrict by 0 for some small . Then, as seen in the previous
subsections, the main point is to find D and D0 , so as to be able to:
understand how parameterization and smoothness may be used to compute the order
of the bracketing entropy,
define the Gaussian process that is used in the limiting distribution.
For these points, smoothness assumptions and bounding with a square integrable function
have to be used together with some non degeneracy of functions that come in the norm
appearing in denominator, when this one goes to zero. In fact, if it may be degenerate, it
means that one has to go further in the order of the Taylor development until non degeneracy.
This, of course, depends on particular examples.
A rather general situation is the following. Let q = p p0 . Denote by Dt ft, the kdimensional vector of derivatives of ft, with respect to t, D ft, the h-dimensional vector
of derivatives of ft, with respect to , Dt2 ft, the k k-dimensional matrix of second
derivatives of ft, with respect to t. Introduce Assumptions (GM):
(GM)
(t, ) ft, is three times continuously differentiable a.e. at any (t, ) T A,
> 0 such that, for all i Rh , ti Rk , i Rk , i Rh , i R, i = 1, .., p0 , for
all 1 , . . . , q 0 such that i 0 , ti t0,i , i i + j j = 0 then :
q
p0
p0
p0
i
i
ti ,0 = 0 a.e.
i=1 i fti ,0 +
i=1 i fti,0 ,0 +
i=1 , D fti,0 ,i +
i=1 , Dt f
if and only if
q
p0
1
p0
= 0 and 1 = 0, . . . , p0 = 0,
i=1 i fti ,0 +
i=1 i ft0,i ,0 = 0, = 0, . . . ,
For any subset J of at most inf{p0 , q} points in T such that for each one there is one
of the t0,i s at distance at most , for any ( j )jJ of vectors of Rk , for any 1 , . . . , p0
in Rh :
p0
j T 2
j
1
p0
i
=
jJ ( ) Dt fj,0 ( ) = 0 a.e. if and only if = 0, . . . ,
i=1 , D ft0,i ,i +
j
0 and = 0, j J;
16
There exists a function B L2 (g0 ) that upper bounds all following functions:
ft, 1 ft,
1 ft,
, i = 1, . . . , k,
, i = 1, . . . , h, (t, ) T A, 0
,
g0 g0 ti
g0 i
1 2 ft,
, i, j = 1, . . . , k, (t, ) T A, 0
g0 ti tj
1 2 ft,
1 2 ft,
, i = 1, . . . , k, j = 1, . . . , h,
, i, j = 1, . . . , h,
g0 ti j
g0 i j
1
3 ft,
1
3 ft,
, i, j, l = 1, . . . , k,
, i, j = 1, . . . , k, l = 1, . . . , h
g0 ti tj tl
g0 ti tj l
(t, ) T A, 0 , t ti for some i.
Set = (( 1 )T , . . . , ( p0 )T ), with i Rh ; = (( 1 )T , . . . , ( p0 )T ), with i Rd ; T =
(t1 , . . . , tq ) Tq ; = (1 , . . . , q ) Rq ; = (1 , . . . , p0 ) [0, 1]p0 ,
i=1
i=1
i=1
and
i , Dt ft0,i ,0 ,
i , D ft0,i ,0 +
i ft0,i ,0 +
i fti ,0 +
i=1
p0
p0
p0
HT,,,, =
HT,,,, /g0
.
HT,,,,/g0 2
dT,,,, =
Define now:
K = (T, , , , ) : 1 , . . . , q 0;
1
+ . . . + p0
+ 1
i = 0;
i +
i
i
2
+ . . . + p0
i2 = 1; H(T,,,,) = 0 .
2i +
Then:
D = {dT,,,,, (T, , , , ) K} ,
and
D0 = {d0,0,0,,} .
It will be possible to obtain the asymptotic distributions in the same way as in Section 2.2
under Assumptions (WID) and (GM). Define the Gaussian field Z(T, , , , ) on K with
covariance
r ((T, , , , ), (T , , , , )) =
dT,,,, dT , , , , g0 d.
Notice that, as in previous sections, K is not closed, and r(, ) is not continuous on some
limiting points, but may be extended in some sense, as has been done for instance in Section
2.1.
g
g
Let also pn , n , Tn , n be such that n pn ,n ,Tg0n ,n 0 2 tends to some positive constant
g
g
g
g
c, with pn ,n ,Tn ,n 0 / pn ,n ,Tn ,n 0 2 tending to d in the closure of D.
g0
g0
Theorem 4 If (WID) and (GM) hold, then (g0 )n and (gpn ,n ,Tn ,n )n are mutually
contiguous, 2 n converges under (g0 )n in distribution to
2
sup
(T,,,,)K
Z(T, , , , )
17
sup
(0,0,0,,)K
Z(0, 0, 0, , )
sup
0 d
dT,,,, dg
Z(T, , , , ) + c
(T,,,,)K
2
sup
Z(0, 0, 0, , ) + c
0 d
d0,0,0,,dg
. (17)
(0,0,0,,)K
It is possible to reduce the formula of the asymptotic distributions in Theorem 4 into only
one supremum, using linear algebra computations as in previous sections. We shall not give
the result for all situations since it involves too long and complicated formula. However, in
case q = 1, the result takes a simpler form that we will give below. For this one needs to
define notations. When q = 1, is reduced to and T reduces to t so that elements of D
may be written as dt,,, with
i=1
p0
p0
p0
Ht,,, =
i (ft,0 ft0,i ,0 ) +
i , Dt ft0,i ,0 .
i , D ft0,i ,0 +
i=1
i=1
0
where i=1
i 0.
Let W be the p0 (h + d)-dimensional centered Gaussian random variable with variance
such that for all and ,
H0,0,,
g0
2
2.
Let Z(t) be the (p0 )-dimensional centered Gaussian field with covariance the p0 p0 matrix
(, ) such that for all t1 , t2 ,
f f
f f
t0,i ,0
t,0
(t1 , t2 )i,j =
and let
g0
ft,0 ft0,i ,
0
g0
t0,j ,0
t,0
g0
ft,0 ft0,j ,
0
g0
g0 d,
p0
C(t) =
i=1
ft,0 ft0,i ,
0
g0
,
ft,0 ft0,i ,
0
2
g0
H0,0,,
g0
2,
C1 C T
= M C
A =
(18)
C M C
(19)
1
(20)
Let 1 denote the p0 -dimensional vector with all coordinates equal to 1. Then:
Theorem 5 Assume (WID) and (GM), and p = p0 + 1. Then 2 n converges under (g0
)n in distribution to
sup AZ + U T W
t
A1
11T
1
T
T
1T A1 (AZ+U W ) 1<0
AZ + U T W
The distribution under contiguous alternatives is rather difficult to express in its full generality so it is omitted for simplicity. The proof of Theorem 5 is given in Section 5.
18
0
In the case of Gaussian mixtures with unknown variance, the assumption i=1
i , D ft0,i ,i +
j T 2
j
1
p0
j
= 0 and = 0, j J
jJ ( ) Dt fj,0 ( ) = 0 a.e. if and only if = 0, . . . ,
does not hold. Indeed, second derivatives with respect to t are proportional with derivatives
with respect to . In this case, it is necessary to go further in the Taylor development:
when taking third derivative with respect to t, the condition of non degeneracy holds. Also,
all derivatives till fourth order may be uniformly upper bounded with some function B as
needed. Since the limiting points of process Z need not to be known at boundary values of
K to define the asymptotic distribution of n , the following result holds:
Theorem 6 The asymptotic distributions under the null hypothesis and under contiguous
hypotheses given in Theorem 4 and Theorem 5 hold for Gaussian mixtures with the same
unknown variance matrix.
for testing one population against two (or p0 against p) the LRT with bounded parameter is not invariant by translation or change of scale.
Several solutions to the first point exist. Threshold calculation can be conducted under
the worst form of the null hypothesis (see Delmas, 2003) or one can use a Plug-in, that
is an estimate of f0 . It remains that results would be nicer if one would be able to get rid of
the compactness assumption. This section and the next one answer by the negative, showing
that in the simplest case: contamination for translation mixtures on R, the LRT is theoretically less powerful than moment tests under contiguous alternatives. As already said in the
introduction, the convergence to this result is very slow, so it is not so relevant in practice.
It mainly shows that it is difficult to construct an unbounded asymptotic theory of the LRT.
We consider in this section the contamination mixture model (1) with T = [T, T ] for
a given positive real number T and the Lebesgue measure. We use notations and results
of Section 2.1. Let n and tn be sequences such that:
19
i+j
i s j t r(s, t)
2 log R(t), bt = at
log(2)
at
and bt = at
t
0
log()
at .
t R,
(G2)
(G3)
>0
(G4)
|R(s)R(t)| >
and
lim R(t) = +,
t+
Y
Y
> 0, r01
and r04
are bounded on {(s, t) R2 , |s| > and |t| > },
(G5)
We have:
Theorem 7 Assume (CM) and (G). Then, as T tends to infinity, aT M (T, T ) bT
tends in distribution to the Gumbel distribution when m(t) = 0 as well as when m(t) = (t).
In other words, if cT, is the threshold of the test defined by
lim P0 (n > cT, ) = ,
n+
then for any contiguous alternative, the limiting power of the LRT equals its level:
lim
T + n+
(H2)
(H3)
(H4)
(H5)
(i)
0 (t)
M > 0, x, t R, f0f(x)f
M,
0 (x+t)
2
F L () : sup|t|1 log |t| f0 (x + t) F (x),
limt+ log(t) f0 (t) = 0.
20
Assumptions (H1) to (H5) are essentially conditions on the tail of f0 . (H4) and (H5)
are very weak and hold for all usual distributions. But (H1) to (H3) though rather
weak, are more restrictive. They hold for example if f0 (t) = O(t ) for > 0 as
t + and f0 (t) = O(t ) for > 0 as t . For instance, they hold for f0
being the inverse of a polynomial and in particular for the Cauchy density.
The proof relies on the verification of assumptions of Theorem 7. In particular, asymptotic behaviours of the covariance r and its derivatives have to be checked. Assumptions (H) only express sufficient conditions under which the asymptotic analysis is
done with some generality. However, though (H2) does not hold for the Gaussian
density, we also verified that Theorem 7 holds for other densities such as the Gaussian
and the normalization of ch(x)1 in spite of different justifications.
LRT has to be compared with other testing procedures such as sample mean or KolmogorovSmirnov testing procedures.
- Denote by i = xi f0 (x)d(x). Without loss of generality one can assume that 1 = 0.
If 2 < + applying Le Cams third Lemma, that is Theorem 6.6 of van der Vaart
(1998),
ft0 f0
2
f0
if tn t0 = 0.
n Fn F0
and W2n =
Set on [0, 1]
(x) =
F0 F01 (x) tn x
,
n+
tn
lim
1
0
under P0 and U +
U2 dI under P0 and
See Shorack and Wellner (1986) for a version of theses convergences. Simulations show that
in both cases, the distribution under Pn ,tn is stochastically greater than that under P0 .
Consequently the asymptotic power is greater than the level.
4 Asymptotic distribution of the LRT for Gaussian contamination mixtures with unbounded mean under contiguous alternatives.
Consider T = R (no prior upper bound) and the testing problem (1) with
(x t)2
1
ft (x) = exp
2
2
21
Set
g0 = f0 and g,t = (1 )f0 + ft , 0 1, t T.
n =
sup
log 1 + exp[tXi
[0,1], tR i=1
t2
]1
2
Then:
Theorem 8 As n tends to infinity, 2 n log log n + log(2 2 ) tends in distribution to
the Gumbel distribution under P0 as well as under Pn ,tn for any and t0 . In other words,
let us define as rejection values the region: (n > c,n ) with
c,n =
1
G1 + log log n log(2 2 ) ,
2
n+
n+
The theorem says that asymptotically, the LRT cannot distinguish the null hypothesis from
any contiguous alternative. Indeed, this has to be compared with other testing procedures
such as moment testing
procedures. For example, if X n is the sample mean, applying Le
Cams third Lemma, n X n converges in distribution, under P
n ,tn as n tends to infinity,
to the Gaussian N (, 1). Thus the test based on the statistic n X n has an asymptotic
power that is strictly greater than the level. As mentioned in the introduction this makes
sense in practice only for very large data sets.
Proof of Theorem 8
The separation of the hypotheses is greater when = 0. Using Lemma 14.31 of van der
Vaart (1998) it is easy to see that this is the only case to consider. Moreover by symmetry
we can suppose also that > 0. Let us introduce Sn the empirical process defined by
1
Sn (t) =
n
i=1
exp[tXi t2 ] exp(
t2
) .
2
Liu and Shao (2004, Theorem 1) recall results obtained by Bickel and Chernoff (1993) on
the process Sn :
sup Sn (t) = sup Sn (t) + oP0 (1)
tR
|t|A2,n
where A2,n = [n , n ], n = 2 log log log n and n = log n/2 2 log log n.
Through the proof of their Theorem 2 Liu and Shao (2004) state that
2 n
22
(21)
|t|A2,n
(22)
|t|A2,n
where S0 is the zero mean non-stationary Gaussian process with covariance function
(s t)2
s2
t2
] exp[ ].
2
2
2
(s, t) = exp[
In their paper, Bickel and Chernoff remark that this process is very close to a stationary
process namely S0 . Because we need it later, we will use here an other way. We define the
standardized version of S0
Y0 (t) =
S0 (t)
(t, t)
S0 (t)
1 et2
in order to be able to apply the Normal Comparison Lemma (Li and Shao, 2002, Theorem
2.1). Y0 is a zero mean non-stationary Gaussian process, with unit variance and covariance
function
exp st 1
r(s, t) =
exp s2 1 exp t2 1
(23)
We have
0
|t|A2,n
sup (1
|t|A2,n
Now the function r satisfies conditions of Corollary 1 of Azas and Mercadier (2004). Consequently we know the exact order of the maximum
1
|t|A2,n
This last equation can also be deduced from standard result on the maximum of stationary
Gaussian processes using the process S0 introduced by Bickel and Chernoff (1993).
On the other side, the maximum of 1 (t, t) on A2,n is obtained at n . This permits us
to write
1
(24)
|t|A2,n
Sn (t)
1 et2
For all t0 and all , using argument close to those that lead to formula (7) in Gassiat (2002)
we have
C(, t0 )2
dPn ,tn
(X1 , . . . , Xn ) = C(, t0 )Yn (tn )
+ oP0 (1)
(25)
log
dP0
2
23
2
t
with C(, t0 ) = if t0 = 0 and C(, t0 ) = et00 1 if t0 > 0. Since can be supposed
positive, t0 is positive. A detailed proof of formula (25) is given in Section 5. Using the
formula (39) of Bickel and Chernoff (1993), we can replace Yn by Y0 to get
log
C(, t0 )2
dPn ,tn
(X1 , . . . , Xn ) = C(, t0 )Y0 (tn )
+ oP0 (1).
dP0
2
(26)
dPn ,tn
dP0
Then, as soon as one proves Lemma 1 the theorem follows from a generalization of Le Cams
third Lemma. The proof of Lemma 1 relies on a suitably chosen discretization, following
ideas in Azas and Mercadier (2004), and an application of the normal comparison lemma
as refined in Li and Shao (2002).
Proofs
5.1
Proof of Theorem 5
= 1, T 1 0,
where
CT
(27)
(28)
Consider the sumpremum under the first constraint. Then, similarly to the proof of Theorem
2, the value of the supremum is
AZ + U T W
A1 AZ + U T W + W T 1 W
and it is attained on some such that T 1 has the same sign as (AZ + U T W )T 1.
If (AZ + U T W )T 1 < 0, then the supremum of (27) under (28) equals the supremum of (27)
under the constraints
= 1, T 1 = 0.
(29)
Computation of this supremum using Lagrange multipliers leads to the fact that it is equal
to
11T
T
AZ + U T W + W T 1 W
AZ + U T W
A1 T
1 A1
and the Theorem is proved.
5.2
Proof of Theorem 7
Set uT,x =
Z + m.
x
aT
24
We have
P M (T, T ) uT,x = P M V R(T ), R(T ) uT,x .
Now, applying with p = 2, D1 = R(T ), R(T ) and D2 =
4 of Azas and Mercadier (2004), we obtain
= P aT M (0, T ) bT x + o(1)
= P aT M (0, T ) bT x + log(2) + o(1)
= G(x + log(2)) + o(1).
Since the same equality holds on (T, 0), one can conclude that
P M (T, T ) uT,x = G(x + log(2))2 + o(1) = G(x) + o(1).
5.3
Proof of Corollary 1
t[T,T ]
f0 (t) = T > 0.
25
N (s, t) =
Differentiation of r, for s and t in R \ {0}, is a consequence of that of N (s, t). Now, for any
integers i 4 and j 4, using (H1) and (H3)
(i)
(j)
f0 (x t)f0 (x s)
f0 (x t)f0 (x s)
f0 (x)
K i Kj
K i Kj M 2
f0 (x)
f0 (x)
f0 (t)f0 (s)
and f0 (t)f0 (s) is positively lower bounded on the neighbourhood of any (s0 , t0 ), which proves
that N is differentiable at any (s, t) (R \ {0})2 with
(i)
i+j N
(s, t) = (1)i+j
it j s
(j)
f0 (x t)f0 (x s)
d(x).
f0 (x)
r11 (t, t)
f02 (xt)
f0 (x) d(x)
f02 (xt)
f0 (x) d(x)
f0 (xt)2
f0 (x) d(x)
f0 (t) 2 f0 (t)f0 () 2
2
2
f0 ()
f0 ()
f0 (xt)f0 (xt)
d(x)
f0 (x)
2
f0 (t) f0 (t)f0 ()
2
f0 () ,
f0 ()
f0 (t)f0 () 4
2
f0 ()
f02 d
t+
f02 d
f02 d
f0 f0 d
2
f02 (x)
d(x)
f0 (x + t)
B(t)
f02 (x)
d(x)
f0 (x + t)
C(t)
f0 (x)f0 (x)
d(x).
f0 (x + t)
Thanks to (H1) and (H3), integrands of Af0 , Bf0 and Cf0 are respectively dominated by
M f0 (x), K12 M f0 (x), K1 M f0 (x).
By application of (H2) and Lebesgue Theorem, we conclude using the following conver-
26
gences:
lim A(t)f0 (t) =
t+
t+
t+
(30)
|st|+
r(s, t) log |s t| = 0.
(31)
Using (H3),
f0 (t)f02 (x)
M f0 (x),
f0 (x + t)
so that using (H2),
f0 (t)f02 (x)
d(x) =
f0 (x + t)
lim
t+
f02 (x)d(x),
r(s, t) C
f0 (s)
f0 (x t)f0 (x s)
d(x).
f0 (x)
CM
f0 (x)
f0 (x + s t)d(x).
|st|+
log |s t|
f0 (x + s t) = 0,
and so, one may apply Lebesgue Theorem using (H4) to obtain (31).
Proof of (G5): (G5) is a consequence of (G2) and formula (11) giving (t).
Proof of (G3): Using (30) and the fact that r11 > 0, one just has to prove that for any
> 0,
sup |r(s, t)| < 1.
(32)
|st|>
First of all, r(s, t) is a continuous function of (s, t) and |r(s, t)| < 1 as soon as s = t by
Cauchy-Schwarz inequality. Thus for any > 0, for any compact set K,
sup
|st|>, tK, sK
On the other hand because of (G2) for |s t| sufficiently large r(s, t) is bounded away from
1, so we may suppose that |s t| is bounded. Suppose that there exists sn and tn such that
|sn tn | is bounded, |sn tn | > and r(sn , tn ) 1. By compactness it would be possible
to choose subsequences s(n) and t(n) such that s(n) t(n) c. But using the same
tricks as before (using (H2), (H3) and Lebesgue Theorem),
lim r(s(n) , t(n) ) =
n+
f0 (x)f0 (x + c)d(x)
f02 (x)d(x)
Since |c| > 0 this value differs from 1. Hence we get a contradiction with assumptions
27
f0 (t) f0 (t)f0
f0 (t)f0 f0 (s)f0
,
,
f0
f0
f0
f0
f0
f0
f0 (t)f0 3
f0 (s)f0
2
2
f0
f0
f0 (t)
2
f0
f0 (t)f0
2
f0
This upper bound is a continuous function on t. By making appear f0 (t), it is easily seen
that it converges, as t tends to infinity, to
2
f02 d
f02 d
Moreover for any > 0, the denominator is lower bounded on D = {(s, t), s R, |t| > }.
Consequently for any > 0, (s, t) r01 (s, t) is bounded on R2 \ D .
Using easy but tedious computations and Cauchy-Schwarz inequality once more, we
have:
(k)
|r04 (s, t)|
i1 j1
f0 (t)
4
k=1
f0
f0 (t)f0
f0
ijk
2
i
2
4
where the sums on i and j are finite and where for any i and j:
k=1 ijk = i. Previous
arguments run again and permit us to assert that for any > 0 the function (s, t) r 04 (s, t)
is bounded on R2 \ D .
5.4
dPn ,tn
(X1 , . . . , Xn ) =
dP0
log 1 + n (etn Xi
i=1
n
(etn Xi
= n
t2
n
2
i=1
with
|S| nn2 t2n
1
n
i=1
t2
n
2
tn Xi
t2
n
2
tn
1)
1) +
n2
2
1
L n tn max
(etn Xi
t2
n
2
i=1
i=1,...,n
tn Xi
t2
n
2
tn
1)2 + S, (33)
t2
tn X 2n
e
1
, n = 1, 2, . . . for X of
Now it suffices to remark that the random variables
tn
distribution N (0, 1) have bounded third moment. Applying the Markov inequality
t2
tn Xi 2n
e
1
max
= oP0 ( n)
i=1,...,n
tn
28
x
tn
is Glivenko-Cantelli in probability (indeed, it is the square of a Donsker class, as a consequence of Section 2.1), we get S = oP0 (1) and
1
n
[(etn Xi
t2
n
2
i=1
so that
log
dPn ,tn
(X1 , . . . , Xn ) = nn
dP0
=
nn
t2n
and
log
5.5
nn
(etn Xi
i=1
1Yn (tn ) + n
n2 1
2 n
t2
n
2
n2 t2n
(e 1) + oP0 (1). (34)
2
et0 1
t0
we have
C(, t0 )2
dPn ,tn
(X1 , . . . , Xn ) = C(, t0 )Yn (tn )
+ oP0 (1).
dP0
2
Proof of Lemma 1
1
tA2,n
tA2,n
2P
P
n
dAq2,n
n
dAq2,n
(35)
tA2,n
n
dAq2,n
The task is now to prove that for fixed x and y each component of the upper bound converges to 0.
We define the following modification of the function r
r(t0 , t) = 0
r(s, t) = r(s, t)
n
t Aq2,n
, t = t0 ,
n
s, t Aq2,n
.
Note that under the Gaussian distribution defined by r the value of the process at t0 is
independent of the values of the process at other locations whose distribution does not
changes. This proves that r is a covariance function. We define (t) = supu, |ut0 |>t |r(u, t0 )|.
29
t2
2
(x + cn )2
c2
n
2(1 + (n ))
12
The Normal Comparison Lemma (Li and Shao, 2002, Theorem 2.1) gives bounds to terms
of the type
P Y1 u1 , ..., Yn un P Y1 u1 , ..., Yn un
where Y and Y are two centered Gaussian vectors with the same variance and possibly
different covariances ij and ij , i, j = 1, ..., n. It says that
P Y1 u1 , ..., Yn un P Y1 u1 , ..., Yn un
1
2
1i<jn
arcsin(ij ) arcsin(
ij )
exp
u2i + u2j
2(1 + ij )
(36)
n
dAq2,n
(Const)
(Const)
n
dAq2,n
n
dAq2,n
n
dAq2,n
|r(d, t0 )| exp
(x + cn )2 + y 2
2 1 + |r(d, t0 )|
|r(d, t0 )| exp
c2n
12
c2
(Const)
exp n
qn
12
(t)dt =
n qn
c2
(Const)
exp n
qn
12
as soon as > 0.
To deal with the first term of (35), we denote by Uz and Uzqn the point processes of
up-crossings of level z for Y0 and its qn -polygonal approximation (linear interpolation) respectively. For any subset B of R,
Uz (B)
Uzqn (B)
= # l Z, qn (l 1) B, qn l B, Y0 qn (l 1) < z < Y0 qn l
n
dAq2,n
tA2,n
qn
(A2,n ) = 0
P Y0 (n ) > x + cn + P Y0 (n ) x + cn , Ux+cn (A2,n ) 1, Ux+c
n
qn
(x + cn ) + E Ux+cn (A2,n ) Ux+c
(A2,n )
n
where the last upper bound is due to Markov inequality. The first term above tends trivially
30
2 of Azas and Mercadier (2004) is met. It is easy to check that since E Ux+cn (A2,n ) is
bounded we are in the condition of application of that lemma and
qn
E Ux+cn (A2,n ) Ux+c
(A2,n ) = o(1).
n
References
Azas, J.-M. and Mercadier, C. (2004), Asymptotic Poisson character of extremes in non-stationary
Gaussian models, To appear in Extremes.
Bickel, P. and Chernoff, H. (1993), Asymptotic distribution of the likelihood ratio statistic in a
prototypical non regular problem, Statistics and Probability: A Raghu Raj Bahadur Festschrift
(J.K. Ghosh, S.K. Mitra, K.R. Parthasarathy and B.L.S. Prakasa Rao eds), 8396, Wiley Eastern
Ltd.
Boistard, H. (2003), Test of goodness-of-fits for mixture of population, Technical report, Universite
Toulouse III, France.
Chen, H. and Chen, J. (2001), Large sample distribution of the likelihood ratio test for normal
mixtures, Statist. Probab. Lett., 2, 125133.
Chernoff, H. and Lander, E. (1995), Asymptotic distribution of the likelihood ratio test that a
mixture of two binomials is a single binomial, J. Statist. Plan. Inf., 43, 1940.
(1997), Testing in locally conic models, and application to
Dacunha-Castelle, D. and Gassiat, E.
mixture models, ESAIM Probab. Statist., 1, 285317.
(1999), Testing the order of a model using locally conic
Dacunha-Castelle, D. and Gassiat, E.
parameterization: population mixtures and stationary ARMA processes, Ann. Statist., 27(4),
11781209.
Delmas, C. (2003), On likelihood ratio tests in Gaussian mixture models, Sankhya, 65(3), 119.
(2002), Likelihood ratio inequalities with applications to various mixtures, Ann. Inst.
Gassiat, E.
H. Poincare Probab. Statist., 6, 897906.
Garel, B. (2001), Likelihood Ratio Test for Univariate Gaussian Mixture, J. Statist. Plann. Inference, 96(2), 325350.
Ghosh, J. and Sen, P. (1985), On the asymptotic performance of the log likelihood ratio statistic
for the mixture model and related results, Proceedings of the Berkeley Conference in Honor of
Jerzy Neyman and Jack Kiefer, vol. II, 789806, Wadsworth, Belmont, CA.
Goffinet, B., Loisel, P., and B. Laurent. (1992), Testing in normal mixture models when the
proportions are known, Biometrika 79, 842846.
Hall, P. and Stewart, M. (2004), Theoretical analysis of power in a two-component normal mixture
model. Private communication.
Hartigan, J.A. (1985), A failure of likelihood asymptotics for normal mixtures, In Proceedings of
the Berkeley conference in honor of Jerzy Neyman and Jack Kiefer (Berkeley, Calif., 1983), Vol.
II, 807810, Wadsworth, Belmont, CA.
James, L.F., Priebe, C.E., and Marchette, D.J. (2001), Consistent Estimation of Mixture Complexity, Ann. Statist., 29, 12811296.
Lemdani, M. and Pons, O. (1997), Likelihood ratio test for genetic linkage, Statis. Probab. Lett.,
33(1), 1522.
Lemdani, M. and Pons, O. (1999), Likelihood ratio tin contamination models, Bernoulli, 5(4),
705719.
Li, W.V. and Shao, Q. (2002), A normal comparison inequality and its applications, Probab. Theory
Related Fields, 122(4), 494508.
31
Lindsay, B. G. (1995), Mixture models: Theory, geometry, and applications, NSF-CBMS Regional
Conference Series in Probability and Statistics, Vol. 5, Hayward, Calif.: Institute for Mathematical Statistics.
Liu, X. and Shao, Y. (2004), Asymptotics for the likelihood ratio test in two-component normal
mixture models, J. Statist. Plann. Inference, 123(1), 6181.
McLachlan, G. and Peel, D. (2000), Finite mixture models, Wiley Series in Probability and
Statistics: Applied Probability and Statistics, Wiley-Interscience, New York.
Mercadier, C. (2004), Computing the distribution of the maximum of random processes and fields:
how far are the Rice and Euler Characteristics valid. Preprint, Universite Toulouse III, France,
http://www.lsp.ups-tlse.fr/Fp/Mercadier.
Mosler, K. and Seidel, W. (2001), Testing for homogeneity in an exponential mixture model, Aust.
N. Z. J. Stat., 43(3), 231247.
Seidel, W., Mosler, K., and Alker, M. (2000), A cautionary note on likelihood ratio tests in mixture
models, Ann. Instit. Statist. Math., 52, 481487.
Shorack, G.R. and Wellner, J.A. (1986), Empirical processes with applications to statistics, Wiley
Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics, John
Wiley & Sons Inc. New York.
Titterington, D.M., Smith, A.F.M., and Makov, U.E. (1985), Statistical analysis of finite mixture
distributions, Wiley Series in Probability and Mathematical Statistics: Applied Probability and
Statistics, John Wiley & Sons Ltd.
van der Vaart, A. W. (1998), Asymptotic statistics, Cambridge Series in Statistical and Probabilistic
Mathematics, Cambridge University Press, Cambridge.
32
URL: http://www.emath.fr/ps/
The precise knowledge of the distribution of the random variable M is essential in many of statistical problems;
for example, in Methodological Statistics (see Davies [8]), in Biostatistics (see Azas and CiercoAyrolles [4]).
But a closed formula based upon natural parameters of the process is only known for a very restricted number
of stochastic processes X: for instance, the Brownian motion, the Brownian bridge or the OrnsteinUhlenbeck
process (a list is given in Azas and Wschebor [6]). An interesting review of the problem could be found in
Adler [2].
We are interested here in a precise expansion of the tail of the distribution of M for a smooth Gaussian
stationary process. First, let us specify some notations
r(t) := E X(s)X(s + t) denotes the covariance function of X. With no loss of generality we will also
assume 0 = r(0) = 1;
its spectral measure and k (k = 0, 1, 2, . . . ) its spectral moments whenever they exist;
x
1
x2
(x) = exp
and (x) =
(t)dt.
2
2
Keywords and phrases: Tail of distribution of the maximum, stationary Gaussian processes.
178
J.-M. AZA
IS, J.-M. BARDET AND M. WSCHEBOR
Throughout this paper we will assume that 8 < and for every pair of parameter values s and t , 0 s = t T ,
the six-dimensional random vector (X(s), X (s), X (s), X(t), X (t), X (t)) has a non-degenerate distribution.
Piterbarg [11] (Th. 2.2) proved (under the weaker condition 4 < instead of 8 < ) that for each T > 0
and any u R:
2
u2 (1 + )
T (u) P (M > u) B exp
2
2
1 (u) +
(1)
for some positive constants B and . It is easy to see (see for example Miroshin [10]) that the expression inside
the modulus is non-negative, so that in fact:
0 1 (u) +
2
u2 (1 + )
T (u) P (M > u) B exp
2
2
(2)
The problem of improving relation (2) does not seem to have been solved in a satisfactory manner until now.
A crucial step has been done by Piterbarg in the same paper (Th. 3.1) in which he proved that if T is small
enough, then as u +:
P (M > u) = 1 (u) +
2
T (u)
2
3 3(4 22 )9/2
9/2
22 (2 6
T
u
u5
24 )
4
4 22
[1 + o(1)] .
(3)
The same result has been obtained by other methods (Azas and Bardet [3]; see also Azas et al. [5]).
However Piterbarg equivalent (3) is of limited interest for applications since it contains no information on
the meaning of the expression T small enough.
The aim of this paper is to show that formula (3) is in fact valid for any length T under appropriate
conditions that will be described below.
Consider the function F (t) defined by
2
F (t) :=
2 1 r(t)
2 1 r2 (t) r 2 (t)
Lemma 1. The even function F is well defined, has a continuous extension at zero and
22
1. F (0) =
;
4 22
2. F (0) = 0;
2 (2 6 24 )
3. 0 < F (0) =
< .
9(4 22 )
Proof.
1. The denominator of F (t) is equal to 1 r2 (t) Var X (0)|X(0), X(t) thus non zero due to the non
degeneracy hypothesis. A direct Taylor expansion gives the value of F (0).
2. The expression of F (t) below shows that F (0) = 0:
F (t) =
(4)
3. A Taylor expansion of (4) provides the value of F (0). Note that 4 22 can vanish only if there exists
some real such that ({}) = ({}) = 1/2. Similarly, 2 6 24 can vanish only if there exists some
179
real and p 0 such that ({}) = ({}) = p, ({0}) = 1 2p. These cases are excluded by the non
degeneracy hypothesis.
We will say that the function F satisfies hypothesis (H) if it has a unique minimum at t = 0. The next
proposition contains some sufficient conditions for this to take place.
Proposition 1. (a) If r (t) < 0 for 0 < t T then (H) is satisfied.
(b) Suppose that X is defined on the whole line and that
1. 4 > 222 ;
2. r(t), r (t) 0 as t ;
3. there exists no local maximum of r(t) (other than at t = 0) with value greater or equal to
4 222
1 + cos(t) t2 /2
e
2
4
3 + 3 2 + 4 /2
On [0, ), the covariance attains its second largest local maximum in the interval
2
,
2 2
Theorem 1. If the process X satisfies hypothesis (H), then (3) holds true.
2. Proofs
Notations.
p (x) is the density (when it exists) of the random variable at the point x Rn .
1lC denotes the indicator function of the event C.
Uu ([a, b]), u R is the number of upscrossings on the interval [a, b] of the level u by the process X defined
as follows:
Uu ([a, b]) = #{t [a, b], X(t) = u, X (t) > 0}
For k a positive integer, k (u, [a, b]) is the kth order factorial moment of Uu ([a, b])
k (u, [a, b]) = E
We define also
k (u, [a, b]) = E
180
J.-M. AZA
IS, J.-M. BARDET AND M. WSCHEBOR
Lemma 2. Let f (respectively g) be a real-valued function of class C 2 (respectively C k for some integer k 1)
defined on the interval [0, T ] of the real line verifying the conditions:
1. f has a unique minimum on [0, T ] at the point t = t , and f (t ) = 0, f (t ) > 0.
2. Let k = inf j : g (j) (t ) = 0 .
Define
T
h(u) =
0
1
g(t) exp u2 f (t)
2
dt.
Then, as u :
g (k) (t )
k!
h(u)
1
xk exp f (t )x2
4
J
dx
1
1
exp u2 f (t ) ,
uk+1
2
(5)
1
exp ay 2 dy =
2
1
u7
1
1
3
+
+O
au a2 u3 a3 u5
1
u7
1
exp au2 ,
2
(6)
K
, K a constant depending only on a0 .
u7
Proof of of Theorem 1.
Step 1: The proof is based on an extension of Piterbargs result to intervals of any length. Let > 0, the
following relation is clear
P (M[0, ] > u) = P (X(0) > u) + P (Uu ([0, ]).1l{X(0)u} 1)
= 1 (u) + P (Uu ([0, ]) 1) P (Uu ([0, ]).1l{X(0)>u} 1).
In the sequel a term will be called negligible if it is O u6 exp
following relations to be proved later:
1 4 u2
2 4 22
as u +. We use the
(7)
N1 N4 being negligible. Applying (7) repeatedly and on account of Piterbargs theorem that states that (3)
is valid if T is small enough, one gets the result.
Step 2: Proof of (i). Using Markovs inequality:
P (Uu ([0, T ]).1l{X(0)>u} 1) 1 (u, [0, T ]),
181
where 1 is evaluated using the Rice formula (Cramer and Leadbetter [7])
+
1 (u, [0, T ]) =
dx
u
(8)
m
m
+ m
,
and plugging into (8) one obtains (see details in Azas et al. [5]):
1 (u, [0, T ]) =
(u)
2
dt
2 F
e 2 F y dy
2
r
F
(1 r)u2
exp
2(1 + r)
2 (1 r2 )
2
exp
r 2F y2
22 (1 r2 )
B(t, u)dt,
0
where r, r and F stand for r(t), r (t) and F (t) respectively. Clearly, since r (0) = 2 < 0, there exists T0
such that r < 0 on (0, T0 ]. Divide the integral into two parts: [0, T0 ] and [T0 , T ]. Using formula (6) on [0, T0 ]
we get
B(t, u) =
(u)
2
2 F 5/2
2 (1 r)2 3
u + O u5 (u) ,
r2
4 u2
2(4 22 )
On the other hand, since inf t[T0 ,T ] F (t) is strictly larger than F (0), it follows easily from
exp
ay 2
2
au2
1
dy (const) exp
2
a
a > 0, u 0,
that
T
B(t, u)dt
T0
is negligible.
Step 3: Proof of (ii). Use once more Markovs inequality:
P (Uu ([0, ])Uu ([, 2 ]) 1 ) E (Uu ([0, ])Uu ([, 2 ])) .
Because of Rice formula (Cramer and Leadbetter [7]):
(t (2 t))At (u)dt,
with
At (u) = E X + (0)X + (t)|X(0) = X(t) = u pX(0),X(t) (u, u).
(9)
182
J.-M. AZA
IS, J.-M. BARDET AND M. WSCHEBOR
1+r
(10)
with
T1 (t, u) = 1 + (b)(kb);
+
T2 (t, u) = 2(2 2 )
(kx)(x)dx;
b
T3 (t, u) = 2(kb)(b);
r
u;
1+r
2
r
2 = 2 (t) = Var X (0)|X(0), X(t)) = 2
;
1 r2
r (1 r2 ) rr 2
= (t) = Cor X (0), X (t)|X(0), X(t)) =
;
2 (1 r2 ) r 2
1+
k = k(t) =
; b = b(t, u) = /;
1
= (t, u) = E X (0)|X(0) = X(t) = u) =
(z) =
(v)dv;
0
22
42
(kb)(1 + )(b) 3 (kb)(b) (1 + )2 k(kb)(b)
b
b
2
4
k
k3
1
+ 22 k 3 (kb)(b) + 2 k(kb)(b) + O 2 (kb)(b) 7 + 6 + 4
b
b
b
b
Since T1 (u) + T2 (u) + T3 (u) is non negative, majorizing (kb) and (kb) by 1 we get
At (u) (const)
2
1 r2 (t)
(1 + )
k
1
k
k3
+ k3 + 2 + 7 + 6 + 4
b
b
b
b
b
exp
1
1 + F (t) u2
2
(const)t2 , (1 + )
(const)t2 ,
1 r2 (t)
(const)t, b
so that
2 (1 + )
b 1 r2 (t)
b2
(const)
t3
;
u
2 k3
1 r2 (t)
(const)t4 ;
2 k
1 r2 (t)
(const)t2 u2 ;
(const)u,
183
and also that the other terms are negligible. Then, applying Lemma 2:
T0
(t (2 t))At (u)dt
(const)u6 exp
4 u2
2(4 22 )
thus negligible.
For t T0 remark that T1 (u) + T2 (u) + T3 (u) does not change when (and consequently b) changes of sign.
Thus and b can supposed to be non-negative. Forgetting negative terms in formula (10) and majorizing
by 1; 1 (b) by (const)(b) and by (const)u, we get:
At (u) (const)2
1+r
1
1 + F (t) u2
2
We conclude as in Step 2.
Proof of of Proposition 1. Let us prove statement (a). The expression (4) of F shows that it is positive for
0 < t T , since r (t) < 0 and
r 2 (t)(2 r (t))(1r(t)) =
1
Var X(t)X(0) Var X (t)+X (0) Cov2 X(t)X(0), X (t)+X (0) < 0.
4
(11)
22
1 r(t )
>
1 + r(t )
4 22
The authors thank Professors P. Carmona and C. Delmas for useful talks on the subject of this paper.
References
[1] M. Abramowitz and I.A. Stegun, Handbook of Mathematical functions with Formulas, graphs and mathematical Tables. Dover,
New-York (1972).
[2] R.J. Adler, An introduction to Continuity, Extrema and Related Topics for General Gaussian Processes. IMS, Hayward, CA
(1990).
[3] J.-M. Azas and J.-M. Bardet, Unpublished manuscript (2000).
[4] J.-M. Azas and C. CiercoAyrolles, An asymptotic test for quantitative gene detection. Ann. Inst. H. Poincar
e Probab. Statist.
(to appear).
[5] J.-M. Azas, C. CiercoAyrolles and A. Croquette, Bounds and asymptotic expansions for the distribution of the maximum of
a smooth stationary Gaussian process. ESAIM: P&S 3 (1999) 107-129.
[6] J.-M. Azas and M. Wschebor, The Distribution of the Maximum of a Gaussian Process: Rice Method Revisited, in In and
out of equilibrium: Probability with a physical flavour. Birkhauser, Coll. Progress in Probability (2002) 321-348.
[7] H. Cram
er and M.R. Leadbetter, Stationary and Related Stochastic Processes. J. Wiley & Sons, New-York (1967).
184
J.-M. AZA
IS, J.-M. BARDET AND M. WSCHEBOR
[8] R.B. Davies, Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 (1977)
247-254.
[9] J. Dieudonn
e, Calcul Infinit
esimal. Hermann, Paris (1980).
[10] R.N. Miroshin, Rice series in the theory of random functions. Vestn. Leningrad Univ. Math. 1 (1974) 143-155.
[11] V.I. Piterbarg, Comparison of distribution functions of maxima of Gaussian processes. Theoret. Probab. Appl. 26 (1981)
687-705.
[see the monographs by Adler (1981, 1990), Berman (1992), Leadbetter, Lindgren and Rootzen (1983), Ledoux and Talagrand (1991), and Piterbarg (1996)].
It turns out that the exact distribution of Z is known for the Wiener process,
1
the Brownian bridge B t , B t 0 B u du [Darling (1983)], the integrated
Wiener process [see, e.g., Lachal (1991)], a class of sawtooth processes [see,
e.g., Cressie (1980)] and the random cosine wave X t = 1 cos t + 2 sin t ,
where 1 and 2 are i.i.d. 0 1 .
Otherwise, two directions have been mainly explored. The first one consists
of deriving, under minimal restrictions, upper and lower bounds for P Z > a
for a large enough, or first-order asymptotics for P Z > a as a . At this
level of generality, these bounds often involve unknown constants and are
not sharp enough to be used as p-values in statistical tests and stochastic
modelization where precise estimations of P Z > a , lying between 0.1 and
0.01, are required. Works of the second category try to obtain precise asymptotics for P Z > a under more rigid restrictions on the process (stationarity,
Received July 1994; revised October 1995.
supported by the CNRS and the NSF.
2 Research supported by the Swiss National Science Foundation.
AMS 1991 subject classifications. Primary 60G15, 60G70; secondary 60G17.
Key words and phrases. Differential geometry, Gaussian processes, extreme value, nonasymptotic formulas, density.
1 Research
1104
1105
1
X t = X
t
j gj t
j=1
2n
1106
his results render the well-known expressions (Theorem 18.1, page 37):
(2)
where L =
P Z > a L 2
T
0
n
j=1
gj 2 t
(3) P Z > a L 2
1/2
exp a2 /2
exp a2 /2 +
1/2
exp x2 /2 dx
a>0
The main term appearing in both expressions results from Rices formula,
which measures the expected number of upcrossings of a fixed level a [Marcus (1977)]. In subsection 2.3, we improve the upper bound (3) for smooth
stationary processes providing a higher-order expansion for P Z > a .
Johnstone and Siegmund (1989) consider processes of the form (1) with
X t 1, finite n and 1
n uniformly distributed on the unit sphere.
By making use of the connection between the standard Gaussian distribution
in Rn and the uniform distribution on the unit sphere of Rn , we can adapt their
result (Theorem 3.3, page 190) to our context. It turns out that the resulting
upper bound is (3).
Sun (1993) investigates an asymptotic expansion for the tail probabilities
of the maximum of smooth Gaussian random fields with unit variance. In
the special case of processes, her results concern periodic processes of the
form (1) with X t 1. For finite n, Sun obtains the asymptotic formula (2)
(Theorem 3.1, page 40) as a consequence of Weyls formula for the volume of
tubes around a manifold embedded in the unit sphere. For infinite n, (2) still
holds under additional assumptions, otherwise it becomes an upper bound
(Theorems 3.2 and 3.3, page 41).
The sharpest results concerning smooth Gaussian processes are due to
Piterbarg (1981, 1988) and Konstant and Piterbarg (1993) who produce very
precise asymptotic formulas for P Z > a . In subsection 2.3, our results are
compared to theirs. In particular, we provide rates of approximation for suitable variants of the asymptotic formulas given in Konstant and Piterbarg
(1993).
Our approach is based on the interpretation of the functions gj t , j 1,
as a parameterization of a curve embedded in the unit sphere of Rn or of
the space of square summable sequences. With the canonical moving frame
induced by this parameterization, we describe each level manifold z = b ,
b R, of the functional
1
z = sup X
t
tI
xj gj t
j=1
1107
and Vect x1
xd denote the linear subspace spanned by x1
xd and
its orthogonal, respectively, Gram x1
xd is the determinant of the matrix G with entries Gij = ai aj [note that det G = det2 x1
xd ]. n
is the Gaussian measure on Rn with density n x = 2 n/2 exp x 2 /2 ,
x
x = 1 x and x = y dy. By convention, = 0 and = 1.
m
A denotes the set of functions A R having kth-order continuous
derivatives for k = 1
m. The partial derivatives k+l r x y /xk yl where
k+l
r
A are written Dkl r x y . The Jacobian matrix of a differentiable
mapping p: Rn Rn is denoted Dp.
2. Main results.
2.1. An integral formula. Let X t , t I = 0 T , be a Gaussian process
2
with mean 0 and variance X
t > 0, of the form
(4)
1
X t = X
t U t
U t = g t
1
where X t = X
t , g t = g1 t
gn t , n 2, and = 1
n
is a Gaussian r.v. with zero mean and identity covariance matrix. With this
representation, the covariance function rX t1 t2 of X t is given by
1
1
1
1
rX t1 t2 = X
t1 X
t2 rU t1 t2 = X
t1 X
t2
g t1 g t2
2
2
and Dkl rU t1 t2 = g k t1 g l t2 . Since X
t = rX t t = g t 2 /X
t ,
2
g t
= rU t t 1 and g t parameterizes a curve embedded in the
unit sphere Sn1 in Rn . Let us denote kl u t = Dkl rU t1 t2 t1 = t2 = t . In this
subsection, we assume that
Condition 1. X t is in 2 I and rU t1 t2
derivatives Dkl rU t1 t2 for 0 k l 2.
Condition 2. 11
t = 0 for all t I.
3/2
Condition 3. t: cg t = 0 t: X t X t 12 u t /11 u t +
X t /11 u t > 0 , where the function cg t 0 defines the geodesic
1108
The key idea of our approach is to transform this problem into a geometric
problem concerning the standard Gaussian measure of certain convex subsets
of Rn . We obtain an integral formula for the density fZ of Z which is stated in
Theorem 1. The derivation of this formula which is sketched below is greatly
simplified if we parameterize with unit speed. This can be done without
loss of generality when Condition 2 holds. Let us define the Gaussian process
Y s , s J, as
1
Y s = Y
s V s
V s = f s
1/2
where Y s = X 1 s , f s = g 1 s and s = t = 0 11 u t dt
defines a unit speed parameterization of , J = 0 L with L = = T .
Then we have
Z = sup X t = sup Y s
tI
sJ
1109
x Rn
P Z a = n Ca
aR
where
Ca = x Rn : sup Y s x a
sJ
(6)
n Ca =
b Cb db =
fZ b db
Such a decomposition can be worked out basicallyfor simplicity, we assume Y s X t periodic here, the aperiodic case can be treated essentially
in the same waybecause it is possible (see Lemma 9) to parameterize Cb
by
n2
pb s u = c1 b s f s + c2 b s T s +
uj Kj s
j=1
where s J, u = u1
un2 Db s , c1 b s and c2 b s are defined in
terms of b, Y s and Y s , and Db s is a closed convex subset of Rn2 . We
show in Lemmas 11 and 12 that the transformation p: b s u pb s u
is a C1 -diffeomorphism from an open subset of Rn into Rn . By the change-ofvariable formula and Fubinis theorem, we have, for all A Rn ,
n A =
=
=
b s u p1 A
bR
bR
n p b s u Gram1/2 Dp b s u db ds du
s u pb1 ACb
n pb s u Gram1/2 Dp b s u ds du db
b A Cb db
s u p1
b ACb
n pb s u Gram1/2 Dpb s u ds du
1110
s u pb1 ACb
n pb s u Y s Gram1/2 Dpb s u ds du
fZ b =
(8)
Db s
Y s b Y s + Y s
u1 cg s
n p b s u du1 dun2 ds + Z b
where
Z
with
0
b = Y 0 bY 0 n1 Gb 0
+ Y L bY L n1 Gb L
Db s = u = u1
1
un2 Rn2 : sup Y
s
Y s L-periodic
otherwise
p b s u f s
s J
n2
p b s u = b Y s f s + Y s T s
Gb l = v = v1
uj Kj s
j=1
1
vn1 Rn1 : sup Y
s
s J
pb l v f s
and
n2
pb l v = bY l f l + vn1 T l +
vj Kj l
j=1
l=0 L
1111
Theorem 2.
fZ b M b
=
b
2
Y s Y s + Y s exp
b Y s + Y s
cg s
(9)
+
1
2
Y s cg s exp
where
M
b2 2
s + Y 2 s
2 Y
ds
b2 2
s + Y 2 s
2 Y
b Y s + Y s
cg s
0
bY 0
b = Y 0 bY 0
+ Y L bY L 1
ds + M b
Y s L-periodic
bY L
otherwise
Y s by X t , Y s by X t /11
X t /11 u t .
t and Y s by X t 12
3/2
t /11
t +
The upper bound M b can be used to derive a lower bound for fZ b , b > 0.
Indeed, the integral formula (8) can be rewritten as
fZ b =
b
2
L
0
Y s Y s + Y s
exp
(10)
1
2
b2 2
s + Y 2 s
2 Y
L
0
Y s cg s exp
Db s
n2 Db s ds
b2 2
s + Y 2 s
2 Y
u1 n2 u du1 dun2 ds + Z b
1112
= P sup Y s b Y s = b Y s = 0
s J
1
= P sup Y
s
s J
b Y s f s f s
+ Y s T s f s
n2
j Kj s f s
j=1
where = 1
n2 is a Gaussian r.v. with mean 0 and identity covariance matrix. The Gaussian process Y s
Y s = b Y s = 0 has mean
1
given by bY
s Y s f s f s + Y s T s f s
and variance given by
2
Y
s 2s s where 2s s = 1 f s f s 2 T s f s 2 > 0 by Condition 6.
Therefore,
n2
(11)
n2 Db s
= P for all s J
j Kj s f s
j=1
bs s
where
s s = Y s Y s f s f s
1
= b Y s
bE Y s
Y s T s f s
Y s =b Y s =0
= P sup Ws s b
s J
s J+
1113
where Ws s = s1 s ks s , s1 s = Var1/2 Ws s = s s 1
s >0
s
and ks s = ks 1 s
ks n2 s
parameterizes a curve s on the unit
sphere of Rn2 . This curve is the normalized orthogonal projection of on
Vect f s T s and consequently s ks s is not unit speed. As in subsection 2.1, we need the autocovariance function of ks s which is given
by
rs s1 s2 = ks s1 ks s2
= s1 s1 1
s2
s
f s1 f s2
f s f s1
T s f s1
f s f s2
T s f s2
if s1 = s and s2 = s, rs s s2 = f s + f s f s2 / cg s s s2 and rs s s =
1. By Condition 5, rs s1 s2 has continuous partial derivatives Dkl rs s1 s2
for 0 k l 2. Let us denote kl s s = Dkl rs s1 s2 s1 = s2 = s .
In order to apply Theorem 2 to Zs = sups J Ws s , it remains to determine
1/2
ks s and the geodesic curvature cg s s of s . We have ks s = 11 s s >
0 by Condition 6 and cg2 s s = 11 s s 22 s s 212 s s /311 s s 1.
Theorem 3.
fZ b m b
b
s Y s + Y s
=
2 J+ Y
(12)
b2 2
exp
Y s + Y 2 s
1
Ms b db ds
2
b
b2 2
1
s + Y 2 s 1 s Ms b ds
Y s cg s exp
2 J+
2 Y
where 0 < s = inf s J s s cg1 s Y s + Y s
Ms b =
b
2
L
0
s s s s exp
bs s
cg s s
Ms b =
1
2
2 s
b2 2
s s + s
2
11 s s
s ds
L
0
s 0 bs 0
1/2
11
s s cg
s exp
bs s
cg s s
1/2
11
b2 2
2 s
s s + s
2
11 s s
s ds + Ms b
Y s L-periodic
bs 0
1/2
11
+ s L bs L
bs L
1/2
11
otherwise
1114
and
s s = s s s s 12
3/2
s /11
s + s s /11
Remark. Expressions in terms of t = 1 s are obtained by applying the transformations of Theorem 2 and by replacing s s by t t =
t t /t t where 2t t = 1 rU t t 2 D210 rU t t /11 u t , t t =
1/2
X t X t rU t t X t D10 rU t t /11 u t , s s by t t /11 u t ,
3/2
exp a2 /2 + 1
B exp a2 / 1 +
We do not give such a precise result in the nonperiodic case [Theorem 4(ii)] but
we do [Theorem 4(i)] in the periodic case (not treated by Piterbarg). However,
under slightly stronger conditions than those of Theorem 2.2, Piterbarg (1981)
obtains the highest-order expansion to our knowledge. His Theorem 2.3 states
that there is an L0 small enough such that, for all L L0 ,
P Z > a = L 2
exp a2 /2 + 1
1+o 1
1115
RM b > 0
m b e b 1 Rm b
Rm b > 0
as b where:
(i) if J \ J+ has Lebesgue measure 0 and Y s is periodic, or not periodic
with min Y 0 Y L > inf sJ Y s ,
(13)
e b =
b
2
L
0
Y s Y s + Y s exp
b2 2
s + Y 2 s
2 Y
ds
e b =
b
b2
L exp
2
2
1116
Remark. The results stated in Theorem 4(iii) and (iv) can be easily
adapted to the case where Y s reaches its absolute minimum on a finite set
of points by adding the asymptotics over these points.
Note that when Y s reaches its maximum at the boundaries with high
probability, the main contribution to the density is given by the additional
term Z b . This phenomenon affects the good behavior of m b since we have
chosen to take 0 as the lower bound for Z b for the sake of brevity. However,
it would be possible to improve m b by introducing a term 0 < m b Z b
which corrects this imperfection. This subject is under current research.
Theorem 5. Assume that Conditions 2, 5 and 6 hold, J \ J+ has Lebesgue
measure 0 and, for a.e. s J the function s s s reaches its infimum
k in J with s si > 0 for
s > 0 at a finite number of points si i = 1
si int J and s si = 0 or s si > 0 if si = 0 L. Then there exists for a.e.
s J a positive number s such that, for b ,
b
2
fZ b
L
0
Y s Y s + Y s
exp
b2 2
s + Y 2 s
2 Y
u s b ds + Z b
where
u s b = 1 s 1
b s
1+o 1
+ s cg s b Y s + Y s
fZ b
b
2
L
0
Y s Y s + Y s exp
b s
1+o 1
b2 2
s + Y 2 s
2 Y
l s b ds
where
l s b = 1 s 1
1+o 1
b s
s cg s b Y s + Y s
The function s is given by s =
k
i=1
i s
b s
1+o 1
where:
1117
s and
fZ b bL 2
exp b2 /2 1 1
1+o 1
+ cg b b 1 + o 1
fZ b bL 2
exp b2 /2 1 b b1 1 cg 1 + o 1
exp a2 /2
P Z > a L 2
1 cg L 2
1 + 2
1/2
1 + 2
1/2
a 1 + 2
1/2
a 1 + 2
1/2
1+o 1
which improves the well-known upper bound (3) for a large enough. Similarly,
1
P Z > a L 2
exp a2 /2
1 + cg L 2
1+o 1
exp a2 /2 + 1
1 + 2 s
1/2
a 2
a 1 + 2 s
1/2
exp a2 /2 2
1 + 2 s
1/2
L
0
L
0
s 1 s cg
1 + o 1 ds
s 1 s + cg
a 1 + 2 s
1/2
1 + o 1 ds
b
2
s +
s
Y s Y s + Y s exp
b2 2
s + Y 2 s
2 Y
ds
2
and + 2 = inf sJ\J Y
s + Y 2 s . By assumption, > 0. From (9) and
(12), we have M b = e b + M b + O b +
and m b = e b 1
O b
as b , with = inf sJ s > 0. As min Y 0 Y L > , it
results that
fZ b = e b 1 + O b
1118
g t1 g t2
=
The functions g t = g1 t g2 t
j=1 gj t1 gj t2
parameterize a curve embedded in the unit sphere of 2 . Let us denote
kl u t = Dkl rU t1 t2 t1 =t2 =t . We assume that
Condition D1. The function X t is in 3 I and rU t1 t2 has continuous partial derivatives Dkl rU t1 t2 for 0 k l 4.
Condition D2. 11
t = 0 for all t I.
1/2
1/2
4.1. Proof of Theorem 1. We give a detailed proof for the case Y s periodic
and sketch the straightforward adaptation for the other case. A complete proof
of all results can be found in Diebolt and Posse (1995) and is available upon
request.
1119
(15)
sJ
If Ca = Ca = x Rn : supsJ Y s x = a .
Lemma 9.
x f s
= aY s
x T s
= aY s
can be parameterized by
n2
(16)
pa s u = a Y s f s + Y s T s
uj Kj s
j=1
with s J u = u1
un2 Rn2 .
(ii) The hypersurface Ca can be parameterized by pa s u with s J and
u Da s , where Da s is the closed convex subset of Rn2 (possibly empty)
defined by the set of inequalities sups J Y s pa s u a .
Lemma 10. Let s0 J be given.
(i) If u Da s0 = then d a s0 u1 cg s0 0.
(ii) If u int Da s0 = and cg s0 > 0, then d a s0 u1 cg s0 > 0,
where d a s = a Y s + Y s .
Proof. (i) For each fixed u Rn2 and s0 , the function hu s0 s =
Y s pa s0 u s J, is twice differentiable and hu s0 s0 = 0. Furthermore, since f s = cg s K s f s for all s J, hu s0 s0 = d a s0
u1 cg s0 /Y s0 . If u Da s0 = , hu s0 s reaches its maximum value at
s = s0 , implying that hu s0 s0 0.
(ii) Suppose that u int Da s0 = . Let us show by contradiction that
hu s0 s0 < 0. Otherwise, we would have hu s0 s0 = 0 by (i). If hu s0 s0 = 0,
since cg s0 > 0 and u int Da s0 , we can pick v Da s0 (close enough to
u) such that hv s0 s0 > 0 (by taking v1 > u1 ), which contradicts (i).
Let us define the C1 -function
p b s u = pb s u
bR
sJ
u = u1
un2 Rn2
1120
Lemma 12. The function p is a one-to-one mapping from V0a onto C0a .
Proof. Using Condition 4, the proof is similar to the proof of Lemma
10(ii).
Lemma 13.
(17) n C0a =
V0a
Y s d b s u1 cg s n p b s u du1 dun2 ds db
Proof. According to Lemmas 11 and 12, the function p is a C1 diffeomorphism from V0a onto C0a . Moreover, according to Lemmas 10 and 11,
det Dp b s u > 0 for all b s u V0a . Then (17) results from the change of
variable x = p b s u applied to the integral n C0a = C0 n x dx.
a
1121
4.2. Proof of Theorem 2. (i) Under Conditions 14, the inequality (9) is a
direct consequence of Lemma 10.
(ii) To enlarge the scope of this inequality, we use the following perturbation
argument. Let us consider the auxiliary Gaussian process Y s , s J,
1
Y s = Y
s V s
V s =
n+k
j fj s
j=n+1
where j , j = n + 1
n + k, are independent standard Gaussian r.v.s, inden+k
2
2
pendent of j , j = 1
n, n+k
j=n+1 fj s 1 for s J, the
j=n+1 fj s =
1/2
V s + V s
1
= Y
s
n+k
j f
j=1
b
2
+
L
0
1
2
Y s Y s + Y s exp
L
0
Y s c g s exp
b2 2
s + Y 2 s
2 Y
b2 2
s + Y 2 s
2 Y
ds
1/2
ds + M b
where M b is independent of . For 0 , the expression on the righthand side is bounded above by M b = O b exp b2 2Y /2 , where Y =
inf sJ Y s > 0 .
Finally, P Z a P Z a for all a as 0 since
1
Z Z Y
1 + 2
1/2
1 sup V s + sup V s
sJ
sJ
1122
n2
K s f s /s s
Ws s = j=1 j j
1 cg s / Y s + Y s
s
,
s =s
s =s
with = 1
n2 a Gaussian r.v. with zero mean and identity covariance matrix. With this definition, Ws s is defined and continuous on
J, Var Ws s
= 2s s /2s s for s = s and Var Ws s = cg2 s / Y s +
Y s 2 . Moreover, Ws s is of the form (4) and satisfies Conditions 1 and 2:
Ws s = s1 s ks s , where s s = Var Ws s 1/2 and ks s =
ks 1 s
ks n2 s with ks j s = Kj s f s /s s for s = s, ks 1 s =
1 and ks j s = 0, j 2. Therefore, we can apply Theorem 2 to Zs to obtain
an upper bound for 1 n2 Db s .
This interpretation can also be used to derive an upper bound for the second term in (10). Indeed, by Stokes theorem [Berger and Gostiaux (1988),
page 195]
Db s
u1 n2 u du1
dun2 =
Db s
Db s
n2 u du2
dun2
n2 dV
n2 dV =
v p1
b s Db s
n2 pb
s v Gram1/2 Dpb
s v ds dv
n2 dV
1
f b = 1 s fZs b
inf s J s s Zs
4.4. Proof of Theorem 4. (i) Assume Y s periodic and J\J+ has Lebesgue
measure 0. Let us denote
b2 2
b L
Y s Y s + Y s exp
s + Y 2 s
ds
e1 b =
2 0
2 Y
By (9),
M b = e1 b +
b
2
L
0
Y s Y s + Y s
exp
b2 2
s + Y 2 s
2 Y
b Y s + Y s
cg s
ds
1123
where 0 < H x = x1 x 1
x x3 x for all x > 0 and
H = 0. Since cg s is continuous and Y s + Y s is continuous and
positive on J, it follows that 1 = inf sJ Y s + Y s /cg s > 0 and M b
e1 b e1 b b1 3 b1 . By (12), m b = e1 b A B, where A involves
1
b2 2 s
exp
2
2
+ cg
1/2
L
0
s s
1/2
11
b s s
s ds
= C5
Y s Y s + Y s exp
b2 2
2
0 Y
s Y 2 s
2 Y
ds
2
2
For > 0 small enough, the subset J = s J: Y
0 Y
s Y 2 s 2 of
2
J contains a nonempty interval [around a global minimum of Y
s + Y 2 s ].
Therefore, it has positive Lebesgue measure. For such an > 0,
L
0
Y s Y s + Y s exp
exp
b2 2
2
b2 2
2
0 Y
s Y 2 s
2 Y
ds
Y s Y s + Y s ds
= C6 1 b
Hence, bY 0 = e1 b O b3 , that is, M b = e1 b O b3 . Finally,
from the continuity and positivity of s s 0 and s L , supsJ Ms b =
e1 b O b4 .
(ii) Straightforward.
(iii) Let us take > 0 such that s s + J+ . Such an exists
since s s > 0 for s close to s and s J. Then there exists > 0 such that
1124
2
Y
s + Y 2 s +
M b =
b
2
for s s s + and
s +
Y s Y s + Y s
exp
b2 2
s + Y 2 s
2 Y
ds 1 + O
b1
b3
+ O b b +
around s = s .
(iv) Analogous to (iii).
4.5. Proof of Theorem 5. This follows directly from a straightforward adaptation of the proof of Theorem 3 and the application of Laplaces formula to
Ms b .
4.6. Proof of Theorem 6. Lemma 14 shows that the process X t admits
the representation (1). Therefore, by truncation and renormalization, we can
construct a sequence of Gaussian processes Xn t of the form (4) which converge to X t . Moreover, under Conditions D1 and D2, Xn t satisfies Conditions 1 and 2 for all n sufficiently large. Similarly, under Conditions D2D4,
Xn t satisfies Conditions 2, 5 and 6 for n large enough. Hence, the density
fZn b of Zn = suptI Xn t has an upper bound Mn b of the form (9) and a
lower bound mn b of the form (12).
We will show that Mn b M b for all b, mn b m b for all b > 0 and
the sequence fZn is weakly relatively compact in L1 R . With an equality
due to Dmitrovskii [Lifshits (1986)], this implies that Z has a density fZ b
which is the limit of fZn b and satisfies m b fZ b M b .
Lemma 14. Under Condition D1:
1
(i) there exists a representation X
t
j=1 j gj t of X t
j j 1 are i.i.d. 0 1
(ii) the functions gj t j 1 are in 3 I
(iii) sup t1
k l 3.
t2 II
Dkl rU t1 t2
n
j=1
gj
t1 gj t2
0 as n 0
1125
onto Rn 0
k
gn
= gn
gn
0
, t I, is well defined for all n N1 and
t g t uniformly
for 0 k 3. The functions gn t , t I, parameterize a curve n on
the unit sphere of Rn 0
2 . Moreover, there exists N2 such that
inf tI gn t > 0 for all n N2 . The corresponding unit speed parameterization of n is defined by fn n t = fn 1 n t
fn n n t 0
,
t
where n t = 0 gn t dt . With this notation Zn = suptI Xn t =
n
1
suptI Yn n t , where Xn t = X
t Un t , Un t =
j=1 j gn j t ,
n
1
Yn n t = Yn n t Vn n t and Vn n t = j=1 j fn j n t .
k
1126
2n
= sup Var U t Un t
tI
tI
sup U t Un t
tI
and
d2n t1 t2 = E U t1 Un t1
U t2 Un t2
C t1 t2
for n large enough, by Lemma 14. On the other hand, we have the following
inequality due to Dmitrovskii [Lifshits (1986)]:
P sup U t Un t > u = 2 exp u2 / 22n
tI
qn u
1127
T K1 K2 Kn2
1
0
0 0
0 cg 0 0
cg
f
T
K1
K2
Kn2
1128
sin2 = 1/c2 and then cg2 = f 2 1. Note that cg = 0 iff cos = 0, that is,
f N = 1. Finally, it can be shown that K = f + f /cg .
If g t , t I = 0 T , is a general parameterization of with g t > 0
for all t I, the corresponding unit speed parameterization of is given by
t
f s = g 1 s , s J, where s = t = 0 g u du, t I, is the arc length
of from 0 to t. We have
df
dg dt
g t
g t
=
=
=
ds
dt ds
t
g t
T s =f s =
c s N s =f s =
=
cg2 t =
1
g t
g t
2
2
d
d2 f
=
2
ds
dt
g t
g t
g
g
dt
ds
g t g t
g t 2
g t g t
g t 6
g t
2
Acknowledgments. The draft of this paper was prepared when the authors were visiting the Department of Statistics at Stanford University. We
are also indebted to one referee for bringing to our attention Piterbargs work.
REFERENCES
Adler, R. J. (1981). The Geometry of Random Fields. Wiley, New York.
Adler, R. J. (1990). An Introduction to Continuity, Extrema, and Related Topics for General
Gaussian Processes. IMS, Hayward, CA.
Azais, J. M. and Florens-Zmirou, D. (1987). Approximation du temps local des processus
gaussiens stationnaires par regularisation des trajectoires. Probab. Theory Related
Fields 76 121132.
Berger, M. and Gostiaux, B. (1988). Differential Geometry: Manifolds, Curves and Surfaces.
Springer, New York.
Berman, S. (1988). Sojourns and extremes of a stochastic process defined as a random linear
combination of arbitrary functions. Comm. Statist. Stochastic Models 4 143.
Berman, S. (1992). Sojourns and Extremes of a Stochastic Processes. Wadsworth, Belmont, CA.
1129
School of Statistics
College of Liberal Arts
270A Vincent Hall
206 Church Street
University of Minnesota
Minneapolis, Minnesota 55455
E-mail: cposse@stat.washington.edu
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
University ; Cecile
Mercadier Lyon, France and Mario Wschebor
Universite de Toulouse
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
1 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
2 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
6000
5000
4000
3000
2000
1000
20
40
60
80
100
120
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
3 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
4
0
20
40
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
60
80
100
120
4 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Introduction
Testing
The maximum of absolute value of the series is 3.0224. An estimation
of the covariance with WAFO gives
2
1.5
0.5
0.5
1.5
0
20
40
60
80
100
120
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
5 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
6 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
I :=
l1
(x)dx
(1)
ln
I :=
(z1 )dz1
l1 /T11
u2 T12 z1
T22
(z2 )dz2
l2 T12 z1
T22
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
(2)
7 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
1 (u1 /T11 )
I :=
dt1
1 (l1 /T11 )
u2 T12 1 (t1 )
T22
dt2
l2 T12 1 (t1 )
T22
(3)
h(t)dt.
(4)
[0,1]n
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
8 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
QMC
In the form (4) the MC evaluation is based on
M
h(ti )
I = 1/M
i=1
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
9 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Theorem
(Nuyens and cools, 2006) Assume that h is the tensorial product of
periodic functions that belong to a Koborov space (RKHS). Then the
minimax sequence and the worst error can be calculated by a
polynomial algorithm. Numerical results show that the convergence is
roughly O(M 1 ).
This result concerns the worst case so it is not so relevant
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
10 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
A meta theorem
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
11 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
MCQMC
Let (ti , i) be the lattice sequence, the way of estimating the integral
can be turn to be random but exactly unbiased by setting
M
I = 1/M
ti + U
i=1
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
12 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
13 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Do processes exist ?
In this part X(t) is a Gaussian process defined on a compact interval
[0, T].
Since such a process is always observed in a finite set of times and
since the previous method work with say n = 1000, is it relevant to
consider continuous case ?
Answer yes : random process occur as limit statistics. Consider for
example the simple mixture model
H0 : Y N(0, 1)
H1 : Y pN(0, 1) + (1 p)N(, 1) p [0, 1], M R
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
(5)
14 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
(6)
est 1
es2 1
et2 1
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
15 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
(7)
(8)
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
16 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
17 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
An example
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
18 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
Extensions
Treat all the cases : maximum of the absolute value, non centered,
non-stationary. In each case some tricks have to be used.
A great challenge is to use such formulas for fields .
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
19 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
References
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
20 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
21 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
THANK-YOU
MERCI
GRACIAS
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
22 / 22
Introduction
MCQMC computations of Gaussian integrals
Maxima of Gaussian processes
THANK-YOU
MERCI
GRACIAS
Jean-Marc A ZAI S Joint work with : Alan Genz ,Washington State University
; Cecile
Mercadier Lyon, France and Mario
Wschebor (Universite de Toulouse )
22 / 22
(k)
Xt1 , ..., Xtn , Xt1 , ..., Xtn , ..., Xt1 , ..., Xtn
is non degenerate.
We denote m(t) and r(s, t) the mean and covariance functions of X, that is
i+j
m(t) := E(Xt ), r(s, t) := E (Xs m(s))(Xt m(t)) and rij := s i t j r (i, j =
0, 1, ..) the partial derivatives of r, whenever they exist.
Our main results are the following:
71
Theorem 1.1. Let X = {Xt : t [0, 1]} be a stochastic process satisfying H2k .
Denote by F (u) = P (M u) the distribution function of M.
Then, F is of class C k and its succesive derivatives can be computed by repeated
application of Lemma 3.3.
Corollary 1.1. Let X be a stochastic process verifying H2k and assume also that
E(Xt ) = 0 and V ar(Xt ) = 1.
Then, as u +, F (k) (u) is equivalent to
(1)k1
uk u2 /2
e
2
(1)
2. Crossings
Our methods are based on well-known formulae for the moments of crossings of
the paths of stochastic processes with fixed levels, that have been obtained by a
variety of authors, starting from the fundamental work of S.O.Rice (19441945).
In this section we review without proofs some of these and related results.
Let f : I IR be a function defined on the interval I of the real numbers,
Cu (f ; I ) := {t I : f (t) = u}
72
Nu (f ; I ) =
Cu (f ; I )
denote respectively the set of roots of the equation f (t) = u on the interval I and
the number of these roots, with the convention Nu (f ; I ) = + if the set Cu is
infinite. Nu (f ; I ) is called the number of crossings of f with the level u on
the interval I .
In the same way, if f is a differentiable function the number of upcrossings
and downcrossings of f are defined by means of
Uu (f ; I ) := ({t I : f (t) = u, f (t) > 0})
Du (f ; I ) := ({t I : f (t) = u, f (t) < 0}).
For a more general definition of these quantities see Cramer and Leadbetter (1967).
In what follows, f p is the norm of f in Lp (I, ), 1 p +, denoting the Lebesgue measure. The joint density of the finite set of real-valued random
variables X1 , ...Xn at the point (x1 , ...xn ) will be denoted pX1 ,...,Xn (x1 , ...xn ) whenever it exists. (t) := (2)1/2 exp(t 2 /2) is the density of the standard normal
t
distribution, (t) := (u)du its distribution function.
The following proposition (sometimes called Kacs formula) is a common tool
to count crossings.
Proposition 2.1. Let f : I = [a, b] IR be of class C1 , f (a), f (b) = u. If f
does not have local extrema with value u on the inteval I , then
Nu (f ; I ) = lim 1/(2)
0
IRk
j =1
|xj | pXt
Ik
where it is understood that the density in the integrand of the definition of At1 ,...tk
(x1 , ...xk ) exists almost everywhere and that the integrals above can take the value
+.
73
,...,Xs
exists for (t1 , ...tk ), (s1 , ...sk ) I k \Dk (I ) and is a continuous function of
(t1 , ...tk ) and of x1 , ...xk at the point (u, ..., u).
2. the function
(t1 , ..., tk , x1 , ...xk ) At1 ,...tk (x1 , ...xk )
is continuous for (t1 , ..., tk ) I k \Dk (I ) and x1 , ...xk belonging to a neighbourhood of u.
3. (additional technical condition)
IR3
(2)
dt
(3)
(b) Simple variations of (3), valid under the same hypotheses are:
E[Uu (X; I )] =
E[Du (X; I )] =
dt
(4)
(5)
dt
In the same way one can obtain formulae for the factorial moments of marked
crossings, that is, crossings such that some additional condition holds true. For
example, if Y = {Yt : t IR} is some other stochastic process with real values
such that for every t , (Yt , Xt , Xt ) admit a joint density, a < b + and
Nua,b (X, I ) := {t : t I, Xt = u, a < Yt < b}.
Then
E[Nua,b (X; I )] =
b
a
dy
dt
(6)
74
+
In particular, if Ma,b
is the number of strict local maxima of X(.) on the interval I
+
such that the value of X(.) lies in the interval (a, b), then Ma,b
= D0a,b (X , I ) and:
+
]=
E[Ma,b
b
a
dy
dt
(7)
Sufficient conditions for the validity of (6) and (7) are similar to those for 3.
(c) Proofs of (2) for Gaussian processes satisfying certain conditions can be
found in Belayev (1966) and Cramer-Leadbetter (1967). Marcus (1977) contains
various extensions. The present statement of Proposition 2.2 is from Wschebor
(1985).
(d) It may be non trivial to verify the hypotheses of Proposition 2.2. However
some general criteria are available. For example if X is a Gaussian process with C1
paths and the densities
pXt1 ,...,Xtk ,Xs ,...,Xs
1
are non-degenerate for (t1 , ...tk ), (s1 , ...sk ) I k \Dk , then conditions 1, 2, 3 of
Proposition 2.2 hold true (cf Wschebor, 1985, p.37 for a proof and also for some
manageable sufficient conditions in non-Gaussian cases).
(e) Another point related to Rice formulae is the non existence of local extrema
at a given level. We mention here two well-known results:
Proposition 2.3 (Bulinskaya, 1961). Suppose that X has C1 paths and that for
every t I , Xt has a density pXt (x) bounded for x in a neighbourhood of u.
Then, almost surely, X has no tangencies at the level u, in the sense that if
TuX := {t I, Xt = u, Xt = 0},
then P (TuX = ) = 1.
Proposition 2.4 (Ylvisakers Theorem, 1968). Suppose that {Xt : t T } is a
real-valued Gaussian process with continuous paths, defined on a compact separable topological space T and that V ar(Xt ) > 0 for every t T . Then, for each
u IR, with probability 1, the function t Xt does not have any local extrema
with value u.
3. Proofs and related results
Let be a random variable with values in IRk with a distribution that admits a
density with respect to the Lebesgue measure . The density will be denoted by
p (.) . Further, suppose E is an event. It is clear that the measure
(B; E) := P ({ B} E)
defined on the Borel sets B of IRk , is also absolutely continuous with respect to .
We will denote the density of related to E the Radon derivative:
p (x; E) :=
d (.; E)
(x).
d
75
Theorem 3.1. Suppose that X has C2 paths, that X, X , X admit a joint density at
every time t, that for every t, Xt has a bounded density pXt (.) and that the function
1
I (x, z) :=
dt
is uniformly continuous in z for (x, z) in some neighbourhood of (u, 0). Then the
distribution of M admits a density pM (.) satisfying a.e.
pM (u) pX0 (u; X0 < 0) + pX1 (u; X1 > 0)
1
dt
(8)
Using Proposition 2.3, with probability 1, X (.) has no tangencies at the level 0,
thus an upper bound for this expectation follows from the Kacs formula:
1
0 2
+
= lim
Muh,u
1
0
(t)<0} |X
(t)|dt
a.s.
1
2
dz
u
uh
I (x, z)dx =
u
uh
I (x, 0)dx.
uh
(s, t)
,
(s) (t)
76
with
(t, t) = 1, 11 (t, t) = 1, 10 (t, t) = 0, 12 (t, t) = 0, 02 (t, t) = 1,
after some calculations, we get exactly their bound M(u) ( their formula (9)) for
the density of the maximum.
Let us illustrate formula (8) explicitly when the process is Gaussian, centered
with unit variance. By means of a deterministic time change, one can also assume
that the process has unit speed (V ar(Xt ) 1). Let L the length of the new
time interval. Clearly t, m(t) = 0, r(t, t) = 1, r11 (t, t) = 1, r10 (t, t) = 0,
r12 (t, t) = 0, r02 (t, t) = 1. Note that
Z N (, 2 ) E(Z ) = (/ ) (/ ).
The formulae for regression imply that conditionally on Xt = u, Xt = 0, Xt
has expectation u and variance r22 (t, t) 1. Formula (8) reduces to
pM (u) p+ (u) := (u) 1+(2)1/2
with Cg (t) :=
L
0
r22 (t, t) 1
As x +,
(x) = 1
(x)
x
(x)
x3
+O
(x)
x5
+O u4 (u/C + ) ,
with C + := supt[0,L] Cg (t).
Furthermore the exact equivalent of pM (u) when u + is
(2)1 u L exp(u2 /2)
as we will see in Corollary 1.1.
The following theorem is a special case of Lemma 3.3. We state it separately
since we use it below to compare the results that follow from it with known results.
Theorem 3.2. Suppose that X is a Gaussian process satisfying H2 . Then M has a
continuous density pM given for every u by
pM (u) = pX0 (u ; M u) + pX1 (u ; M u)
1
+
0
dt
(9)
77
and if x < 0 :
P (M u/Xt = u, Xt = 0, Xt = x )
1 E([Du ([0, t]) + Uu ([t, 1])] /Xt = u, Xt = 0, Xt = x ).
If we plug these lower bounds into Formula (9) and replace the expectations of
upcrossings and downcrossings by means of integral formulae of (4), (5) type, we
obtain the lower bound:
pM (u) pX0 (u; X0 < 0) + pX1 (u; X1 < 0)
1
+
0
dt
ds
0
1
dt
dx
0
|x |
t
0
dx .
(10)
Simpler expressions for (10) also adapted to numerical computations, can be found
in Cierco (1996).
Finally, some sharper upperbounds for pM (u) are obtained when replacing the
event {M > u} by {X0 + X1 > 2u}, the probability of which can be expressed
using the conditionnal expectation and variance of X0 + X1 ; we are able only to
express these bounds in integral form.
We now turn to the proofs of our main results.
78
(s t) t
Zs
2
2
s [0, 1) .
s [0, 1] s = t.
(12)
(13)
tm
.
s
is continuous.
Proof . (a) and (b) follow in a direct way, computing the regression coefficients
a (s), a (s) , bt (s), ct (s) and substituting into formulae (11), (12), (13). Note
that (b) also follows from (a) by applying it to Z + f and to Z. We prove now (c)
which is a consequence of the following:
Suppose Z(t1 , ..., tk ) is a Gaussian field with C p sample paths (p 2) defined on
[0, 1]k with no degeneracy in the same sense that in the definition of hypothesis Hk
(3) for one-parameter processes. Then the Gaussian fields defined by means of:
Z (t1 , ..., tk ) = (tk )1 Z(t1 , ..., tk1 , tk ) a (t1 , ..., tk )Z(t1 , ..., tk1 , 0)
for tk = 0,
Z (t1 , ..., tk ) = (1 tk )1 Z(t1 , ..., tk1 , tk ) a (t1 , ..., tk )Z(t1 , ..., tk1 , 1)
for tk = 1,
Z(t1 , ..., tk , tk+1 ) = 2 (tk+1 tk )2 (Z(t1 , ..., tk1 , tk+1 )
b(t1 , ..., tk , tk+1 )Z(t1 , ..., tk )
Z
(t1 , ..., tk ))
for tk+1 = tk
c(t1 , ..., tk , tk+1 )
tk
79
can be extended to [0, 1]k (respectively [0, 1]k , [0, 1]k+1 ) into fields with paths in
C p1 (respectively C p1 , C p2 ). In the above formulae,
- a (t1 , ..., tk ) is the regression coefficient of Z(t1 , ..., tk ) on Z(t1 , ..., tk1 , 0),
- a (t1 , ..., tk ) is the regression coefficient of Z(t1 , ..., tk ) on Z(t1 , ..., tk1 , 1),
- b(t1 , ..., tk , tk+1 ), c(t1 , ..., tk , tk+1 ) are the regression coefficients of
Z
Z(t1 , ..., tk1 , tk+1 ) on the pair Z(t1 , ..., tk ), t
(t1 , ..., tk ) .
k
Let us prove the statement on Z. The other two are simpler. Denote by V the subZ
(t1 , ..., tk ) . Denote
space of L2 ( , , P ) generated by the pair Z(t1 , ..., tk ), t
k
:= Y bZ(t1 , ..., tk ) + c
Z
(t1 , ..., tk ) ,
tk
Z
(t1 , ..., tk ).
tk
Note that if {Y : } is a random field with continuous paths and such that
Y is continuous in L2 ( , , P ) , then a.s.
, t1 , ..., tk )
V (Y )
is continuous.
From the definition:
Z(t1 , ..., tk , tk+1 ) = 2 (tk+1 tk )2
tk+1
tk
Z
(t1 , ..., tk )+R2 (t1 , ..., tk , tk+1 )
tk
2Z
(t1 , ..., tk1 , ) (tk+1 ) d
tk2
so that
Z(t1 , ..., tk , tk+1 ) =
(14)
It is clear that the paths of the random field Z are p 1 times continuously differentiable for tk+1 = tk . Relation (14) shows that they have a continuous extension
to [0, 1]k+1 with Z(t1 , ..., tk , tk ) =
V
2Z
(t , ..., tk )
tk2 1
. In fact,
80
= 2 (sk+1 sk )2
sk+1
sk
2Z
(s , ..., sk1 , )
tk2 1
(sk+1 ) d.
According to our choice of the version of the orthogonal projection V , a.s. the
integrand is a continuous function of the parameters therein so that, a.s.:
Z (s1 , ..., sk , sk+1 )
2Z
(t1 , ..., tk ) when (s1 , ..., sk , sk+1 )
tk2
(t1 , ..., tk , tk ).
V
This proves (c). In the same way, when p 3, we obtain the continuity of the
partial derivatives of Z up to the order p2.
The following lemma has its own interest besides being required in our proof
of Lemma 3.3. It is a slight improvement of Lemma 4.3, p. 76 in Piterbarg (1996)
in the case of one-parameter processes.
Lemma 3.2. Suppose that X is a Gaussian process with C3 paths and that for all
(2)
(3)
s = t, the distributions of Xs , Xs , Xt , Xt and of Xt , Xt , Xt , Xt do not degenerate. Then, there exists a constant K (depending on the process) such that
pXs ,Xt ,Xs ,Xt (x1 , x2 , x1 , x2 ) K(t s)4
for all x1 , x2 , x1 , x2 IR and all s, t, s = t [0, 1].
Proof .
pXs ,Xt ,Xs ,Xt (x1 , x2 , x1 , x2 ) (2)2 DetV ar(Xs , Xt , Xs , Xt )
1/2
where DetV ar stands for the determinant of the variance matrix. Since by hypothesis the distribution does not degenerate outside the diagonal s = t, the conclusion
of the lemma is trivially true on a set of the form {|s t| > }, > 0. By a compactness argument it is sufficient to prove it for s, t in a neighbourhood of (t0 , t0 )
for each t0 [0, 1]. For this last purpose we use a generalization of a technique
employed by Belyaev (1966). Since the determinant is invariant by adding linear
combination of rows (resp. columns) to another row (resp. column),
DetV ar(Xs , Xt , Xs , Xt ) = DetV ar(Xs , Xs , X s(2) , X s(3) ),
with
X s(2) = Xt Xs (t s)Xs
X s(3) = Xt Xs
2
X (2)
(t s) s
(t s)2 (2)
Xt 0
2
(t s)2 (3)
Xt 0 ,
6
The equivalence refers to (s, t) (t0 , t0 ). Since the paths of X are of class
(2)
(3)
C3 , Xs , Xs , (2(t s)2 )X s , (6(t s)2 )X s
tends almost surely to
81
(3)
Xt0 , Xt0 , Xt0 , Xt0 as (s, t) (t0 , t0 ). This implies the convergence of the
variance matrices. Hence
DetV ar(Xs , Xt , Xs , Xt )
(t s)8
(2)
(3)
DetV ar(Xt0 , Xt0 , Xt0 , Xt0 ),
144
t
(t)E v,u
Ztt t (t).u 1IAu (Z t , t )
(15)
82
(t1 t)2 t
Zt1 t (t1 ) u + (t1 ) (u v), ...
2
(tm t)2 t
Ztm t (tm ) u + (tm ) (u v) .
...,
2
t
v,u
=G
Proof . We start by showing that the arguments of Theorem 3.1 can be extended to
our present case to establish that Fv is absolutely continuous. This proof already
contains a first approximation to the main ideas leading to the proof of the lemma.
Step 1 Assume - with no loss of generality - that u 0 and write for h > 0:
Fv (u) Fv (u h) = E v .1IAu \Auh E v .1IAuh \Au
(16)
(17)
Note that:
where:
(1)
(1)
P (Muh,u 1) E Muh,u ,
and the formula for the expectation of the number of local maxima applied to the
process t Zt (t)(u h) imply
|E v .1IAu \Auh |
1I{(0)>0}
+1I{(1)>0}
1
+
0
(0)u
(0)(uh)
(1)u
(1)(uh)
(t)h
1I{(t)>0} dt
(18)
83
(1)
(19)
(20)
where
|R1 (h)| E |v |1I{(0)(uh)<Z0 (0)u,(1)(uh)<Z1 (1)u} 1I{(0)>0,(1)>0}
+ E |v |1I
+ E |v |1I
(1)
1I{(0)>0}
(1)
1I{(1)>0}
(0)(uh)<Z0 (0)u,Muh,u 1
(1)(uh)<Z1 (1)u,Muh,u 1
(1)
uh,u 1
84
Let us consider T2 (h). Using the integral formula for the expectation of the
number of local maxima:
T2 (h) 1I{(0)>0}
1
0
1I{(t)0} dt
(0)h
(t)h
dz0
dz.
1I{(t)0} dt
(t)h
0
where the random vector V2 is the same as in (18). Since the conditional expectation as well as the density are bounded for u in a bounded set and 0 < h < 1, this
expression is bounded by (const)h.
As for the second integral, when t is between and 1 the Gaussian vector
Z0 (0)(u h), Zt (t)(u h), Zt (t)(u h)
has a bounded density so that the integral is bounded by C h2 , where C is a constant
depending on .
Since > 0 is arbitrarily small, this proves that T2 (h) = o(h). T3 (h) is similar
to T2 (h).
We now consider T4 (h). Put:
(4)
Eh =
where
stands
h1/4 |v | h1/4
(1)
(1)
(21)
E |v |1IE C Muh,u E |v |4 E
h
(1)
Muh,u
1/4
P (EhC )
1/2
The polynomial bound on G, plus the fact that Z has finite moments of all
orders, imply that E |v |4 is uniformly bounded.
(1)
Also, Muh,u D0 (Z(.) (.)(u h), [0, 1]) = D (recall that D0 (g; I ) denotes the number of downcrossings of level 0 by function g). A bound for E D 4
85
E |v |1IE C Muh,u
h
(4)
(const) C1 eC2 h
+ hq/4 E |v |q
1/4
) + P (|v |
> h
1/2
> h1/4 )
1/2
where C1 , C2 are positive constants and q any positive number. The bound on the
first term follows from the Landau-Shepp (1971) inequality (see also Fernique,
1974) since even though the process depends on h it is easy to see that the bound
is uniform on h, 0 < h < 1. The bound on the second term is simply the Markov
inequality. Choosing q > 8 we see that the second term in (21) is o(h).
For the first term in (21) one can use the formula for the second factorial moment
(1)
of Muh,u to write it in the form:
1
0
1I{(s)0,(t)0} dsdt
0
E(|v |1IEh (Zs
(s)h
0
(t)h
dz1
dz2
y
t1
( )d |=|
y
t1
t2
(4) ( )d |
(t s)2
2
(4)
86
/V4 = v4 )
(23)
u
uh
u
+1I{(1)>0} (1)
+
uh
(t)h
1I{(t)0} dt
0
0
pV2 (z, 0) + o(h)
u
uh
(24)
where:
H1 (x, h) = 1I{(0)>0} (0)E v .1IAu /Z0 = (0)x .pZ0 ((0)x)
+ 1I{(1)>0} (1)E v .1IAu /Z1 = (1)x .pZ1 ((1)x)
+
1I{(t)0}
0
E(v .1IAu (Zt
Step 3. Our next aim is to prove that for each u the limit
lim
h0
Fv (u) Fv (u h)
h
exists and admits the representation (15) in the statement of the Lemma. For that
purpose, we will prove the existence of the limit
1
lim E v .1IAu \Auh .
h0 h
(26)
87
h0,uh<x<u
H1 (x, h).
Consider the first term in expression (25). We apply Lemma 3.1(a) and with the
same notations therein:
Zt = a (t) Z0 + tZt ,
t = a (t) (0) + tt
t [0, 1] .
(27)
a (t) (0)(u x)
for all t [, 1] .
t
|a (t) (0)|
: t [, 1] .
t
We prove that as x u,
E v,x .1IB(u,x) E v,u .1IB(u,u)
(28)
We have,
|E v,x .1IB(u,x) E v,u .1IB(u,u) |
E |v,x v,u | + |E v,u (1IB(u,x) 1IB(u,u) ) |.
(29)
88
From the definition of v,x it is immediate that the first term tends to 0 as x u.
For the second term it suffices to prove that
P (B(u, x) B(u, u)) 0 as x u.
(30)
The first term is equal to zero because of Proposition 2.4. The second term
decreases to zero as 0 since M[,1] 0, M[0,] > 0 decreases to the empty
set.
It is easy to prove that the function
(u, v) E v,u .1IAu (Z
, )
is continuous. The only difficulty comes from the indicator function 1IAu (Z , ) although again the fact that the distribution function of the maximum of the process
Z(.) (.)u has no atoms implies the continuity in u in much the same way as
above.
So, the first term in the right-hand member of (25) has the continuous limit:
1I{(0)>0} (0)E v,u .1IAu (Z
) .pZ0 ((0).u).
With minor changes, we obtain for the second term the limit:
1I{(1)>0} (1)E v,u .1IAu (Z
) .pZ1 ((1).u),
where Z , are as in Lemma 3.1 and v,u as in the statement of Lemma 3.3.
The third term can be treated in a similar way. The only difference is that the regression must be performed on the pair (Zt , Zt ) for each t [0, 1], applying again
Lemma 3.1 (a),(b),(c). The passage to the limit presents no further difficulties, even
if the integrand depends on h.
Finally, note that conditionally on Zt = (t)u, Zt = (t)u one has
Zt (t)u = Ztt t (t)u
and
89
) .pZ0 ((0).u)
,
) .pZ1 ((1).u)
t
(t)1I{(t)>0} dtE v,u
(Ztt t (t).u)1IAu (Z t , t )
(t) = ( )(t)
and
, )
where v = G(Zt1 (t1 )v, ..., Ztm (tm )v) is continuoustly differentiable
and its derivative verifies (15) with the obvious changes, that is:
Fv
(u) = (0)E
v,u
+ (1)E
1
v,u
(t)E
pZt ,(Z
)t
.1IAu
(Z ) ,( )
.1IAu
v,u
.pZ0 (0) .u
(Z ) ,( )
(t) .u,
t
t
.pZ1 (1) .u
t
(t) .u dt.
)t ,( )t )
(31)
Let 0. We prove next that (Fv ) (u) converges for fixed (u, v) to a limit
function Fv (u) that is continuous in (u, v). On the other hand, it is easy to see that
for fixed (u, v) Fv (u) Fv (u). Also, from (31) it is clear that for each v, there
exists 0 > 0 such that if (0, 0 ), (Fv ) (u) is bounded by a fixed constant when
u varies in a bounded set because of the hypothesis on the functions G and and
the non-degeneracy of the one and two-dimensional distribution of the process Z.
90
So, it follows that Fv (u) = Fv (u) and the same computation implies that Fv (u)
satisfies (15).
Let us show how to proceed with the first term in the right-hand member of
(31). The remaining terms are similar.
Clearly, almost surely, as 0 one has Zt Zt , (Z )t Zt , (Z )t Zt
uniformly for t [0, 1], so that the definition of Z in (11) implies that (Z )t
Zt uniformly for t [0, 1], since the regression coefficient (a ) (t) converges to
a (t) uniformly for t [0, 1] (with the obvious notation).
Similarly, for fixed (u, v):
( )t t , (v,u ) v,u
uniformly for t [0, 1].
Let us prove that
E (v,u ) 1IAu
E v,u 1IAu (Z
(Z ) ,( )
) .
This is implied by
as
P Au
0. Denote, for
> 0, 0:
Cu, = Au
Au Z ,
0.
(32)
sup
uK,t[0,1]
and
Fu, = sup
t[0,1]
Zt (t)u .
< .
, ,
Eu, \ Cu,
D c, Fu, ,
91
sup
t[0,1]
Zt (t)u = 0 = 0.
) = P Au Z ,
| sup
t[0,1]
P Au Z ,
Zt (t).u || h |
92
E Ytt11 1IA
Y t1
pYt
(33)
This expression is exactly the expression in (9) with the indicated notational changes and after taking profit of the fact that the process is Gaussian, via the regression
on the conditionning in each term. Note that according to the definition of the
Y -process:
E 1IA(Y ) = E 1IAu (X , )
E 1IA(Y ) = E 1IAu (X
E Ytt11 1IA
Y t1
(1)
+
0
E Ytt11 1IA
Y t1
(1,0)
(0, 0)dt1
t1 ,Yt1
pY
A Y
,t2
pY
t2 ,(Y
,t2
pY
t2 ,(Y
) pY1 (0)
pYt
) pY1 (0)
1
0
(1)
(1)
In this formula pYt , pYt
0
1
t1
(t2 )E
Y t1 ,
Y t1 ,
pY t1 (0)
0
pY t1 (1)
1
dt1 ,
t2
(34)
and pYt ,Yt (0, 0)(1,0) stand respectively for the deriv1
1
ative of pYt0 (.), the derivative of pYt1 (.) and the derivative with respect to the first
variable of (pYt ,Yt (., .)).
1
1
To validate the above formula, note that:
93
The first two lines are obtained by differentiating with respect to u, the densities
pY0 (0) = pX0 (u), pY1 (0) = pX1 (u), pYt ,Yt (0, 0) = pXt ,Xt (u, 0).
1
1
1
1
Lines 3 and 4 come from the application of Lemma 3.3 to differentiate E(1IA(Y ) ).
The lemma is applied with Z = X , = , = 1.
Similarly, lines 5 and 6 contain the derivative of E(1IA(Y ) ).
The remaining corresponds to differentiate the function
E Ytt11 1IA(Y t1 ) = E Xtt11 t1 (t1 )u 1IAu (Xt1 , t1 )
in the integrand of the third term in (33). The first term in line 7 comes from the
simple derivative
..
Y t1 ,..,tm
where:
1 m k.
t1 , ..., tm [0, 1] { , } , m 1.
s1 , .., sp , 0 p m, are the elements in {t1 , ..., tm } that belong to [0, 1] (that
is, which are neither nor ). When p = 0 no integral sign is present.
Q(s1 , .., sp ) is a polynomial in the variables s1 , .., sp .
is a product of values of Y t1 ,...,tm at some locations belonging to s1 , .., sp .
K1 (s1 , .., sp ) is a product of values of some ancestors of t1 ,...,tm at some
locations belonging to the set s1 , .., sp {0, 1} .
K2 (s1 , .., sp ) is a sum of products of densities and derivatives of densities of
the random variables Z at the point 0, or the pairs ( Z , Z ) at the point (0, 0)
where s1 , .., sp {0, 1} and the process Z is some ancestor of Y t1 ,...,tm .
94
ct (s) =
r01 (s, t)
r11 (t, t)
1 r(s, t)
(t s)2
95
where
L(u) = L1 (u) + L2 (u) + L3 (u),
L1 (u) = P (Au (X , ),
L2 (u) = P (Au (X , ),
1
L3 (u) =
0
dt
.
(2 r11 (t, t))1/2
u
(2)1/2
1
0
r02 (t, t)
u
dt =
1/2
(r11 (t, t))
(2 )1/2
h=k
h=2
k 1 (kh)
(u)L(h1) (u).
k1
(37)
u
), j = 1, ..., k 1
aj
(38)
1 r(s, 0)
f or 0 < s 1, (0) = 0,
s
(t)E (Xt
,t
,
,t
)]
pX ( (1)u)
1
(t)u)1IAu (X
,t , ,t )
pX
,(X )t (
Notice that (1) is non-zero so that the first term is bounded by a constant
times a non-degenerate Gaussian density. Even though (0) = 0, the second
96
term is also bounded by a constant times a non-degenerate Gaussian density because the joint distribution of the pair (Xt , (X )t ) is non-degenerate and the pair
( (t), ( ) (t)) = (0, 0) for every t [0, 1].
Applying a similar argument to the succesive derivatives we obtain (38) with
L1 instead of L.
The same follows with no changes for
L2 (u) = P (Au (X , ).
For the third term
1
L3 (u) =
0
dt
(2 r11 (t, t))1/2
we proceed similarly, taking into account t (s) = 0 for every s [0, 1]. So (38)
follows and we are done.
Remark. Suppose that X satisfies the hypotheses of the Corollary with k 2.
Then, it is possible to refine the result as follows.
For j = 1, ..., k :
F (j ) (u) = (1)j 1 (j 1)!hj 1 (u)
1
1 + (2)1/2 .u.
0
1 (j )
where hj (u) = (1)
j ! ((u)) (u), is the standard j-th Hermite polynomial
(j = 0, 1, 2, ...) and
| j (u) | Cj exp(u2 )
where C1 , C2 , ... are positive constants and > 0 does not depend on j .
The proof of (39) consists of a slight modification of the proof of the Corollary.
Note first that from the above computation of (s) it follows that 1) if X0 < 0,
then if u is large enough Xs (s).u 0 for all s [0, 1] and 2) if X0 > 0,
then X0 (0).u > 0 so that:
L1 (u) = P (Xs (s).u 0) for all s [0, 1])
1
2
as u +.
1
L1 (u) =
2
+
u
L1 (v)dv D1 exp(1 u2 )
L3 (u) =
0
E (Xtt t (t)u)
dt
(2 r11 (t, t))1/2
97
E (Xtt t (t)u)1I(A
u (X
t , t ) C
dt
.
(2 r11 (t, t))1/2
(40)
(2)1/2 .u.
inf
s,t[0,1]
Then:
P
Au (X t , t )
with D3 , 3 are positive constants, the last inequality being a consequence of the
Landau-Shepp-Fernique inequality.
The remainder follows in the same way as the proof of the Corollary.
Acknowledgements. This work has received a support from CONICYT-BID-Uruguay, grant
91/94 and from ECOS program U97E02.
References
1. Adler, R.J.: An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes, IMS, Hayward, Ca (1990)
2. Azas, J-M., Wschebor, M.: Regularite de la loi du maximum de processus gaussiens
reguliers, C.R. Acad. Sci. Paris, t. 328, serieI, 333336 (1999)
3. Belyaev, Yu.: On the number of intersections of a level by a Gaussian Stochastic process,
Theory Prob. Appl., 11, 106113 (1966)
4. Berman, S.M.: Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series (1992)
5. Bulinskaya, E.V.: On the mean number of crossings of a level by a stationary Gaussian
stochastic process, Theory Prob. Appl., 6, 435438 (1961)
6. Cierco, C.: Probl`emes statistiques lies a` la detection et a` la localisation dun g`ene a` effet
quantitatif. PHD dissertation. University of Toulouse.France (1996)
7. Cramer, H., Leadbetter, M.R.: Stationary and Related Stochastic Processes, J. Wiley &
Sons, New-York (1967)
8. Diebolt, J., Posse, C.: On the Density of the Maximum of Smooth Gaussian Processes,
Ann. Probab., 24, 11041129 (1996)
9. Fernique, X.: Regularite des trajectoires des fonctions aleatoires gaussiennes, Ecole
dEte de Probabilites de Saint Flour, Lecture Notes in Mathematics, 480, Springer-Verlag,New-York (1974)
10. Landau, H.J., Shepp, L.A.: On the supremum of a Gaussian process, Sankya Ser. A, 32,
369378 (1971)
11. Leadbetter, M.R., Lindgren, G., Rootzen, H.: Extremes and related properties of random
sequences and processes. Springer-Verlag, New-York (1983)
12. Lifshits, M.A.: Gaussian random functions. Kluwer, The Netherlands (1995)
13. Marcus, M.B.: Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 5271 (1977)
98
14. Nualart, D., Vives, J.: Continuite absolue de la loi du maximum dun processus continu,
C. R. Acad. Sci. Paris, 307, 349354 (1988)
15. Nualart, D., Wschebor, M.: Integration par parties dans lespace de Wiener et approximation du temps local, Prob. Th. Rel. Fields, 90, 83109 (1991)
16. Piterbarg, V.I.: Asymptotic Methods in the Theory of Gaussian Processes and Fields,
American Mathematical Society. Providence, Rhode Island (1996)
17. Rice, S.O.: Mathematical Analysis of Random Noise, Bell System Technical J., 23,
282332, 24, 45156 (19441945)
18. Tsirelson, V.S.: The Density of the Maximum of a Gaussian Process, Th. Probab. Appl.,
20, 817856 (1975)
19. Weber, M.: Sur la densite du maximum dun processus gaussien, J. Math. Kyoto Univ.,
25, 515521 (1985)
20. Wschebor, M.: Surfaces aleatoires. Mesure geometrique des ensembles de niveau, Lecture Notes in Mathematics, 1147, Springer-Verlag (1985)
21. Ylvisaker, D.: A Note on the Absence of Tangencies in Gaussian Sample Paths, The
Ann. of Math. Stat., 39, 261262 (1968)
Abstract
This paper deals with the problem of obtaining methods to compute the
distribution of the maximum of a one-parameter stochastic process on a fixed
interval, mainly in the Gaussian case. The main point is the relationship
between the values of the maximum and crossings of the paths, via the socalled Rices formulae for the factorial moments of crossings.
We prove that for some general classes of Gaussian process the so-called
Rice series is convergent and can be used for to compute the distribution
of the maximum. It turns out that the formulae are adapted to the numerical
computation of this distribution and becomes more efficient than other numerical methods, namely simulation of the paths or standard bounds on the
tails of the distribution.
We have included some relevant numerical examples to illustrate the power
of the method.
Introduction
Let X = {Xt : t IR} be a stochastic process with real values and continuous paths
defined on a probability space (, , P ) and MT := max{Xt : t [0, T ]}.
The computation of the distribution function of the random variable MT
F (T, u) := P (MT u), u IR
by means of a closed formula based upon natural parameters of the process X is
known only for a very restricted number of stochastic processes (and trivial functions
of them): the Brownian Motion {Wt : t 0}; the Brownian Bridge, Bt := Wt
1
tW1 (0 t 1); Bt 0 Bs ds (Darling, 1983); the Brownian Motion with a linear
t
drift (Shepp, 1979); 0 Ws ds + yt (McKean, 1963, Goldman, 1971 Lachal, 1991); the
stationary Gaussian processes with covariance equal to:
1. r(t) = e|t| (Ornstein-Uhlenbeck process, DeLong, 1981),
2. r(t) = (1 |t|)+ , T a positive integer (Slepian process, Slepian 1961, Shepp,
1971),
3. r(t) even, periodic with with period 2, r(t) = 1|t| for 0 |t| 1, 0 < 2,
(Shepp and Slepian 1976),
4. r(t) = 1 |t|/1 /1 , |t| < 1 /, 0 < 1/2, T = (1 )/
(Cressie 1980),
5. r(t) = cos t.
Given the interest in F (T, u) for a large diversity of theoretical and technical
purposes an extensive literature has been developed of which we give a sample of
references pointing to various directions:
1. Obtaining inequalities for F (T, u) : Slepian (1962); Landau & Shepp (1970);
Marcus & Shepp (1972); Fernique (1974); Borell (1975); Ledoux (1996); Talagrand (1996) and references therein. A general review of a certain number of
classical results is in Adler (1990, 2000).
2. Describing the behaviour of F (T, u) under various asymptotics : Qualls and
Watanabe (1973); Piterbarg (1981, 1996); Leadbetter, Lingren and Rootzen
(1983); Berman (1985a, b, 1992); Talagrand (1988); Berman & Kono (1992) ;
Sun (1993); Wschebor (2000); Azas, Bardet and Wschebor (2000).
2
called Davies bound(1977), or more accurately, the first term in the Rice series
to obtain approximations for F (T, u). But as T increases, for moderate values of u
the Davies bound is far from the true value and one requires the computation of
the succesive terms. The numerical results are shown in the case of four Gaussian
stationary processes for which no closed formula is known.
An asymptotic approximation of F (T, u) as u + recently obtained by Azas,
Bardet and Wschebor (2000). It extends to any T a previous result by Piterbarg
(1981) for sufficiently small T .
One of the key points in the computation is the numerical approximation of the
factorial moments of upcrossings by means of Rice integral formulae. For that purpose, the main difficulty is the precise description of the behaviour of the integrands
appearing in these formulae near the diagonal, which is again an old subject that is
interesting on its own - see Belayeiv (1966), Cuzick (1975) - and remains widely open.
We have included in the Section Computation of Moments some new results, that
give partial answers and are helpful to improve the numerical methods.
The extension to processes with non-smooth trajectories can be done by smoothing the paths by means of a deterministic device, applying the previous methods
to the regularized process and estimating the error as a function of the smoothing
width. We have not included these type of results here since for the time being they
do not appear to be of practical use.
The Note (Azas & Wschebor 1997) contains a part of the results in the present
paper, without proofs.
Notations
Let f : I IR be a function defined on the interval I of the real numbers,
Cu (f ; I) := {t I : f (t) = u}
Nu (f ; I) := (Cu (f ; I))
denote respectively the set of roots of the equation f (t) = u on the interval I and the
number of these roots, with the convention Nu (f ; I) = + if the set Cu is infinite.
Nu (f ; I) is called the number of crossings of f with the level u on the interval
I. In what follows, I will be the interval [0, T ] if it is not stated otherwise.
In the same way, if f is a differentiable function the number of upcrossings of
f is defined by means of
Uu (f ; I) := ({t I : f (t) = u, f (t) > 0}).
4
m =
dt1 ...dtm
[0,T ]
(1)
[0,T ]m
dt1 ...dtm
dx
[0,+)m
x1 ...xm
(2)
(References for conditions for this formula to hold true that suffice for our presente purposes and also for proofs can be found, for example, in Marcus (1977) and
in Wschebor (1985).
This section contains two main results. The first is Theorem 2.1 that requires
the process to have C paths and contains a general condition enabling to compute
F (T, u) as the sum of a series. The second is Theorem 2.2 that illustrates the same
situation for Gaussian stationary processes from conditions on the the covariance.
As for Theorem 2.3, it contains upper and lower bounds on F (T, u) for processes
with C k paths verifying some additional conditions.
Theorem 2.1 Assume that a.s. the paths of the stochastic process X are of class
C and that the density pXT /2 is bounded by some constant D.
(i) If there exists a sequence of positive numbers {ck }k=1,2,... such that:
k := P
X (2k1)
ck .T 2k1 +
22k1
Dck
= o 2k (k )
(2k 1)!
(3)
then :
(1)m+1
m
m!
(4)
(ii) In formula (4) the error when one replaces the infinite sum by its m0 -th
m
:= sup 2k+1 k .
km
We will call the series in the right-hand term of (4) the Rice Series.
For the proof we will assume, with no loss of generality that T = 1.
We start with the following lemma on the Cauchy remainder for polynomial
interpolation (Davis 1975, Th. 3.1.1 ).
Lemma 2.1 a) Let I be an interval in the real line, f : I IR a function of
class C k , k a positive integer, t1 , ..., tk , k points in I and let P (t) be the - unique
- interpolation polynomial of degree k 1 such that for i = 1, ..., k: f (ti ) = P (ti ),
taking into account possible multiplicities.
Then, for t I :
1
(t t1 )....(t tk )f (k) ()
k!
f (t) P (t) =
where
1
f (k)
k!2k
The next combinatorial lemma plays the central role in what follows. A proof is
given in Lindgren (1972).
Lemma 2.2 Let be a non-negative integer-valued random variable having finite
moments of all orders. Let k, m, M (k 0, m 1, M 1) be integers and denote
M
pk := P ( = k) ; m := E( [m] ) ; SM :=
(1)m+1
m=1
m
m!
Then
(i) For each M :
2M
pk S2M +1
pk
S2M
(5)
k=1
k=1
(ii) The sequence {SM ; M = 1, 2, ...} has a finite limit if and only if m /m! 0
as m , and in that case:
P ( 1) =
(1)m+1
pk =
m=1
k=1
m
.
m!
(6)
Remark. A by-product of Lemma 2.2 that will be used in the sequel is the following:
if in (6) one substitutes the infinite sum by the M partial sum, the absolute value
M +1 /((M + 1)!) of the first neglected term is an upper-bound for the error in the
computation of P ( 1).
7
Lemma 2.3 With the same notations as in Lemma 2.2 we have the equality:
E( [m] ) = m
(k 1)[m1] P ( k) (m = 1, 2, ...).
k=m
[m]
(k)[m1]
=m
k=m1
E(
[m]
)=
[m]
P ( = j) =
j=m
(k 1)[m1] =
P ( = j)m
j=m
k=m
(k 1)[m1] P ( k).
=m
k=m
Lemma 2.4 Suppose that a.s. the paths of the process X belong to C and that
pX1/2 is bounded by the constant D. Then for any sequence {ck , k = 1, 2, ...} of
positive numbers, one has
[m]
E((Uu )
)m
(k 1)[m1] P
X (2k1)
ck +
k=m
22k1
Dck
,
(2k 1)!
(7)
ck ) + P (Uu k, X (2k1)
< ck ).
< ck } {|X1/2 u|
8
22k1
ck
}.
(2k 1)!
1
(k+1)
k [m] m
2
= m 2(m+1)
m!
m! k=m
m!
1
1x
(m)
|x=1/2 = m
0.
and the result follows from m
Remarks
One can replace condition pXT /2 (x) D for all x by pXT /2 (x) D for x in some
neighbourhood of u. In this case, the statement of Theorem 2.1 holds if one adds in
(2k1)
ck ) P (|X0
| ck ) + 2P (Uck (X (2k1) , I) 1)
1/2
P( X
(2k1)
2 4k2
+ (1/)
ck
ck )
4k
4k2
1/2
exp
Choose
ck := (B1 k4k2 )1/2 if
ck := (4k )1/2 if
4k
B1 k
4k2
4k
> B1 k.
4k2
B1 k
2
1
+ (B1 k)1/2 e 2 .
ck )
2
1/2
(1 + 2(C1 + 1)k)22k
10
(k = 1, 2, ...),
c2k
24k2
(9)
8
1/2
(1 + 2(C1 + 1)m)2m (m = 1, 2, ...).
(10)
Remarks
a) If one is willing to use Rice formulae to compute the factorial moments m , it
is enough to verify that the distribution of
Xt1 , ..., Xtk , Xt1 , ..., Xtk
is non-degenerate for any choice of k = 1, 2, ... (t1 , ..., tk ) I k \Dk (I). For Gaussian
stationary processes a sufficient condition for non-degeneracy is the spectral measure
not to be purely atomic (see Cramer and Leadbetter (1967) for a proof). The same
kind of argument permits to show that the conclusion remains if the spectral measure
is purely atomic and the set of its atoms has an acumulation point in IR. Sufficient
conditions for the finiteness of m are given also in Nualart & Wschebor (Lemma
1.2, 1991).
b) If instead of requiring the paths of the process X to be of class C , one relaxes
this condition up to a certain order of differentiability, one can still get upper and
lower bounds for P (M > u).
Theorem 2.3 Let X = {Xt : t I} be a real -valued stochastic process. Suppose
that pXt (x) is bounded for t I, x IR and that the paths of X are of class C p+1 .
Then
2K+1
if
(1)m+1
m
m!
and
2K
if
(1)m+1
m=1
m
.
m!
Note that all the moments in the above formulae are finite.
The proof is a straightforward application of Lemma 2.2 and Lemma 1.2 in
Nualart & Wschebor (1991).
When the level u is high, the results by Piterbag (1981, 1996), which were until
recently the sharpest known asymptotic bounds for the tail of the distribution of the
11
2 (const)e
u2 (1+)
2
0 1 (u) +
(11)
Computation of Moments
An efficient numerical computation of the factorial moments of crossings is associated to a fine description of the behaviour as the k-tuple (t1 , ..., tk ) approaches the
diagonal Dk (I), of the integrands
A+
t1 ,...,tk (u, ..., u) =
[0,+)m
dx
[0,+)m
that appear respectively in Rice formulae for the k th factorial moment of upcrossings or the k th factorial moment of upcrossings with the additional condition that
X0 u (see formula(2).
For example in Azas, Cierco and Croquette (1999) it is proved that if X is
Gaussian, stationary, centered and 8 < , then the integrand A+
s,t (u, u) in the
12
A+
s,t (u, u)
1
(2 6 24 )
1 4
exp
u2
1296 (4 22 )1/2 2 22
2 4 22
(t s)4 ,
(12)
as t s 0.
(12) can be extended to non-stationary Gaussian processes obtaining an equivalence of the form:
A+
s,t (u, u)
J(t)(t s)4
as s, t t
(13)
where J(t) is a continuous non-zero function of t depending on u, that can be expressed in terms of the mean and covariance functions of the process and its derivatives. We give a proof of an equivalence of the form (13) in the next proposition.
One can profit of this equivalence to improve the numerical methods to compute
2 (the second factorial moment of the number of upcrossings restricted to X0
u). Equivalence formulae such as (12) or (13) can be used to avoid numerical
degeneracies near the diagonal D2 (I). Note that even in case X is stationary at the
departure, under conditioning on X0 , the process that must be taken into account
in the actual computation of the factorial moments of upcrossings that appear in
the Rice series(4) will be non-stationary, so that equivalence (13) is the appropriate
tool.
Proposition 3.1 Suppose that X is a Gaussian process with C 5 paths and that for
(2)
(3)
each t I the joint distribution of Xt , Xt , Xt , Xt does not degenerate.Then (13)
holds true.
1
a two-dimensional random vector having as proba2
s
bility distribution the conditional distribution of X
given Xs = Xt = u.
Xt
One has:
Proof. Denote by =
+ +
A+
pXs ,Xt (u, u)
s,t (u, u) = E 1 2
(14)
Put = t s and check the following Taylor expansions around the point s:
E (1 ) = m1 + m2 2 + L1 3
(15)
E (2 ) = m1 + m2 2 + L2 3
(16)
13
V ar () =
a 2 + b 3 + c 4 + 11 5
a 2
b+b
2
a 2
3 + d 4 + 12 5
b+b
2
3 + d 4 + 12 5
a 2 + b 3 + c 4 + 22 5
(17)
(18)
(2)
1 det V ar(Xs , Xs , Xs )T 2
V ar(1 )
4 det V ar(Xs , Xs )T
where denotes equivalence as 0. So,
(2)
1 det V ar(Xs , Xs , Xs )T
a=
4 det V ar(Xs , Xs )T
which is a continuous non-vanishing function for s I. Note that the coefficient of
. This follows either by
3 in the Taylor expansion of Cov(1 , 2 ) is equal to b+b
2
direct computation or noting that det V ar() is a symmetric function of the pair
s, t.
Put
(s, t) = det V ar()
The behaviour of (s, t) as s, t t can be obtained by noting that
(s, t) =
det V ar(Xs , Xt , Xs , Xt )T
det V ar(Xs , Xt )T
and applying Lemma 3.2 in Azas and Wschebor (2000) or Lemma 4.3, p.76 in
Piterbarg (1996) which provide an equivalent for the numerator, so that:
(s, t) (t) 6
14
(19)
with
(2)
(t) =
(3)
The non degeneracy hypothesis implies that (t) is continuous and non zero.
Then:
E
1+ 2+
1
1/2
2 [(s, t)]
xy exp
0
1
F (x, y) dxdy
2(s, t)
(20)
where
F (x, y) = V ar(2 )(x E(1 ))2 + V ar(1 )(y E(2 ))2 2Cov(1 , 2 )(x E(1 ))(y E(2 ))
Substituting the expansions (15), (16), (17) in the integrand of (20) and making
the change of variables x = 2 v, y = 2 w we get, as s, t t:
E 1+ 2+
5
2 (t)
1/2
vw exp
0
1
F (v, w) dvdw
2(t)
(21)
bb
2
and
2
k
computing the integral of A+
t1 ,...,tk (u) over I , instead of choosing at random the point
(t1 , t2 , ..., tk ) in the cube I k with a uniform distribution, we do it with a probability
law that has a density proportional to the function 1i<jk (tj ti )4 . For its proof
we will use the following auxiliary proposition, that has its own interest and extends
(19) to any k.
if t1 , t2 , ...., tk t :
(2k1) T
(tj ti )8
1i<jk
(22)
Proof. With no loss of generality, we consider only ktuples (t1 , t2 , ...., tk ) such
that ti = tj if i = j.
Suppose f : I IR is a function of class C 2m , m 1, and t1 , t2 , ...., tm
are pairwise different points in I. We use the following notations for interpolating
polynomials:
Pm (t; f ) is the polynomial of degree 2m 1 such that
Pm (tj ; f ) = f (tj ) and Pm (t; f ) = f (tj ) for j = 1, ..., m.
Qm (t; f ) is the polynomial of degree 2m 2 such that
Qm (tj ; f ) = f (tj ) for j = 1, ..., m ; Qm (t; f ) = f (tj ) for j = 1, ..., m 1.
From Lemma 2.1 we know that
f (t) Pm (t; f ) =
f (t) Qm (t; f ) =
1
(t t1 )2 ....(t tm )2 f (2m) ()
(2m)!
1
(t t1 )2 ....(t tm1 )2 (t tm )f (2m1) ()
(2m 1)!
where
= (t1 , t2 , ...., tm , t), = (t1 , t2 , ...., tm , t)
16
(23)
(24)
and
min(t1 , t2 , ...., tm , t) , max(t1 , t2 , ...., tm , t).
Note that the function
g(t) = f (2m1) ((t1 , t2 , ...., tm , t)) =
1
(tm t1 )2 ....(tm tm1 )2 f (2m1) ((t1 , t2 , ...., tm , tm ))
(2m 1)!
(25)
Put
m = (t1 , t2 , ...., tm , tm ), m = (t1 , t2 , ...., tm , tm ).
Since Pm (t; f ) is a linear functional of
(f (t1 ), ..., f (tm ), f (t1 ), ..., f (tm ))
and Qm (t; f ) is a linear functional of
(f (t1 ), ..., f (tm ), f (t1 ), ..., f (tm1 ))
with coefficients depending (in both cases) only on t1 , t2 , ...., tm , t, it follows that:
= det V ar Xt1 , Xt1 , Xt2 P1 (t2 ; X), Xt2 Q2 (t2 , X), ...
T
[2!...(2k 1)!]2
(tj ti )8
1i<jk
with
(2)
(2k2)
(2k1) T
Proposition 3.3 Suppose that X is a centered Gaussian process with C 2k1 paths
and that for each pairwise distinct values of the parameter t1 , t2 , ..., tk I the joint
(2k1)
distribution of (Xth , Xth , ...., Xth
, h = 1, 2, ..., k) is non-degenerate. Then, as
t1 , t2 , ..., tk t :
A+
t1 ,...,tk (0, ..., 0) Jk (t )
(tj ti )4
1i<jk
1
[2!.....(k 1)!]2
(tj ti )2 . Dk1 (t ).
(26)
1i<jk
For pairwise different values t1 , t2 , ..., tk , let Z = (Z1 , ..., Zk )T be a random vector
having the conditional distribution of (Xt1 , ...., Xtk )T given Xt1 = Xt2 = ... = Xtk =
0. The (Gaussian) distribution of Z is centered and we denote its covariance matrix
by . Also put:
1 =
1
ij
det()
i,j=1,...,k
ij being the cofactor of the position (i, j) in the matrix . Then, one can write:
+
+
A+
. pXt1 ,...,Xtk (0, ..., 0)
t1 ,...,tk (0, ..., 0) = E Z1 ...Zk
(27)
and
A+
t1 ,...,tk (0, ..., 0) =
1
k
2
(2) (det())
1
2
x1 ...xk exp
(R+ )k
F (x1 , ..., xk )
2. det()
dx1 ...dxk
(28)
where
k
ij xi xj .
F (x1 , ..., xk ) =
i,j=1
18
1
[k!.....(2k 1)!]2
(tj ti )6 .
1i<jk
D2k1 (t )
.
Dk1 (t )
We consider now the behaviour of the ij (i, j = 1, ..., k). Let us first look at 11 .
Using the same method as above, now applied to the cofactor of the position (1, 1)
in , one has:
11
1
[2!...(2k2)!]2
ti )8
2i<jk (tj
1
[2!.....(k1)!]2
2hk (t1
2
1i<jk (tj ti )
1
=
[k!...(2k 2)!]2
(tj ti )
2i<jk
th )4 D2k2 (t )
Dk1 (t )
6
(t1 th )
2hk
=
D2k2 (t )
Dk1 (t )
1
[k!...(2k 2)!]2
(tj ti )6
3i<jk
(ti tj )2 . yj
xj =
i=1,i=j
19
j = 1, ..., k
D2k2 (t )
Dk1 (t )
y1 ...yk exp
(R+ )k
1i<jk
1
G(y1 , ..., yk )
2. det()
dy1 ...dyk
where
k
h=k
G(y1 , ..., yk ) =
i,j=1
h=k
ij
(th tj )2
(th ti )
h=1,h=i
yi yj .
h=1,h=j
so that, as t1 , t2 , ..., tk t
G(y1 , ..., yk )
D2k2 (t )
[(2k 1)!]2
det()
D2k1 (t )
i=k
yi
i=1
Now, passage to the limit under the integral sign in (28), which is easily justified by
application of the Lebesgue Theorem, leads to
E
Z1+ ...Zk+
1
(2)
k
2
|tj ti |
k!...(2k 1)!
1i<jk
Dk1 (t )
D2k1 (t )
1
2
Ik ( )
y1 ...yk exp
Ik () =
(R+ )k
i=k
yi
i=1
and
= [(2k 1)!]2
D2k2 (t )
D2k1 (t )
2!...(2k 2)!
Ik (1)
D2k1 (t)
D2k2 (t)
Numerical examples
4.1
First, let us compare the numerical computation based upon Theorem 2.1 with
the Monte-Carlo method based on the simulation of the paths. We do this for
stationary Gaussian processes that satisfy the hypotheses of Theorem 2.2 and also
the non-degeneracy condition that ensures that one is able to compute the factorial
moments of crossings by means of Rice formulae.
Suppose that we want to compute P (M > u) with an error bounded by , where
> 0 is a given positive number.
To proceed by simulation, we discretize the paths by means of a uniform partition
{tj := j/n, j = 0, 1, ..., n}. Denote
M (n) := sup Xtj .
0jn
Using Taylors formula at the time where the maximum M of X(.) occurs, one
gets :
0 M M (n) X
/(2n
It follows that
0 P (M > u) P (M (n) > u) = P (M > u, M (n) u)
P (u < M u + X
/(2n
)).
The bound for m
in Equation (10) implies that computing a partial sum with
(const)log(1/) terms assures that the tail in the Rice series is bounded by . If
one computes each m by means of a Monte-Carlo method for the multiple integrals
appearing in the Rice formulae, then the number of elementary operations for the
whole procedure will have the form (const) 2 log(1/). Hence, this is better than
simulation as tends to zero.
As usual, for given > 0, the value of the generic constants decides the comparison between both methods.
More important is the fact that the enveloping property of the Rice series implies
that the actual number of terms required by the application of Theorem 2.1 can
. More
be much smaller than the one resulting from the a priori bound on m
|m
m | ,
m
0 +1
.
(m0 + 1)!
(29)
m+1
(1)
m=1
m
m+1 m
(1)
(e + 1).
m! m=1
m!
Putting = /(e + 1), we get the desired bound. In other words one can profit of
the successive numerical approximations of m to determine a new m0 which turns
out to be - in certain interesting examples - much smaller than the one deduced
4.2
Next, we will give the results of the evaluation of P (MT > u) using up to three
terms in the Rice series in a certain number of typical cases. We compare these
results with the classical evaluation using what is often called the Davies (1977)
bound. In fact this bound seems to have been widely used since the work of Rice
(1944). It is an upper-bound with no control on the error, given by:
P (M > u) P (X0 > u) + E Uu ([0, T ])
22
(30)
The above mentioned result by Piterbarg (11) shows that in fact, for fixed T and
high level u this bound is sharp. In general, using more than one term of the Rice
series supplies a remarkable improvement in the computation.
We consider several stationary centered Gaussian processes listed in the following
table, where the covariances and the corresponding spectral densities are indicated.
process
X1
X2
X3
X4
covariance
1 (t) = exp(t2 /2)
2 (t) = (ch(t))1
1
3 (t) = 31/2 t sin(31/2 t)
spectral density
f1 (x) = (2)1/2 exp(x2 /2)
1
f2 (x) = 2ch((x)/2)
f3 (x) = 121/2 1I{3<x<3}
4
f4 (x) = 105 (5 + x2 )4
23
u
-2
-1
T
10
1.00
1.00
1.00
1.00
0.98-1.00
0.98-1.00
0.99
0.98-1.00
0.90-1.00
0.87-1.00
0.92-1.00
0.88-1.00
0.74-0.77
0.70-0.76
0.76-0.78
0.72-0.77
0.22
0.21
0.22
0.22
0.02
0.02
0.02
0.02
Table 1: Values of P (M > u) for the different processes. Each cell contains, from
top to bottom, the values corresponding to stationary centered Gaussian processes
with covariances 1 , 2 , 3 and 4 respectively. The calculation uses three terms
of the Rice series for the upper-bound and two terms for the lower-bound. Both
bounds are rounded up to two decimals and when they differ, both displayed.
24
One, three, or two terms of the Rice series (R1, R3, R2 in the sequel) that is
K
P (X0 > u) +
(1)m+1
m=1
m
m!
with K = 1, 3 or 2.
Note that the bound D differs from R1 due to the difference between 1 and
1 . These bounds are evaluated for T = 4, 6, 8, 10, 15 and also for T = 20 and
T = 40 when they fall in the range [0, 1]. Between these values an ordinary spline
interpolation has been performed.
In addition we illustrate the complete detailed calculation in three chosen cases.
They correspond to zero and positive levels u. For u negative, it is easy to check
that the Davies bound is often greater than 1, thus non informative.
For u = 0, T = 6, = 1 , we have P (X0 > u) = 0.5, 1 = 0.955, 1 = 0.602,
2 /2 = .150, 3 /6 = 0.004, so that:
D = 1.455 , R1 = 1.103 , R3 = 0.957 , R2 = 0.953
R2 and R3 give a rather good evaluation of the probability, the Davies bound
gives no information.
For u = 1.5, T = 15, = 2 , we have P (X0 > u) = 0.067, 1 = 0.517,
1 = 0.488, 2 /2 = 0.08, 3 /6 = 0.013, so that:
D = 0.584 , R1 = 0.555 , R3 = 0.488 , R2 = 0.475
In this case the Davies bound is not sharp and a very clear improvement is
provided by the two bounds R2 and R3.
For u = 2, T = 10, = 3 , we have P (X0 > u) = 0.023, 1 = 0.215,
1 = 0.211, 2 /2 = 0.014, 3 /6 = 3104 , so that:
D = 0.238 , R1 = 0.234 , R3 = 0.220. , R2 = 0.220
In this case the Davies bound is rather sharp.
As a conclusion, these numerical results show that it is worth using several terms
of the Rice series. In particular the first three terms are relatively easy to compute
and provide a good evaluation of the distribution of M under a rather broad set of
conditions.
25
Acknowledgements
We thank C. Delmas for computational assistance. This work has received a support
ECOS program U97E02.
References
26
29
, u =1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
8
10
12
Length of the interval
14
16
18
20
Figure 1: For the process with covariance 1 and the level u = 1, representation of
the three upper-bounds D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
2 u =0
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
10
15
Figure 2: For the process with covariance 2 and the level u = 0, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
30
u =2
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
10
15
20
25
Length of the interval
30
35
40
Figure 3: For the process with covariance 3 and the level u = 2, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
4 u =1.5
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
8
10
12
Length of the interval
14
16
18
20
Figure 4: For the process with covariance 4 and the level u = 1.5, representation of
the three upper-bounds: D, R1, R3 and the lower-bound R2 (from top to bottom)
as a function of the length T of the interval
31
MT = sup{X(t) : t T }
The methods to find formulas for the distribution of the supremum of these processes, are ad hoc, hence non transposable to
more general random functions, even in the Gaussian context.
Given the interest in the distribution of the random variable MT ,
arising in a diversity of theoretical and technical questions, a large
body of mathematics has been developed beyond these particular formulas.
INEQUALITIES.
We give some fundamental examples for Gaussian processes.
Assume that X is centered Gaussian and there exists a countable
subset D T such that almost surely MT = suptD X(t). [In
particular, this condition holds true if X is separable].
2(t) = E[X 2(t)].
2
1u
2
2
(1)
1 u2
.
2
2 T
(2)
(3)
Grosso modo, this says that the tail of the distribution of the
random variable MT is bounded (except for a multiplicative constant) by the value of the centered normal density having variance larger than, and arbitrarily close to, T2 .
The problem is that C can grow (and tend to infinity) as
decreases to zero. Even for fixed , in general, one can only
have rough bounds for C. This implies serious limitations for
the use of these inequalities in Statistics and in other fields.
5
These inequalities are essential for the development of the mathematical theory. However, in a wide number of applications, the
general situation is that these inequalities are not good enough,
one reason being that they depend on certain constants (the
expectation or the median of MT ) that one is unable to estimate
or for which estimations differ substantially from the true values.
As a consequence, the bounds become exponentially larger than
the true values, as u +.
Since the 1990s several methods have been introduced with the
aim of obtaining more precise results:
Examples: the double sum method (Piterbarg, 1996); the EulerPoincar
e Characteristic approximation (EPC, Taylor, Takemura
and Adler, 2005, Adler and Taylors book, 2007); the tube
method (Sun, 1993), the use of Rice series (Miroshin, 1984,
Azas-MW, 2002), the record method (Rychlik,Mercadier, 2005).
See the book by Azas-MW, Wiley, 2009 for a more detailed account.
7
We call the first (respectively the second) term in the righthand side of (4) the first (resp second) order approximation of
P{M > u}.
First order approximation has been considered by Taylor, Takemura and Adler (2005) and also Adler and Taylor (2007) by
means of the expectation of the EPC of the excursion set Eu :=
{t S : X(t) > u}. This works for large values of u. The same
authors have considered the second order approximation, that is,
how fast does the difference between P{M > u} and the expected
EPC tend to zero when u +.
As far as I know, the only known result giving a precise description of the second order term for the asymptotics of P(MT > u)
as u + is the following (Piterbarg, 1981 for sufficiently small
T and Azas-Bardet-MW, 2001 for general T ):
Let X be a one-parameter centered Gaussian stationary process,
satisfying certain regularity conditions. Then, as u +:
2
T (u)
P(MT > u)= 1 (u) +
2
3 3(4 2
)9/2 T
4
2
[1 + o(1)] .
u
2
5
9/2
u
4 2
22 (26 2
4)
(resp. ) denotes the standard normal distribution (resp. density). k is the k-th spectral moment.
10
pM (x) =
tS0
d
12
REMARKS ON THEOREM 3.
Theorem 3 implies, in particular, the existence of a continuous density of the distribution of M . For one-parameter
processes, this kind of formula can be iterated and used to
prove higher order differentiability of F (x). This is a hard
and very interesting subject.
13
The proof of this theorem is based on a variant of Rice Formula, which permits to write as an integral the expectation
of the total mass of weighted roots of a random field.[See
Azas-MW, 2009, chapters 6 and 7].
d0
+
j=1 Sj
p(x)dx.
+
j=1
(1)j
Sj
j,N (t)Ct,j
X(t) = x, Xj (t) = 0
st 2 ,
(8)
Theorem 5 Assume that the random field X is centered Gaussian, satisfies conditions A1-A5 of Chapter and has a covariance
having the form (8). Let S have polyhedral shape. Then,
p(x) = (x)
d0
0(t) +
tS0
j=1
| | j/2
H j (x) + Rj (x) gj
(9)
dj1(Ct,j S dj1)
d(t) = 1.
dj1
(S dj1)
for j = 0, . . . , d 1,
(10)
(11)
- Rj (x) =
2 2 ((j+1)/2 +
Tj (v) exp
| |
v := (2)1/2 (1 2)1/2y x
y2
2
dy
j1
Hj (v)
Hk2(v) v2/2
e
j
Ij1(v),
Tj (v) :=
k
2 k!
2 (j 1)!
k=0
2
In(v) = 2ev /2
[ n1
2 ]
(13)
(n 1)!!
Hn12k (v)
(14)
(n 1 2k)!!
k=0
n
2
+ 1I{n even} 2 (n 1)!! 2(1 (x))
2k
tS0
d0
0(t) +
j=1
| | j/2
H j (x) gj ,
(15)
which is the product of a standard normal density times a polynomial with degree d0. Integrating once, we get -in this casethe formula for the expectation of the EPC of the excursion set
given in Adler and Taylor (2007).
16
Theorem 7 Assume that X is centered, satisfies hypotheses A1A5, the covariance has the form (8) with (0) = 1/2, (x)
0 f or x 0. Let S be a convex set, and d0 = d 1. Then
2
1
.
lim log p(x) pM (x) = 1 +
x+ x2
12 1
(16)
Remarks
1.- Since S is convex, the added hypothesis that the maximum
dimension d0 such that Sj is not empty is equal to d is not an
actual restriction.
2.- (0) = 1/2 is not an actual restriction, one can always
reduce the problem to this case by means of a scale change. As
for (x) 0 f or x 0 it is always verified by the so-called
Schoenberg covariances, which is exactly the class of functions
such that ( t s 2) is a covariance for any dimension d.
18
SOME REFERENCES
Adler, R.J. (1981). The Geometry of Random Fields. J. Wiley
and Sons, New York.
Adler, R.J. and Taylor, J. (2007). Random fields and geometry
Springer-Verlag.
Azas, J-M, Bardet, J-M and Wschebor, M. (2002). On the
Tails of the Distribution of the Maximum of a Smooth Stationary Gaussian Process. ESAIM: Probability and Statistics, Vol.
6, 177-184.
Azas, J-M. and Wschebor, M. (2008). A general formula for the
distribution of the maximum of a Gaussian field and the approximation of the tail. Stoch. Proc. Appl., 118, (7), 1190-1218.
Azas, J-M. and Wschebor, M. (2009). Level sets and extrema
of random processes and fields. J. Wiley and Sons, New York.
Borell, C. (1975). The Brunn-Minkowski inequality in Gauss
19
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
Jean-Marc A ZAI S
Institut de Mathematiques,
Universite de Toulouse
1 / 21
Examples
The record method
The maxima method
Examples
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
2 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
3 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
3 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
3 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
Precision agriculture
Representation of the yield per unit by GPS harvester .
4 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
Neuroscience
The activity of the brain is recorded under some particular action and
the same question is asked
5 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
Sea-waves spectrum
Locally in time and frequency the spectrum of waves is registered.
We want to localize transition periods.
6 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
7 / 21
Examples
The record method
The maxima method
Examples
Non asymptotic bounds for the distribution of the maximum of Random fields
The record method
8 / 21
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
The record method
The maxima method
Hypothesis
S is a regular set of R2 (compact, simply connected + piecewise C 1
parametrization of the boundary by arc length) X is such that
X
-the bivariate process Z = (X, t
) has C 1 sample paths and non
2
degenerated Gaussian distribution.
X 2 X
- the distribution of (X(t), X (t)), X(t), t
, t2 do not degenerate
1
2
2X
t12
< 0
9 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The record method
E(| det(Z (t) 1IX20 (t)<0 1IX01 (t)>0 | X(t) = u, X01 (t) = 0)pX(t),X01 (t) (u, 0)dt,
+
S
10 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The record method
The key point is that under the condition {X(t) = u, X01 (t)} = 0, the
quantity
X10 X01
| det(Z (t)| =
X11 X02
is simply equal to |X10 X02 | . Taking into account conditions, we get the
following expression for the second integral
E(|X20 (t) X01 (t)+ X(t) = u, X01 (t) = 0)pX(t),X01 (t) (u, 0)dt.
+
S
11 / 21
Examples
The record method
The maxima method
Examples
Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
12 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
13 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
In fact result are simpler (and stronger) in term of the density pM (x) of
the maximum. Bound for the distribution are obtained by integration.
Theorem
pM (x) pM (x) :=
pM (x) :=
S
1
[p (x) + pEC
M (x)]with
2 M
and
d
pEC
M (x) := (1)
S
14 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
Quantity pEC
M (x) is easy to compute using the work by Adler and
properties of symmetry of the order 4 tensor of variance of X ( under
the conditional distribution) )
Lemma
E det(X(t))/X(t) = x, X (t) = 0 = det()Hd (x)
where Hd (x) is the dth Hermite polynomial and := Var(X (t))
main advantage of Euler characteristic method lies in this result.
15 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
computation of pm
8G +
2 Id + xId
16 / 21
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
The record method
The maxima method
Theorem
Assume that the random field X is centered, Gaussian, stationary
and isotrpic and is regular Let S have polyhedral shape. Then,
d0
| | j/2
Hj (x) + Rj (x) gj
0 (t) +
p(x) = (x)
(1)
tS0
j=1
17 / 21
Non asymptotic bounds for the distribution of the maximum of Random fields
Examples
The record method
The maxima method
Theorem (continued)
Rj (x) =
2
| |
j
2
((j+1)/2
Tj (v) exp
v := (2)1/2 (1 2 )1/2 y x
j1
Tj (v) :=
k=0
In (v) = 2e
v2 /2
y2
2
dy
with := | |( )1/2 ,
Hj (v)
Hk2 (v) v2 /2
e
j
Ij1 (v),
k
2 k!
2 (j 1)!
(2)
(3)
[ n1
2 ]
(n 1)!!
Hn12k (v)
(n 1 2k)!!
k=0
n
+ 1I{n even} 2 2 (n 1)!! 2(1 (x))
2k
(4)
18 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
Second order
Theorem
Under conditions above + Var(X(t)) 1 Then
limx+
2
log pM (x) pM (x)
x2
1 + inf
tS
t2
1
+ (t)2t
t2 := sup
19 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
20 / 21
Examples
The record method
The maxima method
Non asymptotic bounds for the distribution of the maximum of Random fields
The maxima method
References
21 / 21
Jose R. Leon
Mario Wschebor
October 5, 2009
Abstract
We use Rices formulas in order to compute the moments of some
level functionals which are linked to problems in oceanography and optics. For instance, we consider the number of specular points in one or
two dimensions, the number of twinkles, the distribution of normal angle
of level curves and the number or the length of dislocations in random
wavefronts. We compute expectations and in some cases, also second moments of such functionals. Moments of order greater than one are more
involved, but one needs them whenever one wants to perform statistical
inference on some parameters in the model or to test the model itself.
In some cases we are able to use these computations to obtain a Central
Limit Theorem.
Introduction
2 r1 1 r2
x(r2 r1 )
(1)
x2 + 2i , i=1,2.
The points (x, W (x)) of the curve such that x is a solution of (1) are called
specular points. We denote by SP1 (A) the number of specular points
such that x A, for each Borel subset A of the real line. One of our aims
in this paper is to study the probability distribution of SP1 (A).
The following approximation, which turns out to be very accurate in practice for ocean waves, was introduced long ago by Longuet-Higgins (see [13]
and [14]):
Suppose that h1 and h2 are big with respect to W (x) and x, then ri =
i + x2 /(2i ) + O(h3
i ). Then, (1) can be approximated by
W (x)
x h1 + h2
x 1 + 2
= kx,
2 1 2
2 h1 h2
where
k :=
(2)
1 1
1
+
.
2 h1
h2
Denote Y (x) := W (x) kx and SP2 (A) the number of roots of Y (x)
belonging to the set A, an approximation of SP1 (A) under this asymptotic.
The first part of Section 3 below will be devoted to obtain some results
on the distribution of the random variable SP2 (R).
Consider now the same problem as above, but adding a time variable t,
that is, W becomes a random function parameterized by the pair (x, t).
We denote Wx , Wt , Wxt , ... the partial derivatives of W .
We use the Longuet-Higgins approximation (2), so that the approximate
specular points at time t are (x, W (x, t)) where
Wx (x, t) = kx.
Generally speaking, this equation defines a finite number of points which
move with time. The implicit function theorem, when it can be applied,
shows that the x-coordinate of a specular point moves at speed
dx
Wxt
=
.
dt
Wxx k
The right-hand side diverges whenever Wxx k = 0, in which case a flash
appears and the point is called a twinkle. We are interested in the
(random) number of flashes lying in a set A of space and in an interval
[0, T ] of time. If we put:
Wx (x, t) kx
Wxx (x, t) k
Y(x, t) :=
(3)
W (u)
CQ
Y (s)ddd (s)].
where Y (s) is some random field defined on the level set. Caba
na [7],
Wschebor [19] (d = 1) Azas and Wschebor [4] and, in a weak form,
Z
ahle [20] have studied these types of formulas. See Theorems 5 and 6.
Another interesting problem is the study of phase singularities, dislocations of random wavefronts. They correspond to lines of darkness, in light
propagation, or threads of silence in sound [6]. In a mathematical framework they can be define as the loci of points where the amplitude of waves
vanishes. If we represent the wave as
W (x, t) = (x, t) + i(x, t), where x Rd
where , are independent homogenous Gaussian random fields the dislocations are the intersection of the two random surfaces (x, t) = 0, (x, t) =
0. We consider a fixed time, for instance t = 0. In the case d = 2 we will
study the expectation of the following random variable
#{x S : (x, 0) = (x, 0) = 0}.
In the case d = 3 one important quantity is the length of the level curve
L{x S : (x, 0) = (x, 0) = 0}.
All these situations are related to integral geometry. For a general treatment
of the basic theory, the classical reference is Federers Geometric Measure Theory [9].
The aims of this paper are: 1) to re-formulate some known results in a
modern language or in the standard form of probability theory; 2) to prove
new results, such as computations in the exact models, variance computations
in cases in which only first moments have been known, thus improving the
statistical methods and 3) in some case, obtain Central Limit Theorems.
The structure of the paper is the following: In Section 2 we review without
proofs some formulas for the moments of the relevant random variables. In
Section 3 we study expectation, variance and asymptotic behavior of specular
points. Section 4 is devoted to the study of the distribution of the normal to the
level curve. Section 5 presents three numerical applications. Finally, in Section
6 we study dislocations of wavefronts following a paper by Berry & Dennis [6].
Rice formulas
We give here a quick account of Rice formulas, which allow to express the
expectation and the higher moments of the size of level sets of random fields
by means of some integral formulas. The simplest case occurs when both the
4
dimension of the domain and the range are equal to 1, for which the first results
date back to Rice [17] (see also Cramer and Leadbetters book [8]). When
the dimension of the domain and the range are equal but bigger than 1, the
formula for the expectation is due to Adler [1] for stationary random fields. For
a general treatment of this subject, the interested reader is referred to the book
[4], Chapters 3 and 6, where one can find proofs and details.
Theorem 1 (Expectation of the number of crossings, d = d = 1) Let W =
{W (t) : t I} , I an interval in the real line, be a Gaussian process having C 1 paths. Assume that Var(W (t)) = 0 for every t I.
Then:
(4)
E NIW (u) = E |W (t)| W (t) = u pW (t) (u)dt.
I
E
Im
j=1
|W (tj )| W (t1 ) = ... = W (tm ) = u pW (t1 ),...,W (tm ) (u, ..., u)dt1 ...dtm .
(5)
Proposition 1 Under the same conditions of the above theorem one has
P({t A : W (t) = u , det(W (t)) = 0}) = 0
if
pX(t) (x) C for all x in some neighborhood of u,
at least one of the two following conditions is satisfied
a) the trajectories of W are twice continuously differentiable
b)
() = sup P{| det W (t)| < W (t) = x} 0
xV (u)
E
Bm
j=1
W (t) is of class C 1 .
1/2
W (t) = u
pW (t) (u)dt.
(7)
E
BW 1 (u)
(8)
3.1
Number of roots
where
i =
i d()
u2
2 2
0 ,
e
0
i = 0, 2, 4, . . . ,
(9)
3.2
We consider first the one-dimensional static case with the longuet-Higgins approximation (2) for the number of specular points, that is:
SP2 (I) = #{x I : Y (x) = W (x) kx = 0}
E(SP2 (I)) =
I
where 2 (x) is the variance of W (x) and G(, ) := E(|Z|), Z with distribution
N (, 2 ).
For the second equality in (10), in which we have erased the condition in the
conditional expectation, take into account that since Var(W (x)) is constant,
for each x the random variables W (x) and W (x) are independent (differentiate under the expectation sign and use the basic properties of the Gaussian
distribution).
An elementary computation gives:
G(, ) = [2(/) 1] + 2(/),
where (.) and (.) are respectively the density and the cumulative distribution
functions of the standard Gaussian distribution.
When the process W (x) is also stationary, v 2 = 2 and 2 (x) is constant
equal to 4 . If we look at the total number of specular points over the whole
line, we get
G(k, 4 )
(11)
E(SP2 (R)) =
k
which is the result given by [14] (part II, formula (2.14) page 846). Note that
this quantity is an increasing function of k4 ) .
Since in the longuet-Higgins approximation k 0, one can write a Taylor
expansion having the form:
E(SP2 (R))
24 1
1 k2
1 k4
1+
+
+ ...
k
2 4
24 24
(12)
Let us turn to the variance of the number of specular points, under some
additional restrictions. First of all, we assume for this computation that the
8
(13)
R2
(14)
where
1 k 2 (2 x2 + 22 (x y)xy + 2 y 2 )
,
2
22 2 (x y)
2 22 2 (x y)
(15)
under the additional condition that the density (15) does not degenerate for
x = y.
For the conditional expectation in (14) we perform a Gaussian regression of
W (x) (resp. W (y)) on the pair (W (x), W (y)). Putting z = x y, we obtain:
1
exp
ay (x) =
(16)
where y (x) is Gaussian centered, independent of (W (x), W (y)). The regression of W (y) is obtained by permuting x and y.
The conditional expectation in (14) can now be rewritten as an unconditional
expectation:
(z)x + 2 y
22 2 (z)
(z)y + 2 x
22 2 (z)
(17)
Notice that the singularity on the diagonal x = y is removable, since a Taylor
expansion shows that for z 0:
E y (x) k (z) 1 +
(z) 1 +
x (y) k (z) 1 +
(z)x + 2 y
1 4
=
x z + O(z 3 ) .
22 2 (z)
2 2
(18)
2 2 (z)
22 2 (z)
(19)
and
E y (x)x (y) = (4) (z) +
2 (z) (z)
.
22 2 (z)
(20)
H(; 0, 0) =
1 2 +
2
arctan
and
1 2
|R2 (; , )| 3(2 + 2 )
if 2 + 2 1 and 0 1.
In the next theorem we compute the equivalent of the variance of the number
of specular points, under certain hypotheses on the random process and with
the longuet-Higgins asymptotic. This result is new and useful for estimation
purposes since it implies that, as k 0, the coefficient of variation of the
random variable S tends to zero at a known speed. Moreover, it will also appear
in a natural way when normalizing S to obtain a Central Limit Theorem.
Theorem 7 Assume that the centered Gaussian stationary process W = {W (x) :
x R} is dependent, that is, (z) = 0 if |z| > , and that it has C 4 -paths.
Then, as k 0 we have:
Var(S) =
1
+ O(1).
k
where
J
= +
2
24
24
,
3
2
10
(22)
J=
2 (z)H (z); 0, 0)
2(2 + (z))
dz,
(23)
the functions H and 2 (z) have already been defined above, and
(z) =
(z)2 (z)
1
(4)
(z)
+
.
2 (z)
22 2 (z)
1
24
+
2 (z)H((z); 0, 0)
2
2 +
(z)
1 4
dz.
2
The proof of this extension can be performed following the same lines as
the one we give below, with some additional computations.
Proof of the Theorem: We use the notations and computations preceding
the statement of the theorem.
Divide the integral on the right-hand side of (14) into two parts, according as
|x y| > or |x y| , i.e.
E(S(S 1)) =
... = I1 + I2 .
... +
(24)
|xy|
|xy|>
In the first term, the dependence of the process implies that one can
factorize the conditional expectation and the density in the integrand. Taking
into account that for each x R, the random variables W (x) and W (x) are
independent, we obtain for I1 :
I1 =
|xy|>
On the other hand, we know that W (x) (resp. W (x)) is centered normal with
variance 2 (resp. 4 ). Hence:
I1 = G(k,
4 )
2
|xy|>
1 k 2 (x2 + y 2 )
1
exp
dxdy,
22
2
2
11
To compute the integral on the right-hand side, notice that the integral over
the whole x, y plane is equal to 1/k 2 so that it suffices to compute the integral
over the set |x y| . Changing variables, this last one is equal to
+
x+
dx
x
+
1
=
2k 2
1
1 k 2 (x2 + y 2 )
dy
exp
22
2
2
1
e 2 u du
u+ k
e 2 v dv
u k
+ O(1),
=
k 2
where the last term is bounded if k is bounded (in fact, remember that we are
considering an approximation in which k 0). So, we can conclude that:
|xy|>
1
1
1 k 2 (x2 + y 2 )
dxdy = 2
exp
+ O(1)
22
2
2
k
k 2
24 1
+ O(1) .
2
k
k 2
(25)
I2 = O(1) + 2
|xy|,x>2
22 2 (z)
exp
12
1 k 2 (2 x2 + 2 (x y)xy + 2 y 2 )
dxdy
2
22 2 (x y)
I2 = O(1) + 2
1
2(2 + (z))
+
exp
1
2(2
(z))
k2 z 2
2
1
1
2 2 (z) 2 + (z) 2
exp k 2
dz
(x z/2)2
dx
2 (z))
2k(x z/2)
=
2 (z)
so that it becomes:
+
1
1 1
1
1
exp 2 d =
+ O(1)
2
k 2 0
2
2 2k
(26)
(27)
To finish, put together (27) with (25), (24), (13) and (12).
Corollary 1 Under the conditions of Theorem 7, as k 0:
Var(S)
k.
E(S)
The proof follows immediately from the Theorem and the value of the expectation.
The computations made in this section are in close relation with the two
results of Theorem 4 in Kratz and Leon [12]. In this paper the random variable
SP2 (I) is expanded in the Wiener-Hermite Chaos. The aforementioned expansion yields the same formula for the expectation and allows obtaining also a
formula for the variance. However, this expansion is difficult to manipulate in
order to get the result of Theorem 7.
Let us now turn to the Central Limit Theorem.
13
24 1
k
/k
N (0, 1),
T (k) =
|j|[k ]
Denote
Vk = Var(S(k))
1/2
k/
We give the proof in two steps, which easily imply the statement. In the
first one, we prove that
Vk [S(k) T (k)]
SP2 (Ijk ) + Z1 + Z2
|j|<[k ]
where
Z1 = SP2 , [k ].[k ] + /2 ,
Z2 = SP2 [k ].[k ] /2, +) .
Using the fact that E SP2k (I) (const)
+
=0
[k ]k
+
2
(kx/
2 )dx .
[k ][k ]
|j|<[k ]
(29)
We already know that Vk2 E S(k) T (k) 0. Using the hypotheses of the
theorem, since each Ijk can be covered by a fixed number of intervals of size one,
we know that E SP2 (Ijk )(SP2 (Ijk ) 1) is bounded by a constant which does
not depend on k and j. We can write
Vk2
|j|<[k ]
which tends to zero because of the choice of . The remaining two terms can
be bounded by calculations similar to those of the proof of Theorem 7.
Step 2.
T (k) is a sum of independent but not equi-distributed random
variables. To prove it satisfies a Central Limit Theorem, we use a Lyapunov
condition based of fourth moments. Set:
Mjm := E
Mj4 0 as k 0,
(30)
where
2 :=
Mj2 .
|j|[k ]
(31)
where SPi stands for SP2 (Ii ) E SP2 (Ii ) Since the size of all the intervals
is equal to and given the finiteness of fourth moments in the hypothesis, it
follows that E SPi1 SPi2 SPi3 SPi4 is bounded.
On the other hand, notice that the number of terms which do not vanish in
the sum of the right-hand side of (31) is O(p2 ). In fact, if one of the indices in
(i1 , i2 , i3 , i4 ) differs more than 1 from all the other, then E SPi1 SPi2 SPi3 SPi4
vanishes. Hence,
E SP2 (Ujk ) E SP2 (Ujk )
(const)k 2
so that |j|[k ] Mj4 = O(k 2 k ). The inequality 2 + < 2 implies Lyapunov condition.
16
3.3
dx
a
(32)
m1
m1
(x, W (x))
(x, W (x))W (x),
x
w
m1
m1
(x, w))+
(x, w))m1 (x, w).
x
w
Using that for each x, W (x) and W (x) are independent random variables
and performing a Gaussian regression of W (x) on W (x), we can write (32) in
the form:
dx
a
E | 2 w K(x, w)|
1
1
m2 (x, w)
exp (w2 + 1
) dw.
2
2
2 2
(33)
1
2
4 22
2
dx
a
m2 (x, w)
1
) dw,
G(m, 1) exp (w2 + 1
2
2
17
(34)
where
m = m(x, w) =
2 w + K(x, w)
4 22
3.4
Number of twinkles
(35)
40 =
4 (d, d )
20 =
2 (d, d ),
where is the spectral measure of the stationary random field W (x, t). The
density in (35) satisfies
pY(x,t) (0) = (20 )1/2 kx(20 )1/2 (40 )1/2 k(40 )1/2 .
On the other hand
Y (x, t) =
Wxx (x, t) k
Wxxx (x, t)
Wxt (x, t)
Wxxt (x, t)
231
, for the first coordinate
40
240
, for the second coordinate
20
(36)
(37)
It follows that:
E | det(Y (x, t))| Y(x, t) = 0 = G
31
k,
40
18
22
40
231
.G
kx,
40
20
60
240
20
Summing up:
1
E T W(R, T ) =
T
1
40
=
40
k
40
setting 6 := 60
III page 853).
3.5
31
k,
40
240
20
22
31
k,
40
231
40
G
R
22
40
kx,
20
231 1
40 k
60
1
240
20
20
6 20 + 40
6
20 40
6 + 240
kx
20
(38)
x2 + y 2 + i , i = 1, 2, as in the one-parameter
x
2 r1 1 r2
,
2
+y
r2 r1
y
2 r1 1 r2
Wy = 2
.
x + y 2 r2 r1
Wx =
x2
(39)
(40)
19
Let us define:
Wx (x, y) kx
Wy (x, y) ky
Y(x, y) :=
(41)
Under very general conditions, for example on the spectral measure of {W (x, y) :
x, y R} the random field {Y (x, y) : x, y R} satisfies the conditions of Theorem 3, and we can write:
E SP2 (Q) =
Q
(42)
since for fixed (x, y) the random matrix Y (x, y) and the random vector Y (x, y)
are independent, so that the condition in the conditional expectation can be
erased.
The density in the right hand side of (42) has the expression
pY(x,y) (0) = p(Wx ,Wy ) (kx, ky)
k2
02 x2 211 xy + 20 y 2 .
2(20 02 211 )
20 02 211
(43)
To compute the expectation of the absolute value of the determinant in the right
hand side of (42), which does not depend on x, y, we use the method of [6]. Set
2
:= det Y (x, y) = (Wxx k)(Wyy k) Wxy
.
=
1
2
exp
We have
E(||) = E
1 cos(t)
dt .
t2
(44)
Define
2
h(t) := E exp it[(Wxx k)(Wyy k) Wxy
]
Then
E(||) =
+
0
1 Re[h(t)]
dt .
t2
0 1/2 0
0
A = 1/2 0
0
0 1
40 22 31
:= 22 04 13 .
31 13 22
20
.
(45)
E exp it (1 Z12 k(s11 +s21 )Z1 )+(2 Z22 k(s12 +s22 )Z2 )+(3 Z32 k(s13 +s23 )Z3 )
where (Z1 , Z2 , Z3 ) is standard normal and sij are the entries of 1/2 P T .
One can check that if is a standard normal variable and , are real
constants, > 0:
E ei (+)
i 2
2
2
1
+i
+
exp
1 + 4 2
1 + 4 2
(1 + 4 2 )1/4
where
1
arctan(2 ), 0 < < /4.
2
Replacing in (46), we obtain for Re[h(t)] the formula:
=
Re[h(t)] =
j=1
dj (t, k)
1+
42j t2
j (t) + k 2 tj (t)
cos
(47)
j=1
where, for j = 1, 2, 3:
dj (t, k) = exp
k 2 t2 (s1j + s2j )2
,
2
1 + 42j t2
j (t) =
1
arctan(2j t), 0 < j < /4,
2
j (t) =
(s1j + s2j )2 j
1
.
t2
3
1 + 42j t2
Introducing these expressions in (45) and using (43) we obtain a new formula
which has the form of a rather complicated integral. However, it is well adapted
to numerical evaluation.
On the other hand, this formula allows us to compute the equivalent as
k 0 of the expectation of the total number of specular points under the
longuet-Higgins approximation. In fact, a first order expansion of the terms in
the integrand gives a somewhat more accurate result, that we state as a theorem:
Theorem 9
E SP2 (R2 ) =
21
m2
+ O(1)
k2
(48)
(46)
where
+
m2 =
3
j=1 (1
+ 42j t2 )
cos
3
j=1
j (t)
t2
1/2
1 23/2
3
j=1
Aj
1 + Aj
t2
dt
1 B1 B2 B2 B3 B3 B1
dt,
(49)
where
Aj = Aj (t) = 1 + 42j t2
1/2
, Bj = Bj (t) =
(1 Aj )/(1 + Aj ).
The function
z
(50)
x y > .
Then, for k small enough:
Var SP2 (R2 )
L
,
k2
where L is a positive constant depending upon the law of the random field.
A direct consequence of Theorems 9 and 10 is the following:
Corollary 2 Under the same hypotheses of Theorem 10, for k small enough,
one has:
Var SP2 (R2 )
L1 k
E SP2 (R2 )
where L1 is a new positive constant.
Proof of Theorem 10. For short, let us denote T = SP2 (R2 ). We have:
Var(T ) = E(T (T 1)) + E(T ) [E(T )]2
(51)
... dxdy +
xy >
... dxdy = J1 + J2 .
xy
O(1)
m22
+ 2 .
4
k
k
(52)
23
(53)
pY(x),Y(y) (0, 0)
n Y(x) y x
sup
Y(s)
n Y(x)
(54)
s[x,y]
So,
E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0
y x 2E
= z 2E
sup
Y(s)
n Y(x)
s[x,y]
sup
Y(s)
n Y(0)
s[0,z]
24
W (z) W (0)
= k ,
z
where the last equality is again a consequence of the stationarity of the random
field {W (x) : x R2 }.
At this point, we perform a Gaussian regression on the condition. For the
condition, use again Taylor expansion, the non-degeneracy hypothesis and the
independence of W (0) and W (0). Then, use the finiteness of the moments of
the supremum of bounded Gaussian processes (see for example [4], Ch. 2), take
into account that z to get the inequality:
E | det Y (x)|| det Y (y)| Y(x) = 0, Y(y) = 0 C4 z
1+k x
(55)
where C4 is a positive constant. Summing up, we have the following bound for
J2 :
J2 C1 C4 2
1+k x
R2
+
= C1 C4 2 2 2
1 + k
0
exp C2 k 2 ( x C3 )2 dx
4
(56)
exp C2 k 2 ( C3 )2 d
Let us consider a modeling of the sea W (x, y, t) as a function of two space variables and one time variable. Usual models are centered Gaussian stationary
with a particular form of the spectral measure that we discuss briefly below.
We denote the covariance by (x, y, t) = E(W (0, 0, 0)W (x, y, t)).
In practice, one is frequently confronted with the following situation: several
pictures of the sea on time over an interval [0, T ] are stocked and some properties
or magnitudes are observed. If the time T and the number of pictures are large,
and if the process is ergodic in time, the frequency of pictures that satisfy a
certain property will converge to the probability of this property to happen at
a fixed time.
Let us illustrate this with the angle of the normal to the level curve at a
point chosen at random. We consider first the number of crossings of a level
u by the process W (, y, t) for fixed t and y, defined as
W (,y,t)
M2
dt
0
W (,y,t)
(57)
If the ergodicity assumption in time holds true, we can conclude that a.s.:
1
T
M2
W (,y,t))
N[0,M1 ]
dt
0
W (,0,0))
(u) dy M1 E N[0,M1 ]
(u) =
M1 M2
200 21 u2
000 ,
e
000
where
abc =
R3
ax by ct d(x , y , t )
E
0
W (,y,0)
N[0,M1 ] (u) dy = E
=
CQ (0,u)
2 (Q)
200 2u2
000 ,
e
000
(58)
where Q = [0, M1 ] [0, M2 ]. We have a similar formula when we consider sections of the set [0, M1 ] [0, M2 ] in the other direction. In fact (58) can be
generalized to obtain the Palm distribution of the angle .
Set h1 ,2 = 1I[1 , 2 ] , and for 1 < 2 define
F (2 ) F (1 ) : = E 1 ({x Q : W (x, 0) = u ; 1 (x, s) 2 })
=E
(59)
u2
exp(
)
y W
)((x W )2 + (y W )2 )1/2 ] 200 .
= 2 (Q)E[h1 ,2 (
x W
2000
Denoting = 200 020 110 and assuming 2 (Q) = 1 for ease of notation, we
26
readily obtain
F (2 ) F (1 )
u2
e 2000
=
(2)3/2 ()1/2 000
R2
h1 ,2 () x2 + y 2 e 2 (02 x
211 xy+20 y 2 )
dxdy
u2
e 200
=
(2)3/2 (+ )1/2 000
2 exp(
2
(+ cos2 ( ) + sin2 ( )))dd
2+
where + are the eigenvalues of the covariance matrix of the random vector
(x W (0, 0, 0), y W (0, 0, 0)) and is the angle of the eigenvector associated to
+ . Remarking that the exponent in the integrand can be written as
1/ (1 2 sin2 ( )) with 2 := 1 + / and that
+
0
2 exp
H2
2
2H
F (2 ) F (1 ) = (const)
1 2 sin2 ( )
1/2
d.
From this relation we get the density g() of the Palm distribution, simply by
dividing by the total mass:
g() =
1 2 sin2 ( )
1 2 sin2 ( )
1/2
1/2
=
d.
1 2 sin2 ( )
4K( 2 )
1/2
(60)
Here K is the complete elliptic integral of the first kind. This density characterizes the distribution of the angle of the normal at a point chosen at random
on the level curve.
In the case of a random field which is isotropic in (x, y), we have 200 = 020
and moreover 110 = 0, so that g turns out to be the uniform density over
the circle (Longuet-Higgins says that over the contour the distribution of the
angle is uniform (cf. [15], pp. 348)).
Let now W = {W (x, t) : t R+ , x = (x, y) R2 } be a stationary zero
mean Gaussian random field modeling the height of the sea waves. It has the
following spectral representation:
ei(1 x+2 y+t)
W (x, y, t) =
27
f (1 , 2 , )dM (1 , 2 , ),
ei(
G(, )dd,
where H x, t = H W (x, t), W (x, t) , where W = (Wx , Wy ) denotes gradient in the space variables and H is some measurable function such that the integral is well-defined. This is exactly our case in (59). The process {Z(t) : t R} is
strictly stationary, and in our case has a finite mean and is Riemann-integrable.
By the Birkhoff-Khintchine ergodic theorem ([8] page 151), a.s. as T +,
1
T
T
0
Z(s)ds EB [Z(0)],
28
1
t
In order to compute second moments, we use Rice formula for integrals over
level sets (cf. Theorem 6), applied to the vector-valued random field
X(x1 , x2 , s1 , s2 ) = (W (x1 , s1 ), W (x2 , s2 ))T .
The level set can be written as:
CQ2 (u, u) = {(x1 , x2 ) QQ : X(x1 , x2 , s1 , s2 ) = (u, u)}
So, we get
Var Z(t) =
2
t
t
0
for 0 s1 t, 0 s2 t.
s
(1 )I(u, s)ds,
t
where
I(u, s) =
Q2
W (x2 , s)
W (x1 , 0) = u ; W (x2 , s) = u
pW (x1 ,0),W (x2 ,s) (u, u)dx1 dx2 E[H u, W (0, 0) ||W (0, 0)||]pW (0,0) (u)
Assuming that the given random field is time--dependent, that is,
(x, y, t) = 0 (x, y), whenever t > , we readily obtain
t Var Z(t) 2
Using now a variant of the Hoeffding-Robbins Theorem [11] for sums of dependent random variables, we get the CLT:
Numerical computations
the result with the exact formula is around 2.102 larger but it is almost hidden
by the precision of the computation of the integral.
If we consider the case (90, 110, 3), the results are respectively 136.81 and
137.7.
In the case (100, 300, 3), the results differ significantly and Figure 1 displays
the densities (32) and (10)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
100
200
300
400
500
114 0
0 81
30
0.55
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
2
1.5
0.5
0.5
1.5
Figure 2: Density of the Palm distribution of the angle of the normal to the
level curve in the case = 0.5 and = /4
and the matrix of Section 3.5 is
3 0
11 0
0 3
9
= 104 3
0
90
0.8
120
60
0.6
0.4
150
30
0.2
180
330
210
300
240
270
31
Figure 4: Intensity function of the specular points for the Jonswap spectrum
\protect\vrule width0pt\protect\href{http://www.math.univ-toulouse.fr/\string~azais/prog/pro
In this section we follow the article by Berry and Dennis [6]. As these authors, we
are interested in dislocations of wavefronts. These are lines in space or points in
the plane where the phase , of the complex scalar wave (x, t) = (x, t)ei(x,t) ,
is undefined, (x = (x1 , x2 )) is a two dimensional space variable). With respect
to light they are lines of darkness; with respect to sound, threads of silence.
32
(x, t) = 0.
We assume an isotropic Gaussian model. This means that we will consider the
wavefront as an isotropic Gaussian field
(x, t) =
R2
(|k|) 1/2
) dW (k),
|k|
(x) =
R2
(k) 1/2
) dW1 (k)
k
R2
(k) 1/2
) dW2 (k)+
k
R2
sin( kx )(
(k) 1/2
) dW2 (k) (61)
k
sin( kx )(
(k) 1/2
) dW1 (k) (62)
k
and
cos( kx )(
(x) =
R2
J0 (k|x x |)(k)dk
(63)
Moreover
exp i[ k x ] (
(k) 1/2
) dW (k).
k2
E [(r1 )(r2 )] = 4
0
sin(k|r1 r2 |)
(k)dk.
k|r1 r2 |
(64)
The same formula holds true for the process and also E [(r1 )(r2 )] = 0 for
any r1 , r2 , showing that the two coordinates are independent Gaussian fields .
33
6.1
K2
4
1
2
2
1
E[(det Z (x)Z (x)T )1/2 ].
2
Again
E[(det(Z (x)Z (x)T )1/2 ] = 2 E(V ),
where V is the surface area of the parallelogram generated by two standard
Gaussian variables in R3 . A similar method to compute the expectation of this
random area gives:
E(V ) = E(
4
2 (3)) E( 2 (2)) =
2
34
=2
2
Leading eventually to
d3 =
2
.
6.2
k2
3
giving
Variance
=
S2
+ d2 d22 .
Taking into account that the law of the random field is invariant under translations and orthogonal transformations of R2 , we have
A(s1 , s2 ) = A (0, 0), (r, 0) = A(r)
whith r = s1 s2 ,
The Rices function A(r)) , has two intuitive interpretations. First it can be
viewed as
A(r) = lim
1
E N B((0, 0), ) N B((r, 0), ) .
2 4
1 2 2 1 (0, 0) 1 2 2 1 (r, 0)
pZ(0,0),Z(r,0) (04 )
Z(0, 0) = Z(r, 0) = 02
(65)
35
1
, where (r) =
2
(2) (1 2 (r))
J0 (kr)(k)dk.
We use now the same device as above to compute the conditional expectation
of the modulus of the product of determinants, that is we write:
|w| =
(1 cos(wt)t2 dt.
(66)
C := (r)
E = (r)
H = E/r
F
= (r)
F0 = (0)
The regression formulas imply that the conditional variance matrix of the vector
W = 1 (0), 1 (r, 0), 2 (0), 2 (r, 0), 1 (0), 1 (r, 0), 2 (0), 2 (r, 0) ,
is given by
= Diag A, B, A, B
with
A=
B=
E C
F 1C
2
E2
F0 1C
2
F0
H
H
F0
E
F0 1C
2
E2 C
F 1C
2
1
1
1
1
2
1 T (t1 , 0) T (t1 , 0) T (0, t2 ) T (0, t2 )
dt2 t2
1 t2
2
2
2
2
1
1
1
1
+ T (t1 , t2 ) + T (t1 , t2 ) + T (t1 , t2 ) + T (t1 , t2 ) (67)
4
4
4
4
dt1
where
T (t1 , t2 ) = E exp i(w1 t1 + w2 t2 )
with
w1 = 1 (0)2 (0) 1 (0)2 (0) = W1 W7 W3 W5
w2 = 1 (r, 0)2 (r, 0) 1 (r, 0)2 (r, 0) = W2 W8 W4 W6 .
36
0
0
0
D
0
0
D 0
,
H=
0 D
0
0
D
0
0
0
1 t1 0
.
2 0 t2
A standard diagonalization argument shows that
D=
j j2 ) ,
j=1
where the j s are independent with standard normal distribution and the j
are the eigenvalues of 1/2 H1/2 . Using the characteristic function of the 2 (1)
distribution:
8
E exp(iWT HW) =
j=1
(1 2ij )1/2 .
(68)
Clearly
1/2 = Diag A1/2 , B 1/2 , A1/2 , B 1/2
and
1/2 H1/2
0
0
=
0
MT
0
0
0
MT
M
0
0
0
M
0
0
0
= 1+4tr(DBDA)+16 det(DBDA)
where
DBDA =
1
4
E
t21 F0 (F0 1C
2 ) + t1 t2 H(F
E2
t1 t2 H(F0 1C 2 ) + t22 F0 (F
E2C
1C 2 )
E2C
1C 2 )
E C
t21 F0 (F 1C
2 ) + t1 t2 H(F0
E2 C
t1 t2 H(F 1C 2 ) + t22 F0 (F0
So,
E2
E2C
) + 2t1 t2 H(F
)
2
1C
1 C2
E2C 2
E2 2
)
(F
)
(F0
1 C2
1 C2
37
(69)
(70)
E2
1C 2 )
E2
1C 2 )
giving
T (t1 , t2 ) = E exp(iWT HW)
E2
E2C
)
+
2t
t
H(F
)
1
2
1 C2
1 C2
E2 2
E2C 2
+ t21 t22 F02 H 2 (F0
)
(F
)
1 C2
1 C2
A1
2
1
dt1
(71)
E2
1C 2 )
the
2
dt2 t2
1 t2
1
1
1
1
1
+
+
2
2
2
2
2
2
2
2
1 + t1 1 + t2
2 1 + (t1 + t2 ) 2A2 t1 t2 + t1 t2 Z 1 + (t1 + t2 ) + 2A2 t1 t2 + t21 t22 Z
=
1
where
A1
2
dt1
1
1
+
1 + t21
1 + t22
A2 =
Z=
2
dt2 t2
1 t2
2
2
H F (1C )E C
F0 F0 (1C 2 )E 2
F02 H 2
E2 C 2
1 (F 1C
2 ) .(F0
F02
(72)
E2
2
1C 2 )
In this form, and up to a sign change, this result is equivalent to Formula (4.43)
of [6] (note that A22 = Y in [6]).
In order to compute the integral (72), first we obtain
1
1
1
dt2 = .
2
t2
1 + t22
We split the other term into two integrals, thus we have for the first one
1
2
1
1
1
dt2
2
2
2
2
2
t2 1 + (t1 + t2 ) 2A2 t1 t2 + t1 t2 Z
1 + t21
=
1
2(1 + t21 )
1
=
2(1 + t21 )
1+t2
A2
where Z2 = 1+Zt12 and Z1 = 1+Zt
2.
1
1
Similarly for the second integral we get
38
1
2
1
1
1
dt2
2
2
2
2
2
t2 1 + (t1 + t2 ) + 2A2 t1 t2 + t1 t2 Z
1 + t21
=
I1 + I2 =
1
2(1 + t21 )
1
2(1 + t21 )
1
=
(1 + t21 )
t22 + 2Z1 t1 t2
1
dt2 = I2
2
2
t2 t2 + 2Z1 t1 t2 + Z2
t22 + 2Z1 t1 t2
t22 2Z1 t1 t2
1
+
dt2
t22 t22 2Z1 t1 t2 + Z2
t22 + 2Z1 t1 t2 + Z2
In the third line we have used the formula provided by the method of residues.
In fact, if the polynomial X 2 SX + P with P > 0 has not root in [0, ), then
t4
t2
dt =
St2 + P
( P ).
P (S + 2 P )
In our case = (Z2 4Z12 t21 ), S = 2(Z2 2Z12 t21 ) and P = Z22 .
Therefore we get
A(r) =
A1
3
4 (1 C 2 )
t21
(1 + t21 ) Z2 (Z2 Z12 t21 )
Acknowledgement
This work has received financial support from European Marie Curie Network
SEAMOCS.
References
[1] R. J. Adler, The Geometry of Random Fields, Wiley,(1981).
[2] R. J. Adler and J. Taylor, Random Fields and Geometry. Springer, (2007).
[3] J-M. Azas, J. Leon and J. Ortega, Geometrical Characteristic of Gaussian
sea Waves. Journal of Applied Probability , 42,1-19. (2005).
[4] J-M. Azas, and M. Wschebor, Level set and extrema of random processes
and fields, Wiley (2009).
[5] J-M. Azas, and M. Wschebor, On the Distribution of the Maximum of a
Gaussian Field with d Parameters, Annals of Applied Probability, 15 (1A),
254-278, (2005).
39
[6] M.V. Berry, and M.R. Dennis, Phase singularities in isotropic random waves,
Proc. R. Soc. Lond, A, 456, 2059-2079 (2000).
[7] E. Caba
na, Esperanzas de Integrales sobre Conjuntos de Nivel aleatorios. Actas del 2 Congreso Latinoamericano de Probabilidad y Estadistica
Matem
atica, Editor: Sociedad Bernoulli secci
on de Latinoamerica, Spanish
, Caracas, 65-82 (1985).
[8] H. Cramer and M.R. Leadbetter, Stationary and Related Stochastic Processes, Wiley (1967).
[9] H. Federer, Geometric Measure, Springer (1969).
[10] E. Flores and J.R. Leon, Random seas, Levels sets and applications,
Preprint (2009).
[11] W. Hoeffding and H. Robbins, The Central Limit Theorem for dependent
random variables, Duke Math. J. 15 , 773-780,(1948).
[12] M. Kratz and J. R. Leon, Level curves crossings and applications for Gaussian models, Extremes, DOI 10.1007/s10687-009-0090-x (2009).
[13] P. Kree and C. Soize, Mecanique Aletoire, Dunod (1983).
[14] M. S. Longuet-Higgins, Reflection and refraction at a random surface. I, II,
III, Journal of the Optical Society of America, vol. 50, No.9, 838-856 (1960).
[15] M. S. Longuet-Higgins, The statistical geometry of random surfaces. Proc.
Symp. Appl. Math., Vol. XIII, AMS Providence R.I., 105-143 (1962).
[16] Nualart, D. and Wschebor, M., Integration par parties dans lespace de
Wiener et approximation du temps local, Prob. Th. Rel. Fields, 90, 83-109
(1991).
[17] S.O. Rice,(1944-1945). Mathematical Analysis of Random Noise, Bell System Tech. J., 23, 282-332; 24, 45-156 (1944-1945).
[18] WAFO-group . WAFO - A Matlab Toolbox for Analysis of Random Waves
and Loads - A Tutorial. Math. Stat., Center for Math. Sci., Lund Univ., Lund,
Sweden. ISBN XXXX, URL http://www.maths.lth.se/matstat/wafo.(2000)
[19] M. Wschebor, Surfaces Aleatoires. Lecture Notes Math. 1147, Springer,
(1985).
[20] U. Z
ahle, A general Rice formula, Palm measures, and horizontal-window
conditioning for random fields, Stoc. Process and their applications, 17, 265283 (1984).
40
February 2, 2008
Let X = {X(t) : t S} be a real-valued random field defined on some parameter set S and
M := suptS X(t) its supremum.
The study of the probability distribution of the random variable M , i.e. the function
FM (u) := P{M u} is a classical problem in probability theory. When the process is Gaussian,
general inequalities allow to give bounds on 1 FM (u) = P{M > u} as well as asymptotic
results for u +. A partial account of this well established theory, since the founding paper
by Landau and Shepp [20] should contain - among a long list of contributors - the works of
Marcus and Shepp [24], Sudakov and Tsirelson [30], Borell [13] [14], Fernique [17], Ledoux and
Talagrand [22], Berman [11] [12], Adler[2], Talagrand [32] and Ledoux[21].
During the last fifteen years, several methods have been introduced with the aim of obtaining more precise results than those arising from the classical theory, at least under certain
restrictions on the process X , which are interesting from the point of view of the mathematical
theory as well as in many significant applications. These restrictions include the requirement
Centro de Matem
atica. Facultad de Ciencias. Universidad de la Rep
ublica. Calle Igua 4225. 11400 Montevideo. Uruguay.
the domain S to have certain finite-dimensional geometrical structure and the paths of the
random field to have a certain regularity.
Some examples of these contributions are the double sum method by Piterbarg [28]; the
Euler-Poincare Characteristic (EPC) approximation, Taylor, Takemura and Adler [34], Adler
and Taylor [3]; the tube method, Sun [31] and the well- known Rice method, revisited by Azas
and Delmas [5], Azas and Wschebor [6]. See also Rychlik [29] for numerical computations.
The results in the present paper are based upon Theorem 3 which is an extension of Theorem
3.1 in Azas and Wschebor [8] allowing to express the density pM of FM by means of a general
formula. Even though this is an exact formula, it is only implicit as an expression for the
density, since the relevant random variable M appears in the right-hand side. However, it can
be usefully employed for various purposes.
First, one can use Theorem 3 to obtain bounds for pM (u) and thus for P{M > u} for
every u by means of replacing some indicator function in (4) by the condition that the normal
derivative is extended outward (see below for the precise meaning). This will be called the
direct method. Of course, this may be interesting whenever the expression one obtains can
be handled, which is the actual situation when the random field has a law which is stationary
and isotropic. Our method relies on the application of some known results on the spectrum of
random matrices.
Second, one can use Theorem 3 to study the asymptotics of P{M > u} as u +. More
precisely, one wants to write, whenever it is possible
P{M > u} = A(u) exp
1 u2
2 2
+ B(u)
(1)
method.
In all cases, the second order approximation for the direct method provides an upper bound
for the one arising from the EPC method.
Our proofs use almost no differential geometry, except for some elementary notions in Euclidean space. Let us remark also that we have separated the conditions on the law of the
process from the conditions on the geometry of the parameter set.
Third, Theorem 3 and related results in this paper, in fact refer to the density pM of
the maximum. On integration, they imply immediately a certain number of properties of the
probability distribution FM , such as the behaviour of the tail as u +.
Theorem 3 implies that FM has a density and we have an implicit expression for it. The
proof of this fact here appears to be simpler than previous ones (see Azas and Wschebor [8])
even in the case the process has 1-dimensional parameter (Azas and Wschebor [7]). Let us
remark that Theorem 3 holds true for non-Gaussian processes under appropriate conditions
allowing to apply Rice formula.
Our method can be exploited to study higher order differentiability of FM (as it has been
done in [7] for one-parameter processes) but we will not pursue this subject here.
This paper is organized as follows:
Section 2 includes an extension of Rice Formula which gives an integral expression for the
expectation of the weighted number of roots of a random system of d equations with d real
unknowns. A complete proof of this formula in a form which is adapted to our needs in this
paper, can be found in [9]. There is an extensive literature on Rice formula in various contexts
(see for example Belayiev [10] , Cramer-Leadbetter [15], Marcus [23], Adler [1], Wschebor [35].
In Section 3, we obtain the exact expression for the distribution of the maximum as a consequence of the Rice-like formula of the previous section. This immediately implies the existence
of the density and gives the implicit formula for it. The proof avoids unnecessary technicalities
that we have used in previous work, even in cases that are much simpler than the ones considered here.
In Section 4, we compute (Theorem 4) the first order approximation in the direct method
for stationary isotropic processes defined on a polyhedron, from which a new upper bound for
P{M > u} for all real u follows.
In Section 5, we consider second order approximation, both for the direct method and the
EPC approximation method. This is the content of Theorems 5, 6 and 7.
Section 6 contains some examples.
x
(y)dy.
Assume that the random vectors , have a joint Gaussian distribution, where has
values in some finite dimensional Euclidean space. When it is well defined,
E(f ()/ = x)
is the version of the conditional expectation obtained using Gaussian regression.
Eu := {t S : X(t) > u} is the excursion set above u of the function X(.) and Au :=
{M u} is the event that the maximum is not larger than u.
, , , denote respectively inner product and norm in a finite-dimensional real Euclidean
space; d is the Lebesgue measure on Rd ; S d1 is the unit sphere ; Ac is the complement
of the set A. If M is a real square matrix, M 0 denotes that it is positive definite.
4
This is well-known and follows easily from the next lemma (called Bulinskaya s lemma)
that we state without proof, for completeness.
Lemma 1 Let Z(t) be a stochastic process defined on some neighborhood of a set T embedded
in some Euclidean space. Assume that the Hausdorff dimension of T is smaller or equal than
the integer m and that the values of Z lie in Rm+k for some positive integer k . Suppose, in
addition, that Z has C 1 paths and that the density pZ(t) (v) is bounded for t T and v in some
neighborhood of u Rm+k . Then, a. s. there is no point t T such that Z(t) = u.
With respect to A5, one has the following sufficient conditions: Assume A1, A2, A3 and as
additional hypotheses one of the following two:
t
X(t) is of class C 3
sup
tS,x V (0)
as 0,
In this section we review Rice formula for the expectation of the number of roots of a random
system of equations. For proofs, see for example [8], or [9], where a simpler one is given.
Theorem 1 (Rice formula) Let Z : U Rd be a random field, U an open subset of Rd and
u Rd a fixed point in the codomain. Assume that:
(i) Z is Gaussian,
(ii) almost surely the function t
Z(t) is of class C 1 ,
(iii) for each t U , Z(t) has a non degenerate distribution (i.e. Var Z(t) 0),
(iv) P{t U, Z(t) = u, det Z (t) = 0} = 0
Then, for every Borel set B contained in U , one has
E NuZ (B) =
(2)
Theorem 2 Let Z be a random field that verifies the hypotheses of Theorem 1. Assume that
for each t U one has another random field Y t : W Rd , where W is some topological space,
verifying the following conditions:
a) Y t (w) is a measurable function of (, t, w) and almost surely, (t, w)
ous.
Y t (w) is continu-
one puts on C(W, Rd ) the topology of uniform convergence on compact sets. Then, for each
compact subset I of U , one has
g(t, Y t ) =
E
tI,Z(t)=u
(3)
Remarks:
1. We have already mentioned in the previous section sufficient conditions implying hypothesis (iv) in Theorem 1.
2. With the hypotheses of Theorem 1 it follows easily that if J is a subset of U , d (J) = 0,
then P{NuZ (J) = 0} = 1 for each u Rd .
pM (x) =
tS0
d
+
j=1
Sj
E | det(Xj (t))| 1IAx /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt),
(4)
Remark: One can replace | det(Xj (t))| in the conditional expectation by (1)j det(Xj (t)),
since under the conditioning and whenever M x holds true, Xj (t) is negative semi-definite.
Proof of Theorem 3
Let Nj (u), j = 0, . . . , d be the number of global maxima of X(.) on S that belong to Sj and are
larger than u. From the hypotheses it follows that a.s.
j=0,...,d Nj (u) is equal to 0 or 1, so
that
P{M > u} =
P{Nj (u) = 1} =
E(Nj (u)).
(5)
j=0,...,d
j=0,...,d
The proof will be finished as soon as we show that each term in (5) is the integral over (u, +)
of the corresponding term in (4).
This is self-evident for j = 0. Let us consider the term j = d. We apply the weighted Rice
formula of Section 2 as follows :
Z is the random field X defined on Sd .
For each t Sd , put W = S and Y t : S R2 defined as:
Y t (w) := X(w) X(t), X(t) .
Notice that the second coordinate in the definition of Y t does not depend on w.
6
In the place of the function g, we take for each n = 1, 2, . . . the function gn defined as
follows:
gn (t, f1 , f2 ) = gn (f1 , f2 ) = 1 Fn (sup f1 (w)) . 1 Fn (u f2 (w)) ,
wS
(6)
E
tSd ,X (t)=0
Sd
(7)
Notice that the formula holds true for each compact subset of Sd in the place of Sd , hence for
Sd itself by monotone convergence.
Let now n in (7). Clearly gn (Y t ) 1IX(s)X(t)0,sS . 1IX(t)u . The passage to the limit
does not present any difficulty since 0 gn (Y t ) 1 and the sum in the left-hand side is bounded
by the random variable N0X (Sd ), which is in L1 because of Rice Formula. We get
E(Nd (u)) =
Sd
= 1 ; sn S, (n = 1, 2, . . .) such that sn t,
7
t sn
as n +},
t sn
whenever this set is non-empty and Ct,j = {0} if it is empty. We will denote by Ct,j the dual
cone of Ct,j , that is:
Ct,j := {z Rd : z, 0 for all Ct,j }.
Notice that these definitions easily imply that Tt,j Ct,j and Ct,j Nt,j . Remark also that for
j = d0 , Ct,j = Nt,j .
We will say that the function X(.) has an extended outward derivative at the point t in
(t) C .
Sj , j d0 if Xj,N
t,j
Corollary 1 Under assumptions A1 to A5, one has :
(a) pM (x) p(x) where
E 1IX (t)Cbt,0 /X(t) = x pX(t) (x)+
p(x) :=
tS0
d0
Sj
j=1
j,N (t)Ct,j
p(x)dx.
u
Proof
(a) follows from Theorem 3 and the observation that if t Sj , one has
(t) C }. (b) is an obvious consequence of (a).
{M X(t)} {Xj,N
t,j
The actual interest of this Corollary depends on the feasibility of computing p(x). It turns
out that it can be done in some relevant cases, as we will see in the remaining of this section.
+
Our result can be compared with the approximation of P{M > u} by means of u pE (x)dx
given by [3], [34] where
pE (x) :=
d0
(1)j
+
j=1
Sj
j,N (t)Ct,j
Under certain conditions , u pE (x)dx is the expected value of the EPC of the excursion set
Eu (see [3]). The advantage of pE (x) over p(x) is that one can have nice expressions for it in
quite general situations. Conversely p(x) has the obvious advantage that it is an upper-bound
of the true density pM (x) and hence provides upon integrating once, an upper-bound for the
tail probability, for every u value. It is not known whether a similar inequality holds true for
pE (x).
On the other hand, under additional conditions, both provide good first order approximations
for pM (x) as x as we will see in the next section. In the special case in which the process
X is centered and has a law that is invariant under isometries and translations, we describe
below a procedure to compute p(x).
For one-parameter centered Gaussian process having constant variance and satisfying certain
regularity conditions, a general bound for pM (x) has been computed in [8], pp.75-77. In the
two parameter case, Mercadier [26] has shown a bound for P{M > u}, obtained by means of a
method especially suited to dimension 2. When the parameter is one or two-dimensional, these
bounds are sharper than the ones below which, on the other hand, apply to any dimension but
to a more restricted context. We will assume now that the process X is centered Gaussian,
with a covariance function that can be written as
E X(s).X(t) = s t
(10)
X
ti (t).X(t)
2. E
X
X
ti (t). tk (t)
3. E
2X
ti tk (t).X(t)
4. E
2X
2X
ti tk (t). ti tk (t)
= 0,
= 2 ik and < 0,
= 2 ik , E
2X
X
ti tk (t). tj (t)
=0
= 24 ii .kk + i k .ik + ik i k ,
5. 2 0
6. If t Sj , the conditional distribution of Xj (t) given X(t) = x, Xj (t) = 0 is the same as
the unconditional distribution of the random matrix
Z + 2 xIj ,
where Z = (Zik : i, k = 1, . . . , j) is a symmetric j j matrix with centered Gaussian
entries, independent of the pair X(t), X (t) such that, for i k, i k one has :
E(Zik Zi k ) = 4 2 ii + ( 2 ) ik i k + 4 ii .kk (1 ik ) .
Let us introduce some additional notations:
Hn (x), n = 0, 1, . . . are the standard Hermite polynomials, i.e.
Hn (x) := ex
n x2
2 /2
n x2 /2
Jn (x) :=
ey
2 /2
Hn ()dy, n = 0, 1, 2, . . .
(11)
where stands for the linear form = ay + bx where a, b are some real parameters that satisfy
a2 + b2 = 1/2. Then
ey
2 /2
Hn ()dy = 2nb
Also:
Jn (0) =
ey
2 /2
(12)
ey
2 /2
Hn (ay)dy,
Jn (0) =
ey
2 /2
= 2a2
ey
(2p)!
2.
J2p (0) = (1)p (2b)2p (2p 1)!! 2 = (2b2 )p
p!
(13)
(14)
Now we can go back to (12) and integrate successively for n = 1, 2, . . . on the interval [0, x]
using the initial value given by (14) when n = 2p and Jn (0) = 0 when n is odd, obtaining :
(15)
Qn (x)
= nQn (x)
(16)
Qn (0) = 0 if n is odd
(17)
(18)
It is now easy to show that in fact Qn (x) = H n (x) , n = 0, 1, 2, . . . using for example that:
x
H n (x) = 2n/2 Hn .
2
The integrals
In (v) =
2 /2
et
Hn (t)dt,
will appear in our computations. They are computed in the next Lemma, which can be proved
easily, using the standard properties of Hermite polynomials.
10
Lemma 4 (a)
[ n1
]
2
v2 /2
(n 1)!!
Hn12k (v)
(n 1 2k)!!
(n 1)!! 2 (x)
2k
In (v) = 2e
k=0
n
+ 1I{n even} 2 2
(b)
(19)
(20)
(21)
Theorem 4 Assume that the process X is centered Gaussian, satisfies conditions A1-A5 with
a covariance having the form (10) and verifying the regularity conditions of the beginning of this
section. Moreover, let S be a polyhedron. Then, p(x) can be expressed by means of the following
formula:
d0
| | j/2
p(x) = (x)
H j (x) + Rj (x) gj
0 (t) +
,
(22)
j=1
tS0
where
gj =
(23)
Sj
where j (t) is the normalized solid angle of the cone Ct,j in Nt,j , that is:
j (t) =
d (t) = 1.
(24)
(25)
Notice that for convex or other usual polyhedra j (t) is constant for t Sj , so that gj is
equal to this constant multiplied by the j-dimensional geometric measure of Sj .
For j = 1, . . . d,
Rj (x) =
2
| |
j
2
((j + 1)/2
y2
dy
2
(26)
with := | |( )1/2
(27)
Tj (v) exp
where
v := (2)1/2 (1 2 )1/2 y x
and
j1
Tj (v) :=
k=0
Hk2 (v) v2 /2
Hj (v)
Ij1 (v).
e
j
2k k!
2 (j 1)!
(28)
2 /2
qn () = e
2 /2
c2k Hk2 ()
k=0
+
ey
+ 1I{n odd
2 /2
Hn (y)dy 2
ey
2 /2
Hn (y)dy
Hn1 ()
,
+ y 2 /2
e
H
(y)dy
n1
(29)
qn+1 ()
,
n+1
(30)
Proof:
Denote by 1 , . . . , n the eigenvalues of Gn . It is well-known (Mehta [25], Kendall et al. [19])
that the joint density fn of the n-tuple of random variables (1 , . . . , n ) is given by the formula
n
n
2
i=1 i
fn (1 , . . . , n ) = cn exp
1i<kn
(1+i/2)
i=1
Then,
n
E | det(Gn In )| = E
i=1
|i |
=
Rn i=1
|i |cn exp(
= e
2 /2
cn
cn+1
Rn
n
2
i=1 i
)
1i<kn
|k i | d1 , . . . , dn
fn+1 (1 , . . . , n , )d1 , . . . , dn = e
2 /2
cn qn+1 ()
.
cn+1 n + 1
(31)
j/2
j/2
(2 )
Xj,N
(t)
is independent of
(33)
Since the distribution of X (t) is centered Gaussian with variance 2 Id , it follows that :
E( 1IX (t)Cbt,0 /X(t) = x) = 0 (t)
12
(32)
if t S0 ,
(34)
and if t Sj , j 1:
E(| det(Xj (t))| 1IX
j,N (t)Ct,j
/X(t) = x, Xj (t) = 0)
8 Gj + 2
E | det(Gj Ij )| (y)dy,
d0
0 (t) +
j=1
tS0
| |
j/2
H j (x) gj
(36)
which is the product of a standard Gaussian density times a polynomial with degree d0 .
Integrating once, we get -in our special case- the formula for the expectation of the EPC
of the excursion set as given by [3]
The complementary term given by
d0
Rj (x)gj ,
(x)
(37)
j=1
can be computed by means of a formula, as it follows from the statement of the theorem
above. These formulae will be in general quite unpleasant due to the complicated form of
Tj (v). However, for low dimensions they are simple. For example:
T2 (v) = 2 2(v),
T1 (v) =
T3 (v) =
13
(38)
(39)
(40)
Second order asymptotics for pM (x) as x + will be mainly considered in the next
section. However, we state already that the complementary term (37) is equivalent, as
x +, to
12
2
3 2
x2
(41)
j+1
2
j/4
j/2
3 2
(2) (j 1)!
2j4
(42)
We are not going to go through this calculation, which is elementary but requires some
work. An outline of it is the following. Replace the Hermite polynomials in the expression
for Tj (v) given by (28) by the well-known expansion:
[j/2]
(1)i
Hj (v) = j!
i=0
(2v)j2i
i!(j 2i)!
(43)
v2
2j1
v 2j4 e 2 .
(j 1)!
(44)
Using now the definition of Rj (x) and changing variables in the integral in (26), one gets
for Rj (x) the equivalent:
12
Kj x2j4 e
2
3 2
x2
(45)
In particular, the equivalent of (37) is given by the highest order non-vanishing term in
the sum.
Consider now the case in which S is the sphere S d1 and the process satisfies the same
conditions as in the theorem. Even though the theorem can not be applied directly,
it is possible to deal with this example to compute p(x), only performing some minor
changes. In this case, only the term that corresponds to j = d 1 in (8) does not vanish,
= 1 for each t S d1 and one can use invariance
Ct,d1 = Nt,d1 , so that 1IX
b
(t)C
d1,N
t,d1
d1 S d1
E | det(Z + 2 xId1 ) + (2| |)1/2 Id1 |
(2)(d1)/2
(46)
14
variance 2| | and independent of the tangential derivative. So, we apply the previous
computation, replacing x by x + (2| |)1/2 and obtain the expression:
p(x) = (x)
+
2 d/2
(d/2)
| | (d1)/2
H d1 (x + (2| |)1/2 y) + Rd1 (x + (2| |)1/2 y) (y)dy.
(47)
Asymptotics as x +
In this section we will consider the errors in the direct and the EPC methods for large values
of the argument x. Theses errors are:
p(x) pM (x) =
d0
+
j=1
Sj
j,N (t)Ct,j
pE (x) pM (x) =
. 1IM >x /X(t) = x, Xj (t) = 0 pX(t),Xj (t) (x, 0)j (dt). (48)
d0
(1)j
+
j=1
Sj
j,N (t)Ct,j
(50)
Proof :
Let W be an open neighborhood of the compact subset Sv of S such that dist(W, (S\Sd0 )) > 0
where dist denote the Euclidean distance in Rd . For t Sj W c , the density
pX(t),Xj (t) (x, 0)
can be written as the product of the density of Xj (t) at the point 0, times the conditional density
of X(t) at the point x given that Xj (t) = 0, which is Gaussian with some bounded expectation
and a conditional variance which is smaller than the unconditional variance, hence, bounded by
some constant smaller than 1. Since the conditional expectations in (48) are uniformly bounded
by some constant, due to standard bounds on the moments of the Gaussian law, one can deduce
that:
p(x) pM (x) =
W Sd0
d0 ,N (t)Ct,d0
.pX(t),Xd
as x +, for some 1 > 0. Our following task is to choose W such that one can assure
that the first term in the right hand-member of (51) has the same form as the second, with a
possibly different constant 1 .
To do this , for s S and t Sd0 , let us write the Gaussian regression formula of X(s) on the
pair (X(t), Xd 0 (t)):
X(s) = at (s)X(t) + bt (s), Xd 0 (t) +
ts
2
X t (s).
(52)
where the regression coefficients at (s), bt (s) are respectively real-valued and Rd0 -valued.
From now onwards, we will only be interested in those t W . In this case, since W does not
contain boundary points of S\Sd0 , it follows that
Ct,d0 = Nt,d0 and 1IX
d0 ,N (t)Ct,d0
= 1.
Moreover, whenever s S is close enough to t, necessarily, s Sd0 and one can show that
the Gaussian process {X t (s) : t W Sd0 , s S} is bounded, in spite of the fact that its
trajectories are not continuous at s = t. For each t, {X t (s) : s S} is a helix process, see [8]
for a proof of boundedness.
On the other hand, conditionally on X(t) = x, Xd 0 (t) = 0 the event {M > x} can be written as
{X t (s) > t (s) x, for some s S}
where
t (s) =
2(1 at (s))
.
ts 2
(53)
Our next goal is to prove that if one can choose W in such a way that
inf{ t (s) : t W Sd0 , s S, s = t} > 0,
(54)
then we are done. In fact, apply the Cauchy-Schwarz inequality to the conditional expectation
in (51). Under the conditioning, the elements of Xd0 (t) are the sum of affine functions of x
with bounded coefficients plus centered Gaussian variables with bounded variances, hence, the
absolute value of the conditional expectation is bounded by an expression of the form
Q(t, x)
1/2
X t (s)
>x
t
sS\{t} (s)
sup
16
1/2
(55)
where Q(t, x) is a polynomial in x of degree 2d0 with bounded coefficients. For each t W Sd0 ,
the second factor in (55) is bounded by
P sup
X t (s)
: t W Sd0 , s S, s = t > x
t (s)
1/2
X t (s)
: t W Sd0 , s S, s = t > x C2 exp(2 x2 ),
t (s)
for some positive constants C2 , 2 and any x > 0. Also, the same argument above for the density
pX(t),Xd (t) (x, 0) shows that it is bounded by a constant times the standard Gaussian density.
0
To finish, it suffices to replace these bounds in the first term at the right-hand side of (51).
It remains to choose W for (54) to hold true. Consider the auxiliary process
Y (s) :=
X(s)
r(s, s)
, s S.
(56)
s = t, it follows that X(t), Xd0 (t) are independent, on differentiation under the expectation
sign. This implies that in the regression formula (52) the coefficients are easily computed and
at (s) = r(s, t) which is strictly smaller than 1 if s = t, because of the non-degeneracy condition.
Then
2(1 r(s, t))
2(1 r Y (s, t))
t (s) =
.
(57)
ts 2
ts 2
Since r Y (s, s) = 1 for every s S, the Taylor expansion of r Y (s, t) as a function of s, around
s = t takes the form:
Y
r Y (s, t) = 1 + s t, r20,d
(t, t)(s t) + o( s t 2 ),
0
(58)
(59)
where the last equality follows by differentiation in (56) and putting s = t. (59) implies that
Y
(t, t) is uniformly positive definite on t Sv , meaning that its minimum eigenvalue has
r20,d
0,
a strictly positive lower bound. This, on account of (57) and (58), already shows that
inf{ t (s) : t Sv , s S, s = t} > 0,
(60)
(61)
(1 at0 (s0 ))
= t0 (s0 ),
t0 s 0 2
d0 t s
s
2
X t (s),
where (at )d0 (s) is a column vector of size d0 and (bt )d0 (s) is a d0 d0 matrix. Then, one must
have at (t) = 1, (at )d0 (t) = 0 . Thus
tn (sn ) = uTn (at0 )d0 (t0 )un + o(1),
where un := (sn tn )/ sn tn . Since t0 Sv we may apply (61) and the limit of tn (sn )
cannot be non-positive.
A straightforward application of Theorem 5 is the following
Corollary 2 Under the hypotheses of Theorem 5, there exists positive constants C, such that,
for every u > 0 :
+
+
u
and
1
:= lim 2x2 log p(x) pM (x)
x+
d2
(62)
1
lim 2x2 log pE (x) pM (x)
2 := x+
E
(63)
whenever these limits exist. In general, we are unable to compute the limits (62) or (63) or
even to prove that they actually exist or differ. Our more general results (as well as in [3], [34])
only contain lower-bounds for the liminf as x +. This is already interesting since it gives
some upper-bounds for the speed of approximation for pM (x) either by p(x) or pE (x). On the
other hand, in Theorem 7 below, we are able to prove the existence of the limit and compute
d2 for a relevant class of Gaussian processes.
18
For the next theorem we need an additional condition on the parameter set S. For S
verifying A1 we define
(S) = sup
sup
0jd0 tSj
sup
sS,s=t
(64)
t2 := sup
and
t := sup
dist 1
t r01 (s, t), Ct,j
1 r(s, t)
sS\{t}
(66)
where
t := Var(X (t))
(t) is the maximum eigenvalue of t
in (66), j is such that t Sj ,(j = 0, 1, . . . , d0 ).
The quantity in the right hand side of (65) is strictly bigger than 1.
Remark. In formula (65) it may happen that the denominator in the right-hand side is
identically zero, in which case we put + for the infimum. This is the case of the one-parameter
process X(t) = cos t + sin t where , are Gaussian standard independent random variables,
and S is an interval having length strictly smaller than .
19
Proof of Theorem 6
Let us first prove that suptS t < .
For each t S, let us write the Taylor expansions
r01 (s, t) = r01 (t, t) + r11 (t, t)(s t) + O( s t 2 )
= t (s t) + O( s t 2 )
L3
1 r(s, t)
+ L4 ,
(67)
L3 (S) + L4 .
1 r(s, t)
With the same notations as in the proof of Theorem 5, using (4) and (8), one has:
d0
+
j=1
Sj
j,N (t)Ct,j .
t2
1
,
+ (t)2t
T
{M > x} and {Rt (s) > (1 r(s, t))x r01
(s, t)1
t Xj,N (t) for some s S}
20
(69)
coincide.
(t)|X (t) = 0) the regression of X (t) on X (t) = 0. So, the probability in
Denote by (Xj,N
j
j
j,N
(69) can written as
Cbt,j
P{ t (s) > x
T (s, t)1 x
r01
(70)
where
t (s) :=
Rt (s)
1 r(s, t)
If 1
t r01 (s, t) Ct,j one has
T
r01
(s, t)1
t x 0
1
t r01 (s, t) = z + z
So, if x Ct,j :
T (s, t)1 x
r01
z T x + z T x
t
=
t x
1 r(s, t)
1 r(s, t)
using that z T x 0 and the Cauchy-Schwarz inequality. It follows that in any case, if x Ct,j
the expression in (70) is bounded by
Cbt,j
(71)
To obtain a bound for the probability in the integrand of (71) we will use the classical
inequality for the tail of the distribution of the supremum of a Gaussian process with bounded
paths.
The Gaussian process (s, t))
t (s), defined on (S S)\{s = t} has continuous paths. As
the pair (s, t) approches the diagonal of S S, t (s) may not have a limit but, almost surely,
it is bounded (see [8] for a proof). (For fixed t, t (.) is a helix process with a singularity at
s = t, a class of processes that we have already met above).
We set
mt (s) := E( t (s)) (s = t)
m := sups,tS,s=t |mt (s)|
:= E | sups,tS,s=t t (s) mt (s) | .
The almost sure boundedness of the paths of t (s) implies that m < and < . Applying the
Borell-Sudakov-Tsirelson type inequality (see for example Adler [2] and references therein) to
the centered process s
t (s)mt (s) defined on S\{t} , we get whenever xt x m > 0:
P{ t (s) > x t x for some s S}
(x t x m )2
.
2t2
jd
2
x mj,N (t)
exp
2j (t)
(t)|X (t))
where j (t) and j (t) are respectively the minimum and maximum eigenvalue of Var(Xj,N
j
and mj,N (t) is the conditional expectation E(Xj,N (t)|Xj (t) = 0). Notice that j (t), j (t), mj,N (t)
are bounded, j (t) is bounded below by a positive constant and j (t) (t).
P {Xj,N
Ct,j } {M > x}/X(t) = x, Xj (t) = 0
(2j (t))
jd
2
exp
x mj,N (t) 2
(x t x m )2
dx
+
2t2
2(t)
xm
+ P Xj,N
(t)|Xj (t) = 0
, (72)
t
where it is understood that the second term in the right-hand side vanishes if t = 0.
Let us consider the first term in the right-hand side of (72). We have:
x mj,N (t)
(x t x m )2
+
2t2
2(t)
2
(x t x m )2 ( x mj,N (t) )
+
2t2
2(t)
(x m t mj,N (t) )2
2
,
= A(t) x + B(t)(x m ) + C(t) +
2t2 + 2(t)2t
where the last inequality is obtained after some algebra, A(t), B(t), C(t) are bounded functions
and A(t) is bounded below by some positive constant.
So the first term in the right-hand side of (72) is bounded by :
2.(2j )
jd
2
exp
(x m t mj,N (t))2
2t2 + 2(t)2t
Rdj
dx
(x m t mj,N (t) )2
2t2 + 2(t)2t
(73)
where L is some constant. The last inequality follows easily using polar coordinates.
Consider now the second term in the right-hand side of (72). Using the form of the conditional
density pXj,N
(t)/Xj (t)=0 (x ), it follows that it is bounded by
P
(Xj,N
(t)/Xj (t)
= 0)
mj,N (t)
x m t mj,N (t)
t
L1 |x|dj2 exp
(x m t mj,N (t) )2
2(t)2t
where L1 is some constant. Putting together (73) and (74) with (72), we obtain (69).
The following two corollaries are straightforward consequences of Theorem 6:
22
(74)
tS 2
t
x+
1
.
+ (t)2t
d0
p (x) =
j=0
Examples
1 v (t0 )/2
1/k
kCk
E || 2k 1 x11/k (x),
(75)
1
v (2k) (t0 ) + 14 [v (t0 )]2 1Ik=2 . The
where is a standard normal random variable and Ck = (2k)!
proof is a direct application of the Laplace method. The result is new for the density of the
maximum, but if we integrate the density from u to +, the corresponding bound for P{M > u}
is known under weaker hypotheses (Piterbarg [28]).
2) Let the process X be centered and satisfy A1-A5. Assume that the the law of the process
is isotropic and stationary, so that the covariance has the form (10) and verifies the regularity
condition of Section 4. We add the simple normalization = (0) = 1/2. One can easily
check that
1 2 ( s t 2 ) 42 ( s t 2 ) s t 2
(76)
t2 = sup
[1 ( s t 2 )]2
sS\{t}
Furthermore if
(x) 0 for x 0
(77)
one can show that the sup in (76) is attained as s t 0 and is independent of t. Its value
is
t2 = 12 1.
The proof is elementary (see [4] or [34]).
Let S be a convex set. For t Sj , s S:
dist r01 (s, t), Ct,j = dist 2 ( s t 2 )(t s), Ct,j .
23
(78)
The convexity of S implies that (t s) Ct,j . Since Ct,j is a convex cone and 2 ( s t 2 ) 0,
one can conclude that r01 (s, t) Ct,j so that the distance in (78) is equal to zero. Hence,
t = 0 for every t S
and an application of Theorem 6 gives the inequality
lim inf
x+
1
2
.
log p(x) pM (x) 1 +
2
x
12 1
(79)
A direct consequence is that the same inequality holds true when replacing p(x) pM (x) by
|pE (x) pM (x)| in (79), thus obtainig the main explicit example in Adler and Taylor [3], or in
Taylor et al. [34].
Next, we improve (79). In fact, under the same hypotheses, we prove that the liminf is an
ordinary limit and the sign is an equality sign. We state this as
Theorem 7 Assume that X is centered, satisfies hypotheses A1-A5, the covariance has the
form (10) with (0) = 1/2, (x) 0 f or x 0. Let S be a convex set, and d0 = d 1.
Then
1
2
.
(80)
lim log p(x) pM (x) = 1 +
x+ x2
12 1
Remark Notice that since S is convex, the added hypothesis that the maximum dimension d0
such that Sj is not empty is equal to d is not an actual restriction.
Proof of Theorem 7
In view of (79), it suffices to prove that
lim sup
x+
1
2
log p(x) pM (x) 1 +
.
2
x
12 1
(81)
Using (4) and the definition of p(x) given by (8), one has the inequality
p(x) pM (x) (2)d/2 (x)
Sd
(82)
where our lower bound only contains the term corresponding to the largest dimension and we
have already replaced the density pX(t),X (t) (x, 0) by its explicit expression using the law of the
process. Under the condition {X(t) = x, X (t) = 0} if v0T X (t)v0 > 0 for some v0 S d1 , a
Taylor expansion implies that M > x. It follows that
E | det(X (t))| 1IM >x /X(t) = x, X (t) = 0
E | det(X (t))| 1I
vS d1
We now apply Lemma 2 which describes the conditional distribution of X (t) given X(t) =
x, X (t) = 0 . Using the notations of this lemma, we may write the right-hand side of (83) as :
E | det(Z xId)| 1I
sup v T Zv > x
vS d1
=
x
y2
dy, (84)
2 2
y
Z12
... ...
Z1d
2 + y Z23 . . .
Z2d
Z :=
,
..
.
d + y
where the random variables {2 , . . . , d , Zik , 1 i < k d} are independent centered Gaussian
with
Var(Zik ) = 4 (1 i < k d) ; Var(i ) =
4 1
16 (8 1)
(i = 2, . . . , d) ; =
12 1
12 1
1
2
exp(
x
L
y2
)E | det(ZxId)| dy
2
2
2
x(1+0 )+1
exp(
x(1+0 )
y2
)0 (1(1+0 ))d1 xd dy
2 2
for x large enough. On account of (82),(83),(84), we conclude that for x large enough,
p(x) pM (x) L1 xd exp
x2 (x(1 + 0 ) + 1)2
+
.
2
2 2
for some new positive constant L1 . Since 0 can be chosen arbitrarily small, this implies (81).
3) Consider the same processes of Example 2, but now defined on the non-convex set {a
t b}, 0 < a < b. The same calculations as above show that t = 0 if a < t b and
t = max
for t = a.
4) Let us keep the same hypotheses as in Example 2 but without assuming that the covariance is decreasing as in (77). The variance is still given by (76) but t is not necessarily equal
to zero. More precisely, relation (78) shows that
t sup 2
sS\{t}
( s t 2 )+ s t
1 ( s t 2 )
The normalization: = 1/2 implies that the process X is identity speed, that is
Var(X (t)) = Id so that (t) = 1. An application of Theorem 6 gives
lim inf
x+
where
2
log p(x) pM (x) 1 + 1/Z .
x2
(85)
2
4 (z 2 )+ z
1 2 (z 2 ) 42 (z 2 )z 2
+
max
,
[1 (z 2 )]2
z(0,] [1 (z 2 )]2
z(0,]
Z := sup
and is the diameter of S.
5) Suppose that
25
the process X is stationary with covariance (t) := Cov(X(s), X(s + t)) that satisfies
(s1 , . . . , sd ) = i=1,...,d i (si ) where 1 , ..., d are d covariance functions on R which are
monotone, positive on [0, +) and of class C 4 ,
S is a rectangle
S=
[ai , bi ] , ai < bi .
i=1,...,d
Then, adding an appropriate non-degeneracy condition, conditions A2-A5 are fulfilled and Theorem 6 applies
It is easy to see that
..
r0,1 (s, t) =
.
1 (s1 t1 ) . . . d1 (sd1 td1 ).d (sd td )
belongs to Ct,j for every s S. As a consequence t = 0 for all t S. On the other hand,
standard regressions formulae show that
2
2
2
2
2
1 21 . . . 2d 2
Var X(s)/X(t), X (t)
1 2 . . . d 1 . . . d1 d
=
,
(1 r(s, t))2
(1 1 . . . d )2
References
[1] Adler, R.J. (1981). The Geometry of Random Fields. Wiley, New York.
[2] Adler, R.J. (1990). An Introduction to Continuity, Extrema and Related Topics for General
Gaussian Processes. IMS, Hayward, Ca.
[3] Adler, R.J. and Taylor J. E.(2005). Random fields and geometry. Book to appear.
[4] Azas J-M., Bardet J-M. and Wschebor M. (2002). On the Tails of the distribution of the
maximum of a smooth stationary Gaussian Process. Esaim: P. and S., 6,177-184.
[5] Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution of the
maximum of a Gaussian random fields. Extremes (2002)5(2), 181-212.
[6] Azas, J-M and Wschebor, M. (2002). The Distribution of the Maximum of a Gaussian
Process: Rice Method Revisited, In and out of equilibrium: probability with a physical
flavour, Progress in Probability, 321-348, Birkha
user.
[7] Azas J-M. and Wschebor M (2001). On the regularity of the distribution of the Maximum
of one parameter Gaussian processes Probab. Theory Relat. Fields, 119, 70-98.
[8] Azas J-M. and Wschebor M (2005). On the Distribution of the Maximum of a Gaussian
Field with d Parameters. Annals Applied Probability, 15 (1A), 254-278.
[9] Azas J-M. and Wschebor, M. (2006). A self contained proof of the Rice formula for random
fields. Preprint available at http://www.lsp.ups-tlse.fr/Azais/publi/completeproof.pdf.
[10] Belyaev, Y. (1966). On the number of intersections of a level by a Gaussian Stochastic
process. Theory Prob. Appl., 11, 106-113.
26
[11] Berman, S.M. (1985a). An asymptotic formula for the distribution of the maximum of a
Gaussian process with stationary increments. J. Appl. Prob., 22,454-460.
[12] Berman, S.M. (1992). Sojourns and extremes of stochastic processes, The Wadworth and
Brooks, Probability Series.
[13] Borell, C. (1975). The Brunn-Minkowski inequality in Gauss space. Invent. Math., 30,
207-216.
[14] Borell, C. (2003). The Ehrhard inequality. C.R. Acad. Sci. Paris, Ser. I, 337, 663-666.
[15] Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic Processes, J.
Wiley & Sons, New-York.
[16] Cucker, F. and Wschebor M. (2003). On the Expected Condition Number of Linear Programming Problems, Numer. Math., 94, 419-478.
[17] Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes. Ecole
dEte de Probabilites de Saint Flour (1974). Lecture Notes in Mathematics, 480, SpringerVerlag, New-York.
[18] Fyodorov, Y. (2006). Complexity of Random Energy Landscapes, Glass Transition and
Absolute Value of Spectral Determinant of Random Matrices Physical Review Letters v. 92
(2004), 240601 (4pages); Erratum: ibid. v.93 (2004),149901(1page)
[19] Kendall, M.G., Stuart,A. and Ord, J.K. (1987). The Advanced Theory of Statistics, Vol. 3.
[20] Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process. Sankya
Ser. A 32, 369-378.
[21] Ledoux, M. (2001). The Concentration of Measure Phenomenon. American Math. Soc.,
Providence, RI.
[22] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces, Springer-Verlag,
New-York.
[23] Marcus, M.B. (1977). Level Crossings of a Stochastic Process with Absolutely Continuous
Sample Paths, Ann. Probab., 5, 52-71.
[24] Marcus, M.B. and Shepp, L.A. (1972). Sample behaviour of Gaussian processes. Proc.
Sixth Berkeley Symp. Math. Statist. Prob., 2, 423-442.
[25] Mehta,M.L. (2004). Random matrices, 3d-ed. Academic Press.
[26] Mercadier, C. (2006). Numerical bounds for the distribution of the maximum of one- and
two-dimensional processes, to appear in Advances in Applied Probability, 38, (1).
[27] Piterbarg, V; I. (1981). Comparison of distribution functions of maxima of Gaussian
processes. Th, Proba. Appl., 26, 687-705.
[28] Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes and
Fields. American Mathematical Society. Providence. Rhode Island.
[29] Rychlik, I. (1990). New bounds for the first passage, wave-length and amplitude densities.
Stochastic Processes and their Applications, 34, 313-339.
[30] Sudakov, V.N. and Tsirelson, B.S. (1974). Extremal properties of half spaces for spherically
invariant measures (in Russian). Zap. Nauchn. Sem. LOMI, 45, 75-82.
27
[31] Sun, J. (1993). Tail Probabilities of the Maxima of Gaussian Random Fields, Ann. Probab.,
21, 34-71.
[32] Talagrand, M. (1996). Majorising measures: the general chaining. Ann. Probab., 24,
1049-1103.
[33] Taylor, J.E. and Adler, R. J. (2003). Euler characteristics for Gaussian fields on manifolds.
Ann. Probab., 31, 533-563.
[34] Taylor J.E., Takemura A. and Adler R.J. (2005). Validity of the expected Euler Characteristic heuristic. Ann. Probab., 33, 4, 1362-1396.
[35] Wschebor, M. (1985). Surfaces aleatoires. Mesure geometrique des ensembles de niveau.
Lecture Notes in Mathematics, 1147, Springer-Verlag.
28
and FI (u) = P{MI u}, u R the probability distribution function of the random
variable MI . Our aim is to study the regularity of the function FI when d > 1.
There exist a certain number of general results on this subject, starting from
the papers by Ylvisaker (1968) and Tsirelson (1975) (see also Weber (1985), Lifshits
(1995), Diebolt and Posse (1996) and references therein). The main purpose of this
paper is to extend to d > 1 some of the results about the regularity of the function
u
FI (u) in Azas & Wschebor (2001), which concern the case d = 1.
Our main tool here is Rice Formula for the moments of the number of roots
NuZ (I) of the equation Z(t) = u on the set I, where {Z(t) : t I} is an Rd -valued
Gaussian field, I is a subset of Rd and u a given point in Rd . For d > 1, even
though it has been used in various contexts, as far as the authors know, a full proof
of Rice Formula for the moments of NuZ (I) seems to have only been published by R.
Adler (1981) for the first moment of the number of critical points of a real-valued
stationary Gaussian process with a d-dimensional parameter, and extended by Azas
and Delmas (2002) to the case of processes with constant variance. Caba
na (1985)
contains related formulae for random fields; see also the PHD thesis of Konakov
cited by Piterbarg (1996b). In the next section we give a more general result which
has an interest that goes beyond the application of the present paper. At the same
time the proof appears to be simpler than previous ones. We have also included
the proof of the formula for higher moments, which in fact follows easily from the
first moment. Both extend with no difficulties to certain classes of non-Gaussian
processes.
It should be pointed out that the validity of Rice Formula for Lebesgue-almost
every u Rd is easy to prove (Brillinger, 1972) but this is insufficient for a certain
number of standard applications. For example, assume X : I
R is a real-valued
random process and one is willing to compute the moments of the number of critical
points of X. Then, we must take for Z the random field Z(t) = X (t) and the
formula one needs is for the precise value u = 0 so that a formula for almost every
u does not solve the problem.
We have added Rice Formula for processes defined on smooth manifolds. Even
though Rice Formula is local, this is convenient for various applications. We will
need a formula of this sort to state and prove the implicit formulae for the derivatives
of the distribution of the maximum (see Section 3).
2
The results on the differentiation of FI are partial extensions of Azas & Wschebor (2001). They concern only the first two derivatives and remain quite far away
from what is known for d = 1. The main result in that paper states that if X is
a real-valued Gaussian process defined on a certain compact interval I of the real
line, has C 2k paths (k integer, k 1) and satisfies a non-degeneracy condition, then
the distribution of MI is of class C k .
For Gaussian fields defined on a d-dimensional regular manifold (d > 1) and
possessing regular paths we obtain some improvements with respect to classical
and general results due to Tsirelson (1975) for Gaussian sequences. An example is
Corollary 6.1, that provides an asymptotic formula for FI (u) as u + which is
explicit in terms of the covariance of the process and can be compared with Theorem
4 in Tsirelson (1975) where an implicit expression depending on the function F itself
is given.
We use the following notations:
If Z is a smooth function U
Rd , U a subset of Rd , its successive derivatives
are denoted Z , Z ,...Z (k) and considered respectively as linear, bilinear, ..., klinear
forms on Rd . For example, X (3) (t){v1 , v2 , v3 } is the value of the third derivative at
point t applied to the triplet (v1 , v2 , v3 ). The same notation is used for a derivative
on a C manifold.
I and I are respectively the interior, the boundary and the closure of the set
I,
I. If is a random vector with values in Rd , whenever they exist, we denote by
p (x) the value of the density of at the point x, by E() its expectation and by
Var() its variance-covariance matrix. is Lebesgue measure.
If u, v are points in Rd , u, v denotes their usual scalar product and u the
Euclidean norm of u.
For M a d d real matrix, we denote
M = sup M x
x =1
Rice formulae
E NuZ (I) =
(1)
E
Ik
A3 are verified. This follows immediately from the above statements. A standard extension argument shows that (1) holds true if one replaces I by any Borel subset of I
Sufficient conditions for hypotheses A3 to hold are given by the next proposition.
Proposition 2.1 Let Z : I
Rd , I a compact subset of Rd be a random field with
1
d
paths of class C and u R . Assume that
pZ(t) (x) C for all t I and x in some neighbourhood of u.
at least one of the two following hypotheses is satisfied:
a) a.s. t
Z(t) is of class C 2
b)
() =
sup
tI,xV (u)
Denote by det the modulus of continuity of | det(X (.))| and choose m large enough
so that
d
P(Fm, ) = P det (
) .
m
Consider the partition of I into md small cubes with sides of length 1/m. Let Ci1 ...id
such a cube and ti1 ...id its centre (1 i1 , ..., id m). Then
c
c
Fm,
P GCi1 ...id EM
(3)
When the event in the term corresponding to i1 ...id of the last sum occurs, we have:
|Zj (ti1 ...id )|
M
d j = 1, ..., d
m
< .
So, if m is
chosen sufficiently large so that V (0) contains the ball centred at 0 with
M d
radius m , one has:
P(GI ) 2 + md (
2M d
d) C()
m
(si )d 0 as m +.
i=1
So,
h(m)
NuZ (I)
= 0 P(EM ) +
c
NuZ (Ci ) = 0 EM
i=1
h(m)
+
i=1
d
P |Zj (ti ) uj | M si
j = 1, ..., d + C
2
h(m)
( dM si )d
i=1
=0
Thus
| Z (t1 )v i | = | Z (t1 )v i Z (i )v i |
d
k=1
|vk | Z (
) d
k=1
In conclusion
min Z (t1 ) Z (t1 )v Z (
)d,
that implies > .
Proof of Theorem 2.1: Consider a continuous non-decreasing function F such
that
F (x) = 0
F (x) = 1
for x 1/2
for x 1.
1
inf min Z (s) + Z(s) u
2 sI
1F
d
Z ()
(4)
ball with diameter 2 centred at a point in I there is at most one root of the
equation Z(t) = u, and a compactness argument shows that NuZ (I ) is bounded by
a constant C(, I), depending only on and on the set I.
Take now any real-valued non-random continuous function f : Rd R with
compact support. Because of the coarea formula (Federer, 1969, Th 3.2.3), since
a.s. Z is Lipschitz and , (u).f (u) is integrable:
Rd
f (u)NuZ (I ), (u)du =
Rd
variable | det(Z (t))|, (u) is a functional defined on {(Z(s), Z (s)) : s I}. Perform a Gaussian regression of (Z(s), Z (s)) : s I with respect to the random
variable Z(t), that is, write
Z(s) = Y t (s) + t (s)Z(t)
Zj (s) = Yjt (s) + jt (s)Z(t), j = 1, ..., d
where Zj (s) (j = 1, ..., d) denote the columns of Z (s), Y t (s) and Yjt (s) are Gaussian
vectors, independent of Z(t) for each s I, and the regression matrices t (s), jt (s)
(j = 1, ..., d) are continuous functions of s, t (take into account A2). Replacing in
the conditional expectation we are now able to get rid of the conditioning, and using
the fact that the moments of the supremum of an a.s. bounded Gaussian process
are finite, the continuity in u follows by dominated convergence.
So, now we fix u Rd and make 0, 0 in that order, both in (i) and (ii).
For (i) one can use Beppo Levis Theorem. Note that almost surely
= NuZ (I),
NuZ (I ) NuZ (I)
where the last equality follows from Lemma 2.1. On the other hand, the same
Lemma 2.1 plus A3 imply together that,almost surely:
inf min Z (s) + Z(s) u
sI
>0
so that the first factor in the right-hand member of (4) increases to 1 as decreases
to zero. Hence by Beppo Levis Theorem:
lim lim E NuZ (I ), (u) = E NuZ (I) .
0 0
For (ii), one can proceed in a similar way after de-conditioning obtaining (1). To
finish the proof, remark that standard Gaussian calculations show the finiteness of
the right-hand member of (1).
Proof of Theorem 2.2: For each > 0, define the domain
Dk, (I) = {(t1 , ..., tk ) I k , ti tj if i = j, i, j = 1, ..., k}
and the process Z
(t1 , ..., tk ) Dk, (I)
It is clear that Z satisfies the hypotheses of Theorem 2.1 for every value (u, ..., u)
(Rd )k . So,
Z
E N(u,...,u)
Dk, (I)
=
Dk, (I)
| det Z (tj ) |/Z(t1 ) = ... = Z(tk ) = u pZ(t1 ),...,Z(tk ) (u, ..., u)dt1 ...dtk (5)
j=1
To finish, let 0, note that NuZ (I) NuZ (I) 1 ... NuZ (I) k + 1 is the monotone
limit of
Z
N(u,...,u)
Dk, (I) ,
and that the diagonal Dk (I) = (t1 , ..., tk ) I k , ti = tj for some pair i, j, i = j has
zero Lebesgue measure in (Rd )k .
Remark Even thought we will not use this in the present paper, we point out
that it is easy to adapt the proofs of Theorems 2.1 and 2.2 to certain classes of
non-Gaussian processes.
For example, the statement of Theorem 2.1 remains valid if one replaces hypotheses A0 and A2 respectively by the following B0 and B2:
B0 : Z(t) = H(Y (t)) for t I where
Y : I Rn is a Gaussian process with C 1 paths such that for each t I, Y (t) has
a non-degenerate distribution and H : Rn Rd is a C 1 function.
B2 : for each t I, Z(t) has a density pZ(t) which is continuous as a function of
(t, u).
Note that B0 and B2 together imply that n d. The only change to be introduced in the proof of the theorem is in the continuity of (ii) where the regression is
performed on Y (t) instead of Z(t)
Similarly, the statement of Theorem 2.2 remains valid if we replace A0 by B0 and
add the requirement the joint density of Z(t1 ), ..., Z(tk ) to be a continuous function
of t1 , ..., tk , u for pairwise different t1 , ..., tk
Now consider a process X from I to R and define
X
(I) = {t I, X(.) has a local maximum at the point t, X(t) > u}
Mu,1
X
(I) = {t I, X (t) = 0, X(t) > u}
Mu,2
The problem of writing Rice Formulae for the factorial moments of these random
variables can be considered as a particular case of the previous one and the proofs are
10
the same, mutatis mutandis. For further use, we state as a theorem, Rice Formula
for the expectation. For short we do not state the equivalent of Theorem (2.2) that
holds true similarly.
Theorem 2.3 Let X : I
R , I a compact subset of Rd , be a random field. Let
X
u R, define Mu,i
(I), i = 1, 2 as above. For each d d real symmetric matrix M ,
1
we put (M ) := | det(M )|1IM 0 , 2 (M ) := | det(M )|.
Assume:
A0: X is Gaussian,
A1: a.s. t
X(t) is of class C 2 ,
A2: for each t I, X(t), X (t) has a non degenerate distribution in R1 Rd ,
A3: either
a.s. t
X(t) is of class C 3
or
() =
sup
tI,x V (0)
dx
u
2.1
Abstract manifold
Proposition 2.2 For k = 1, 2 the quantity which is expressed in every chart with
coordinates s1 , ..., sd as
+
(s) (x, o)
di=1 dsi ,
(6)
X
k = E Mu,k
(S) .
Denote by s1i and s2i , i = i, ..., d the coordinates in each chart. We have
Y1
=
s1i
2 Y1
=
s1i s1j
i ,j
Y2 Hi
s2i s1i
2 Y2 Hi Hj
+
s2i s2j s1i s1j
Y2 (s2 ),
Y2 2 Hi
.
s2i s1i s1j
pY1 (s1 ),Y1 (s1 ) (x, 0) = pY2 (s2 ),Y2 (s2 ) (x, 0)| det(H (s1 )|1
and at a singular point
Y1 (s1 ) = H (s1 )
Y2 (s2 )H (s1 ).
d(s)
(S)
(7)
X
Y
Since Mu,k
(S) is equal to Mu,k
{(S)} we see that the result is a direct consequence of Theorem (2.3)
2.1.2
Riemannian manifold
The form in (6) is intrinsic (in the sense that it does not depend on the parametrization) but the terms inside the integrand are not. It is possible to give a complete
intrinsic expression in the case when U is a equipped with a Riemannian metric.
When such a Riemannian metric is not given, it is always possible to use the metric
g induced by the process itself (see Taylor and Adler, 2002) by setting
gs (Y, Z) = E
Y (X) Z(X)
for Y, Z belonging the tangent space T (s) at s U . Y (X), (resp. Z(X)) denotes
the action of the tangent vector Y (resp. Z) on the function X. This metric leads
to very simple expressions for centred variance-1 Gaussian processes.
The main point is that at a singular point of X the second order derivative D2 X
is intrinsic since it defines locally the Taylor expansion. Given the Riemannian
metric gs the second differential can be represented by an endomorphism that will
be denoted 2 X(s).
D2 X(s){Y, Z} = Y (Z(X) = Z(Y (X) = gs (2 X(s)Y, Z).
13
(8)
In fact, at a singular point the definition given by formula (8) coincide with the
definition of the Hessian read in and orthonormal basis. This endomorphism is
intrinsic and of course its determinant. So in a chart
det 2 X(s) = det(D2 X(s)) det(gs )1 ,
(9)
1/2
X(s) = gs
1/2
DX
(10)
where we have omitted the tilde above X(s) for simplicity. This is the Riemannian
intrinsic expression.
2.1.3
Embedded manifold
the natural derivative on Rm . The manifold is equipped with the metric induced
by the Euclidean metric in Rm . Considering the form (10), clearly the Riemannian
volume is just the geometric measure on U .
Following Milnor (1965), we assume that the process Xt is defined on an open
neighbourhood of U so that the ordinary derivatives X (s) and X (s) are well defined
for s U . Denoting the projector onto the tangent and normal spaces by PT (s) and
PN (s) , we have.
X(s) = PT (s) (X (s)).
Wee now define the second fundamental form II of U embedded in Rm than can be
defined in our simple case as the bilinear application ( see Kobayashi Nomizu 199?
T 2, chap. 7 for details).
Y, Z T (s)
PN (s) (X Y ).
(11)
(12)
dx
u
dt
0
dt1 dt2 E (| det(Z1 (t1 ))|| det(Z2 (t2 ))|/Z1 (t1 ) = u1 , Z2 (t2 ) = u2 ) pZ1 (t1 ),Z2 (t2 ) (u1 , u2 ),
(14)
g Y t (.) + t (.)u
is continuous at u = u0
Then the formula :
E NuZ (I) =
16
holds true.
We will be particularly interested in the function = 1IMI <v for some v R. We
will see that later on that it satisfies the above conditions under certain hypotheses
on the process Z.
Our main goals in this and the next section are to prove existence and regularity of
the derivatives of the function u
FI (u) and, at the same time, that they satisfy
some implicit formulae that can be used to provide bounds on them. In the following
we assume that I is a d-dimensional C manifold embedded in RN , N d. and
where the function 1 has been defined in the statement of Theorem 2.3 and X
denotes the restriction of X to the boundary I.
Even for d = 1 (one parameter processes) and X Gaussian and stationary, inequality (15) provides reasonably good upper bounds for FI (u) (see Diebolt and
Posse (1996), Azas and Wschebor (2001). We will see an example for d = 2 at the
end of this section.
17
In the next section, we are able to prove that FI (u) is a C 1 function and that
formula (17) can be essentially simplified by getting rid of the conditional expectation, thus obtaining the second form for the derivative. This is done under weaker
regularity conditions but the assumption that X is Gaussian becomes essential.
In case the dimension d of the parameter is equal to 1, this is the starting point
to continue the differentiation procedure and under hypotheses H2k one is able to
(k)
prove that FI is a C k function and to obtain implicit formulae for FI (see Azas &
Wschebor, 2001)
When d > 1, a certain number of difficulties arise and it is not clear that the
process can continue beyond k = 2. With the purpose of establishing such formula
for FI , we introduce in Section 4 the helix-processes which appear in a natural
way in these formulae and have paths possessing singularities of a certain form that
will be described precisely in that section.
Definition 3.1 Let X : I R be real-valued stochastic process defined on a
subset of Rd . We will say that X satisfies condition (Hk ), k a positive integer, if
the following three conditions hold true:
X is Gaussian;
a.s. the paths of X are of class C k ;
for any choice of pairwise different values of the parameter t1 , ...., tn the joint
distribution of the random variables
X(t1 ), ..., X(tn ), X (t1 ), ..., X (tn ), ....., X (k) (t1 ), ..., X (k) (tn )
(16)
has maximum rank. Note that the number of distinct real-valued Gaussian
variables belonging to this set (16), on account of exchangeability of the order
of differentiation, is equal to
n 1+
d
d+1
k+d1
+
+ .... +
d1
d1
d1
The next proposition shows that there exist processes that satisfy (Hk ).
Proposition 3.1 Let X = X(t) : t Rd be a centred stationary Gaussian process having continuous spectral density f X . Assume that f X (x) > 0 for every x Rd
and that for any > 0 f X (x) C x holds true for some constant C and all
x Rd .
Then, X satisfies (Hk ) for every k = 1, 2, ...
18
Z=
h=1
where k denotes summation over all the d-tuples of non-negative integers k1 , k2 ..., kd
such that k1 +k2 +..+kd k and k1 ,k2 ...,kd ,h are complex numbers, then E |Z|2 = 0
implies k1 ,k2 ...,kd ,h = 0 for any choice of the indices k1 , k2 ..., kd , h in the sum. Using
the spectral representation, and denoting x = (x1 , ..., xd ),
n
E |Z|
. exp [i x, th th ] f X (x)dx
where the inner sum is over all 2dtuples of non-negative integers k1 , k2 ..., kd , k1 , k2 ..., kd
such that k1 + k2 + .. + kd k, k1 + k2 + .. + kd k. Hence,
2
E |Z|
k1
=
Rd h=1
kd
f X (x)dx
h=1
The result follows from the fact that the set of functions xk11 ...xkdd exp [i x, th ] where
k1 , k2 ..., kd , h vary as above, is linearly independent.
19
Proof : For u < v and S (respectively S) a subset of I (resp. I), let us denote
Mu,v (S) =
Mu,v (S) =
1 Muh,u (I) 1
Muh,u (I)
(18)
(dt)(dt)
I I
dxdx
uh
(dt)
I\I
uh
= sup |g(t)|
tI
,k
sup
k1 +k2 +..+kd k
21
k1 ,k2 ...,kd g
For fixed > 0 (to be chosen later on) and h > 0,we denote by Eh the event:
Eh =
,4
Because of the Landau-Shepp-Fernique inequality (see Landau-Shepp, 1970 or Fernique, 1975) there exist positive constants C1 , C2 such that
P(EhC ) C1 exp C2 h2 = o(h) as h 0
so that to have (17) it suffices to show that, as h 0 :
E
E
1I
Muh,u (I)
1IMI u 1IEh = o(h)
Muh,u (I)1
(20)
(21)
(s) (t)
II
dx1 dx2
uh
dx1 dx2
As,t =
uh
(s)0,X (t)0
for s, t D .
So it is enough to prove that As,t = o(h) for t s small, and we may assume
that s and t are in the same chart (U, ). Writing the process in this chart we may
assume that I is a ball or a half ball in Rd . Let s, t two such points, define the
process Y = Y s,t by
Y ( ) = X s + (t s)
; [0, 1].
Y (1) = x2 ,
Y (0) = Y (1) = 0
Q(1) = x2 ,
Q (0) = Q (1) = 0
Check that
Q(y) = x1 + (x2 x1 ) y 2 (3 2y), Q (0) = Q (1) = 6(x2 x1 )
Denote
Z( ) = Y ( ) Q( ) 0 1.
Under the conditioning, one has:
Z(0) = Z(1) = Z (0) = Z (1) = 0
and if also the event Eh occurs, an elementary calculation shows that for 0 1 :
|Z (4) ( )|
|Y (4) ( )|
= sup
(const) t s 4 h .
2!
2!
[0,1]
[0,1]
|Z ( )| sup
(24)
det(B)
(25)
v1 =
Note that in that case, the elements of matrix B are of the form X(s)vj , vk
hence bounded by (const)h . So,
ts
2 (d1)
ts
2 (d1)
Cd2
Cd2
2(d1)
2(d1)
(s)0,X (t)0
1IEh /C
E [Y (0)] [Y (1)]
ts
Y (0) + Y (1)
2
ts
Z (0) + Z (1)
2
1IEh /C
1IEh /C
1IEh /C
We now turn to the density in (22) using the following Lemma which is similar
to Lemma 4.3., p. 76, in Piterbarg (1996).
Lemma 3.1 For all s, t I:
ts
d+3
(26)
where D is a constant.
Proof. Assume that (26) does not hold, i.e., that there exist two convergent
sequences {sn }, {tn } in I , sn s , tn t such that
tn sn
d+3
(27)
If s = t , (27) can not hold, since the non degeneracy condition assures that this
sequence has the finite limit t s d+3 pX(s ),X(t ),X (s ),X (t ) (0, 0, 0, 0). So, s = t .
Since one can assume with no loss of generality that I is a ball or a half ball, the
n
segment [sn , tn ] is contained in I. Denote the unit vector e1,n = ttnn s
,complete
sn
d
it to an orthonormal basis {e1,n , e2,n , ..., ed,n } of R and take a subsequence of the
integers {nk } so that ej,nk ej as k + for j = 1, ..., d. In what follows, without
loss of generality, we assume that {nk } is the sequence of all positive integers. For
each Rd we denote 1,n , ..., d,n the coordinates of in the basis {e1,n , ..., ed,n }.
Note that tn sn has coordinates (t1,n s1,n , 0, ..., 0) = ( tn sn , 0, ..., 0).
Also, we denote 1 , ..., d the coordinates of in the basis {e1 , ..., ed }
The following computation is similar to the proof of Lemma 3.2. in Azas &
Wschebor (2001). We have:
n = det Var (X(sn ), X(tn ), X (sn ), X (tn ))
X
X
X
X
= det Var X(sn ), X(tn ),
(sn ),
(tn ), ...,
(sn ),
(tn )
1,n
1,n
d,n
d,n
X
X
X
= det Var X(sn ),
(sn ), Y1,n , Z1,n ,
(sn ), Z2,n , ...,
(sn ), Zd,n
1,n
2,n
d,n
where
X
(sn )(t1,n s1,n )
1,n
X
X
2
Z1,n =
(tn )
(sn )
Y1,n
1,n
1,n
t1,n s1,n
X
X
X
X
(tn )
(sn ), ....., Zd,n =
(tn )
(sn )
Z2,n =
2,n
2,n
d,n
d,n
Using now Taylor expansions and taking into account the integrability of the supremum of bounded Gaussian process, we have:
Y1,n = X(tn ) X(sn )
Y1,n =
Z1,n
(t1,n s1,n )2 2 X
(sn ) + 1,n (t1,n s1,n )3
2
2
1,n
(t1,n s1,n )2 3 X
=
(sn ) + n (t1,n s1,n )3
3
6
1,n
2X
(sn ) + 2,n (t1,n s1,n )2 , ......,
2,n 1,n
2X
= (t1,n s1,n )
(sn ) + d,n (t1,n s1,n )2
d,n 1,n
25
where the random variables 1,n , 2,n , ..., d,n , n are uniformly bounded in L2 of the
underlying probability space.
Substituting into n it follows that:
144 (t1,n s1,n )[8+2(d1)] n
X
2X
3X
2X
X
X
ts
d+1
ds dt
II
dx1 dx2
uh
(const) h22d
since the function (s, t)
t s d+1 is Lebesgue-integrable in I I. The last
constant depends only on the dimension d and the set I, Taking small enough
(20) follows.
An example: Let {X(s, t)} be a real-valued two-parameter Gaussian, centred
stationary isotropic process with covariance . Assume that its spectral measure
is absolutely continuous with density
(ds, dt) = f ()dsdt,
So that
= (s2 + t2 ) 2 .
f ()d = 1.
0
Using (15) which is a consequence of Theorem 3.1 and the invariance of the law of
the process, we have
FI (u) E 1 (X (0, 0))/X(0, 0) = u, X (0, 0) = (0, 0) pX(0,0),X (0,0) (u, (0, 0))
(1, 0))/X(1, 0) = u, X
(1, 0) = 0 p
+ 2E 1 (X
(1,0) (u, 0) = I1 + I2 . (28)
X(1,0),X
We denote by X, X , X the value of the different processes at some point (s, t);
by Xss , Xst , Xtt the entries of the matrix X and by and the standard normal
density and distribution.
One can easily check that:
X is independent of X and X , and has variance J3 Id
Xst is independent of X, X Xss and Xtt , and has variance 4 J5
Conditionally on X = u, the random variables Xss and Xtt have
expectation: J3
J (J3 )2
variance: 3
4 5
covariance: 4 J5 (J3 )2 .
Using an elementary computation we get that the expectation of the negative part
of a Gaussian variable with expectation and variance 2 is equal to
( ) (
).
We obtain
I2 =
2
(u)
J3
3
J5 (J3 )2
4
with
1
2
(bu) + J3 u(bu) ,
J3
b=
3
J
4 5
(J3 )2
1
2
As for I1 we remark that, conditionally on X = u, Xss + Xtt and Xss Xtt are
independent, so that a direct computation gives:
I1 =
1
(u)E 1 2J3 u
8J3
J5 2
(2 + 32 )
4
1I{ < 2J u} 1I
1
3
{ 1 2J3 u
27
, (29)
J5 2
(2 + 32 ) > 0}
4
2
I1 =
(u)
(2 +a2 c2 x2 )(acx)+[2a2 (acx)](acx) x(x)dx,
8J3
0
with a = 2J3 u, c =
J5
.
4
We choose, once for all along this section a finite atlas A for I. Then, to every t I
it is possible to associate a fixed chart that will be denoted (Ut , t ). When t I,
t (Ut ) can be chosen to be a half ball with t (t) belonging to the hyperplane limiting
this half ball. For t I, let Vt an open neighbourhood of t whose closure is included
in Ut and t a C function such that
t 1
t 0
on
on
Vt
Utc
(30)
(31)
1
s t 2.
2
(32)
1
2
st
1
s t 2.
2
where (st)N is the normal component of (st) with respect to the hyperplane
delimiting the half ball t (Ut ) . The rest of the definition is the same.
28
f (s) = a
ts f (t)+ < bts , f (t) > +n(t, s)f t (s),
Then s
X t (s) and s
pole t satisfying Ht,k .
X (1 )t + s v, v (1 )d,
0
With v =
st
st
X (s) = t s
X (1 )t + s v, v (1 )d ,
(33)
exp(u2 /2)
for every u R
2 /2)dv
exp(v
u
(34)
Let now Z satisfy the hypotheses of the theorem. For given a, b R, a < b,
choose A R+ so that |a| < A and consider the process:
X(t) =
Z(t) a |m | + A
.
+
(t)
0
m(t) a |m | + A
|m | + |a| |m | + A
+
+
0,
(t)
0
0
0
and
Var X(t) = 1.
So that (34) holds for the process X.
On the other hand:
|m | + A
|m | + A b a
{a < M Z b} {
< MX
+
}.
0
0
0
And it follows that
P a < MZ b
|m |+A ba
+
0
0
|m |+A
0
(u)du =
a
v a + |m | + A
1
dv.
0
0
FI (u) = 1
+ 1
d1
I
32
(36)
(37)
1/2
Now, observe that our improved version of Ylvisakers theorem (Theorem 4.1),applies
to the process s
X t (s) t (s)u defined on I\{t}. This implies that the first term
in (37) tends to zero as h 0. An analogous argument applies to the second term.
Finally, the continuity of FI (u) follows from the fact that one can pass to the limit
under the integral sign in (35).
To finish the proof we still have to show that the added hypotheses are in fact
unnecessary for the validity of the conclusion. Suppose now that the process X
satisfies only the hypotheses of the theorem and define
X (t) = Z (t) + Y (t)
(38)
where for each > 0, Z is a real-valued Gaussian process defined on I, measurable with respect to the -algebra generated by {X(t) : t I}, possessing C
33
paths and such that almost surely Z (t), Z (t), Z (t) converge uniformly on I to
X(t), X (t), X (t) respectively as 0. One standard form to construct such an approximation process Z is to use a C partition of the unity on I and to approximate
locally the composition of a chart with the function X by means of a convolution
with a C kernel.
In (38), Y denotes the restriction to I of a Gaussian centred stationary process
satisfying the hypotheses of proposition 3.1, defined on RN , and independent of
X. Clearly X satisfies condition (Hk ) for every k, since it has C paths and
the independence of both terms in (38) ensures that X inherits from Y the nondegeneracy condition in Definition 3.1. So, if
MI = max X (t) and FI (u) = P{MI u}
tI
one has
FI (u) = 1
+ 1
d1
E det X
(t)
E det X
(t)
(t)u 1IAu (X
(t)u 1IAu (X
t , ,t )
t , t )
pX
pX
(dt),
(t) (u, 0)
(t),X
(39)
We want to pass to the limit as 0 in (39). We prove that the right-hand member
is bounded if is small enough and converges to a continuous function of u as 0.
Since MI MI , this implies that the limit is continuous and coincides with FI (u)
by a standard argument on convergence of densities. We consider only the first term
in (39), the second is similar.
The convergence of X and its first and second derivative, together with the
non-degeneracy hypothesis imply that uniformly on t I, as 0 :
pX
(t)
(t)u ,
on account of the form of the regression coefficients and the definitions of X t and
t . The only difficulty is to prove that, for fixed u:
P{C C} 0 as
0,
where
C = Au (X t , t )
34
(40)
C = Au (X t , t )
We prove that
a. s. 1IC 1IC as
0,
(41)
sup
X t (s) t (s)u = 0
sI\{t}
sI\{t}
> 0 is small
sI\{t}
P{Au (X t , t )}.
35
sup
X t (s) t (s)u
sI\{t}
sI\{t}
Second derivative
FI (u) = 1
(1,0)
t (t)
E
I
i,j
i,j=1
ds t (s)
dt
I
I
t
E det X (s) (s)u det X (t) (t)u 1IAu /X t (s) = t (s)u, X t (s) = t (s)u
pX t (s),X t (s) t (s)u, t (s)u pX(t),X (t) u, 0 +
S d1
(42)
(1,0)
d
I
(43)
The derivative of the integrand in (43) is the sum of the three derivatives corresponding to the three locations where the variable u appears, namely :
in the density pX(t),X (t) (u, 0) which is clearly differentiable with bounded
derivative :
(1,0)
pX(t),X (t) (u, 0).
This gives the first term in (42).
In the derivative with respect to the first occurrence of u in
E det X t (t) t (t)u 1IAu (X t , t ) .
The derivative of which is
d
t (t)
i,j
i,j=1
where Ci,j (u) is the cofactor of location (i, j) in the matrix X t (t) t (t)u.
This quantity is uniformly bounded when u varies in a compact interval, which
follows easily from an expression of the type (33). This gives the second term
in (42).
in the derivative with respect to the second occurrence of u in
E det X t (t) t (t)u 1IAu (X t , t ) .
To evaluate this derivative define v as in (36) and set for sufficiently small:
I := I\B(t, ) ; Au = Au (X t , t ) := X t (s) t (s)u, s I ,
B(t, ) being the ball with center t and radius in the chart (t , Ut ). By dominated
convergence
E v 1IAu+h (X t , t ) 1IAu (X t , t )
37
dx
I
u+h
dx
u
S(t,)
t (s) = t (s)x
where S(t, ) is the sphere with centre t and radius , Y t (s) = X t (s) t (s)x ,
t (s) t (s)x.
Y t (s) = X
Let us prove that the first integral converges as 0. The only problem is the
behaviour around t. So it is sufficient to prove the convergence locally around t in
the chart (t , Ut ) with s in Vt which implies that n(s, t) = 12 t s 2 . Without loss
of generality we may assume that the representation of t in this chart is the point 0
in Rd . To study the behaviour of the integrand as s 0, we choose an orthonormal
basis with ss as first vector and set s = (, 0, ...0)T . At s = 0 the process X t and
its derivative have the following expansions (for short, derivatives are indicated by
sub-indices).
1 2
3
4
X 11 + X 111 + X 1111 + o(4 )
2
6
24
2
3
X t
3X t
(0)
3 s1
and X 1111 =
4X t
(0)
4 s1
Since
X t (s) = m(s).X t
with
m(s) := 1/n(s, 0) =
s21
2
4
m(s)
(, 0...0) = 3
;
2
+ ... + sd
s1
38
(45)
(46)
(47)
(48)
m(s)
2 m(s)
12
(, 0...0) = 0 (i = 1) ;
(, 0...0) = 4
2
si
s1
4
2 m(s)
2 m(s)
(,
0...0)
=
(i
=
1)
;
(, 0...0) = 0 (i = j).
s2i
4
si sj
Using derivation rules, we get
(49)
(50)
(51)
(52)
(53)
i=1
where the notation iit (s) has an obvious meaning. The condition
C(s) = X t (s) = t (s)x ; X t (s) = t (s)x
converges as s 0 to the condition
X 11 = 11 x ; X 111 = 111 x ; X ti (0) = ti (0)x (i = 2, d)
which is non singular (again the notations 11 , 111 , ti are obvious). Consider
a Gaussian variable which is measurable with respect to the process and which is
39
1
2
2
1
X
+
X
x = Op (2 ),
ii
11
2
2
2 ii
2 11
Since ( t (s) is bounded we see that the integrand is O(1d ) which ensures
convergence of I1 as 0. One easily check that the bound for the integrand is
uniform in t.
We consider now the limit of I2 as 0. It is enough to prove that for each
x R the expression
u+h
d1
(1)
(ds) t (s)
dx
u
S(t,)
t
t
t (s) = t (s)x p t t
E det X (s) v 1IAx /X t (s) = t (s)x, X
X (s),X (s) (s)x, (s)x .
(54)
(1)d1
u
2(d1)
(dw) t (t + w)
dx
S d1
X tN T (w)
X tT (w)
Corollary 6.1 Suppose that the process X satisfies the conditions of Theorem 4.2
and that in addition E(Xt ) = 0 and Var(Xt ) = 1.
Then, as u + F (u) is equivalent to
ud
u2 /2
e
(2)(d+1)/2
det((t))
I
1/2
dt,
(57)
r(s, t);
si
rij; (s, t) :=
2
r(s, t);
si sj
ri;j (s, t) :=
2
r(s, t).
si tj
Thus X(t) and X (t) are independent. Regression formulae imply that
ats = r(s, t),
t (s) =
1 r(t, s)
.
n(s, t)
This implies that t (t) = (t) and that the possible limit values of t (s) as s t
are in the set {v T (t)v : v S d1 }. Due to the non-degeneracy condition these
quantities are minorized by a positive constant. On the other hand for s = t
t (s) > 0. This shows that for every t I one has inf sI t (s) > 0. Since for every
t I the process X t is bounded it follows that
a.s. 1IAu (X t , t ) 1 as u +.
Also
det X t (t) t (t)u
A dominated convergence argument shows that the first term in (35) is equivalent
to
2 /2
ud det(t )(2)1/2 eu
(2)d/2 det(t )
1/2
dt =
ud
2
eu /2
(d+1)/2
(2)
det(t )
I
2 /2
The same kind of argument shows that the second term is O ud1 eu
completes the proof.
1/2
which
References
dt.
Azas, J-M. and Delmas, C. (2002). Asymptotic expansions for the distribution
of the maximum of a Gaussian random fields. To appear in Extremes.
Azas, J-M. and Wschebor M. (1999). Regularite de la loi du maximum de
processus gaussiens reguliers. C.R. Acad. Sci. Paris, t. 328, serieI, 333-336.
Azas, J-M. and Wschebor M. (2001). On the regularity of the distribution of
the maximum of one-parameter Gaussian processes. Probab. Theory Relat. Fields,
119, 70-98.
Brillinger D. R., (1972). On the number of solutions of systems of random
equations. The Annals of Math. Statistics, 43, 534540.
Caba
na, E. M., (1985). Esperanzas de integrales sobre conjuntos de de nivel
aleatorios (spanish). Actas del segundo Congreso latinoamericano de probabilidades
Y estadistica mathematica. Caracas, 65-81
Cramer, H. and Leadbetter, M.R. (1967). Stationary and Related Stochastic
Processes, J. Wiley & Sons, New-York.
Cucker, F.; Wschebor, (2003) M. On the Expected Condition Number of Linear
Programming Problems, to appear in Numerische Mathematik.
Diebolt, J. and Posse, C. (1996). On the Density of the Maximum of Smooth
Gaussian Processes.Ann. Probab., 24, 1104-1129.
Federer, H. (1969). Geometric measure theory. Springer-Verlag, New York
Fernique, X.(1975). Regularite des trajectoires des fonctions aleatoires gaussiennes.Ecole dEte de Probabilites de Saint Flour. Lecture Notes in Mathematics,
480, Springer-Verlag,New-York.
Kobayashi Nomizu 199? Foundation of differential geometry. J. Wiley & Sons,
New-York.
Landau, H.J. and Shepp, L.A (1970). On the supremum of a Gaussian process.
Sankya Ser. A 32, 369-378.
Lifshits, M.A.(1995). Gaussian random functions . Kluwer, The Netherlands.
Milnor, J. W.(1965). Topology from the differentiable viewpoint. The Univerity
Press of Virginia , Charlottesville.
Piterbarg, V. I. (1996). Asymptotic Methods in the Theory of Gaussian Processes
and Fields. American Mathematical Society. Providence, Rhode Island.
Piterbarg V. I., (1996b). Rices Method for Large Excursions of Gaussian Random Fields. Technical Report No. 478, University of North Carolina. Translation
of Rices method for Gaussian random fields.
Taylor J.E., and Adler R. (2002) Euler characteristics for Gaussian fields on
manifolds. Preprint
Tsirelson, V.S. (1975). The Density of the Maximum of a Gaussian Process. Th.
Probab. Appl., 20, 847-856.
43
44