Anda di halaman 1dari 24

Introduction to Statistical Analysis of Time Series

Richard A. Davis
Department of Statistics
Outline

Modeling objectives in time series


General features of ecological/environmental time series
Components of a time series
Frequency domain analysis-the spectrum
Estimating and removing seasonal components
Other cyclical components
Putting it all together
1

Time Series: A collection of observations xt, each one being


recorded at time t. (Time could be discrete, t = 1,2,3,, or
continuous t > 0.)
Objective of Time Series Analaysis

Data compression
-provide compact description of the data.

Explanatory
-seasonal factors
-relationships with other variables (temperature,
humidity, pollution, etc)

Signal processing
-extracting a signal in the presence of noise

Prediction
-use the model to predict future values of the time series
2

General features of ecological/environmental time series


Examples.

320

330

CO2

340

350

1. Mauna Loa (CO2,, Oct `58-Sept `90)

1960

1970

1980

1990

Features

 increasing trend (linear, quadratic?)


 seasonal (monthly) effect.

10
5
0
-5

temp

15

20

2. Ave-max monthly temp (vegetation=tundra,, 1895-1993)

200

400

600

800

1000

1200

Features

 seasonalGo
(monthly effect)
to ITSM Demo
 more variability in Jan than in July
4

10
5

temp

15

20

July: mean = 21.95, var = .6305

-5

Jan : mean = -.486, var =2.637


20

40

60

80

100

18
17

Line: 16.83 + .00845 t

16

Sept : mean = 17.25, var =1.466

15

temp

19

20

20

40

60

80

100

Components of a time series


Classical decomposition
Xt = mt + st + Yt
mt = trend component (slowly changing in time)
st = seasonal component (known period d=24(hourly),
d=12(monthly))
Yt = random noise component (might contain irregular
cyclical components of unknown frequency + other stuff).

Go to ITSM Demo

Estimation of the components.


Xt = mt + st + Yt
Trend mt
 filtering. E.g., for monthly data use

m t = (.5 xt 6 + xt 5 +  + xt +5 + .5 xt + 6 ) / 12
 polynomial fitting

m t = a0 + a1t +  + ak t k

Estimation of the components (cont).


Seasonal st

Xt = mt + st + Yt

 Use seasonal (monthly) averages after detrending.


(standardize so that st sums to 0 across the year.

st = ( xt + xt +12 + xt + 24 ) / N , N = number of years


 harmonic components fit to the time series using least squares.

2
2
st = A cos( t ) + B sin( t )
12
12
Irregular component Yt

Yt = X t m t st
8

The spectrum and frequency domain analysis


Toy example. (n=6)

s1

c1

c0

20
20
),, cos(
6) )' /sqrt(6)
6
6
2
2
c1 = (cos( ), , cos ( 6) )' /sqrt(3)
6
6
2
2
s1 = (sin( ), , sin ( 6 ) )' /sqrt(3)
6
6

c 0 = (cos(

c2

22
22
6) )'/sqrt(3)
c2 = (cos( ),, cos(
6
6
22
2 2
), , sin (
6) )' /sqrt(3)
6
6
s2

s 2 = (sin(

c3

2
2
c3 = (cos( ),, cos( 6) )'/sqrt(6
2
2

X=(4.24, 3.26, -3.14, -3.24, 0.739, 3.04) = 2c0+5(c1+s1)-1.5(c2+s2)+.5c3

Fact: Any vector of 6 numbers, x = (x1, . . . , x6) can be written


as a linear combination of the vectors c0, c1, c2, s1, s2, c3.
More generally, any time series x = (x1, . . . , xn) of length n
(assume n is odd) can be written as a linear combination of the
basis (orthonormal) vectors c0, c1, c2, , c[n/2], s1, s2, , s[n/2].
That is,

x = a0c 0 + a1c1 + b1s1 +  am c m + bm s m , m = [ n / 2]

sin( j )
cos( j )
1

1/ 2
1/ 2
1/ 2

sin(
2
)
cos(
2
)
1
2
1
2
j
j
c0 =
, sj =
, cj =



n
n 
n

sin( n j )
cos(n j )
1

10

x = a0c 0 + a1c1 + b1s1 +  am c m + bm s m , m = [n / 2]


Properties:
1.

The set of coefficients {a0, a1, b1, } is called the


discrete Fourier transform

1
a0 = (x, c 0 ) = 1/ 2
n
21/ 2
a j = (x, c j ) = 1/ 2
n
21/ 2
b j = ( x, s j ) = 1 / 2
n

x
t =1

x cos( t )
t =1

x sin( t )
t =1

11

2. Sum of squares.
n

t =1

x t2 = a 02 + a 2j + b 2j
j =1

3. ANOVA (analysis of variance table)


Source

DF

Sum of Squares

a02

I(0)

1=2/n

a12 + b12

2 I(1)

m =2m/n

am2 + bm2

2
x
t
t

Periodgram


2 I(m)
12

Applied to toy example


Source

DF

Sum of Squares

0=0 (period 0)

1=2/6 (period 6)

a12 + b12 = 50.0

2 =22/6 (period 3) 2

a22 + b22 = 4.5

a02

3 =23/6 (period 2) 1
6

= 4.0

a32

= 0.25

2
t

= 58.75

Test that period 6 is significant


H0: Xt = + t ,

{t} ~ independent noise

H1: Xt = + A cos (t2/6) + B sin (t2/6) + t


Test Statistic: (n-3)I(1)/(t xt2-I(0)-2I(1)) ~ F(2,n-3)
(6-3)(50/2)/(58.75-4-50)=15.79 p-value = .003
13

The spectrum and frequency domain analysis


Ex. Sinusoid with period 12.

xt = 5 cos(

2
2
t ) + 3 sin( t ), t = 1,2, ,120.
12
12

Ex. Sinusoid with periods 4 and 12.


Ex. Mauna Loa

ITSM DEMO
14

Differencing at lag 12
Sometimes, a seasonal component with period 12 in the time
series can be removed by differencing at lag 12. That is the
differenced series is

yt = xt xt 12
Now suppose xt is the sinusoid with period 12 + noise.

xt = 5 cos(

2
2
t ) + 3 sin( t ) + t , t = 1,2, ,120.
12
12

Then

yt = xt xt 12 = t t 12
which has correlation at lag 12.
15

Other cyclical components; searching for hidden cycles


Ex. Sunspots.
 period ~ 2/.62684=10.02 years
 Fishers test significance
What model should we use?

ITSM DEMO

16

Noise.

-2

x_t

The time series {Xt} is white or independent noise if the


sequence of random variables is independent and identically
distributed.

20

40

60

80

100

120

time

Battery of tests for checking whiteness.


In ITSM, choose statistics => residual analysis => Tests of Randomness
17

x_{t+1}
x_{t+1}
x_{t+1}
x_{t+1}
-1.0
-0.5
0.0
0.5
1.0
1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
-1.0
-0.5
0.0
0.5
1.0
1.5

Residuals from Mauna Loa data.

Cor(X t , X t+25 ) = .074


Cor(X
) )==.654
Cor(Xt ,t,XXt+3
.736
Cor(X tt+2
, X t+1) = .824

-1.0
-1.0
-1.0
-1.0

-0.5
-0.5
-0.5
-0.5

0.0
0.0
0.0
0.0

0.5
0.5
x_t 0.5
0.5
x_t
x_t
x_t

1.0
1.0
1.0
1.0

tt
11

rrt
t
-.19
-.19

rrt+25
t+1
t+1
t+2
-.13
.13
-.14
-.25

22
33

-.14
-.14
-.25
-.25

-.32
.04
-.25
-.13
-.13
.20
-.13
-.32

44

-.13
-.13

-.02
.47
-.32
-.02

1.5
1.5
1.5
1.5

18

Autocorrelation function (ACF):

ACF

0.2

0.4

0.6

0.8

1.0

Mauna Loa residuals

0.0

Conf Bds: 1.96/sqrt(n)


0

10

20

30

40

lag

0.4
0.2
0.0
-0.2

ACF

0.6

0.8

1.0

white noise

10

20
lag

30

40

19

Putting it all together

340

1970

1980

1990

1960

1970

1980

1990

1960

1970

1980

1990

1960

1970

1980

1990

340

1960

1 2 3
-1
1.0
0.0
-1.0

irregular part

-3

seasonal

320

trend

320

CO2

Example: Mauna Loa

20

Strategies for modeling the irregular part {Yt}.


 Fit an autoregressive process
 Fit a moving average process
 Fit an ARMA (autoregressive-moving average) process
In ITSM, choose the best fitting AR or ARMA using the
menu option
Model => Estimation => Preliminary => AR estimation
or
Model => Estimation => Autofit

21

How well does the model fit the data?


1.

Inspection of residuals.
Are they compatible with white (independent) noise?


no discernible trend

no seasonal component

variability does not change in time.

no correlation in residuals or squares of residuals

Are they normally distributed?


2. How well does the model predict.


values within the series (in-sample forecasting)

future values

3. How well do the simulated values from the model capture


the characteristics in the observed data?

ITSM DEMO with Mauna Loa

22

Model refinement and Simulation




Residual analysis can often lead to model refinement

Do simulated realizations reflect the key features


present in the original data

Two examples


Sunspots

NEE (Net ecosystem exchange).

Limitations of existing models




Seasonal components are fixed from year to year.

Stationary through the seasons

Add intervention components (forest fires, volcanic


eruptions, etc.)

23

Other directions


Structural model formulation for trend and seasonal


components


Local level model


mt = mt-1 + noiset

Seasonal component with noise


st = st-1 st-2 . . . st-11+ noiset

Xt= mt + st + Yt + t

Easy to add intervention terms in the above


formulation.

Periodic models (allows more flexibility in modeling


transitions from one season to the next).

24