0 Suka0 Tidak suka

177 tayangan41 halamanNov 09, 2010

© Attribution Non-Commercial (BY-NC)

PDF, TXT atau baca online dari Scribd

Attribution Non-Commercial (BY-NC)

177 tayangan

Attribution Non-Commercial (BY-NC)

- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- Hidden Figures Young Readers' Edition
- The Law of Explosive Growth: Lesson 20 from The 21 Irrefutable Laws of Leadership
- The Art of Thinking Clearly
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- The Wright Brothers
- The Other Einstein: A Novel
- State of Fear
- State of Fear
- The Power of Discipline: 7 Ways it Can Change Your Life
- The Kiss Quotient: A Novel
- The 10X Rule: The Only Difference Between Success and Failure
- Being Wrong: Adventures in the Margin of Error
- Algorithms to Live By: The Computer Science of Human Decisions

Anda di halaman 1dari 41

by

David X. Li

and

H. J. Turtle

David X. Li H. J. Turtle

Riskmetrics Group PO Box 644746

44 Wall Street, 22nd Floor Department of Finance, Insurance, and Real Estate

New York, NY 10005 College of Business and Economics

Washington State University

tel: (212) 981-7453 Pullman, Washington, 99164-4746

fax: (212) 981-7402

tel: (509) 335-3797

fax: (509) 335-3857

email:david.li@riskmetrics.com

web:http://www.riskmetrics.com email: hturtle@wsu.edu

web: http://www.cbe.wsu.edu/~hturtle

Current draft: April 1999

1

We thank John Kling, Tom McCurdy, Ieuan Morgan, seminar participants at the 1996 Northern

Finance Association Meetings, and three anonymous referees for helpful comments. Financial

assistance from the Social Sciences and Humanities Research Council (SSHRC) is gratefully

acknowledged (Turtle). The usual disclaimer applies.

Semiparametric ARCH Models: An Estimating Function Approach

autoregressive conditional heteroskedasticity (ARCH) models. We derive the

optimal estimating functions by combining linear and quadratic estimating

functions. The resultant estimators are more efficient than the quasi-maximum

likelihood estimator. If the assumption of conditional normality is imposed, the

estimator obtained by using the theory of estimating functions is identical to that

obtained by using the maximum likelihood method in finite samples. The relative

efficiencies of the estimating function approach in comparison with the quasi-

maximum likelihood estimator are developed. We illustrate the estimating function

approach using a univariate GARCH(1,1) model with conditional Normal,

Student-t, and Gamma distributions. The efficiency benefits of the estimating

function (EF) approach relative to the quasi-maximum likelihood approach are

substantial for the Gamma distribution with large skewness. Simulation analysis

shows that the finite sample properties of the estimators from the estimating

function approach are attractive. EF estimators tend to display less bias and root

mean squared error than the quasi-maximum likelihood estimator. The efficiency

gains are substantial for highly nonnormal distributions. An example demonstrates

that implementation of the method is straightforward.

1. INTRODUCTION

Recent financial studies show substantial interest in the sampling properties of estimators

that result from models of conditional volatility. Many volatility models follow from the seminal

work of Engle (1982) on autoregressive conditional heteroskedasticity (ARCH). Engle models

conditional variances as evolving according to a linear function of predetermined variables, most

notably squared prior disturbances. The generalizations and refinements to Engle’s pioneering

work are extensive (c.f., the GARCH model of Bollerslev (1986), the IGARCH model of Engle

and Bollerslev (1986), the ARCH-M model of Engle, Lilien and Robins (1987), the Quadratic

GARCH of Sentana (1991), the Student-t GARCH model of Engle and Bollerslev (1986) and

Bollerslev (1987), the log GARCH of Geweke (1986), the Exponential ARCH of Nelson (1991),

the nonlinear GARCH of Higgins and Bera (1992), or the threshold ARCH model of Glosten,

Jaganathan, and Runkle (1993)).

Volatility models have been successfully employed in pricing derivative securities, in

stochastic modeling of the term structure of interest rates, in applications related to fixed-income

portfolio management, and in asset pricing studies. The interested reader is referred to Bollerslev,

Chou and Kroner (1992) or Engle (1995) for an extensive survey of the ARCH methodology in

finance. For a review of the literature using stochastic volatility, Taylor (1994) provides an

excellent summary.

maximum likelihood (ML) estimation assuming conditional normality, quasi-maximum likelihood

(QML) estimation (c.f., Weiss 1986, and Bollerslev and Wooldridge 1988), generalized method of

moments (GMM) estimation (e.g., Bodurtha and Mark 1991), or semiparametric estimation (c.f.,

Engle and Gonzalez-Rivera 1991, or Drost and Klaassen 1997). It is well known that GMM and

QML estimation procedures produce inefficient and possibly biased estimates relative to ML

estimates when the true distribution is known. We develop an estimation approach for ARCH

models that reduces bias and improves efficiency without any necessary assumptions regarding the

underlying variate distribution.

The purpose of this paper is twofold. First, we seek to introduce the theory of estimating

functions (EFs) into the finance literature. We show that the EF approach is well suited to

financial data. A related paper by Vinod (1996) considers the benefits of using the estimating

function approach in conjunction with bootstrapping to meaningfully shrink confidence intervals

in many econometric contexts. Second, we show how the EF approach can be applied to the

estimation of ARCH models. Many unsolved problems in the estimation of ARCH models may

be addressed using the EF approach. In particular, the optimality of QML and GMM estimation

is often based on asymptotic theory. Unfortunately, asymptotic findings do not apply to the small

sample sizes often used in practice. This problem is exacerbated because estimation of higher

moments often requires very large samples to obtain convergence to asymptotic results. Because

the EF approach is based on finite samples from the outset, these criticisms do not apply.

Nonetheless, under strong distributional assumptions, many standard results in the estimation of

ARCH models based on conditional normality are recoverable under the EF approach. Thus, in

addition to important finite sample properties, the EF approach provides a strong link to the

existing literature.

2

The remainder of the paper is organized as follows. Section 2 provides an introduction to

the theory of EFs. Sections 3 and 4 show how the EF approach can be used to estimate ARCH

and ARCH regression models, respectively. Section 5 discusses properties of optimal estimating

functions. Section 6 derives measures of the relative efficiency of EF estimators. Section 7

performs Monte Carlo analysis to examine the behavior of EF estimators in terms of variance,

bias, and root mean squared errors in both a moderate and large sample setting (500 and 1000

observations). Section 8 demonstrates the use of the EF approach in a simple example using daily

observations on the S&P500 index. Finally, in section 9 we offer concluding comments.

In this section we introduce preliminary concepts and results from the theory of EFs

required for our development. We draw extensively on the work of Godambe (1960, 1976, 1985,

1991), Godambe and Thompson (1984, 1989), and Heyde (1989) to present the important theory

and application of the EF approach to ARCH models.

Traditional estimation theory, such as the method of ML or the method of least squares

(LS), focuses on properties of estimators that are functions of observations. The optimality of a

subclass of estimators is typically established according to an optimality criterion, such as

minimum mean squared error, or uniform minimum variance unbiasedness. An alternative

approach is to focus on functions of both the observation x and the unknown parameter θ, and to

study estimators as the solution of some equation,

g ( x ,θ )= 0

3

The function g is denoted an estimating function, while the underlying equation is called an

estimating equation. For a given optimal estimating function, an estimate of θ may be obtained.

Many estimation approaches can be viewed as special cases of the estimating function approach.

For example, the ML estimator is typically obtained by setting the score function equal to zero.

In the estimating function approach, it is the estimating function itself, rather than the estimator,

which is the subject of study. This change of emphasis from the estimator to the estimating

function has the following advantages:

1. Optimality criteria are defined for the estimating function itself, not the estimator. Thus,

for example, optimality can be based on finite sample properties.

2. Parametric models and semiparametric models can be studied with equal ease by the

approach of estimating functions.

3. Information from multiple sources can be readily combined using the concept of

orthogonal estimating functions.

Given these advantages, the estimating function approach has been successfully applied to

research in areas such as biostatistics, statistical inference in stochastic processes, and survey

sampling. The focus on the EFs directly implies that the resultant estimators need not be

unbiased. We address these issues theoretically in section 5, and empirically in section 7.

We present without proof a number of important definitions and theorems related to the

theory of EFs (the interested reader is referred to Godambe (1960, 1976, 1985, 1991), Godambe

and Thompson (1984, 1989), and Heyde (1989)). Suppose that X = ( x1 , x 2 ,..., xT ) is a vector

random variable on a probability space. The distribution family of this vector random variable is

( )

parameterized by θ = θ 1 ,θ 2 ,...,θ p . If there is a one-to-one mapping from the distributional

family H to the parameter space θ, the model under study is called a parametric model;

otherwise, it is called a nonparametric model. We first present results for the scalar parameter

case and then we extend our results to the multiparameter case (beginning with Definition 3).

E [g ( X ,θ )] = 0

4

for all F ∈ H such that θ(F) = θ.

Godambe also imposed some regularity conditions on unbiased estimating functions to form a

class ςof regular unbiased EFs. An estimate of θ based on the EF is obtained by solving the

estimating equation

g ( X ,θ ) = 0 .

In many applications, the number of EFs is set to equal the number of parameters so that a unique

θ can be obtained. If f(x, θ) is the probability density function for observation x for given θ, the

score function is defined as

∂

S ( x ,θ ) = log f ( x ,θ ). (1)

∂θ

Because E[S( x,θ)]= 0 under standard regularity conditions (c.f., Lehmann 1983, p. 118), the

Definition 2 (Godambe 1960). Within the class ς of all regular unbiased EFs, a function g

belonging to ςis an optimal EF for θ if, for any F ∈ H with θ = θ(F), it minimizes the quotient,

E g2 []. (2)

{ [ ]}

2

∂g

E ∂θ

The intuition of this optimality criterion is twofold. First, we desire the smallest numerator,

[]

E g 2 , possible for a given denominator. The numerator can be interpreted as the variance of an

2

∂g

unbiased EF, g. We also seek a denominator, E ∂θ

possible to changes in the parameter θ. This optimality criterion for estimation of a single

parameter is due to Godambe (1960). Multiparameter versions are discussed in Durbin (1960),

Godambe and Heyde (1987), and Godambe and Thompson (1989).

Definition 3 (Kale 1962). For a given set of unbiased EFs, g = (g1, g2, ... , gm) from the class ς,

the EF g*∈ ς is said to be optimal if, for any distribution F ∈ H , the following is satisfied

J − H ( H * ) J * ( H *' ) H ' ≥ 0

−1 −1

5

( )

for all g ∈ ς, where J = Cov F ( g ), J * = Cov F g * , H = − EF ( ), and H

∂g

∂θ

*

= − EF ( ).

∂g *

∂θ

When the form of density function f ( x , θ ) is specified, the score function given by equation

(1) is the optimal EF.

Theorem 1 (Godambe 1960). In the parametric model, the score function is the optimal EF in ς.

This theorem justifies the use of ML in parametric models from the vantage point of the theory of

EFs. When the form of the density function is not specified, the optimal EF which minimizes the

quotient (2) still provides the highest correlation with any possible score functions S ( x ,θ ) (c.f.,

Godambe 1985).

Suppose there exist two matrix functions, D(θ) and V(θ) > 0, such that for any F ∈ H

satisfying θ = θ(F),

= D (θ )

∂θ

E and Var ( g ) = V (θ ).

∂g

−1

D 'V g

is an optimal EF of θ in the class ς. This can be readily verified by the multiparameter definition

of optimality. A sufficient condition for g* to be optimal in ςis given by the following lemma.

[ ]

Lemma 1 (Godambe 1985). An EF g* in ςis optimal if E g* ( gs − gs* ) = 0 for any g ∈ ς, where gs

E ( )g .

∂g

∂θ

E( g 2 )

Let X = {x}be an abstract sample space and Ω = {θ }be the parameter vector space. Let pj,

j=1,2,..., k be any real functions defined on the product space X ×Ω = {( x,θ )x ∈ X ,θ ∈ Ω }of

[

E p j ( X ,θ ( F ))ℑ j ]=0 for F ∈ H , (3)

6

[ ]

where EF ⋅ℑ j is the expectation under F, conditional on ℑ j , a σ-algebra generated by a partition

j=1,2,...,k is chosen according to the underlying application. In many situations, simple EFs can

be formed from relationships involving only the first few moments. A better EF can then be

formed using optimal orthogonal combinations based on the following definition.

Definition 4 (Godambe and Thompson 1989). The EFs pj, j=1,2,...,k satisfying equation (3) are

mutually orthogonal, if

E ( p j pi ℑ i )= 0 and E p j p i ℑ j = 0 ( )

for F ∈ H and i ≠ j, i, j=1,2,...,k.

We can now form a class of linear combinations of unbiased EFs as follows,

k

l = ∑ ai pi , (4)

i =1

where the pi’s satisfy equation (3) and each ai is a function of the observation X and parameter θ,

which is measurable with respect to the σ-algebra ℑ j . Theorem 2 shows how to construct an

Theorem 2 (Godambe and Thompson 1989). In the class of EFs l, the optimal EF is given by

*

=∑

n ( ℑ )p

E ∂pi

∂θ i

E(p ℑ )

l 2 i

i =1 i i

if the functions pi are mutually orthogonal, and assuming the existence of the involved derivatives

and their expectations.

This result is very general and can apply to many problems that have been studied by

different estimation methods including least squares, and maximum likelihood. In the following

section, we apply the EF approach to the estimation of ARCH models.

7

3. ARCH MODEL

y t ℑ t − 1 ~ (0, ht ), (5)

and

ht = h( y t − 1 , yt − 2 , . . ., yt − q , α ) (6)

where ℑ t − 1 represents the information set available at time t-1, q is the order of the ARCH

In general, the choice of an estimating function can be viewed in a manner analogously to the

selection of moment conditions in the Generalized Method of Moments (GMM) approach of

Hansen (1982). Specific motivations for estimating functions may arise from economic or

statistical theory. For example, moment conditions may naturally arise from the expectations of

economic agents in a given economic problem.

We can easily verify that conditional on the information set {ℑ t − 1 , t = 1,2,...,T }, the gt’s are

unbiased, and mutually orthogonal. Consider the linear combinations of the basic EFs

T

l = ∑ at gt (8)

t =1

where the weights, at, are any function of yt and α that are measurable with respect to the

information set {ℑ t − 1 , t = 1,2,..., T }.

T

l * = ∑ at*gt (9)

t =1

∂α

) ( )

ℑ t − 1 / E g t2 ℑ t − 1 .

8

Now based on (7), E (

∂g t

∂α

)

ℑ t− 1 = −

∂ht

∂α

( ) ( )

and E g t2 ℑ t − 1 = E y t4 ℑ t − 1 − ht2 . Thus, the optimal

EF can be written as

T ∂ht

(y 2

− ht )

l =−

*

∑ E (y

t =1

∂α

4

t

)

ℑ t − 1 − ht2

. (10)

t

For emphasis, we stress that (10) is based upon the finite sample and it does not depend on

any distributional assumptions for yt conditional on ℑ t − 1 .

(

Assuming conditional normality as in Engle (1982), we also have E y t4 ℑ t − 1 = 3ht2 , and )

equation (10) simplifies to,

T

1 ∂ht yt2

l =−∑

*

− 1 . (11)

t =1 2 ht ∂α ht

Comparing this with the first order condition of equation (7) in Engle (1982), we note that they

are equivalent up to a sign change. Therefore, we conclude that, under the additional assumption

of normality, the theory of EFs and the maximum likelihood method give the same estimate for

parameters in the ARCH model.

The EF method we develop is based on a semiparametric model that is not fully specified by

the parameters of interest; whereas the ML method is based on a parametric model, in which the

model is fully described under the assumption of conditional normality. Equation (10) is valid for

any conditional distributions satisfying the mean-variance structure assumed in equation (5) and

(6). From an EF viewpoint, equation (11) is valid assuming conditional normality, or any other

( )

conditional distribution in which E y t4 ℑ t − 1 = 3ht2 . If the exact distribution is unknown, equation

(10) should be used instead of (11). Godambe and Thompson (1989) show that equation (10) can

be interpreted as a quasi-score function because it possesses properties similar to an ordinary

score function. In this context, we can define the information matrix as the expectation of the

Hessian averaged over all observations. (The information matrix can be estimated by

9

∂h ∂h

1 T t t

∑ ∂α ∂α ′

T t =1 (γ2 t + 2)ht2

, which is identical to the estimate provided in Engle (1982) equation (14), when

y t ℑ t − 1 ~ ( xt β , ht )

ht = α 0 + α 1εt2− 1 + . . . + α q εt2− q

g1t = yt − xt β , and

g2′=

t ( yt − xt β) − ht . 2

Unfortunately, g2′

t is not orthogonal to the linear EF, g 1t . We adopt the orthogonalization

g2 t = ( yt − xt β ) − ht − γ1t ht1/ 2 ( yt − xt β )

2

(12)

where γ1t =

[

E ( y t − xt β ) ℑ t − 1

3

] is the skewness of y conditional on ℑ t − 1 .

t

ht3 / 2

We now form the linear combination of these basic EFs to estimate the coefficient vectors α

and β

T T

l1 = ∑ a1t g1t + ∑a 2t g 2t

t =1 t =1

T T

l 2 = ∑ b1t g1t + ∑b 2t g 2t . (13)

t =1 t =1

10

Let be the class of all EFs (l1 ,l2 ) given by (13). Following Godambe and Thompson

*

1

*

2 are given by (13) with

E ( ∂g 1 t

∂α ℑ t− 1 ) E (

∂g 2 t

∂α ℑ t− 1 ) −

∂h t

∂α

( ) ( ) h (γ + 2 − γ )

a 1*t = = 0, a *2 t = =

2 2

E g 12t ℑ t− 1 E g 22 t ℑ t− 1 t 2t 1t

E (

∂g 1 t

∂β ℑ t− 1 ) ∂x t β

∂β

E ( ∂g 2 t

∂β ℑ t− 1 ) h γ 1

t

2

1t

∂x t β

∂β

−

∂h t

∂β

( ) ( ) h (γ

b1*t = =− b 2* t = =

)

, and .

E g 12t ℑ t− 1 ht E g 22 t ℑ t− 1

2

t 2t + 2 − γ12t

In general, a1t* , a 2t* , b1t* , and b2t* will be vector quantities with dimensions determined by the

E [( y t − x t β )4 ℑ t − 1 ]

− 3 represents the

h t2

standardized kurtosis.

T ∂ht

l1* = − ∑ h (γ +

t =1

2

∂α

2− γ 2

)g 2t

t 2t 1t

T T ht1/ 2γ1t ∂β −

l =−*

2 ∑t =1

∂β

ht

g1t + ∑ h (γ +

t =1

2

2− γ

∂β

2

)g 2t . (14)

t 2t 1t

We again stress that this result is very general in the sense that no distributional assumptions on

y t ℑ t − 1 are made. The usual result obtained under conditional normality is recoverable, by

imposing γ1t = 0 , and γ2t = 0 . In this case the optimal EFs become,

1 ∂h t

εt − 1

T 2

l 1* = − ∑ 2 ht ∂ α

t =1 ht

ε t x t' 1 ∂h t

ε t − 1 .

T T 2

l 2* = − ∑ ht

− ∑ 2 ht ∂ β

t =1 t =1 ht

The resulting optimal estimating equations l1* = 0 and l2* = 0 are equivalent to the first order

conditions of equations (7) and (20) in Engle (1982) under the assumption of conditional

11

normality (up to a sign change). The contribution of γ1t and γ2t to the estimation of α and β will

be important when the underlying conditional distribution displays third and fourth moments that

deviate from normality. The simplified quasi-likelihood equations obtained under the normality

assumption will be inefficient for distributions with nonzero values of γ1t and γ2t (c.f., the

discussion in Engle and Gonzalez-Rivera (1991)). In contrast, the results obtained from (14) will

be more efficient even if only an approximate specification of γ1t and γ2t are available. The

orthogonality of the functions g1t and g2t holds for any value of γ2t . This suggests that even an

approximate value for γ2t can be used to give near optimal estimating functions l1* and l 2* . We

document these benefits in sections 6 and 7 after discussing some properties of optimal estimating

functions and their resultant estimators in section 5.

The optimal EFs l1* and l 2* obtained in section 4 are martingales; thus, they are sometimes

called the optimal martingale estimating functions. According to the martingale central limit

theorems given in Hall and Heyde (1980) and some mild conditions, the optimal EFs obtained

after orthogonalization, standardization and optimal combinations have the property

1

( )

T 2 θ$ EF − θ → MVN ( 0, VEF

−1

)

where V EF = E ( ), i, j, = 1,2 .

∂li*

∂ θj

(Interested readers are referred to Godambe and Heyde (1987),

Anh (1988), Heyde and Lin (1992), or a more recent book by Heyde (1997).)

Crowder (1986) discusses explicit conditions under which the estimators from the EFs

converge in probability to the true parameters. Using Theorem 3.3 from Crowder (1986), weak

convergence of our estimator can be established using the optimal estimating functions presented

in section 4. Recently, Chen (1993) has provided proper conditions and a rigorous proof of

strong consistency for both linear and quadratic EFs. Optimal estimating functions provide an

approximation to the underlying true score functions and have similar properties to a score

function. Hence the results of Hutton and Nelson (1986) on the asymptotic consistency and

12

normality of maximum quasi-likelihood estimates for semi-martingales may be applied, assuming a

martingale difference structure for both the mean and variance of y t ℑ t − 1 .

In the theory of EFs, the emphasis of study is the EFs themselves, rather than the resultant

estimates. Efficiency is measured with respect to the EFs. Bhapkar (1972) proposed an

efficiency measure that is essentially the inverse of Godambe’s criterion in the case of one

dimension. The Bhapkar efficiency measure is the variance of the estimator derived from the

corresponding EF. In this paper, we follow the tradition of comparing competing estimators by

their variance.

Definition 5. The relative efficiency of the estimate for the parameter θ, derived from the optimal

EF, is the ratio of the variance obtained by the ML method when the true density function is

assumed, to the variance derived from the optimal EF when only the first few moments are

assumed, i.e.,

REθ =

( ).

Var θ$ ML

Var (θ$ )

EF

Given our interest in the finite sample properties of our estimators, we also report the bias

and root mean squared error for alternative estimators in our empirical analysis.

FUNCTION APPROACH IN ARCH MODELS

Weiss (1986), and Bollerslev and Wooldridge (1988) show that under a correct specification

of the first and second moments, consistent estimates of the parameters of the ARCH model can

be obtained by maximizing a likelihood function constructed under the assumption of conditional

normality, even when the true density deviates from normality. This approach is now called the

quasi-maximum likelihood (QML) method. Engle and Gonzalez-Rivera (1991) quantify the loss

of efficiency that results when the QML estimator is employed, using Monte Carlo simulations for

two densities -- one leptokurtic and the other positive skewed. They find that the efficiency of the

QML method for a Gamma distribution is particularly low and conclude (p. 347), “It is

worthwhile searching for estimators that can improve on QMLE.”

13

We adopt the theory of EFs to estimate parameters in ARCH type models. In this section

we state the relative efficiency measures in the special case of a Student t or Gamma distribution

to demonstrate the potential efficiency gains from the EF approach. In the next section, we report

the behavior of EF estimators in a Monte Carlo experiment to allow us to compare the finite

sample performance of the EF approach to ML, QML, and other semiparametric approaches such

as Engle and Gonzalez-Rivera (1991), or Drost and Klaassen (1997).

Consider a GARCH (1,1) process to describe the dynamics of asset returns for scalar valued

parameters α and β ,

y t | ℑ t − 1 ~ (0, ht )

ht = (1 − α − β ) + α yt2− 1 + βht − 1 .

The optimal EFs for the GARCH(1,1) model can now be written as,

∂h t

T

l 1* = − ∑ h (γ

t =1

2

∂α

+ 2 − γ12t )

g 2 t , and

t 2t

∂h t

T

∑ h (γ

∂β

l 2* =−

)

g2t . (15)

t =1 t

2

2t + 2 − γ12t

The asymptotic variance-covariance matrix of the coefficient vector (α , β )′may be written as the

V V

2 by 2 matrix V − 1 where V = 11 12 has elements given by,

V 21 V 22

∂l * * *

V11 = E 1 ℑ t − 1 , V = V = E ∂l 1 ℑ , and V = E ∂l 2 ℑ .

t− 1 t− 1

∂α 12 21

∂β 22

∂β

14

In sections 6.1 and 6.2 we consider the relative efficiency of the EF approach for the

Student t and Gamma distributions.

Assume that the conditional density of yt follows a Student’s t distribution with v

( v ≥ 5 )degrees of freedom,

− (v + 1)/ 2

Γ( v2+ 1 ) y t2

f ( yt ℑ t − 1 )=

1

1 + . (16)

π(v − 2 )ht Γ( v2 ) (v − 2 )ht

The moment structure for this distribution up to the fourth order is,

(

E ( y t ℑ t − 1 )= 0 , E y t2 ℑ t − 1 = ht , )

γ1 t = 0 , γ2 t = 6

.

v− 4

This distribution is symmetric about 0 and exhibits leptokurtosis. As v tends to infinity, this

distribution tends to the Normal distribution, N (0, ht ).

and β, may be stated as

T 1

( )( ) I

2

∂ht 2 ( v + 1) y t2

1−

( v − 1) t∑=1 ht2 ∂β y t2 + ht ( v − 2 )

REα = , and

2( v − 4) ( )

T

∂ht 2

∑ 1

V

t =1

ht2 ∂β

T 1

( ) (1 − ) I

2

∂ht 2 ( v + 1) y t2

( v − 1) t∑=1 ht2 ∂α y t2 + ht ( v − 2 )

REβ = , respectively, (17)

2( v − 4) ∑ ( )

T

∂ht 2

1

V

t =1

ht2 ∂α

and β, and I is the determinant of I .

15

6.2 Gamma Distribution

Suppose that the conditional density of yt follows a Gamma distribution with shape

parameter c. In this case, the conditional density and first four moments are,

c− 1 c yt

c yt − + c

f ( yt ℑ t − 1 )=

c

+ c e

ht

(18)

ht Γ(c )

th

E ( yt ℑ t − 1 )= 0, Var ( y t ℑ t − 1 )= ht

2 6

where γ1 = , and γ2 = .

c c

Following the development in the appendix, the relative efficiency measures for α and β

under the Gamma distribution may be computed as,

T 1 1 ∂h t 2 c − u t2

2

∑

I

2 (1 + c ) t =1 4 h t2 ∂β c + u t

RE α = , and

c 2

1 ∂h t

T

∑ V

t =1 h t

2 ∂β

T 1 1 ∂h t 2 c − u t2

2

∑

I

2 (1 + c ) t =1 4 h t2 ∂α c + u t

RE β = . (19)

c 2

1 ∂h t

T

∑ V

t =1 h t

2 ∂α

Comparing our results with the QML measures of relative efficiency in Engle and Gonzalez-

Rivera (1991), we find that the two approaches produce identical results when the conditional

distribution of yt follows a Student’s t distribution; however, assuming a conditional Gamma

distribution leads to a substantial contrast in the approaches. The equivalence in the results for

the Student’s t distribution occurs because of the symmetry in the distribution ( γ1t = 0 ), and the

special relationship between the second and fourth moments. The Gamma distribution differs

from the Normal distribution primarily with respect to skewness. For quadratic EFs to perform

well, information regarding both third and fourth moments is required. The quasi-maximum

likelihood (QML) estimator maximizes the Normal log likelihood function based on the mean and

variance. This approach will be inappropriate when the data displays serious departures from

normality in the third and fourth moments. An interesting alternative procedure used in

16

generalized statistical models is the method of maximum quasi-likelihood (MQL) estimation (c.f.,

Godambe and Heyde 1987, or Heyde and Lin 1992). This approach uses a quasi-likelihood

function based only on the first few moments of the distribution.

Comparing our results with the QML results of Engle and Gonzalez-Rivera (1991) under

the assumption of a conditional Gamma distribution, we find that the optimal EF differs from the

score function and hence produces different estimators. The relative efficiency of the EF

approach is equal to that of the QML estimator (c.f., Engle and Gonzalez-Rivera 1991) multiplied

by the constant factor of (3+c)/(1+c). Therefore, the optimal EF estimator is always more

efficient than the QML estimator. For small c values the efficiency gain is substantial; however, as

c increases this factor approaches 1 and the variance of the QML estimator approaches its lower

bound. Thus, the QML estimator becomes more efficient as c tends to infinity and the Gamma

distribution converges to the Normal distribution. The EF approach uses the additional

information inherent in the density’s skewness and kurtosis to form the optimal EF. The

prevalence of deviations from normality in third and fourth moments in empirical studies of stock

returns, short term interest rates, and exchange rates suggests incorporating this information into

the estimation approach is important (c.f., Rogalski and Vinso 1978, Hsieh 1989, Schwert 1989,

Engle, Ng, and Rothschild 1990, and Mills 1995, among others). The theory of EFs provides a

direct method to capitalize on this information.

In this section we report Monte Carlo results demonstrating the benefits of using the EF

approach in the context of nonnormal data. We present finite sample properties for EF estimators

relative to both ML and QML estimators for both a moderate and large sample of observations.

In each simulation we consider a moderately and highly persistent GARCH process.

We adopt a simplistic specification for skewness and kurtosis to examine the potential of the

EF approach. To admit meaningful comparisons with prior simulation results of Engle and

Gonzalez-Rivera (1991), and Drost and Klaassen (1997), we generate a GARCH(1,1) of length

T=500 or T=1,000 for various values of (α , β) given by (.1, .8) or (.05, .9). The GARCH

process with parameters α and β is described in detail in section 6. For each (α , β) pair, we

17

consider six error distributions: a Normal distribution; a Student t distribution with 5, 8, or 12

degrees of freedom; and a Gamma distribution with parameter of 12, or 30. In all cases, the

unconditional mean and variance are zero and one, respectively. For each generated series, we

then estimate (α , β ) using ML, QML or the EF approach.

Given that the true error distribution is unknown, ML estimation represents an unattainable

outcome in practice. Nonetheless, ML results provide a meaningful bound for comparison

purposes. To maintain comparability with the work of Drost and Klaassen, initial values for the

EF approach are given by the QML estimates. Our empirical application of the EF approach

numerically minimizes the sum of the squared optimal estimating functions from equation (15),

2 2

l1* + l 2* . To define the optimal EFs requires a specification for the skewness and kurtosis

parameters, γ1 t and γ2 t . As a first approximation, we propose the following simple approach.

For each conditional series analyzed, we standardize the series to have a sample mean of zero and

a sample variance of one. The skewness and kurtosis parameters used in estimation are the

sample means of the third power of the standardized series, and the fourth power of the

standardized series, less 3. Future research is warranted to examine the possible benefits that are

attainable through more complex estimation strategies for these nuisance parameters. Possible

alternatives worth consideration include allowing these parameters to follow temporal processes,

or to allow them to iterate within the estimation process.

Table 1 reports summary measures for each of the estimated parameters based on 2500

replications of the above experiment. The first column of the table describes the error distribution

used to generate the data. For each (α , β ) pair considered, we estimate the GARCH parameters

(αˆ, βˆ) under ML, QML or the EF approach. The sample means αˆ , βˆ , standard deviations

(σˆ

αˆ , σˆβˆ ), biases (bias αˆ ,bias βˆ ), and root mean squared errors (rmse αˆ , rmse βˆ ) for each estimation

approach are presented in the remaining columns of the table.

18

Panels A and B of Table 1 report the simulation results for our moderate sample size

experiment with T=500 observations. Moderate and high persistence series are given by

(α , β )=(.1, .8) or (.05, .9), respectively. In the first row of each panel we report the simulation

results for the Normal distribution (in which case, ML and QML estimation are equivalent). The

next three rows of each panel present summary results for Student t innovations with 5, 8, or 12

degrees of freedom. The final two rows of the panel detail the results for the Gamma distribution

with shape parameter given by 12 or 30.

The reported standard errors of the estimated parameters show the familiar result that the

QML estimators are inefficient relative to the unattainable ML estimators. As an example,

consider the case of moderate persistence for the heavily tailed Student-t distribution with five

degrees of freedom in panel A. The standard deviation of the 2500 estimated α parameters is .05

for the ML approach and .066 for the QML estimates. Thus, in this example the QML estimator

suffers a loss in efficiency relative to the ML estimator. The EF approach partially recovers this

loss in efficiency as suggested by the reported standard deviation of .062 for α .

The EF approach is based on unbiased estimating functions in the finite sample, not unbiased

estimators of the underlying parameters. For this reason, we report the finite sample bias and root

mean squared errors for each of the estimators. Continuing with our previous example, we

observe that the EF approach shows a smaller finite sample bias and root mean squared error

(rmse) for both α and β relative to the QML estimates.

The remaining results in panel A show that the EF approach partially recovers the QML

efficiency loss in virtually every instance. In addition, the finite sample bias of the EF approach is

often less than the QML bias, and the EF rmse is always less than the QML rmse. The only

exception to this finding is in the case of normality where the rmse for α is improved and the

rmse for β is slightly worsened.

The high persistence results in panel B display a similar pattern. The EF standard deviations

are always smaller than the comparable QML estimator standard deviations in all cases of

nonnormal data. Similarly, in all cases of nonnormal data, the EF rmse always improves upon the

QML estimator rmse. In the case of normality the rmse results for the ML, QML and EF

approaches are virtually identical.

19

Panels C and D of Table 1 report the estimation results assuming an underlying GARCH

process given by (α , β )=(.1, .8) or (.05, .9), respectively, for our large sample experiment with

T=1,000 observations. The presentation format follows that in panels A and B.

In every case considered in panels C and D, we observe that the ML estimator displays less

absolute bias than the comparable QML estimator. Surprisingly, we also find that the EF

estimator always displays less bias than the QML estimator (with the exception of the case when

ML and QML are equivalent under normality). This finding suggests that the EF approach can be

used to improve the location of QML estimates even with our proposed simple specification for

skewness and kurtosis.

The standard deviation of the EF estimator relative to the ML and QML estimator in panels

C and D also behave well for the larger sample results. The information available from empirical

third and fourth moments can be used to improve the efficacy of the QML estimator. Comparison

of the reported EF standard deviations to the QML standard deviations suggests a marked

improvement in virtually every case. The sole exception to this result for nonnormal data occurs

for the Student t distribution with 12 degrees of freedom in panel C. In this case, we observe

(σˆ

αˆ , σˆβˆ ) equal to (.032, .097) and (.033, .085) for the QML and EF estimators, respectively. In

general we conclude that the EF focus on the finite sample from the outset leads to a substantial

increase in efficiency. This finding is especially important given the lack of bias found in the EF

estimator.

Drost and Klaassen (DK, 1997) propose an alternative semiparametric estimator to that of

Engle and Gonzalez-Rivera (1991). Based on a simulation experiment similar to ours, they show

substantial efficiency gains for their 1-step estimator relative to the QML estimator. The

simulation framework of DK is also based on 2,500 replications of draws from a GARCH(1,1)

series of length T=1,000. In contrast to our research design, DK employ a variance specification

1

with an unconditional variance of for given GARCH parameters α * and β * . In spite

1− α − β

* *

of the differences between our studies, both proposed estimators behave comparably. The

primary difference between the EF and DK estimators relates to bias. We find that, relative to the

QML estimator, the EF estimator leads to a reduction in bias for both αˆ and βˆ ; DK find a

20

substantial increase in the bias for αˆ* , and a commensurate reduction in the bias for βˆ* relative

to their QML estimator.

using the Standard and Poor’s 500 daily composite stock index (SP500) series for the sample

period from Thursday, January 23, 1941 through Monday, January 15, 1996. The SP500 is a

value-weighted index of common stock prices. Prior to March of 1957, the index was composed

of only 90 stocks. Subsequent to March of 1957, the index was expanded to 500 stocks. Finally,

in July of 1976, the index included a group of financial stocks, some of which now trade over the

counter (c.f., French, Schwert and Stambaugh 1987, or Gallant, Rossi, and Tauchen 1992 for

further discussion). This series does not include dividend distributions; however, for ease of

discussion we use the terms return and percentage price change interchangeably.

Our primary focus is to present an appropriate and accurate representation of the GARCH

process governing second moment evolution of this series. To cleanse the raw series of any

conditional mean effects and any deterministic variance effects, we adopt a procedure similar to

Gallant, Rossi, and Tauchen (1992). The general issue of whitening the data for our example is

not trivial. We seek to model the variance process for a realistic series displaying zero conditional

mean and unit variance. Deviations from normality in the third and fourth moments will not cause

estimation difficulties; however, we wish to cleanse any known deterministic effects from the first

and second moments. Removal of effects related to events like Black Monday, Oct. 19, 1987,

will depend on the researcher's ideology in treating outliers, as well as the goal of the research

undertaken. For completeness, we consider three levels of filtering.

We begin by regressing percentage changes in the SP500 daily index on dummy variables

related to calendar effects, wartime years, changes in the composition of the index, and

autoregressive mean effects in the following mean adjustment equation,

yt = xt ′

β + ut , (20)

21

where y t is the original percentage change in the SP500 series, x t is a vector of regressors, and

ut is the disturbance term. The least squares residuals from equation (22) are then fit using an

AR(10) process to remove any possible remaining temporal persistence in the conditional mean

that might influence later variance estimates. These AR(10) residuals are then transformed for use

as a dependent variable in the variance specification,

log( et2 ) = x t ′

γ+ εt . (21)

The final series used in our examples are the standardized residuals, constructed as,

et

zt = (22)

′

exp( x t γ/ 2 )

Table 2 reports the estimation results for three alternative sets of whitening regressors. In a

similar context, Gallant, Rossi, and Tauchen (1992) also consider additional dummy variables for

i) the months of February, March, April, May, June, July, August, September, October, and

November, ii) trading gaps of 1, 2, 3, and 4 days, and iii) time trends and quadratic time trends for

the variance specification. In contrast to Gallant, Rossi, and Tauchen (1992), we also filter out

effects related to i) each of the 10 trading days following Black Monday, 1987, for the conditional

mean and variance, ii) a dummy variable for changes in the composition of the index in March of

1957 and July of 1976, and iii) temporal components from the conditional mean using an AR(10)

process. A good discussion of the financial literature surrounding these filters can be found in

Lakonishok and Smidt (1988), or Gallant, Rossi, and Tauchen (1992).

The first column of Table 2 contains the estimated autoregressive terms for the raw series

without any additional adjustments. We observe significant autocorrelation coefficients over the

first six lags, possibly related to weekly effects. The next two columns of the table report day of

the week effects for both the mean and variance specifications. With the exception of

Wednesdays and Thursdays, all days of the week display significantly negative returns as shown in

the mean column. The extreme negative coefficient for Mondays is the familiar weekend effect.

The autoregressive coefficients in the conditional mean column are qualitatively similar to the

purely autoregressive analysis reported in the first column. The estimated day-of-the-week

22

variance effects suggest that the Thursday and Friday returns display significantly lower variance

than the earlier portion of the week. In total, the reported day-of-the-week effects, suggest a

reduction in both expected returns and risks when additional conditioning information is

considered.

The final two columns of the table present preliminary estimation results, when all of the

dummy variables and autoregressive components are considered. The conditional mean

coefficients for days of the week and the autoregressive component are comparable, although the

Monday effect is not as extreme given the effect of Monday, Oct. 19, 1987 has been mitigated.

The January effect can be observed in the positive returns for the last week of December and first

week of January. Interestingly, we find a large positive effect in the last weeks of both December

and January. The conditional variance effects reported during the ten-day crash period must be

interpreted with caution. During these days the conditional mean effect is clearly negative and

substantial (conditional upon knowledge of the crash). The dummy variable in the conditional

mean removes the primary effect on a day by day basis; resulting in a very good fit for the day

considered, and leaving a relatively small amount of variability to explain in the variance equation.

The remaining variables included have relatively little effect on the conditional mean of the series.

Table 3 reports summary statistics describing the raw SP500 percentage price changes as

well as the three whitened and standardized series. The first column of the table reports the

summary statistics for the raw SP500 percentage price change series. The conditional daily

effective mean is .00032, with a standard deviation of .00080. Thus, the unconditional sample

reward-to-variability (or Sharpe) ratio is .40. The data is somewhat left skewed and highly

leptokurtic. The departures from normality are severe as indicated by the reported Jarque-Bera

test statistics. The Ljung-Box portmanteau statistics demonstrate that the raw series displays

serious temporal persistence in the conditional mean specification and the conditional variance

specification when considering lags of 15, 20 or 25 trading days. The reported robust Q statistics

(c.f., Lo and MacKinlay 1989, and Lobato, Nankervis, and Savin 1998) support these findings for

lags of 15, 20 or 25 trading days.

23

The final three columns of Table 3 report the same results for the three series generated

from the whitening procedure detailed in Table 2. All series have been standardized to have zero

mean and unit variances for our estimating function example. Skewness is significantly negative

in only two of the series, while all transformed series remain highly leptokurtic. Thus, even after

substantial filtering, the data display serious departures from normality, requiring an estimation

methodology that is robust to such departures. The conditional means of all remaining series

appear to be relatively free of temporal dependence in the mean; however, the conditional

variances remain highly persistent. It is this remaining persistence that we hope to fit with our

GARCH specification.

Table 4 reports the estimated parameters and diagnostic statistics for the standardized

residuals for the GARCH(1,1) process for each of the three filtered and whitened series described

in Table 3. Our application of the EF approach uses a simple sample mean of the third power of

the standardized series for the skewness parameter, and a sample average of the fourth power of

the standardized series, less 3, for the central kurtosis parameter.

The reported results show that both GARCH parameters are highly significant and suggest a

lengthy conditional variance decay process for all series. In this large sample empirical application

we find little change in the estimated parameters for two of the three cases. In the final column of

Table 4, where all filters are applied, we observe a substantial reduction in the estimate for α .

The earlier simulation results suggest that the efficiency of the reported estimates should be

improved over QML estimates given the extreme nonnormality in all of the series. The mean of

the standardized residuals from the estimation are consistently negative, suggesting that smaller

conditional variance terms are more often associated with negative mean errors. The skewness

and kurtosis of the standardized residuals retain their nonnormal characteristics of significantly

negative skewness and leptokurtism. The final three rows of the table demonstrate that the

temporal dependence in the conditional variance up to lag 25 has been adequately captured by the

GARCH(1,1) process.

24

Our example demonstrates that even after extensive filtering of the raw data, serious

nonnormalities remain in the data analyzed. The EF methodology explicitly uses this information

to improve the efficiency of the estimator and to lessen the bias in estimated GARCH parameters.

Further research is warranted to expand the estimation procedure to allow for temporal changes

in skewness and kurtosis, and to allow the nuisance parameters to change iteratively.

9. CONCLUDING COMMENTS

We have demonstrated the benefits of using the estimating function (EF) approach for

modeling data drawn from nonnormal conditional distributions. The approach naturally takes

advantage of departures from normality to improve the efficiency of estimated parameters given a

finite sample of data. In comparison with asymptotically based procedures, the focus on the finite

sample in the EF approach is important. We find efficiency gains from the EF approach are

substantial. Thus, the estimating function approach will be most useful in cases with serious

departures from normality where efficiency is important. The estimating function approach

should be a natural forecasting technology, when accurate and small confidence bounds are

sought.

Our simulation results suggest that the finite sample bias and variance of EF estimators are

desirable relative to other alternatives such as QML or other semiparametric techniques. In

particular, we find that both the finite sample bias and variance of the EF approach is virtually

always less than the QML estimator bias and variance.

Our empirical example demonstrates that the estimating function approach is readily

implemented in a simple example with a highly nonnormal data series. The approach is able to

successfully eliminate second order effects; however, nonnormal higher moments persist in

standardized residuals. The finite sample approach with unrestricted parameters for skewness and

25

kurtosis departures appears well suited to the data. Future research may seek to extend the

specifications for third and fourth moments to improve the efficiency of second moment estimates.

26

APPENDIX: COMPUTING THE RELATIVE EFFICIENCY OF THE

ESTIMATING FUNCTION APPROACH (GAMMA DISTRIBUTION)

Assuming the conditional density of yt follows a Gamma distribution with shape parameter

c, the optimal EFs are,

1 ∂h t

T

∑

c y t2 − h t − 2

l 1* = − h t1 / 2 y t and

2 (1 + c ) t =1 h t2 ∂ α

c

1 ∂h t

T

∑

c y t2 − h t − 2

l 2* = − h t1 / 2 y t .

2 (1 + c ) t =1 h t2 ∂ β

c

2

1 ∂h t

T

Var (αˆEF )= ∑

V 22 c

= V , and

V 2 (1 + c ) t =1 h t2

∂β

( )

2

1 ∂h t

T

∑

V c

Var βˆEF = 11 = V

V 2 (1 + c ) t =1 h t2

∂α

where V − 1 is again the asymptotic variance-covariance matrix of the EF estimators for α and β,

and V is the determinant of V .

1 1 ∂h t

c − ut

T 2

l 1* = − ∑ 2 ht ∂α c + u t

,

and

t =1

1 1 ∂h t

c − ut

T 2

l 2* = − ∑ 2 ht ∂α c + u t

t =1

c

where u t = yt .

ht

27

I I

The estimated information matrix multiplied by the sample size T, I = 11 12 , has

I 12 I 22

elements,

2

T

1 1 ∂h t

2

c − u t2

I 11 = ∑

4 h t2

∂α

c + ut

,

t =1

2

1 1 ∂h t ∂h t c − ut

T 2

I 12 = I 21 = ∑

4 h t2

c+ u

∂ β ∂ α

, and

t =1 t

2

1 1 ∂h t

2

c − ut

T 2

I 22 = ∑

4 h t2

c+ u

∂β

.

t =1 t

Using the inverse of the information matrix as the variance-covariance matrix for α and β

yields the Gamma distribution relative efficiency measures as reported in equation (19).

28

REFERENCES

Anh, V.V. (1988), "Nonlinear Least Squares and Maximum Likelihood Estimation of Heteroscedastic Regression

Model," Stochastic Processes and their Applications, 29, 317-333.

Bhapkar, V. P. (1972), “On a Measure of Efficiency of an Estimating Function,” Sankhya, 34, 467-472.

Bodurtha, J. N. and Mark, N. C. (1991), “Testing the CAPM with Time-Varying Risks and Returns,” Journal of

Finance, 46, 1485-1505.

Bollerslev, T. (1986), “Generalized Autoregressive Conditional Heteroskedasticity,” Journal of Econometrics, 31, 307-

327.

Bollerslev, T. (1987), “A Conditional Heteroskedastic Time Series Model for Speculative Prices and Rates of Return,”

Review of Economics and Statistics, 69, 542-547.

Bollerslev, T., Chou, R. Y. and Kroner, K. F. (1992), “ARCH Modelling in Finance: A Review of the Theory and

Empirical Evidence,” Journal of Econometrics, 52, 5-59.

Bollerslev, T. and Wooldridge, J. M. (1988), “Quasi-Maximum Likelihood Estimation of Dynamic Models with Time

Varying Covariance,” Econometric Reviews, 11, 143-172.

Crowder, M. (1986), “On Consistency and Inconsistency of Estimating Equations,” Econometric Theory, 2, 305-330.

Chen, Y. (1993), “Asymptotic Theory of Optimal Estimating Functions,” Technical Report Series, STAT-93-01,

University of Waterloo.

Doob, (1953), Stochastic Processes, New York: John Wiley and Sons.

Drost, F. C., and Klaassen, C. A. J. (1997), "Efficient Estimation in Semiparametric GARCH Models," Journal of

Econometrics, 81, 193-221.

Durbin, J. (1960), "Estimation of Parameters in Time Series Regression Models," Journal of the Royal Statistical

Society, Series B, 22, 139-153.

Engle, R. F. (1982), “Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of U. K. Inflation,”

Econometrica, 50, 987-1008.

Engle, R. F. and Bollerslev, T. (1986), “Modeling the Persistence of Conditional Variances,” Econometric Reviews, 5,

1-50, pp. 81-87.

Engle, R. F., Lilien, D. M. and Robins, R. P. (1987), “Estimating Time Varying Risk Premia in the Term Structure:

The ARCH-M Model,” Econometrica, 55, pp. 391-407.

Engle, Robert F., Ng, Victor K. and Rothschild, Michael (1990), “Asset Pricing with a Factor-ARCH Covariance

Structure: Empirical Estimates for Treasury Bills,” Journal of Econometrics, 45, 213-237.

Engle, R. F. and Gonzalez-Rivera, G. (1991), “Semiparametric ARCH Models,” Journal of Business & Economic

Statistics, 9, No. 4, pp. 345-359.

French, K., Schwert, G. W. and Stambaugh, R. (1987), “Expected Stock Returns and Volatility,” Journal of Financial

Economics, 19, 3-30.

Gallant, A. R., Rossi, P. E. and Tauchen, G. (1992), “Stock Prices and Volume,” Review of Financial Studies, 5, 199-

242.

Geweke, J. (1986), "Modelling the Persistence of Conditional Variances: A Comment," Econometric Reviews, 5, 1, 57-

61.

Glosten, L. R., Jaganathan, R. and Runkle, D. (1993), “On the Relation between the Expected Value and the Volatility

of the Nominal Excess Return on Stocks,” Journal of Finance, 48, 1779-1802.

Godambe, V. P. (1960), “An Optimum Property of Regular Maximum Likelihood Estimation,” The Annals of

Mathematical Statistics, 31, 1208-12.

Godambe, V. P. (1976), “Conditional Likelihood and Unconditional Optimum Estimating Equations,” Biometrika. 63,

277-84.

Godambe, V. P. (1985), “The Foundation of Finite Sample Estimation in Stochastic Processes,” Biometrika. 72, 419-

28.

Godambe, V. P., Ed. (1991), Estimating Functions, Oxford: Oxford University Press.

Godambe, V. P. and Heyde, C. C. (1987), “Quasi-likelihood and Optimal Estimation,” International Statistical Review,

55, 231-44.

Godambe, V. P. and Thompson, M. E. (1984), “Robust Estimation Through Estimating Equation,” Biometrika 71, 115-

25.

Godambe, V. P. and Thompson, M. E. (1989), “An Extension of Quasi-Likelihood Estimation (with discussion),”

Journal of Statistical Planning and Inference, 22, 137-72.

Hall , P. and Heyde, C. C. (1980), Martingale Limit Theory and Its Application, New York: Academic Press.

30

Heyde, C. C. (1989), “Quasi-likelihood and Optimality of Estimating Function: Some Current Unifying Themes,”

Bulletin of the International Statistical Institute, Book 1, 19-29.

Heyde, C. C., and Lin, Y. X. (1992), "On Quasi-likelihood Methods and Estimation for Branching Processes and

Heteroscedastic Regression Models," Australian Journal of Statistics, 34, 2, 199-206.

Higgins, M. L., and Bera, A. K. (1992), “A Class of Nonlinear ARCH Models,” International Economic Review, 33,

137-158.

Hsieh, D. A., (1989), “Modelling Heteroscedasticity in Daily Foreign Exchange Rates,” Journal of Business and

Economic Statistics, 7, 307-317.

Hutton, J. E. and Nelson, P. I. (1986), “Quasi-likelihood Estimation for Semimartingales,” Stochastic Processes and

their Applications, 22, 245-257.

Kale, B.K. (1962), "An Extension of the Cramer-Rao Inequality for Statistical Estimating Functions," Scandinavian

Actuarial Journal, 45, 60-89.

Liang, K. Y., and Zeger, S. L. (1986), “Longitudinal Data Analysis Using Generalized Linear Models,” Biometrika, 73,

13-22.

Lakonishok, J. and Smidt, S. (1988), “Are Seasonal Anomalies Real? A Ninety-Year Perspective,” Review of Financial

Studies, 1, 403-425.

Lo, A. W., and MacKinlay, A. C. (1989), “The Size and Power of the Variance Ratio Test in Finite Samples: A Monte

Carlo Investigation,” Journal of Econometrics, 40, 203-238.

Lobato, I., Nankervis, J. C., and Savin, N. E. (1998), “Testing that Stock Returns are Uncorrelated Using a Modified

Box-Pierce Q-Test,” working paper, University of Iowa.

Mills, T. (1995), “Modelling Skewness and Kurtosis in the London Stock Exchange FT-SE Index Return Distributions,”

The Statistician, 44, 323-332.

Nelson, D., (1991), “Conditional Heteroscedasticity in Asset Returns: A New Approach,” Econometrica, 59, 347-370.

Rogalski, R. J., and Vinso, J. D. (1978), “Empirical Properties of Foreign Exchange Rates,” Journal of International

Business Studies, 9, 69-79.

31

Schwert, G. William (1989), “Why Does Stock Market Volatility Change Over Time,” Journal of Finance, 44, 1115-

1153.

Sentana, E., (1991), “Quadratic ARCH models: A potential re-interpretation of ARCH models,” Unpublished working

paper, CEMFI, Madrid.

Mathematical Statistics, Symposium on Estimating Functions.

Weiss, A. A. (1986), “Asymptotic Theory for ARCH Models: Estimation and Testing,” Econometric Theory, 2, 107-

131.

32

Table 1

Finite Sample Properties of ML, QML and EF estimates

αˆ βˆ σˆαˆ σˆβˆ biasαˆ bias βˆ rmseαˆ rmse βˆ

Panel A. α = 0.1, β = 0.8, T=500

Normal ML=QML 0.104 0.774 0.045 0.131 0.004 -0.026 0.046 0.133

EF 0.103 0.771 0.044 0.130 0.003 -0.029 0.044 0.134

Student t (5) ML 0.105 0.775 0.050 0.134 0.005 -0.025 0.050 0.137

QML 0.112 0.759 0.066 0.170 0.012 -0.041 0.067 0.174

EF 0.109 0.769 0.062 0.155 0.009 -0.031 0.063 0.158

Student t (8) ML 0.105 0.772 0.047 0.137 0.005 -0.028 0.048 0.140

QML 0.105 0.773 0.053 0.145 0.005 -0.027 0.053 0.148

EF 0.107 0.767 0.051 0.143 0.007 -0.033 0.052 0.147

Student t (12) ML 0.103 0.775 0.046 0.135 0.003 -0.025 0.046 0.138

QML 0.104 0.771 0.049 0.146 0.004 -0.029 0.049 0.149

EF 0.104 0.771 0.047 0.140 0.004 -0.029 0.048 0.143

Gamma (12) ML 0.102 0.777 0.039 0.118 0.002 -0.023 0.039 0.120

QML 0.105 0.769 0.048 0.143 0.005 -0.031 0.048 0.146

EF 0.102 0.777 0.043 0.126 0.002 -0.023 0.043 0.128

Gamma (30) ML 0.102 0.778 0.042 0.128 0.002 -0.022 0.042 0.130

QML 0.104 0.769 0.045 0.141 0.004 -0.031 0.035 0.144

EF 0.103 0.775 0.043 0.130 0.003 -0.025 0.043 0.133

Normal ML=QML 0.054 0.863 0.032 0.139 0.004 -0.037 0.033 0.144

EF 0.055 0.858 0.033 0.138 0.005 -0.042 0.034 0.144

Student t (5) ML 0.056 0.865 0.036 0.138 0.006 -0.035 0.036 0.142

QML 0.061 0.853 0.051 0.164 0.011 -0.047 0.052 0.171

EF 0.063 0.846 0.051 0.158 0.013 -0.054 0.052 0.167

Student t (8) ML 0.056 0.860 0.035 0.142 0.006 -0.040 0.035 0.147

QML 0.056 0.860 0.040 0.152 0.006 -0.040 0.041 0.157

EF 0.057 0.858 0.039 0.140 0.007 -0.042 0.039 0.146

Student t (12) ML 0.055 0.861 0.035 0.144 0.005 -0.039 0.035 0.149

QML 0.055 0.864 0.035 0.141 0.005 -0.036 0.035 0.145

EF 0.055 0.864 0.034 0.131 0.005 -0.036 0.034 0.136

Gamma (12) ML 0.053 0.873 0.030 0.123 0.003 -0.027 0.030 0.126

QML 0.055 0.864 0.035 0.139 0.005 -0.036 0.036 0.143

EF 0.054 0.864 0.032 0.126 0.004 -0.036 0.033 0.131

Gamma (30) ML 0.053 0.868 0.030 0.122 0.003 -0.032 0.030 0.126

QML 0.054 0.864 0.035 0.139 0.004 -0.036 0.035 0.144

EF 0.054 0.866 0.032 0.128 0.004 -0.034 0.033 0.132

33

αˆ βˆ σˆαˆ σˆβˆ biasαˆ bias βˆ rmseαˆ rmse βˆ

Panel C. α = 0.1, β = 0.8, T=1,000

Normal ML=QML 0.102 0.788 0.028 0.076 0.002 -0.012 0.028 0.077

EF 0.101 0.786 0.027 0.078 0.001 -0.014 0.027 0.079

Student t (5) ML 0.102 0.787 0.032 0.083 0.002 -0.013 0.032 0.083

QML 0.108 0.771 0.044 0.123 0.008 -0.029 0.045 0.126

EF 0.107 0.775 0.042 0.109 0.007 -0.025 0.043 0.112

Student t (8) ML 0.101 0.788 0.030 0.082 0.001 -0.012 0.030 0.083

QML 0.104 0.777 0.035 0.106 0.004 -0.023 0.036 0.108

EF 0.103 0.785 0.034 0.088 0.003 -0.015 0.034 0.089

Student t (12) ML 0.102 0.785 0.029 0.079 0.002 -0.015 0.029 0.081

QML 0.104 0.779 0.032 0.097 0.004 -0.021 0.032 0.100

EF 0.102 0.786 0.033 0.085 0.002 -0.014 0.033 0.087

Gamma (12) ML 0.101 0.791 0.025 0.063 0.001 -0.009 0.025 0.063

QML 0.103 0.782 0.032 0.100 0.003 -0.018 0.032 0.102

EF 0.102 0.790 0.029 0.074 0.002 -0.010 0.029 0.074

Gamma (30) ML 0.102 0.787 0.028 0.075 0.002 -0.013 0.028 0.076

QML 0.103 0.781 0.031 0.094 0.003 -0.019 0.031 0.095

EF 0.101 0.787 0.028 0.079 0.001 -0.013 0.028 0.080

Normal ML=QML 0.052 0.883 0.022 0.087 0.002 -0.017 0.022 0.088

EF 0.054 0.874 0.022 0.099 0.004 -0.026 0.022 0.103

Student t (5) ML 0.052 0.888 0.023 0.070 0.002 -0.012 0.023 0.071

QML 0.059 0.842 0.036 0.185 0.009 -0.058 0.038 0.194

EF 0.057 0.869 0.031 0.112 0.007 -0.031 0.032 0.116

Student t (8) ML 0.051 0.887 0.021 0.068 0.001 -0.013 0.021 0.070

QML 0.057 0.846 0.028 0.172 0.007 -0.054 0.029 0.180

EF 0.055 0.873 0.025 0.096 0.005 -0.027 0.026 0.099

Student t (12) ML 0.052 0.883 0.022 0.082 0.002 -0.017 0.022 0.084

QML 0.055 0.848 0.025 0.177 0.005 -0.052 0.026 0.184

EF 0.054 0.876 0.023 0.090 0.004 -0.024 0.024 0.093

Gamma (12) ML 0.052 0.884 0.019 0.072 0.002 -0.016 0.019 0.074

QML 0.057 0.844 0.025 0.175 0.007 -0.056 0.026 0.184

EF 0.053 0.879 0.021 0.088 0.003 -0.021 0.021 0.091

Gamma (30) ML 0.053 0.880 0.020 0.081 0.003 -0.020 0.020 0.084

QML 0.055 0.851 0.025 0.168 0.005 -0.049 0.025 0.175

EF 0.053 0.877 0.021 0.089 0.003 -0.023 0.022 0.092

34

Table 2. Data Adjustments for S&P500 Daily Index Returns

Whitened and Standardized Data

AR Terms and Day of the

Only AR Terms Week Effects AR Terms and All Dummies

Mean Mean Variance Mean Variance

*** *** ***

Constant .07880 -1.8879 .07605 -1.5448***

Day of the Week

Monday -.20029*** -.04107 -.18390*** -.06841

Tuesday -.08540*** -.06017 -.08482*** -.04610

Thursday -.06434* -.16872** -.06087** -.17843***

Friday -.04391*** -.22623*** -.04128* -.22504***

January Effect

Dec. 1 - 7 .10247* .30894**

Dec. 8 - 14 -.02406 -.01504

Dec. 15 - 21 .05941 .11450

Dec. 22 - 31 .10040*** -.87696***

Jan. 1 - 7 .07911 -.71494***

Jan. 8 - 14 -.04995 .47067***

Jan. 15 - 21 -.01437 .12693

Jan. 22 - 31 .11882** .13879

Black Monday

Oct. 19 -25.536*** .81099***

Oct. 20 6.5203*** -1.2401***

Oct. 21 11.273*** -3.8493***

Oct. 22 -4.9666*** -2.0639***

Oct. 23 -.08961*** -2.0673***

Oct. 26 -10.294*** -2.8112***

Oct. 27 3.0037*** -9.2526***

Oct. 28 -.06746*** -2.8453***

Oct. 29 6.1099*** -3.8248***

Oct. 30 3.5152*** -5.2992***

War Years -.00043 .02935

March, 1957 .00231 -.29899***

July, 1976 -.03025 -.48008***

AR terms

1 .12064*** .12136*** .12614***

2 -.05160*** -.05024*** -.03823***

3 .00046 .00178 .00123

4 -.01641* -.01579* -.00402

5 .02253*** .01809** -.00077

6 -.02908*** -.02773*** -.03146***

7 .00628 .00737 -.00020

8 .00594 .00729 .01916**

9 -.01164 -.01105 -.01125

10 .00127 -.00304 .00432

NOTE: The sample period contains the 14,342 return observations from Thursday, January 23, 1941 through Monday, January

15, 1996. Significance at the 1, 5, and 10 percent levels are denoted by ***, **, and *, respectively.

35

Table 3. Summary Statistics for the Raw and Whitened S&P500 Daily Index Return Data

Whitened and Standardized Data

Raw Series Only AR terms AR Terms and Day AR T

of the Week Effects

Mean .00032 .00000 .00000

Minimum -.20457 -25.254 -24.514

Maximum .09099 10.378 10.255

Standard Deviation .00799 1.0000 1.0000

γ

1 -1.3062*** -1.2123*** -1.1347***

γ2 36.9623*** 35.3327*** 32.0499***

Jarque-Bera 820500*** 749540*** 616910***

Ljung-Box portmanteau

QX (15) 229.14*** 4.71 4.53

Q X ( 20) 241.58*** 14.98 12.44

Q X ( 25) 244.21*** 17.37 14.00

Q XX (15) 1442.68*** 1650.97*** 1807.81***

Q XX ( 20) 1474.68*** 1686.01*** 1849.12***

Q XX ( 25) 1497.32*** 1713.87*** 1881.78***

Robust Q statistics

Q *X (15 ) 39.39*** 2.39 2.23

Q *X ( 20 ) 46.44*** 8.29 6.71

Q *X ( 25 ) 47.97*** 9.64 7.57

NOTE: Significance at the 1, 5, and 10 percent levels are denoted by ***, **, and *, respectively.

37

Table 4. Estimated Parameters and Diagnostic Analysis

AR Terms and Day of the AR Term

Only AR Terms Week Effects Dum

α$ .06670 ***

.06703*** .05

β$

***

.92330 .92297*** .92

Mean -.00915 -.00864 -.00

Minimum -15.776 -15.733 -15.

Maximum 10.784 10.207 9.0

Standard Deviation 1.2775 1.2777 1.1

γ1 -.60732*** -.62492*** -.57

γ2 6.53001*** 6.72728*** 6.7

Jarque-Bera 26363. *** 27978. *** 282

Ljung-Box portmanteau

Q XX (15) 19.57 16.45 15

Q XX ( 20) 20.86 17.83 17

Q XX ( 25) 23.71 20.80 18

NOTE: Significance at the 1, 5, and 10 percent levels are denoted by ***, **, and *, respectively.

39

40

- Man Health and Multiple OrgasmDiunggah olehAlexong2000
- Guide Male Electro OrgasmDiunggah olehysony
- Cheap E StimDiunggah olehJay Szatori
- How to Build a Tens UnitDiunggah olehdvdavidvgt
- Dr Erector - Portable Electric Stimulator Instruction ManualDiunggah olehstimulator
- Self Prostate Massage Technique 1Diunggah olehBlue Lit
- Continuous Male OrgasmsDiunggah olehjhony52
- Learn Prostate Massage or Prostate Milking From an Expert in 30 MinutesDiunggah olehprostatemilking
- Prostate Massage ManualDiunggah olehJose Duarte
- Prostate Orgasm Prostate CureDiunggah olehAshley Ka
- Electro Sex GuideDiunggah olehDaniel Bernard
- Male_Multiple_Orgasm_by_Ian_KesslerDiunggah olehVictor Maria Cruz
- GARCH - Tutorial and Excel SpreadsheetDiunggah olehAjinkya Agrawal
- Prostate Massage For Health And PleasureDiunggah olehvitafx
- Secrets of Sexual EcstasyDiunggah olehTemesgen Endalew
- mcq-interval estimation.pdfDiunggah olehelite76
- Searching for Appropriate Crude Oil Price Benchmarking Method in the Nigerian Budgeting ProcessDiunggah olehAlexander Decker
- ML-MI-TableB1-B13Diunggah olehanonymous987
- Garch Volatility Solution_2014Diunggah olehNitish Bhardwaj
- 6 IJAEBM Volume No 1 Issue No 2 From Discrete to Continuous 98 101Diunggah olehiserp
- Ajay Final Review SessionDiunggah olehateiska
- Multivariate GARCH Modeling Analysis of Unexpected U.S. D, Yen and Euro-Dollar to Reminibi Volatility Spillover to Stock MarketsDiunggah olehfuzhiguo8888
- ARMAtoARCH[1]Diunggah olehshubhu7
- qm04Diunggah olehJilani Osmane
- Vol Modeling (ARCH, GARCH, LR Etc.)Diunggah olehVaibhav Vijay
- ARIMA and GARCHDiunggah olehXiaohu Zhang
- Stock market daily volatility and information measures of predictabilityDiunggah olehNur Jannah
- 5th IRTbin2013_MAIN-Reviewed.docDiunggah olehDiarmuid Magourty
- multiperiod stock market volatilityDiunggah olehNoufal Ansari
- publ_1409Diunggah olehRavi Kudal

- impulse buyingDiunggah olehSumit Soni
- 229583_Surveying Lab ManualDiunggah olehDeddy Santoso
- Social Networks and Cooperation in Hunter-GatherersDiunggah olehCelta Gómez Trejo
- R21@09@2019Diunggah olehNAG
- MBA HandbookDiunggah olehAshutosh Jha
- Poster Abstracts 2009Diunggah olehjaybeck
- MDU B.Ed Question Papers.docDiunggah olehAnishSahni
- pptDiunggah olehRamesh Kumar
- Bm Iom w02 c02 a TemplateDiunggah olehRangith Ramalingam
- COMMIT trial.pdfDiunggah olehachyutsharma
- DesignDocFAQDiunggah olehOlga Waller
- bok%3A978-94-017-9664-4Diunggah olehfivalen1_443898619
- An Analysis of Generic Structure of Narrative TextDiunggah olehRizki Fitra Fahlevi
- P HSSE 003 E Monitoring Contract Performance During Contr ExecutionDiunggah olehagaricus
- Mba SemDiunggah olehsudiptarims
- David Kaczynski - Every Last TieDiunggah olehMarcelo Bizurado
- Accounting, Non-governmental Organizations and Civil Society - Accounting, Organizations and Society - ElsevierDiunggah olehNunung Nurul
- Shin, Seol, Son - Interpretation of Animal Dose and Human Equivalent Dose for Drug Development - 2010.pdfDiunggah olehHendra Wana Nur'amin
- A Study on the Impact of Occupational Stress AmongDiunggah olehFabio.
- Large Scale Systems - JamshidiDiunggah olehwingorenator
- CA Ovarium and UsgDiunggah olehSondang Natalia Siallagan
- Chapter 1Diunggah olehAndrés Ortiz
- Design guidelines for confined masonry buildings.pdfDiunggah olehSam Chong
- Real Eastate Oberoi RealtyDiunggah olehNayeem Sazzad
- Visual Performance. Validation of TheDiunggah olehTiago Amaral
- VOUCHER SPECIMENS IN ETHNOBIOLOGICAL STUDIES AND PUBLICATIONSDiunggah olehJosé Blancas
- BMC Jurnal Instrumen Pengalaman Hamil Salin NifasDiunggah olehOshigita
- The Impact of In-Kind Food Assistance on Pastoralist Livelihoods in Humanitarian CrisesDiunggah olehOxfam
- Factors Affects to the e e ActivityDiunggah olehrawatemangesh
- MotorsDiunggah olehdinesh kumar

## Lebih dari sekadar dokumen.

Temukan segala yang ditawarkan Scribd, termasuk buku dan buku audio dari penerbit-penerbit terkemuka.

Batalkan kapan saja.