Anda di halaman 1dari 18

ARTICLE IN PRESS

Journal of Wind Engineering


and Industrial Aerodynamics 95 (2007) 165–182
www.elsevier.com/locate/jweia

The r largest order statistics model for extreme wind


speed estimation
Ying An1, M.D. Pandey
Department of Civil Engineering, University of Waterloo, Waterloo, Ont., Canada N2L 3G1
Received 14 November 2005; received in revised form 6 April 2006; accepted 22 May 2006
Available online 28 July 2006

Abstract

The paper presents the statistical estimation of extreme wind speed using annually r largest order
statistics (r-LOS) extracted from the time series of wind data. The method is based on a joint
generalized extreme value distribution of r-LOS derived from the theory of Poisson process. The
parameter estimation is based on the method of maximum likelihood. The hourly wind speed data
collected at 30 stations in Ontario, Canada, are analyzed in the paper. The results of r-LOS method
are compared with those obtained from the method of independent storms (MIS) and specifications
of the Canadian National Building Code (CNBC-1995). The CNBC estimates are apparently
conservative upper bound due to large sampling error associated with annual maxima analysis. Using
the r-LOS method, the paper shows that the wind pressure data can be suitably modelled by the
Gumbel distribution.
r 2006 Elsevier Ltd. All rights reserved.

Keywords: Wind speed; Extreme value estimation; Generalized extreme value distribution; Order statistics;
Annual maxima; Maximum likelihood method; Method of independent storm

1. Introduction

The estimation of design wind speed corresponding to a long return period is generally
based on the extreme value theory, which derives the three asymptotic domains of attraction,
namely, the Gumbel, Frechet and Weibull distributions [1]. These three distributions can be
written in a unified form, referred to as the Generalized Extreme Value distribution (GEV).

Corresponding author. Tel.: +1 519 888 4567x5858; fax: +1 519 888 4349.
E-mail address: mdpandey@uwaterloo.ca (M.D. Pandey).
1
Graduate student.

0167-6105/$ - see front matter r 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jweia.2006.05.008
ARTICLE IN PRESS
166 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

Traditionally, a sample of annual maximum wind speed is fitted with the Gumbel
distribution using the methods of moments or least squares. However, the statistical
extrapolation to estimate wind speed corresponding to 500–1000 year return period is
seriously contaminated by sampling and model uncertainty, if data are available for a
limited period (20–30 years). This has motivated the development of approaches to enlarge
the sample extreme values beyond the annual maxima.
The method of independent storms (MIS), proposed by Cook [2] and refined by Harris
[3,4], considers several wind storm maxima, rather than just annual maxima. The extremes
of storm maxima are fitted with the Gumbel distribution. The MIS method is limited to the
Gumbel distribution, and it discounts the possibility of GEV model representing the data.
The Peaks-Over-Threshold (POT) method is another alternative that models the peaks of
wind speed time series exceeding a threshold by the Generalized Pareto Distribution
(GPD), which is shown to be the domain of attraction of the peaks [5,6]. However, the
application of POT is confounded by an erratic variation of a quantile estimate with
respect to the threshold used in creating the sample of peaks [7].
The paper presents an alternate extreme value analysis of the Canadian wind speed data
that is based on estimation of the joint distribution of annually r largest order statistics
(r-LOS) of data. Assuming that r-LOS are generated by an underlying inhomogeneous
Poisson process, they can be modelled by a joint GEV distribution [8,9]. The paper
shows that the r-LOS method provides a systematic approach to (1) ascertain whether data
belong to the Gumbel or the GEV distribution, and (2) estimate the sampling error
associated with quantile estimates. Although the theoretical basis of the r-LOS method is
well established, the paper illustrates its versatility in the estimation of extreme wind speed.
The paper is organized as follows. A brief review of extreme value theory and MIS is
presented in Section 2. The proposed r-LOS method is described in Section 3. Section 4
presents a detailed analysis of wind data collected at 30 sites in Ontario (Canada) using
r-LOS and MIS methods. The wind speed quantile estimates are compared with the design
values specified in the Canadian National Building Code (CNBC) 1995. Section 5
summarizes the finding of this paper.

2. Extreme value estimation methods

2.1. Background

2.1.1. General concept


Consider a sample of iid random variables, (X1, X2,y, Xn), with a cumulative
distribution (CDF) FX(x) and denote the maximum value in the sample as Mn ¼ max
(X1,yXn). The CDF of Mn can be obtained as
PrfM n oxg ¼ ½F X ðxÞn . (1)
According to extreme value theory, if there exist some constants an and bn, then the
distribution of extremes converges to a non-trivial result as
 
Pr ðM n  bn Þ=an px ¼ ½F X ðan x þ bn Þn ! G X ðxÞ as n ! 1. (2)
The asymptotic distribution, GX(x), must converge to one of the three types of
distributions, namely, the Gumbel, Frechet and Weibull forms [1]. The three distributions
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 167

can be expressed in a unified form as the GEV distribution [10]:


(   )
xðx  mÞ 1=x
G X ðxÞ ¼ exp  1 þ . (3)
s

where m, s and x are the location, scale and shape parameters, respectively. If x40, the
GEV is known as Type II (Frechet) distribution with an unbounded upper tail (m-s/
xoxoN). The case of xo0 is called the Type III (the reverse Weibull) distribution with a
finite upper tail (Noxom–s/x). As x-0, the Type I (Gumbel) distribution is obtained:
  
G X ðxÞ ¼ exp  exp ðx  mÞ=s . (4)
Since the GEV in a theoretical sense encompasses all the three types of extreme value
distributions, it has become a popular choice for extreme value analysis without any
presumption about the Gumbel distribution.
It should be noted that the GEV converges to the Gumbel distribution only in an
asymptotic sense, because Eq. (3) is not applicable to the case of x ¼ 0 due to a singularity.
This has an important practical consequence that a non-zero value of the shape parameter
obtained during the distribution fitting cannot be accepted as it is. In fact, a statistical
significance test is required to test whether or not a non-zero shape parameter is indicative
of the GEV or Gumbel distribution.

2.1.2. The Gumbel versus GEV distribution: convergence issue


A key issue related to the application of extreme value distribution is the convergence, or
lack of it, of [FX(x)]n to the asymptotic distribution GX(x).
Cook [2] showed that the exponential parent distribution most rapidly converges to the
Gumbel form when nE100. However, the rate of convergence can be quite slow for other
parents. For example, there is a significant departure between the tails of the Rayleigh
parent and Gumbel distributions for n as high as 10,000. For this reason, prior to the
extreme value analysis a suitable transformation of original data is desirable to bring it
closer to an exponential form. Assuming that the parent wind speed distribution
approximately follows the Rayleigh distribution, the dynamic pressure, which involves a
square of the wind speed, would be closer to the exponential distribution. Hence, extremes
of pressure should be fitted with the Gumbel distribution, due to their much faster rate of
convergence than the wind speed data [2–4].
The empirical data analysis typically shows that the three parameter GEV distribution
provides a superior fit to the data that follow a curve on the Gumbel plot due to added
flexibility associated with an additional parameter. However, in many cases the GEV shape
parameter tends to be negative implying a bounded upper tail. It has sparked a debate due
to possible underestimation of extreme values by a bounded distribution. The proponents
of GEV justify its use due to its general form, better quality of empirical fit and a possible
physical upper limit to wind speed. The opponents of the GEV argue that a lack of
convergence to the Gumbel distribution is manifested by the curvature of sample data (i.e.,
empirical distribution) on the Gumbel plot, and propose data transformation as a way to
accelerate the convergence.
The extreme value theory is based on solid mathematical foundation, though practical
applications are confounded by limited data that do not satisfy the conditions of
ARTICLE IN PRESS
168 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

asymptotic convergence. The r-LOS method will be applied to examine this issue in the
context of the Canadian wind speed data analysis.

2.2. The annual maxima (Gumbel) method

It is a simple and straightforward method, adopted by many national design codes world
wide, in which the Gumbel distribution is used to fit a sample of annual maximum wind
speed. The annual maxima are plotted on the Gumbel probability paper and parameters
are estimated from the method of least squares.
The design wind speeds specified in the CNBC 1995 is based on the Gumbel analysis of
annual maxima of wind speed data [11]. Given a sample of n year maxima, the method of
moments is used to estimate the Gumbel parameters [12]:
pffiffiffi.
s ¼ sx 6 p,
m ¼ x̄  0:5772s, ð5Þ
where x̄ is the mean and sx is the standard deviation of the annual maxima data, and m and
s are location and scale parameter, respectively. In CNBC, the 30-year wind speed quantile
is chosen as a reference speed. The mean 30-year wind speed (V̄ 30 ) is first estimated for
every station in Canada from the Gumbel analysis of annual maxima data, varying from
15 to 25 years in duration [12]. The data uncertainty is added to the mean estimate to
obtain the final design value. pffiffiffi
The sampling error associated with the 30-year quantile was estimated as es ¼ 2:96sx = n
[12]. This formula is based on sampling error associated with the Gumbel quantile written
as a function of sample moments. The additional sources of error are climate variability,
siting uncertainty, anemometer height uncertainty and uncertainty associated with log law
for height correction, which were assumed to be 25%, 20%, 15% and 15% of the sampling
error (es), respectively. The final data uncertainty (ed) was obtained by adding variance of
all these components, i.e, ed ¼ 1.0712 es. The 30-year wind speed and the associated error
were plotted on a map, and the final extreme wind contour map was prepared using expert
judgement. The design speed appears to be specified as V30 ¼ V̄ 30 +ed, i.e., it corresponds
to 68% upper bound confidence interval.
The design speed corresponding to any other T-year return period can be calculated in
terms of V30 as

T
V T ¼ V 30  0:7797sx 3:3843 þ lnðln . (6)
T 1
The CNBC provides an alternate formula in which a T- year quantile is expressed in
terms of 10 and 30-year quantiles as
" #
x10  x30 0:0339
xT ¼ x30 þ ln   . (7)
1:1339 ln 1  1=T

2.3. Method of independent storms (MIS)

This method enlarges the sample for extreme value analysis by including the wind storm
maxima, typically 100 storms per year. Through the examination of continuous records of
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 169

wind speed, independent wind storms are identified between each pair of lulls, and
maximum value within each storm is extracted to form a sample of storm maxima. The
wind speed data are converted into dynamic pressure. The top r order statistics of storm
maxima data are plotted on the Gumbel probability paper and extreme quantiles are
obtained by extrapolation of the straight line fitted to the data. The main concepts of the
improved MIS, as discussed by Harris [3,4], are summarized below:
Suppose N independent storm maxima are extracted from S years of records, and FP(x)
denotes their CDF. The probability distribution of annual maxima can be given as
FA(x) ¼ [FP(x)]m, assuming that it is generated by independent storms with annual
occurrence rate of m ¼ N/S. Arranging the storm maxima in a decreasing order and
denoting a kth order statistic as Yk, such that Y1 being the largest and the YN being the
smallest value. Its probability density function (PDF) is given as [4]

N!
f Y k ðyk Þ ¼ ½F P ðyk ÞNk ½1  F P ðyk Þk1 f P ðyk Þ. (8)
ðk  1Þ!ðn  kÞ!

Using the probability integral transformation, zk ¼ FP(yk), the PDF of the cumulative
frequency or the plotting position, zk, of Yk can be derived as

dyk
f Zk ðzk Þ ¼ f Y k fF 1
P ðzk Þg and dzk ¼ f P ðyk Þ dyk . (9)
dzk

Combining Eqs. (8) and (9) leads to

N!
f Zk ðzk Þ ¼ ½zk Nk ½1  zk Þk1 . (10)
ðk  1Þ!ðn  kÞ!

The standard solution of the mean plotting position is obtained as


R1
E½zk  ¼ 0 zk f Zk ðzk Þ dzk ¼ (N-k+1)/(N+1), where E[  ] denotes the expectation operation.
Since FA(x) ¼ [FP(x)]m ¼ zm is assumed to converge to the Gumbel distribution, Harris [4]
recommended to plot the expected value of the
n o Gumbel plotting positions
m
R1
E½ lnf lnðzk Þ g ¼ 0 ½ lnð ln zk Þ  ln mf Zk ðzk Þ dzk against the order statistics,
Yk, of the storm maxima. The annual storm rate, m, has fairly small effect in analysis,
because ln(m) is involved in the plotting position formula and its value doe not change
much even with large variations in the value of m.
Harris also presented a computer program for accurate computation of the
mean and variance of the Gumbel plotting positions for various values of k. Another
refinement that Harris proposed was to use the weighted least-squares method to fit a
straight line to the data on the Gumbel plot, instead of the classical least squares. The
weight associated with kth plotting position is taken as inversely proportional to its
standard deviation [4].
In the MIS method, only r top-order statistics out of N storm maxima are plotted on the
Gumbel paper. The value of r is recommended so that approximately 2.5 observations per
year can be included in the plot. In summary, MIS method uses a larger data set than
annual maxima, employs preconditioning of data (i.e., square transformation) and plots
them consistently on the Gumbel paper.
ARTICLE IN PRESS
170 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

3. r-Largest order statistics (r-LOS) model

3.1. Analysis

This method selects r largest observation in each year of the data collected and derives
their joint distribution based on the theory of the Poisson process [13]. As seen from
Eqs. (2) and (3) the sample extreme value distribution, [FX(x)]n, asymptotically converge to
the GEV as
n 1=x o
½F X ðuÞn ¼ G X ðxÞ ¼ exp  1 þ xðu  mÞ=s (11)

and it can be rewritten as


n 1=x o
n log½F X ðuÞ ¼  1 þ xðu  mÞ=s . (12)

The probability that wind speed exceeds a threshold, u, is p ¼ [1FX(u)] and the number
of exceedances in n trials follows the binomial distribution with parameters n and p. As
n ! 1 and p-0, then np-L, a constant. The distribution of number of exceedances
converges to the Poisson distribution with L as the intensity measure. Using a first-order
approximation, log½F X ðuÞ  ½1  F X ðuÞ, the Poisson intensity measure can be obtained
from Eq. (12) as
1=x
n log½F X ðuÞ  n½1  F X ðuÞ3np ! LðuÞ ¼ 1 þ xðu  mÞ=s ðn ! 1; p ! 0Þ.
(13)
Denoting the rth LOS in a sample of n as M ðrÞ
n , its distribution, Gr(y), can be related to the
Poisson distribution as
X
r1
½LðyÞk  
n oy ¼ G r ðyÞ ¼
P½M ðrÞ exp Lð yÞ . (14)
k¼0
k!

It is interesting that the distribution of rth LOS incorporates information about 1 to


(r1) order statistics (OS) of the sample. The increased information is expected to reduce
the uncertainty from estimation. In a special case of r ¼ 1, Eq. (14) reduces to the GEV
distribution.
The joint distribution of an r-LOS vector (Xr5Xr1,y,oX1) can be derived using the
theory of Poisson process as
Y
r
1
f 1r ðx1 ; . . . ; xr Þ ¼ exp½Lðxr Þ ½Lðxk Þ1þx . (15)
k¼1
s

For details of the derivation of this distribution, readers are referred to [9, Chapter 7].
The joint distribution (16) is the basis for inference, and is referred to as the r-LOS model.
As x-0, the Poisson rate parameter in limit converges to
h y  mi
Lð yÞ ¼ exp  ðx ! 0Þ (16)
s
which is the Gumbel analog of the r-LOS model.
The r-LOS model has emerged as a versatile method for extreme value analysis in many
areas of science and engineering. It has been effectively applied for the estimation of
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 171

extremes of sea levels [5], wave heights [14] and rainfall [15]. A monograph by Coles
illustrates several interesting applications of this method [9]. This paper presents for the
first time a comprehensive application of r-LOS model to wind speed estimation problem.

3.2. Statistical estimation

3.2.1. Maximum likelihood method


The joint distribution of r-LOS given by Eq. (15) provides a basis for maximum
likelihood (ML) method. Suppose j years of data are available and r-LOS are extracted
from each year of data. It is reasonable to assume that wind speed data from separate years
are independent, which makes a vector of r-LOS as independent and identically distributed
across j years. Thus, the likelihood function can be constructed as a product of the j
densities (15) corresponding to the observed sample:
( )
Y
j Y
r
1þx
Lðm; s; xÞ ¼ exp½Lðxr;i Þ ½Lðxk;i Þ . (17)
i¼1 k¼1

Here, xk,i denotes the kth largest OS in ith year of wind data. The distribution parameters
are obtained by maximizing the log-likelihood function. In a special case of r ¼ 1, it
reduces to the GEV model for annual maxima. A quantile estimate corresponding to a T
year return period is obtained as
sn   x o
xT ¼ m  1   log 1  1=T . (18)
x

3.2.2. Discrimination between the Gumbel and GEV models


A given data set can be fitted with either joint GEV (i.e., x6¼0) or Gumbel (i.e., x ¼ 0)
version of the r-LOS model. A formal statistical test, referred to as the likelihood ratio test,
can be applied to discriminate between the Gumbel or GEV model [15]. Let L1 denote the
ML associated with the GEV assumption (i.e., x6¼0) and ML associated with the Gumbel
model, obtained by setting x ¼ 0, denoted as L2. The likelihood ratio is defined as
 
D ¼ 2 log L2 =L1 . (19)
It is distributed as a w2 variable with one degree of freedom. Thus, at the 5% significance
level, the Gumbel model would be preferred if Dow21;0:95  3:841 [15]. This criterion is used
in the paper to select the appropriate distribution.

3.2.3. Sampling uncertainty


Smith [8] showed that the inverse of the information matrix is a good approximation of
the variance–covariance matrix of the ML estimators. The information matrix is the
Hessian of the negative log-likelihood evaluated at the maximum of the likelihood
function. The variance of a quantile value is calculated in terms of the variance of the
distribution parameters based on the delta method [16]. The estimated variance is used to
calculate the confidence intervals for a quantile [9].

3.2.4. Choice of r largest order statistics


There is a bias-variance trade-off associated with the number of order statistics, r, used
in the analysis. A small value of r can result in large variance, but a large r is likely to cause
ARTICLE IN PRESS
172 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

the bias and violate the assumption of Poisson process generating the extreme values [8].
A practical criterion is that r should be selected such that it minimizes the variance
associated with a required quantile estimate. Tawn [17] also concluded that the results for
r ¼ 3–7 are very stable, showing that provided r is not too large this method leads to
consistent results. The analysis presented in Section 4 shows that r ¼ 5 is sufficient to
provide minimum variance quantile estimates. Thus, r ¼ 5 is considered in the rest of the
analysis.

4. Analysis of Canadian wind speed data

4.1. Wind speed data

This section presents the analysis of time series of the aviation wind (HLY 01) recorded
by the Environment Canada at 30 stations in the Province of Ontario. The location and
other details of these stations are given in Table 1 and Fig. 1. The data for a site contain

Table 1
Information about wind stations in Ontario Canada

No. Name of sites Site ID Begin year End year Total years

1 Wawa A 6059D09 1977 2004 28


2 Red Lake A 6016975 1953 2004 52
3 Kenora A 6034075 1953 2004 52
4 Sioux Lookout A 6037775 1953 2004 52
5 Geraldton A 6042716 1981 2004 24
6 Thunder Bay A 6048261 1954 2004 51
7 Sault Ste Marie A 6057592 1961 2004 44
8 Sudbury A 6068150 1954 2004 51
9 Earlton A 6072225 1953 2004 52
10 Kapuskasing A 6073975 1953 2004 52
11 Timmins 6078285 1955 2004 50
12 North Bay A 6085700 1953 2004 52
13 Gore Bay A 6092925 1953 2004 52
14 Kingston A 6104146 1967 2004 38
15 Ottawa M-C Int’l A 6106000 1954 2004 51
16 Petawawa A 6106398 1969 2004 36
17 Muskoka A 6115525 1953 2004 52
18 Wiarton 6119500 1953 2004 52
19 Sarnia A 6127514 1967 2004 38
20 St. Catharines A 6137287 1971 2004 34
21 Simcoe 6137730 1962 1986 25
22 Windsor 6139525 1953 2004 52
23 London A 6144475 1954 2004 51
24 Mount Forest 6145503 1962 1986 25
25 Waterloo Wellington A 6149387 1966 2002 37
26 Hamilton A 6153194 1970 2004 35
27 Toronto Island Airport 6158665 1957 2004 48
28 Toronto Pearson Airport 6158733 1953 2004 52
29 Trenton 6158875 1953 2004 52
30 Peterborough A 6166418 1969 2004 36

Note: The letter ‘A’ in the names of a site means that data were recorded at the local airport field.
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 173

Fig. 1. Locations of 30 sites in Ontario, Canada.

hourly wind speed, which is the 2-min average of the wind speed recorded just before the
hour. The largest daily wind speed is the maximum value of these hourly wind speeds
(24 values) within a day. The design wind estimates given in CNBC were derived from
these data [11].
The main assumption in the extreme value analysis is that the wind speed maxima are
independent. To reduce mutual dependence in the data, the annual time series of daily
maximum speed are partitioned into blocks that are equal to or larger than the duration of
typical storms in days (4–8 days). Using a procedure described by Simiu and Heckert [5],
new time series of 4-day independent maxima were created for each station. The wind
speed (V) time series were converted into dynamic pressure series using the relation
0.5rV2, where r ¼ 1.29 kg/m3 is the air density in Ontario [12].
The processed data sets were analyzed using the following three methods:

(1) r-largest order statistics (r-LOS)


This method, as described in Section 3, was implemented using the algorithms provided
by Cole [9].
(2) Method of Independent Storms (MIS)
The computer code described by Harris [4] was programmed to implement this
method. The purpose is to compare its performance with r-LOS and CNBC estimates.
(3) Annual maxima Gumbel method (AMG)
The Gumbel analysis of annual maxima data using the method of moments was
performed to confirm the estimates of CNBC [11].
ARTICLE IN PRESS
174 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

4.2. Numerical results

4.2.1. Illustration of extreme value analysis methods


The steps involved in the extreme value estimation methods are explained through a
detailed analysis of Kingston station data. The 38 years of daily maximum hourly speed
data (13,750 values) were processed to create a time series of 3466 independent values of
4-day maximum wind pressure.
In the r-LOS method, a sample of size 38r was extracted from the data, and the analysis
was repeated for r 1–10. ML estimates of the three GEV parameters and the associated
standard errors (SEs) were calculated. The SEs associated with the shape parameters (x)
for various values of r are shown in Fig. 2. The variability decreases with an increase in r
up to 5, but there is no appreciable change in SE for r between 5 and 10. The SEs
associated with 50 and 500 year quantiles are shown in Fig. 3. A deceasing trend in SE is
seen with a modest increase in r, which levels off for r45. Therefore, an optimum choice of
r is expected to be close to 5. Similar to the work of Smith [5], we have chosen r ¼ 5 as a
fixed value in the subsequent analysis. Thus, the r LOS sample for Kingston includes 190
values of wind pressure.
To apply the likelihood ratio test, the ML analysis was repeated under the assumption of
the Gumbel distribution (i.e., x ¼ 0). As shown in Table 2, the likelihood ratio test discerns
the Gumbel model as a statistically representative distribution of the wind pressure.
An interesting observation is made when the likelihood test was reapplied to wind speed
data, which showed a preference for the GEV distribution with x ¼ 0.0715 (Table 2). The
analysis of other stations, presented in the next section, also shows that the wind pressure
distribution is much closer to the Gumbel form than the wind speed data.
The MIS analysis included 88 top-order statistics out of 3466 storm maxima of the
dynamic pressure series, i.e., it uses approximately 2.3 maxima/year. The annual maxima
data (38 values) were also fitted with the Gumbel distribution, and its parameters were
estimated from the method of moments. The Gumbel parameters estimated from the three
methods are compared in Table 3.

0.4

0.3
Shape Parameter of GEV

0.2

0.1

0.0

-0.1

-0.2

-0.3
0 1 2 3 4 5 6 7 8 9 10
r

Fig. 2. The GEV shape parameter with 95% confidence interval (Kingston, Ont.).
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 175

105 105
50-year Design Speed (km/h)

50-year Design Speed (km/h)


100 100

95 95

90 90

85 85

80 80

75 75
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
r r

Fig. 3. 50 and 500 year wind speed with 95% confidence interval (Kingston, Ont.).

Table 2
Likelihood ratio test applied to r-LOS data (Kingdom, Ont.)

Data type Log-likelihood of Log-likelihood of Likelihood ratio Selection


r-LOS Gumbel r-LOS GEV test of model

Wind pressure (Pa) 770.5 769.8 D ¼ 1.41o3.841 Gumbel


Wind speed (km/h) 453.0 447.4 D ¼ 11.2443.841 GEV

Table 3
Parameters of the Gumbel distribution (Kingdom, Ont.)

Method s scale (Pa) m location (Pa)

Annual maxima Gumbel 83.97 3.08


MIS (Gumbel) 86.11 2.98
r-LOS (Gumbel) 85.86 1.97

Fig. 4 compares the wind speed quantiles for return periods ranging from 50 to 1000
years. It is interesting that the MIS and r-LOS curves are in close agreement in this case.
The design wind speeds specified in the CNBC are higher than those obtained from the
r-LOS and MIS methods. The reason, as discussed in Section 2.1, is the higher uncertainty
associated with estimates obtained from annual maxima data.

4.3. Discussion of results for other stations in Ontario

The data from other remaining 29 stations in Ontario were analyzed as described in the
previous section. The quantity of data used by each of the three methods varies, as shown
in Fig. 5. Some of the important trends observed from the results are discussed below.
ARTICLE IN PRESS
176 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

125
120
115

Quantile (Km/h)
110
105
100
95
90
NBCC 95
85
MIS
80 r-LOS
75
10 100 1000
Return Period (Year)

Fig. 4. Comparison of wind speed estimates (Kingston, Ont.).

300 r-LOS
MIS
AMG
250
Number of Samples

200

150

100

50

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 5. Comparison of the sample size used by different methods in Ontario data analysis.

4.3.1. Reduction in sampling variability


The effect of considering r41 order statistics is the reduction in sampling uncertainty.
To illustrate this point, the SE associated with the r-LOS estimates of 500-year wind speed
are shown in Fig. 6. It is noteworthy that the SE on average is reduced by 50% as r is
increased from 1 to 5. It is a remarkable advantage of using the r-LOS method over the
classical annual maxima method.

4.3.2. The GEV shape parameter


For all 30 stations, the mean value and 95% confidence intervals of the GEV shape
parameters estimated from the r-LOS (r ¼ 5) analysis of the wind pressure data are plotted
in Fig. 7. The mean shape parameter is mostly close to zero or in the positive range which
implies unbounded upper tail of the distribution. The results of the likelihood ratio test, as
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 177

5
r=1
Standard Error (km/h)

r=3
3

2
r=5

1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 6. The number of order statistics (r) versus the standard error associated with 500-year speed.

0.4

0.3

0.2
Shape Parameter

0.1

0.0

-0.1

-0.2

-0.3

-0.4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 7. The GEV shape parameter with 95% confidence interval: wind pressure data (Ont., Canada).

shown in Fig. 8, confirm that the Gumbel model is preferred for 27 stations. The GEV can
be used only for three stations numbered 1 (Wawa), 19 (Sarnia) and 28 (Mount Forest).
The estimated shape parameters for these stations are highly negative: 0.11, 0.12 and
0.19, respectively.
In contrast, the shape parameter (mean and 95% confidence intervals) estimated from
the wind speed data is mostly negative (Fig. 9). The likelihood ratio test confirms that 22
stations follow the GEV model (Fig. 10). The negative shape parameter means that the
upper tail is bounded at (ms/x), and it corresponds to the reverse Weibull (Type III)
domains of attraction. These results are in line with conclusions of the POT results
ARTICLE IN PRESS
178 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

20

18

16

14
Likelihood Ratio

12

10

6 GEV
Criteria Line = 3.841
4

2 Gumbel

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 8. Results of likelihood ratio test: wind pressure data (Ont., Canada).

0.3

0.2

0.1
Shape Parameter

0.0

-0.1

-0.2

-0.3

-0.4

-0.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 9. The GEV shape parameter with 95% confidence interval: wind speed data (Ont., Canada).

reported in the literature [5], in which the shape parameter for the generalized Pareto
distribution is predominantly negative.
On the other hand, the comparison of Figs. 8 and 10 appears to support the notion that
a square transformation of wind speed (to wind pressure) accelerates the convergence to
the Gumbel distribution [2,3,18].
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 179

20

18

16

14
Likelihood Ratio

12

10

6 GEV
Criteria Line =3.841
4

2 Gumbel

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 10. Results of likelihood ratio test: Wind speed data (Ont., Canada).

110 CNBC 95
r-LOS
MIS
100
50-year Speed (km/h)

90

80

70

60

50
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 11. Comparison of 50-year wind speed estimates.

4.3.3. Comparison of quantile estimates


The quantile estimates obtained from r-LOS analysis of wind pressure data are
compared with those obtained from MIS and CNBC specifications. The comparisons of
50-year and 500-year quantiles are given in Figs. 11 and 12, respectively.
It is interesting to note that mean estimates of 50-year wind speed obtained from MIS
and r–LOS are in close agreement, as the two methods are conceptually similar. However,
ARTICLE IN PRESS
180 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

130
CNBC95
r-LOS
MIS
500-year Speed (km / h)

110

90

70

50
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 12. Comparison of 500-year wind speed estimates.

130

120

110
Quantile (km/h)

100

90

80

70 NBCC 95
r-LOS+1.07*Std.Err.
Mean of r-LOS
60
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Site No.

Fig. 13. Comparison of 500-year wind speed estimated from RLOS and NBC 95 methods.

r-LOS estimates of 500-year wind speed are slightly higher than the corresponding MIS
values.
For the sake of a consistent comparison, mean r-LOS estimates are converted into the
design values by adding the data uncertainty (ed ¼ 1.0712 es) similar to CNBC approach.
Fig. 13 shows that r-LOS estimates of 500-year design speed are much lower than CNBC
specifications. It suggests that the conservatism-associated CNBC can be reduced by
improved extreme value analysis of the wind speed data.
ARTICLE IN PRESS
Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182 181

5. Conclusions

The paper presents the statistical estimation of extreme wind speed using annually
r-LOS extracted from the time series of wind data. The method is based on a joint
generalized extreme value distribution of r largest order statistics derived from the theory
of Poisson process. The parameter estimation is based on the method of ML . A formal
likelihood ratio test is applied to discern between the GEV and Gumbel distribution. The
data collected at 30 stations in Ontario, Canada, are analyzed in the paper. The results
of r-LOS method are compared with those obtained from the MIS and specifications of the
CNBC.
The r-LOS analysis of the Ontario data shows that the Gumbel distribution is
statistically more preferable than the GEV distribution with a bounded tail due to negative
shape parameter. In this sense, the r-LOS results support the basis of the MIS method
which adopts the Gumbel distribution for wind pressure. However, the r-LOS method is
more versatile than the MIS method, as it also retains the flexibility of adopting the GEV
distribution.

Acknowledgements

The authors are grateful to the Sciences and Engineering Research of Canada (NSERC)
and the University Network of Excellence for Nuclear Engineering (UNENE) for
providing the financial support for this study. The authors are also thankful to the
Environment Canada for providing the wind speed data. The authors gratefully
acknowledge the use of computer programs provided by Coles for r-LOS model and
Simiu for filtering of the data.

References

[1] E.J. Gumbel, Statistics of Extremes, Columbia University Press, Columbia, 1958.
[2] N.J. Cook, Towards better estimation of extreme winds, J. Wind Eng. Ind. Aerodyn. 9 (1982)
295–323.
[3] R.I. Harris, Gumbel re-visited—a new look at extreme value statistics applied to wind speeds, J. Wind Eng.
Ind. Aerodyn. 59 (1996) 1–22.
[4] R.I. Harris, Improvements to the method of independent storms, J. Wind Eng. Ind. Aerodyn. 80 (1999)
1–30.
[5] E. Simiu, N.A. Heckert, Extreme wind distribution tails: a peaks over threshold approach, J. Struct. Eng. 122
(1996) 539–547.
[6] J.I. Pickands, Statistical inference using extreme order statistics, Ann. Stat. 3 (1975) 119–131.
[7] M.D. Pandey, An adaptive exponential model for extreme wind speed estimation, J. Wind Eng. Ind.
Aerodyn. 90 (2002) 839–866.
[8] R.L. Smith, Extreme value theory based on the r largest annual events, J. Hydrol. 86 (1986) 27–43.
[9] S. Coles, An Introduction to Statistical Modeling of Extreme Values, Springer, Berlin, 2001.
[10] A.F. Jenkinson, The frequency distribution of the annual maximum (or minimum) of meteorological
elements, Quart. J. R. Meteorol. Soc. 81 (1955) 158–171.
[11] National Building Code of Canada, 1995, National Research Council.
[12] Yip, T.C., Auld, H., 1993. Updating the 1995 National Building Code of Canada wind pressures. In: Paper
presented to Canadian Electricity Association, pp. 1–9.
[13] I. Weissman, Estimation of parameters and large quantiles based on the k largest observations, J. Am. Stat.
Assoc. 73 (1978) 812–815.
ARTICLE IN PRESS
182 Y. An, M.D. Pandey / J. Wind Eng. Ind. Aerodyn. 95 (2007) 165–182

[14] C.G. Soares, M.G. Scotto, Application of the r largest-order statistics for long-term predictions of significant
wave height, Coast. Eng. 51 (2004) 387–394.
[15] S. Nadarajah, Extremes of daily rainfall in west central Florida, Clim. Change 69 (2005) 325–342.
[16] G.W. Oehlert, A note on the delta method, Am. Statist. 46 (1992) 27–29.
[17] J.A. Tawn, An extreme-value theory model for dependent observations, J. Hydrol. 101 (1988)
227–250.
[18] A. Naess, Estimation of long return period design values for wind speeds, J. Eng. Mech. 124 (1998)
252–259.