Anda di halaman 1dari 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/248807556

Evaluation of Regional Flood Frequency Analysis With a Region of Influence


Approach

Article  in  Water Resources Research · October 1990


DOI: 10.1029/WR026i010p02257

CITATIONS READS

331 726

1 author:

Donald H. Burn
University of Waterloo
163 PUBLICATIONS   6,571 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

FloodNet View project

All content following this page was uploaded by Donald H. Burn on 24 March 2014.

The user has requested enhancement of the downloaded file.


WATER RESOURCES RESEARCH, VOL. 26, NO. 10, PAGES 2257-2265, OCTOBER 1990

Evaluation
ofRegional
Flood
Frequency
Analysis
Witha Region
of Influence Approach
DONALD H. BURN

Department of Civil Engineering, University of Manitoba, Winnipeg, Canada

A novel approach to regional flood frequency analysis is presented and evaluated. The technique is
referred to as the region of influence approach in that every site can have a potentially unique set of
gauging stations for use in the estimation of at-site extremes. The rationale for the methodology is
discussed,and several options for incorporating the approach into regional flood frequency analysis
are developed and compared with traditional regional estimation procedures. Through a Monte Carlo
experiment, the region of influence approach is demonstratedto provide improved at-site estimates of
extreme flow quantliesin terms of network averageroot mean squarederror and comparableresults
for bias. The method is further shown to have attractive features for estimating extremes for unusual
sites in a network of gauging stations.

INTRODUCTION approach. The fundamental premise of the approach is that


there is no need for distinct boundaries between different
Water resources engineers are often faced with the task of regions; rather each site should utilize in the estimation of
estimating the probability of exceedence associated with a at-site extremes, information from all stations that are suffi-
selected flow value at a particular location on a river. This ciently similar to it. The objective of this paper is to refine
typically involves estimating the relationship between an and evaluate a region of influence approachfor regional flood
extreme flow value and the associated return period (the frequency analysis.
so-called Q-T relationship). A complicatingfactor is that the The remainder of this paper is organized in the following
return periods of interest, from a designpoint of view, often manner. The next section presents the ROI concept and
exceed the length of data record at the site. Thus one may be develops different options for incorporating the general
faced with the task of estimating the value of the 100-year scheme into the regional flood frequency process. The
event based on, say 30 years of streamflow record. To following section describes a Monte Carlo experiment de-
alleviate this problem, regional flood frequency analysis is signed to evaluate the performance of the technique. A
often employed in an attempt to compensatefor an insuffi- presentation of results ensues wherein the relevant features
cient temporal characterization of the at-site Q-T relation- of the ROI approach are illustrated. The paper concludes
ship by utilizing information pooled from other gauging with a discussion of the results and the identification of
stations that are in some way similar to the at-site location. possible avenues for further research.
Recent research has explored various ways of delineating
a set of gauging stations that can be considered to constitute
MODEL DEVELOPMENT
a region with sufficient homogeneity in extreme flow char-
acteristics. Tasker [1982] compared different strategies for The region of influence approach allows for a potentially
determining regions involving different bases for measuring unique set of gauging stations to be used in the at-site
similarity between gauging stations. Wiltshire [1986] has estimation of extremes for every station in a collection of
developed a statistical measure that can be used to evaluate gauging stations. The ROI approach can be regarded as a
the homogeneity of a proposed partitioning of a network of natural extension of the search for homogeneousregions that
gauging stations. Lettenmaier et al. [1987] demonstrated, has become standard in regional flood frequency analysis
througha Monte Carlo experiment, the importanceof iden- [Wiltshire, 1986; Lettenmaier et al., 1987; Burn, 1989].
tifying homogeneous regions for regional flood frequency Although the identification of homogeneous regions will
analysis. Potter [1987] gives a summary of recent research often lead to an effective and efficient spatial transfer of
into regionalization techniques. In a novel approach to information, there are still problems and inconsistencies
regionalization,Cook et al. [1988] used indicesof landscape with the use of these techniques. Consider, for example, the
and rainfall characteristics to predict streamflow indices case of a station which lies on the boundary between two
which were then usedas a basisfor predictingsimilarityin homogeneous regions (we need not concern ourselves now
with the issue of what measures should be used to define
catchment
•hydrology.
This paper examines a variation on the classicalapproach station similarity). It has been suggestedby Acreman and
to regionalization in that a potentially unique "region" is Wiltshire [1987] that such a station could be regarded as
definedfor each gaugingstation. This novel approachto the being a partial member in both of the regions that it borders
regionalization process, which appears to have been first on. Expanding this concept leads to the realization that there
suggested•by Acreman and Wiltshire [1987] and Acreman is no need to define boundaries between regions but rather
[1987], is referred to herein as the region of influence (ROI) each site can have its own "region" consisting of those
siationsthat are sufficientlysimilarto the site of interest.
Copyright 1990 by the American Geophysical Union. The starting point for the ROI approach to regionalization
Paper number 90WR01192. is the selection of a distance metric defining the closenessof
0043-1397/90/90 WR-01192505.00 each station to every other station. It is suggested than an
2257
2258 BURN: REGIONAL FLOOD FREQUENCYANALYSIS

appropriate metric results from the weighted Euclidean • U= 0 V j • Ii (3b)


distance in M space where M represents the number of
attributes used to define station similarity. The distance where•U is theweightfor station
j in theROI for sitei, f( )
metric can be defined through is a functional relationship defining the weight, and ß is a
parameter vector for the weighting function. The weighting
function is used in the pooling of information from all
O•i= Z Wm(Xim
- X:•m)
2 (1) stations in the ROI for a site.
m=l
There are numerous options that could be formulated to
capture the characteristicsof the ROI approach to regional-
whereD O.is the weighteddistancebetweenstationi and ization. Different options can be derived by adopting dif-
stationj, Wm is the weight applied to attribute rn reflecting
ferent schemesfor definingthe thresholdvalue, Oi, and the
therelativeimportance
of theattribute,andX/mis thevalue
of attribute rn for station i. weighting
function,%.. Oneof theintentsof thispaperis to
explore the relative performance of different ROI options in
There are two general types or classes of attributes that
terms of the estimation of at-site extreme flow quarttiles. One
can be incorporated into the distance metric. The attributes
possible approach would be to choose a sufficiently large
can be based on physical features of the contributingdrain- threshold value such that all stations were included in the
age area for a station (e.g., soil type classification) or on
ROI of every other station. In combination with this type of
statistical measures of the data record at each site (e.g.,
threshold, it would be prudent to select a weighting function
coefficient of variation (CV) of the annual flow data). The
that would result in a comparatively low weighting assigned
latter approach, although not feasible for ungauged loca-
to the stations that are most separated, in attribute space,
tions, can be expected to provide an effective similarity
from the reference site. The opposite extreme would result
measure for gauged locations due to the expected relation-
from choosing a very restrictive threshold, such that the
ship between statistical measures and the at-site extreme
average size of I i is quite small, but also selectinga func-
flow relationship. Thus sites that have similar statistical
measures could be expected to have similar extreme flow tional form for •qijthat gave weightingvalues that are
substantially different from zero to all stations in the ROI.
responses. The former approach is the only alternative for
The above two examples of strategiesfor formulating ROI
the case of ungauged sites. Similarities in physical basin
optionsrepresentdifferentphilosophiesfor combininginfor-
features can be expected to be indicative of similar extreme
mation for regional flood frequency analysis. The options
flow behavior due to similar rainfall-runoff response.
The selection of a set of attributes for inclusion in the
outlined below were formulated to be representative of the
types of strategiesthat might be selected in an attempt to
distance metric from the array of possibleattributes can be
maximize the information return resulting from spatial data
accomplished using a screening process to identify those
transfer. Although not overly complex, the options will
attributes that are most indicative of similarity in extreme
subsequentlybe shown to result in efficient spatial informa-
flow behavior. The screening process can involve plotting tion transfer.
values for the attributes versus measures of at-site extreme
flow or calculating the correlation between potential at-
tributes and sample extreme flow measures. For the case
Option 1
where many attributes are available, a multivariate analysis
may be required to identify either a reduced set of attributes This option has a threshold value defined through
or a set of combinations of attributes. The intent in the
screening process is not only to identify a reduced set of 0i = 0L NSi > NST (4)
attributes but also to determine appropriate weighting values
for each selected attribute based on the perceived relative
importance of each of the attributes.
Oi
=OL
-3-
(NST-
(Ou--
_NS{•
OL)•NST/
NSi
<NST (5)
After the definition of an appropriate set of attributes, two
tasks remain to complete the definition of a station's ROI. where 0L is a lower threshold value defining a desired
The first of the remaining requirements is to determine a proximity for stations to be included in the ROI for site i,
threshold value to define a cutoff for the inclusion of stations
NS i is the number of stationsin the ROI for site i with the
into the ROI for a site. Any station with a distance value in threshold at 0L, NST is the target number of stationsfor a
excess of the threshold will not be included in a site's ROI.
region of influence, and 0t• is an upper threshold value for
The set of stations in a site's ROI are thus defined as
sites with fewer than NST stations in the ROI. The philoso-
phy behind this option is that all stationswithin a distance
I i={j: D 0--<0i} (2) correspondingto 0t should be included in the ROI for the
where I i is the set of stationsin the ROI for site i, and Oiis site, and if the number of stations included is less than the
the threshold for site i. As can be seen from the form of (2), desired number, a less restrictive threshold should apply.
the threshold value may be site specificimplying that Oiis a The threshold for the latter eventuality is a function of an
function of some set of conditions at site i. absoluteupper boundfor the threshold, 0u, and the number
of stations with distance measures less than the lower
The remainingfeature of the ROI techniqueis a weighting
function that reflects the relative closeness (jnM•t- imen- threshold. The weightingfunction for this optionLsdefined
sional attribute space) to the site of each station in a site's through
ROI. The weighting function will generally be of the form

•qO'= f(Dti, xtt) V j • Ii (3a) rt,j: 1-•TP


/ (6)
BURN: REGIONAL FLOOD FREQUENCY ANALYSIS 2259

where TP and n are parameters of the weighting function. is used in conjunction with the method of probability
Option 1 requires the definition of an upper and lower weighted moments (PWM) since this combination was iden-
threshold value, a target number of stationsfor the RaT, and tified by Potter [1987] as an efficient basis for combining
the parameters of the weighting function. extreme flow data. Other distribution functions and param-
eter estimation techniques could be employed possibly ne-
Option 2 cessitatingrevised procedures for effecting the spatial data
transfer.
This option has a constant threshold value which is given The GEV distribution is given as
as

F(x) = exp{-[1 - g(x - •)la] 1/•} g• 0 (13)


0 i -- 0 U (7)
F(x) = exp {-exp [-(x - s•)/a]} g= 0 (14)
where 0tj is the constant threshold value. The weighting
function for this option is given as The three parameters of this distribution can be estimated
using three PWMs obtained from the sample data through
Dij [Hosking et al., 1985]
rtij = 1 - TN-
D ij > 0œ (8)
I np
r/0'-- 1 otherwise (9) Mr---- Z P[Xi r=0,
np i=l
where OL defines a lower threshold for the distance metric,
and TN and n are parameters of the weighting function with where Pi -- (i - 0.35)/np is the plotting positionfor data point
TN defined through xi and np is the number of years of record at the station.
From the PWMs calculated for each station, scaled values
TN = Max (TLi, TPP) (10)
can be obtained through
with TPP a parameter of the weightingfunction, and TLi J J
given as tlj = M1/Mo (16)
J J
TLi = Max (Dij) (11) t2j = M2/Mo (17)
{J}
where the index j denotes the station number. PWMs for the
The information requirements for this option are the two RaT are then derived from the scaled PWMs of the stations
threshold values, and the two parameter values of the in the RaT through
weighting function.

Option 3 T}
=•5•
tk,jnpj•lij
/ Znpj•lij
k=1,2 (18)
The final option considered involves including all sites in
the RaI such that the threshold is defined as whereI i is the setof stationsin the RaI for sitei, andnpj is
the number of years of record for stationj. The index i on the
Oi = TLi (12) regionalized PWM indicates the site for which the weighted
PWM is calculated. From the weighted PWM values for each
and the weighting function is defined through (8)-(11). This RaI, the three parameters of the GEV distribution can be
option requires the specification of a threshold parameter estimated via [Hosking et al., 1985]
and two weighting function parameters.
The first option presented above entails including a limited c = (2T•- 1)/(3T}- 1)- log(2)/log(3) (19)
number of stations in the RaT for each site. The resulting
stationsare then expected to be very similar in extreme flow g = 7.859c+ 2.955c2 (20)
response to the site of interest. Options 2 and 3 represent
variations on the contrastingapproachto RaT formulation in c•= (2T•- 1)g/{F(1
+ g)(1- 2-g)} (21)
that a comparatively large number of stationsare included in
se = 1 + c•{r(1 + g)- 1}/g (22)
each ROT (for option 3, all stations) and the weighting
function is then used to reflect the relative proximity of where F( ) signifiesthe gamma function and each parame-
stations. Methods of selecting parameter values for each of ter shouldbe regarded as having an index i associatedwith it
the options, in keeping with the modeling philosophies referring to the station for which the set of parameters were
indicated above, will be discussedin a subsequentsegment calculated. Hosking et al. [1985] indicate that the above
of this paper. equationsprovide satisfactory parameter estimates for -0.5
With the definition of the station membership for each -< g -< 0.5. A dimensionless growth curve can then be
region of influence and the determination of the weight estimated from
assignedto each station in the RaT, it is possibleto estimate
at-site extremes incorporating information from all stations
in the RaT. The methodology for combining information
from all of the included stations will necessarily be some- ß
+ [_,og
(,__;)
}
what specific to the distribution function selected for ex- where x• is the estimate of the dimensionlessT-year flow
treme flows and the parameter estimation technique used. In value for site i. An estimate of the T-year event for any site
this work, the generalized extreme value (GEV) distribution can be obtained from
2260 BURN' REGIONALFLOODFREQUENCYANALYSIS

X•= M•c• (24) stationsin the data network. As previously noted, all of the
attributes should be related to the extreme flow responseat
whereX• is theT-yearflowat sitei, andM• is themean the station, but, as well, reliable estimates of the attribute
annual flood for site i. values should be obtainable from the available data base
which comprises the annual flow record and limited infor-
MONTE CARLO EXPERIMENT mation describingphysical features of the contributing drain-
age area. The set of candidate attributes consisted of the
Experimental Design following: (1) the coefficient of variation (CV) of the annual
flow series, (2) a plotting position estimate of the 10-year
A common problem in evaluating methods for estimating
flood quantiles is that the available data record represents flow event (Q10) interpolated from the available annual flow
series, (3) a variation on the Pearson skewness (PS) measure
only one realization of what could be regarded as a stochas-
defined as
tic flood generation process. The "true" value for any flood
quantile at a particular location is therefore inherently un-
knowable. Thus one must often resort to Monte Carlo PS = • (25)
sampling to evaluate the relative performance of different
flood estimators. A disadvantage of this approach arises where Ix is the mean, m is the median, and cris the standard
from the need to specify the form and parameters of the deviation, of the annual flow series, (4) the skewness coef-
parent extreme flow distribution for all sites. To avoid ficient (SK) of the annual flow series with a bias correction
arbitrariness in this process, it is essentialthat the selected for data set length [Kite, 1977], and (5) the drainage area
parent distributions be representative of conditions that (DA) of the basin contributing to the flow at the gauging
could occur (see, for example, Lettenmaier et al. [1987]). station.
One possible approach to selectingrealistic parent distribu- From the candidate attributes, a reduced set of attributes
tion function characteristics is to allow the available data
were identified by comparing the attributes with an at-site
record to suggestparent distributions for each station under estimate of the 100-year flow event obtained from the annual
consideration. This approach, previously used by Burn flow series assuming the GEV distribution. Two pieces of
[ 1988], involves calculating distribution parameters for every information are required from the selection process. The set
site, and setting the "parent" parameters equal to the of attributes to include must be identified and a relative
calculated values. Sample sets of annual flow data are then importance must be assigned for each attribute. To accom-
generated for each site using the assumed parent parameter plish both of these tasks, the correlation between the 100-
values and the length of record observed at the site. Gener- year event and each attribute was calculated, which led to
ated data sets for all sites are then used to evaluate the
the selection of CV, Q10, and PS as the attributes. The
performance of the ROI options outlined in the previous selected weighting values corresponded to the observed
section and to compare this approach with a traditional correlation between the attribute and the extreme flow
regionalization procedure involving fixed regions and also estimate. In addition to calculating correlations, the three
with results from using all available stations. selected attributes were confirmed as desirable measuresby
The settingup of the Monte Carlo experiment involves the plotting the attribute values versus the extreme flow esti-
following steps:(1) Choose a set of gaugingstations,and for mates. The resulting plots indicated essentially linear rela-
each station in the network, estimate the at-site parameters tionships for each of the selected attributes and were also
for the GEV distribution; these parameters values will define useful for identifying "unusual" stations which showed up
the parent distribution for the station. (2) Select a set of as outliers on the scatter diagrams.
attributes to define station similarity following the procedure
outlined below. (3) Determine parameter values for each of
the ROI options based on the station similarity measure and Parameters for ROI Options
the emphasis of the particular option. (4) Determine charac-
The parameters for the ROI options are selected consid-
teristics of the traditional regionalization approachfollowing
ering the philosophyincorporated into the individual options
the approach of Burn [1989].
and the characteristics of the gauging stations that make up
the data network. An important part of the basisfor choosing
Data Set Network Description parameter values is the matrix of distance metric values
The data set used to evaluate the ROI technique consisted which contains the weighted distance from every station to
of 45 gauging stations located on natural rivers in southern every other station. While the diagonal elements of this
Manitoba. The drainage area for the sites included in the matrix are zero, the terms above the diagonal (or the terms
networkrangedfrom46to 4200km2 witha medianvalueof below the diagonal) include all observed nonzero distance
414 km2. The numberof yearsof recordat the gauging values. The elements above the diagonal were sorted by
stations ranged from 20 to 42 years with a median value of 25 magnitude, resulting in somethinganalogousto a distribution
years. Further information on the stations used to define the for weighted distance values between station pairs. Thus one
parent distributions for the network of stations is summa- could determine the median distance, the largest or smallest
rized in Table 1. distance, or a particular percentile of the distance values
_ (i.e., thee90th percenti!e distance value wouldAmply that
Attribute
Selection only 10%of thedistance values weregreater thanthe
selected value). Selected percentiles of the sorted distance
All of the attributes consideredas candidatesfor inclusion values were used.asa guideline for selectingthresholdvalues
in the distance metric must be readily available for all for the ROI options. The chosen percentiles acted as a
BURN: REGIONAL FLOOD FREQUENCY ANALYSIS 2261

TABLE 1. Characteristics of Sites Used in the Monte Carlo Experiment

Station Mean Flow, Coefficient of Skewness Drainage


Number m3/s Variation Coefficient Area, km2
1 37.9 0.757 2.83 1580.
2 46.5 0.830 3.38 697.
3 5.5 0.949 4.97 78.
4 11.6 0.846 1.27 165.
5 22.5 0.822 1.38 598.
6 38.8 0.969 3.52 344.
7 31.6 0.877 2.92 837.
8 21.3 0.676 1.73 2000.
9 49.6 0.803 3.27 974.
10 59.4 0.713 1.41 2110.
11 10.1 0.865 0.861 251.
12 43.1 0.722 1.09 1480.
13 17.9 0.786 1.40 471.
14 55.2 0.877 2.05 1140.
15 57.6 0.716 0.298 1900.
16 7.1 0.868 0.995 262.
17 8.0 0.872 1.75 73.
18 87.9 0.764 0.926 4220.
19 4.4 0.823 1.91 156.
20 18.3 0.776 1.18 1120.
21 2.9 0.911 2.66 88.
22 14.6 0.906 1.64 635.
23 39.5 0.929 1.23 1050.
24 6.8 1.00 1.43 153.
25 22.3 0.880 1.81 394.
26 2.2 0.883 1.38 399.
27 2.7 0.982 1.37 46.
28 18.2 1.04 1.20 453.
29 13.3 0.968 1.54 1060.
30 15.3 0.804 1.19 212.
31 28.0 0.715 1.50 572.
32 7.5 1.02 1.54 107.
33 2.5 0.996 1.70 75.
34 24.1 1.21 1.97 3210.
35 6.2 1.11 2.78 871.
36 17.6 1.22 2.15 389.
37 6.2 1.50 4.69 290.
38 6.1 1.15 2.76 132.
39 21.3 1.20 1.75 168.
40 11.2 1.23 2.52 1180.
41 8.5 1.28 3.00 262.
42 3.6 1.02 1.18 50.
43 2.0 1.09 2.00 370.
44 6.8 1.20 2.17 979.
45 6.3 1.03 1.70 414.

guideline only because intuitively the threshold values that are reasonably similar to the target station so the
should correspond to breakpoints in the array of distance weighting function values should be substantially different
values. The procedure was thus to choose a particular from zero even at the upper threshold. To incorporate this
percentile of the distancedistribution and then look for a gap behavior, the value of n was set to 2.5 and TP was given a
in the distribution close to the selectedpercentile. Although value corresponding to the 85th percentile of the distance
an element of judgment is required to select the percentile value distribution. In contrast, the weighting functions for
values to use for a particular threshold, preliminary investi- options 2 and 3 should give comparatively low weights to
gations have indicated that the methodology is not overly stations at the threshold since both of these options entail
sensitive to the values selected for the thresholds. including a large number of stationsin the ROI. As such, the
For the data set examined, the lower threshold parameter, value of n was set to 0.10 and TPP was also set to the 85th
0L, was set equal to the 25th percentile while the 75th percentile of the distance value distribution.
percentile was selectedfor the upper thresholdvalue, 0t•.
The target number of stations, NST, for an ROI was set at 15
Parameters for Identifying Fixed Regions
(one third of the available stations). The percentiles and
target values selected reflect the diversity of the stations that
The traditional regionalization approach, used as one
constitute the network for this data set. The parameter benchmark for comparing the performance of the ROI op-
valuesfor the weighting function,•/ij, werechosenconsid- tions, was based on clustering in the three-dimensional
ering the modeling approach taken with the individual op- attribute spaceused to define the distance metric for the ROI
tions. Option 1 involves an ROI containing only those sites approach. Within the clustering approach, several pieces of
2262 BURN' REGIONAL FLOOD FREQUENCY ANALYSIS

information are. required [Burn, 1989]. Each attribute form- TABLE 2. Performance Measures and Standard Errors (in
ing part of the distance metric requires a weight, reflecting Parentheses)
the relative importance of the attribute in defining basin Return Period, years
similarity. The correlation coefficientbetween each attribute
and the 100-year event was used as a weighting value in a Option 25 50 100 200
similar approach to that used to assignrelative importanceto RMSE
the attributes with the ROI technique. The number of 1 0.103 0.143 0.189 0.241
regions for the 45 stations in the network was set at three (0.00203) (0.00246) (0.00335) (0.00479)
following results from Burn [1989]. The division of the 2 O.907 O. 138 O. 185 0.240
stations based on attribute values derived from the at-site
(0.00165) (0.00227) (0.00334) (0.00488)
parent distribution function resulted in three regions which
3 0.089 0.123 0.161 0.203
each passed the regional homogeneity test described by
(0.00117) (0.00186) (0.00255) (0.00349)
Wiltshire [ 1986] indicating that the traditional regionalization
R 0.125 0.179 0.240 0.308
approach represents a reasonable data partitioning. To fur-
ther evaluate the fixed regions, a convenient, but simple, (0.00191) (0.00279) (0.00401) (0.00562)
measure of regional heterogeneity is the normalized regional R1 0.142 0.188 0.248 0.298
range in the CV values [Lettenmaier et al., 1987]definedas (0.00075) (0.00162) (0.00240) (0.00342)
BIAS
•(CV) 1 -0.025 -0.031 -0.035 -0.036
R*(CV) = • (26) (0.00062) (0.00089) (0.00119) (0.00152)
M(CV)
2 -0.014 -0.015 -0.014 -0.010
where R(CV) is the range of CV values for the region and (0.00060) (0.00089) (0.00121) (0.00156)
M(CV) is the median CV value for the region. Calculating
3 -0.006 -0.004 0.000 0.006
the normalized regional range for the three parent regions
(0.00062) (0.00094) (0.00130) (0.00168)
resulted in values of 0.356, 0.351, and 0.401 for regions 1, 2,
R -0.015 -0.015 -0.010 -0.001
and 3, respectively. The stations are listed in Table 1
(0.00059) (0.00089) (0.00123) (0.00161)
according to parent region membership with the first 19
stations constitutingregion 1, the next 14 comprisingregion R1 0.000 0.005 0.012 0.023
2, and the final 12 stations making up region 3. The normal- (0.00062) (0.00095) (0.00132) (0.00172)
ized regional range for the entire network of 45 stationsis
0.909 indicating again that the regionalization processyields
a reasonable partitioning of the stations.
stations
wereassigned
weighting
values,r/•/,of unityfor the
Simulation of Data Sets combination of at-site PWMs through (18). All of the options
were evaluated in terms of root mean squared error (RMSE)
With the characteristics of the various regionalization and bias as defined through
options defined, it is possible to generate data sets and
evaluate the approaches.To accomplishthis, 1000 samples 1 1 Q•-Q
of extreme flow data for each site were generated in accor-
dance with the parent parameter values and the number of RMSEr
=• k=l • /=1 Q•r (27)
years of annual flow data at the site. For each of the 1000
samples, it was necessaryto complete the following steps:
(1) Define the region of influence and weighting function
values for each station and each option in accordance with BIASr
=•-• • k=• j=• • Q•r (28)
the procedures outlined above. (2) Determine at-site esti-
mates for extreme flow quantiles for every site and compare
where RMSEr is the root mean squared e•or for return
with theoretical values. (3) Determine traditional fixed re- period T, NS is the number of sites in the data set, N is the
gions and estimate at-site extremes for every site based on
number of MonteCarlosamples, Q• is theestimate
forthe
regionalgrowth curvesand comparewith theoreticalvalues.
T yeareventat sitek forsample
j, Q• isthetheoretical
value
(4) Estimate at-site extremes using all 45 stations in the
for the T year event at site k, and BIASr is the averagebias
for return period T. The two performance measures were
network and again compare with the theoretical values.
calculated for each of the alternatives with the results
summarized in Table 2, along with standard e•ors for each
PRESENTATION OF RESULTS
performance measure estimate.
The relative performanceof the regionof influenceoptions The results in Table 2 indicate that the ROI options are
outlined above was evaluated in terms of measures of the uniformly better than the regional estimators in terms of
accuracy and precision of the estimatesof at-site extreme RMSE. In comparing ROI options, the close agreement
flow quantiles. In addition, the performance of the ROI between options 1 and 2, pa•icularly on the RMSE measure,
options was compared with the performance of the tradi- is interestingin that the two optionsrepresent very different
tional regionalestimatorbasedon clusteringand the estima- philosophiesfor defininga region of influence. In option 1,
tion usingatlwvaiI•he• is a regio• •n•y thosesitesthatarerelativelyclosein attributespaceto
estimatorwith one region.The regionalestimatorsinvolved the candidatesite are includedin the ROI, whereasoption2
includingall the stationsin each region for the at-site seeksto includemany siteswith correspondingly reduced
estimationof extremesfor every site in the region. All weightingsfor the moreremotesites.The resultsin Table 2
BURN: REGIONAL FLOOD FREQUENCYANALYSIS 2263

also illustratethe effect of includingall stationsin the ROI 0.4


for each site as is done for option 3. As could be expected,
the RMSE for this option is the best of all alternatives
investigatedas a resultof the efficientutilizationof informa- 0.3
tion from all available stations. In terms of the bias measure,
the differencesbetween the three ROI options are not as
dramatic,althoughoption3 wouldappearto be thebestin 02
termsof averagebias. •
The traditional approachof usingfixed regions,labeled as =,
option
R in Table2, results
in comparable
performance
to X 0.1
the ROI options in terms of average bias, but substantially •
poorer results in terms of RMSE. The network average •
results
implythatthestations
identified
using
theclustering• 00
technique represent reasonably homogeneousgroupings, as z
evidenced by the bias results that are not dramatically v'
r-r -0.1
different from the ROI options (especially options 1 and 2). •
However, the RMSE performancefor this alternativeis •_
muchpoorerthantheresults
fortheROI options
implying -0.2
that additional information can be productively used in the
estimation of the at-site extremes for the sites in the net-
work. The regional option involving including all sites in the -0.3

estimation of at-site extremes, labeled as R1 in Table 2,


yields favorable bias results but very poor results in terms of
RMSE. This canbe explainedby the fact that the entireset -0.4
of 45 stations does not constitute a homogeneousregion (as
evidenced by the value for the heterogeneity measure, for Fig. 2. Box plots (5, 25, 50, 75, and 95 percentiles) for bias and
the entire network, given above). Although this option RMSE for the 50-year return period event. Plots are included for
involves the inclusion of as many sites as any of the other (from left to right) ROI options 1, 2, and 3, as well as regional
options, the information included for estimation of at-site options R and R I.
extremes does not constitute sufficiently similar information
to result in an efficient estimation. It is noted that the relative
performancefor the R1 option improves for the longer return period events indicating that even dissimilar information can
enhancethe at-site estimation of the more extreme quantiles.
While Table 2 summarizes average performance measures
for the network of gauging stations, it is also beneficial to
0.4
consider the performance of the estimators at individual
sites. This is particularly appropriate since average perfor-
mance measurescan tend to mask very poor performance at
0.3
individual sites, especially for bias values where the combi-
nation of large negative and large positive biases can result in
an average bias of near zero. To examine this characteristic,
0.2
box plots of the performance measures for the 25-, 50-, 100-,
and 200-year events are presented in Figures 1 to 4, respec-
tively. Each figure presents a box plot for the five options
giving the 5, 25, 50, 75, and 95 percentiles of the distribution
of site performance measuresfor bias and RMSE. A perusal
of Figures 1-4 reveals that there are indeed differences in the
performance for the various options that were not apparent
z from the average values in Table 2. In terms of variability of
r-r -0.1
performance measure, options 1 and R are quite similar. This
is consistentwith the nature of these two options which both
involve including a limited number of stations, with a corre-
a_ BIAS RMSE
-0.2 spondinglyhigh degree of similarity, in the at-site estimation
of extremes. Options 2 and 3 result in somewhat more
variability in bias and RMSE as could be expected from the
-0.3 fact that these two options involve including information
from stationswith lesser degreesof similarity. Option R 1 can
be seen to result in extreme variability in performance
-0.4 measure response across the sites in the network. Although
option R1 is a better option for some sites than any other
Fig. 1. Box plots (5, 25, 50, 75, and 95 percentiles)for bias and
RMSE for the 25-year return period event. Plots are included for option is, it is also much worse than all of the options for
(from left to right) ROI options l, 2, and 3, as well as regional some of the other sites. This behavior can be explained by
options R and R1. considering the nature of this option. Since all stations are
2264 BURN' REGIONALFLOOD FREQUENCYANALYSIS

at-site quantiles in this case, it would be necessary to


identifya regionof influenceconsisting
of a set of gauged
stations with similar attributes to the ungauged site. Since
0.3
the site of interest is ungauged, it would be necessary to
definethe attributesin the similarity measurebasedsolely on
physical features of the contributing drainage area and other
available information such as precipitation inputs. A possi-
ble refinementof the basictechniquewouldbe to considera
more flexible region of influencesuchthat the stationsin the
ROI and/orthe weightingsare a function of the return period
of interest. For less frequently occurring events, it may be
0.0 advantageous to relax the criterion for including stations in
the ROI and to alter the weighting function as well.

CONCLUSIONS

The main findings of this research can be summarized as


follows:
-0.2
- BIAS P,MSE
1. The region of influence approach is a versatile and
efficient mechanism for combining extreme flow informa-
tion. The technique can result in improved quantile esti-
mates on a network average basis and also at individual
stations which may be characterized as difficult sites.
2. A seemingly robust approach to the ROI technique
involvesincludinga fairly large number of stationsin each
site's region of influence.The weightingvalues associated
Fig. 3. Box plots (5, 25, 50, 75, and 95 percentiles) for bias and
with such a set of stations can be' defined to reflect the lack
RMSE for the 100-year return period event. Plots are included for
(from left to right) ROI options 1, 2, and 3, as well as regional of similarity for some of the stations included in the ROI.
options R and R1. 3. The technique would appear to be fairly easily ex-
tendedto the case of ungaugedlocationswith necessary
revisions to the type of attributes selected for the distance
included in the at-site extreme flow estimation for all other metric. The procedure for identifying appropriate attributes
sites, it can be expected that this option will perform
relatively well for siteswhich are essentiallyaverage sites, in
terms of parent distribution characteristics. These sites will .512
0.4
benefit from incorporatinginformation from many other sites 506

becausefew sites will be dramatically different from the site


in question. At the other extreme, we have sites which are
0.3
atypical of the entire network, such as the sitesat the top or
bottomportion of Table 1. In this case,includingall of the
sites will be clearly inappropriate and large biases and O2
RMSEs canbe expectedfor thesesites. '
From the results presented above, it is clear that the RO!
approach is preferred to the regional approaches.To distin- 0.1
guish between the ROI options, both average performance
and the performance acrossall sites in the network shouldbe
considered. The performance of options 1 and 2 is quite 0.0
similar and clearly inferior to the performance for option 3,
in terms of network average performance measures. Al-
though option 3 exhibits somewhat more variability in per- -0.1
formance measure response, especially for bias, the differ-
ences are not dramatic. Option 3 also gives the lowest
average bias and lowest median bias of the ROI options. -02-

Thus option 3 shouldbe selectedas the preferred ROI option


due to its superior performance for the conditionsexamined
BIAS RMSE
herein. -03-

DISCUSSION

There are s•ve• avenues for•ther researchthat could


Fig. 4. Box plots (5, 25, 50, 75, and 95 percentiles) for bias and
be pursued to expand and refine the methodology outlined RMSE for the 200-year return period event. Plots are included for
herein. An obvious and fairly natural extension would be to (from left to right) ROI options 1, 2, and 3, as well as regional
consider the case of ungauged sites. For the estimation of options R and R1.
BURN: REGIONAL FLOOD FREQUENCYANALYSIS 2265

wouldbe identical
to thatoutlined
herein,butthesetof Hosking, J. R., J. R. Wallis, and E. F. Wood, Estimation of the
candidate attributes would be limited to measuresnot relying generalizedextreme-valuedistributionby the methodof proba-
on annual flow data.
bility-weighted
moments,Technometrics,
27(3),251-261,1985.
Kite, G. W,, Frequency and Risk Analysis in Hydrology, Water
ResourCesPublications, Littleton, Colo., 1977.
REFERENCES Lettenmaier, D, P., J. R. Wallis, and E. F. Wood, Effect of heter-
ogeneity on flood frequency estimation, Water Resour. Res.,
Acreman, M. C., Regional flood frequency analysisin the U.K.: 23(2), 313-323, 1987.
Recentresearch-newideas,report, Inst. of Hydrol., Wallingford, Potter, K. W., Research on flood frequency analysis: 1983-1986,
United Kingdom, 1987. Rev. Geophys., 25(2), 113-!18, 1987.
Acreman, M. C., and S. E. Wiltshire, Identificationof regionsfor Tasker, G. D., Comparingmethodsof hydrologicregionalization,
regional flood frequency analysis,(abstract), Eos Trans. AGU, Water Resour. Bull., 18(6), 965-970, 1982.
68(44), 1262, 1987. Wiltshire, S. E., Regionalfloodfrequencyanalysis,I, Homogeneity
Burn, D. H., Delineation of groupsfor regional flood frequency statistics,Hydrol. Sci. J., 31(3), 321-333, 1986.
analysis,J. Hydrol., 104, 345-361, 1988.
Burn, D. H., Clusteranalysisas appliedto regionalfloodfrequency, D.H. Burn, Department of Civil Engineering, University of
J. Water Resour. Plann. Manage. Div. Am. Soc. Civ. Eng., Manitoba, Winnipeg, Manitoba, Canada R3T 2N2.
115(5), 567-582, 1989.
Cook, B. G., P. Laut, M.P. Austin, D. N. Body, D. P. Faith, M. J.
Goodspeed, andR. Srikanthan,Landscapeandrainfallindicesfor (Received November 15, 1989;
predictionof streamflowsimilaritiesin the Hunter Valley, Aus- revised May 11, 1990;
tralia, Water Resour. Res., 24(8), 1283-1298, 1988. accepted May 24, 1990.)

View publication stats

Anda mungkin juga menyukai