Hydrological statistics
Mostly concernes with the statistical analysis of hydrological time series in relation to extremes, i.e. floods and droughts.
Extreme events
Want to know: What is the probability distribution that a flood of given size occurs? What is the size of a flood that belongs to a given design frequency? First question that must be answered is:
Qmax (m3/s)
8000
6000
4000
2000
0 1902
1912
1922
1932
1942
1952
1962
1972
1982
1992
2002
Year
In this case a single probability distribution can be assumed for the maximum values.
F ( y ) = Pr(Y y )
P ( y ) = Pr(Y y )
T ( y) =
1 1 = P( y ) 1 F ( y )
Recurrence time: the average number years between two consecutive flood events of a given size Note: the actual number of years between flood events of a given size is itself random.
i n +1 P( y ) = 1 F ( y ) 1 T ( y) = 1 F ( y) F ( y) =
. .
9140 9300 9372 9413 9510 9707 9785 9850 10274 11100 11365 11931 12280
. .
90 91 92 93 94 95 96 97 98 99 100 101 102
. .
0.87379 0.88350 0.89320 0.90291 0.91262 0.92233 0.93204 0.94175 0.95146 0.96117 0.97087 0.98058 0.99029
. .
0.12621 0.11650 0.10680 0.09709 0.08738 0.07767 0.06796 0.05825 0.04854 0.03883 0.02913 0.01942 0.00971
. .
7.9231 8.5833 9.3636 10.3000 11.4444 12.8750 14.7143 17.1667 20.6000 25.7500 34.3333 51.5000 103.0000
b ( y a ) exp(e b ( y a ) ) pdf: fY ( y ) = be
cpdf: FY ( z ) = exp(e b ( y a ) )
Gumbel paper
Taking the double logarithm of the cpdf:
ln[ FY ( y )] = ln[(exp(e b ( y a ) )] = e b ( y a ) ln{ ln[ FY ( y )]} = b( y a ) 1 yT = a ln{ ln[ FY ( y )]} 1 b T ( y) 1 1 yT = a ln{ ln[ ]} b T ( y)
i ]} n +1
Gumbel paper
i ]} n +1
or, plot maximum yi vs T(yi) on special Gumbel (double logarithmic) paper 2. Fit a straight line by eye or by regression to determina a and b.
Gumbel paper
Rhine data set
a = 5604, b = 0.0005877
Method of moments
Mean and variance of a Gumbel variate Y:
=a+
Y 0.5772 b
2 Y =
2
6b 2
1) Estimate mean mY and variance sY 2) Equate these to the above expressions 3) Solve for a and b:
2 1 6sY = 2 b
a = mY
0.5772 b
Method of moments
20000 18000 16000 14000
Qmax (m /s)
Coffee break
yT t95 sY (T ) yT yT + t95 sY (T ) t95 : the 95-point of the students t-distribution and the standard error of prediction sY (T ) estimated as:
1 ( x(T ) x ) 2 1 N 2 sY (T ) = 1 + + ( yi yi )2 N ( x(Ti ) x ) 2 N 1 i =1
Examples:
Pr(one or more flood events occur in 10 years) = 1-Pr(no events) = 1 0.99 10 = 0 .0956 .
Gumbel
Qmax (m /s)
(Y
i =1 n i =1
n 1
i +1
Yi ) 2
2
(Y Y )
i
Lower critical area. If larger than critical value: no evidence that data are dependent! Rhine data set: Q=2.071 ; upper critical value at =0.05: 1.618 -> no evidence for dependence
Qmax (m /s)
1920
1940
1960
1980
2000
Year
There is an upper critical area for the 0-hypothesis that the data follow the proposed distribution. Rhine data and proposed Gumbel distribution and 20 classes: 2 = 23.70 Rhine data and proposed lognormal distribution and 20 classes: 2 = 14.36 Lower boundary critical area for m-1 = 19 degrees of freedom and =0.05: 30.144 -> both distribution cannot be discarded!