CIVE 445 - ENGINEERING HYDROLOGY
CHAPTER 6: FREQUENCY ANALYSIS

The term frequency analysis refers to techniques whose objective is to analyze the occurrence of hydrologic variables within a statistical framework.
Frequency analysis can be used with rainfall or runoff data.
In engineering hydrology, frequency analysis is used to calculate flood discharges.
Frequency analysis is used for large catchments, because these are more likely to be gaged and have longer record periods.
For ungaged catchments, frequency analysis can be used in a regional context for hydrologically homogeneous regions.

Given n years of daily streamflow records for stream S, what is the maximum flow Q that is likely to recur with a frequency of once in T years on the average?
What is the maximum flow Q associated with a T-yr return period?
What is the return period T associated with a maximum flow Q?

6.1 CONCEPTS OF STATISTICS AND PROBABILITY

A random variable follows a certain probability distribution.
A probability distribution expresses in mathematical terms the relative chance of occurrence of each of all possible outcomes of the random variable.
An example of random variable and probability distribution is shown in Fig. 6-1.
A cumulative discrete distribution is shown in Fig. 6-2.

Properties of statistical distributions

The properties of statistical distributions are described by the following measures:

central tendency (first moment)
variability (second moment)
skewness (third moment)

The first moment is the arithmetic mean, which expresses the distance from the origin to the centroid of the distribution.

x_m = (1/n) Σ x_i

The mean is shown in Fig. 6-3 (a).
The median divides the probability distribution into two equal portions (or areas).
The median is shown in Fig. 6-3 (b).
The mode is the value that occurs most frequently.
The mode is shown in Fig. 6-3 (c).
The variance is:

s² = [1/(n-1)] Σ (x_i - x_m)²

The standard deviation s is the square root of the variance.
The standard deviation is shown in Fig. 6-3 (d).
The variance coefficient (or coefficient of variation) is:

C_v = s/(x_m)

The skewness is:

a = { (n-1) / [(n-1) (n-2)] } Σ (x_i - x_m)³

The skew coefficient is:

C_s = a/ s³

For symmetrical distributions, the skewness is zero, and C_s = 0.
For right skewness (tail to the right) C_s ⟩ 0.
For left skewness (tail to the left) C_s ⟨ 0.
The skew coefficient is shown in Fig. 6-3 (e).

Continuous probability distributions

The normal distribution has two parameters: (1) mean μ and (2) standard deviation σ.
The PDF of the normal distribution is:

f(x) = {1/[σ (2π)^1/2]} e ^{- (x - μ)² /(2σ²)}

By means of the transformation:

z = (x - μ) / σ

the normal distribution can be converted into a one-parameter distribution:

f(z) = [1/(2π)^1/2] e ^{- z² / 2}

z is the standard unit or frequency factor in the following:

x = μ + z σ

Integration of the PDF leads to the cumulative density function CDF:

F(z) = [1/(2π)^1/2] ∫ e ^{- u² /2} du

between the limits of - ∞ to z.
Table of values of F(z) as a function of z.
Example 6-2.
Solution.

Lognormal distribution distribution
The lognormal distribution substitutes y = ln (x) in the equation for the normal distribution.
The parameters of the lognormal distribution are the mean and the standard deviation of y: μ_y and σ_y.

Gamma distribution
The gamma distribution is used in many applications of engineering hydrology.
See Equations.

Pearson distributions
The Pearson distributions are used in many applications of engineering hydrology.
See Equations.

Extreme value distributions
These distributions (Type I, II, and III) are based on the theory of extreme values.
Extreme value theory implies that if a random variable Q is a maximum in a sample of size n from some population of x values, then provided n is sufficiently large, the distribution of Q is one of three asymptotic types, depending on the distribution of x.
The extreme value distributions can be combined and expressed as the Generalized Extreme Value (GEV), used in the UK and Europe.
The CDF of the GEV distribution is:

F(x) = e ^{- [1 - k(x - u)/α]^1/k}

in which k, u, and α are parameters.
For k = 0, the distribution reduces to the Type I (Gumbel).
For k less than 0, the distribution reduces to Type II (Frechet).
For k greater than 0, the distribution reduces to Type III (Weibull).
Gumbel has fitted the extreme value Type I distribution to long records of river flow from many countries.
The CDF of the Gumbel distribution is the double exponential:

F(x) = e ^{- e^-y}

In which y = (x - u)/α is the Gumbel (reduced) variate.
The mean and standard deviation of the Gumbel variate are functions of record length, as shown in Table A-8.
When the record length n approaches ∞, the mean approaches the value of the Euler constant (0.5772) and the standard deviation approaches the value π/(6)^1/2.
The skew coefficient of the Gumbel distribution is 1.14.

6.2 FLOOD FREQUENCY ANALYSIS

Selection of data series

The complete record of streamflows at a given station is called the complete data series.
To perform a flood frequency analysis, it is necessary to extract the flood series.
There are two types of flood series:

The partial duration series: Consists of floods whose magnitude is greater than a certain value (Peaks-Over-Threshold or POT).
The extreme value series: Consists of the series of annual maxima.

When the partial duration series is equal to the record length, the series is called the annual exceedence.
The difference between both series is marked for short records.
Annual exceedence is used for record lengths less than 10 yr.
Annual maxima is used for record lengths more than 10 yr.

Return period, frequency, and risk

The return period is the time elapsed between succesive peak flows exceeding a certain flow Q.
The relationship between probability of exceedence P(Q) and return period T is:

P(Q)= 1/T

The terms frequency and return period are used interchangeably, although strictly speaking, frequency is the reciprocal of return period.
A frequency of 1/T, or once in T years, corresponds to a return period of T years.
The probability of nonexceedence P(Q)_bar is the complementary probability of the probability of exceedence, defined as:

P(Q)_bar = 1 - P(Q) = 1 - (1/T)

The probability of nonexceedence P(Q)_bar in n succesive years is:

P(Q)_bar = [1 - (1/T)]ⁿ

The probability or risk R that Q will occur at least once in n succesive years is:

R = 1 - P(Q)_bar = 1 - [1 - (1/T)]ⁿ

Plotting positions

Frequency distributions are plotted using probability papers.
One of the scales is a probability scale; the other is either arithmetic or logarithmic.
Normal and extreme value probability distributions are often used.
Arithmetic probability paper: normal probability and arithmetic scale.
Log probability paper: normal probability and log scale.
Extreme value probability paper: extreme-value probability and arithmetic scale.
Data fitting a normal distribution plots as straight line on arithmetic probability paper.
Data fitting a lognormal distribution plots as straight line on log probability paper.
Data fitting a log Pearson III distribution with zero skewness plots as straight line on log probability paper.
Data fitting a log Pearson III distribution with nonzero skewness plots as a curve on log probability paper.
Data fitting a Gumbel distribution plots as straight line on extreme-value probability paper.
For a series of n annual maxima, the following ratio holds:

x_bar / N = m / ( n + 1)

in which

x_bar = mean number of exceedences;
N = number of trials,
n= number of values in the series,
m = rank of descending values, with largest equal to 1.

For example, if n = 79, the second largest value in the series (m = 2) will be exceeded twice on the average (x_bar = 2) in 80 trials (N = 80).
The largest value in the series (m = 1) will be exceeded once on the average (x_bar = 1) after 80 trials (N = 80).
Since return period T is associated with x_bar = 1:

1 / T = P = m / ( n + 1)

This is the Weibull plotting position formula.
A general plotting position formula is:

1 / T = P = (m - a) / ( n + 1 - 2a)

Blom formula, with a = 0.375 is appropriate for the normal distribution.
Gringorten formula, with a = 0.44 is appropriate for the Gumbel distribution.
Weibull formula, with a = 0 is appropriate for the uniform distribution.
Example 6-3.
Solution.

Frequency factors

Any value of a random variable may be represented in the following form:

x = x_bar + Δ x

The departure from the mean Δx can be expressed as:

x = x_bar + K s

where K is a frequency factor, and s is the standard deviation.

Log Pearson III Method

The Log Pearson III method of flood frequency analysis is described in Bulletin 17B: Guidelines for determining Flood Flow Frequency, published by the U.S. Interagency Advisory Committee on Water Data, Reston, Virginia.
To apply the methodology, the following steps are necessary:

Assemble the annual flood series: x_i
Calculate the logarithms of the annual flood series: y_i = log x_i
Calculate the mean y_bar, standard deviation s_y and skew coefficient C_sy of the logarithms.
Calculate the logarithms of the flood discharges: log Q_j for each of several chosen probability levels P_j using the following frequency formula:
log Q_j = y_bar + K_j s_y
in which K_j is the frequency factor, a function of the probability P_j and the skew coefficient C_sy (Table A-6).
Calculate the flood discharges Q_j by taking the antilogarithms of log Q_j.
Plot the flood discharges Q_j against probability levels P_j on log probability paper, with discharges in the log scale. Fit the data with a smooth curve. For C_sy = 0, the curve reduces to a straight line.
Example 6-4.
Solution.
Table 6-4.

Gumbel's Extreme Value Type I Method

The Extreme Value Type I or Gumbel method has been widely used in the U.S. and the world.
The method is a special case of the three-parameter GEV distribution described in the British Flood Studies Report.
The cumulative density function (CDF) F(x) (the probability of nonexceedence) of the Gumbel method is the double exponential function:

F(x) = e^{-e^-y}

In flood frequency analysis, the probability of interest is the probability of exceedence G(x):

G(x) = 1 - F(x) = 1 - e^{-e^-y}

The return period is the reciprocal of the probability of exceedence G(x):

1/T = = 1 - e^{-e^-y}

from which the Gumbel reduced variate y is:

y = - ln ln [T/(T-1)]

In the Gumbel method, values of flood discharge are obtained from the frequency formula:

x = x_bar + Ks

The frequency factor K is evaluated with the frequency formula:

y = y_bar_n + Kσ_n

in which y = Gumbel reduced variate, a function of return period;
y_bar_n and σ_n are the mean and the standard deviation of the Gumbel variate.
These values are a function of the record length (Table A-8).
In the previous equation, for K = 0, x is equal to x_bar, the mean annual flood.
Likewise, for K = 0, y_bar is equal to y_bar_n.
The limiting value of y_bar_n, for n approaching ∞ (infinity), is the Euler constant, 0.5772.
In the relation between y and T, for y = 0.5772, T = 2.33 years.
T = 2.33 years is taken as the return period of the mean annual flood.
The final Gumbel formula is:

x = x_bar + [(y - y_bar_n)/σ_n] s

To apply the Gumbel method, the following steps are necessary:

Assemble the annual flood series: x_i
Calculate the mean x_bar and standard deviation s of the flood series.
Use Table A-8 to determine the mean y_bar_n and standard deviation σ_n of the Gumbel variate as a function of the record length n.
Select several return periods T_j and associated probabilities P_j.
Calculate the Gumbel variates y_j corresponding to the return periods T_j
Calculate the flood discharge using the previous equation.
Plot the flood discharges Q_j against y_j or T_j or P_j on Gumbel paper, with discharges in the ordinates (arithmetic) scale. The data should fit a straight line.
Example 6-6.
Solution.

Comparison between flood frequency methods

In 1966, the Hydrology Subcommittee of the Water Resources Council began work on selecting a method of flood frequency analysis that could be recommended for use in the U.S.
The committee tested the following six distributions:

lognormal
log Pearson III
Hazen
gamma
Gumbel (EV1)
log Gumbel (EV2)

The committee recommended the log Pearson III method.
The same type of analysis was performed in the United Kingdom. The methods tested were:

gamma
log gamma
log normal
Gumbel (EV1)
GEV
Pearson III
log Pearson III

The committee found the GEV and log Pearson III methods to be the best.

6.3 LOW-FLOW FREQUENCY ANALYSIS

Sustained low flows can lead to droughts.
A drought is defined as a lack of rainfall so great and continuing so long as to adversely affect the plant and animal life of a region.
Drought refers to a period of unusually low water supplies, regardless of the demand.
The regions most subject to droughts are those with the greatest variability in annual rainfall.
Arid and semiarid regions are prone to recurrent droughts.
There is tendency for droughts to last more than one year, up to five years.
There is a need to study severity, duration, and frequency of droughts: See Characterization of drought, Link 3147 .
Low-flow frequency analysis can be used in the assessment of droughts or low flow, for purposes of water supply, hydropower, water quality, and inland navigation.
The analysis of low flow is made by abstracting the minimum flows over a period of several consecutive days.
For instance, for each year, the 7-day period with the minimum flow volume is abstracted.
A frequency analysis (using, for instance, the Gumbel method) results in a function describing the probability (or return period) of a certain average low flow value lasting a certain number of consecutive days.
Figure 6-7.

Go to Chapter 7.

040322