By Roger Daley*
Journal of the Meteorological Society of Japan, Vol. 75, No. 1B, pp.319-329, 1997 (Manuscript received 23 May 1995, in revised from 15 February 1996) *Naval Research Laboratory, 7 Grace Hopper Avenue, Monterey CA 93943-5502, USA
DEFINITION
Data assimilation is an analysis technique in which the observed information is accumulated into the model state by taking advantage of consistency constraints with laws of time evolution and physical properties.
-F. Bouttier and P. Courtier
OBJECTIVE (1)
to produce a regular, physically consistent 4 dimensional representation of the state of the atmosphere from a heterogeneous array of in situ and remote instruments which sample imperfectly and irregularly in space and time.
-Roger Daley
OBJECTIVE (2)
to provide a dynamically consistent motion picture of the atmosphere and oceans, in three space dimensions, with known error bars. -M. Ghil and P. Malanotte-Rizzoli
OBJECTIVE (3)
/2 2 2 )
error variance
THE L2 NORM
(1)
In most meteorological practice, L2 norms are used because they lead to linear analysis equations.
L2 norm estimation yields the mean, L1 estimation gives the median and Lestimation determines the mid-range.
THE L2 NORM
(2)
Example: estimation based on 5 observations. Assume that each observation has the same observation error variance 2 The 5 observation values: -22.5, 1.1, 1.2, 1.3 and 650 It seems likely that there must have been severe measurement problems in the first and last observations.
THE L2 NORM
The median value is 1.2 (L1) The mid-range value is 313.95 (L)
(3)
In this example, minimization with respect to the L1 norm gives the most credible estimate. The L1 norm is much superior to the L2 norm when it comes to detecting and removing gross errors.
In atmospheric data assimilation, there are situations where errors are not normally distributed.
THE L2 NORM
(4)
However, in meteorological practice, a quality control procedure is used prior to analysis in order to remove gross errors.
Under these circumstances, the use of an L2 norm is more justifiable.
All observing systems have their limitations, problems, and failures, resulting in the reported measurements being sometimes incorrect. Such data must be identified and rejected by the data assimilation system in order to avoid corruption of the analysis. Due to the amount of data handled this is done by automatic routines, both in the form of preprocessing and during the data assimilation stage.
Xiang-Yu Huang and Henrik Vedel
The systems used to quality control the observational data include Bad reporting practice check.
Blacklist check. For stations which are found to always report erroneous data.
Gross check. Against some limits, e.g. from climatology. Background check. Based on the deviation between the observation and the expectation based on a short term forecast. Buddy check. Checking against nearby observations.
Redundancy check.
-Xiang-Yu Huang and Henrik Vedel
R = <eo(eo)T> is the observation error covariance matrix Pb= < ef(ef)T> is the forecast error variance matrix
These matrices have real, positive eigenvalues and the inverses exist.
The analysis error covariance Pa = < ea(ea)T> is given by [Pa]-1 = R-1 + [Pb]-1
The main diagonal elements of Pa are the analysis error variances at each grid point.
Note that the observation and forecasted error covariancesR= <eo(eo)T> and Pb= <ef(ef)T> correspond to 2and 2 in the scalar case.
We define the operator H(x) (forward or observation model) as that operator which maps the forecast/analyzed variables on the analysis grid to the observed variables at the observation location. e.g. The forecast variable might be the temperature and the observed variable might be the radiance seen by an orbiting radiometer. The operator H would combine both spatial interpolation together with the appropriate form of the radiative transfer equation.
Note: H (x) is frequently a non-linear operator, but it can be linearized by defining the tangent linear operatorH =
()
This is done by incorporating linear diagnostic relations (such as geostrophic and hydrostatic balance) between two variables in the forecast error covariance Pb
OTHER METHODS
xa can be obtained by direct minimization of J using conjugate gradient or quasi-Newton procedures.
Direct minimization of J is referred to as a three dimensional variational (3DVAR)algorithm. Some advantages of 3DVAR over OI:
Can be done over the whole domain rather than locally Forward model (or its transpose) can be used for a direct assimilation of satellite radiances
TEMPORAL ASPECTS
The analysis xa = xf + PbHT[HPbHT+ R]-1[yo H(xf)] is a combination of the observations with a forecast.
Where did this forecast (and forecast error covariance) come from?
Forecasts are produced by models from analyses at an earlier time and we will incorporate the model explicitly into the equations.
Remember:
x = M (x1 ) o (x )innovation/observation increment y
x correction/analysis increment x
b forecast error covariance
While the forecast x is responsive to all the complexities of atmospheric flow simulated by the model, the forecast b specified in the OI and 3DVAr error covariances algorithms are completely insensitive to the flow. Modern assimilation techniques attempt to generate the analysis weights dynamically, explicitly using the model M.
One way of doing this, is through an explicit evolution equation for the forecast error covariance. Defining the tangential linear model
() and
assuming that model M is imperfect, the forecast error at time tn is related to the analysis error covariance at time tn-1 by
b a = T +
Where Qn= <n(n)T>is the model error covariance and n is the model error
The non-linear form of Kalman filter is referred to as extended Kalman filter (EKF) The EKF is a sequential algorithm that makes use of the past and present observations.
It is necessary to invoke a moment closure assumption to produce a tractable algorithm. This closure assumption may cause EKF to diverge from the observations, if they are too sparse. The fixed-lag Kalman smoother (FLKS) is an algorithm that makes use of the future observations. The FLKS is applied to non-forecast problems such as environmental monitoring or climate studies where it is not necessary that the forecast/analysis cycle be run in strict real time. <<Kalman filter example>>
The 4DVAR algorithm is implementable for global forecast models The algorithm naturally makes use of future information. It assumes that the model is perfect.
It is appropriate for a fixed time interval and does not naturally cycle in time.
To resolve this, it is necessary to incorporate the Kalman filter to compute the error covariance. <<click here for high frequency interference>>
THANK YOU
MATRIX TRANSPOSE
The transpose of a m by n matrix is defined to be a n by m matrix that results from interchanging the rows and columns of the matrix. The transpose of a matrix is designated by the superscript T.
http://comp.uark.edu/~jjrencis/femur/Learning-Modules/LinearAlgebra/mtxdef/transpose/transpose.html <<back>>
In the most general context, G represents the governing equations that relate the model parameters to the observed data (i.e. the governing physics). <<back>>
Kalman filters are a way to take a bunch of noisy measurements of something, and perhaps also some predictions of how that something is changing, and maybe even some forces we're applying to that something, and to efficiently compute an accurate estimate of that something's true value.
Kalmanfilters use a weighted average to pick a point somewhere between our 72 degree guess and the 75 degree measurement.
weight =
If the weight is large (approaching 1.0), we mostly trust our thermometer. If the weight is small, we mostly trust our guess and ignore the thermometer. 29% weight means we'll trust our guess more than the thermometer, which makes sense, because we think our guess is good to 2 degrees, whereas the thermometer was only good to 5
So we think our estimate is correct to 1.43 degrees we have a guess that the temperature in the room is 72.87 degrees, 1.43 degrees.
1.11 =
1.435 1.43+5
So after the second measurement, we estimate that the actual temperature is 72.46 degrees, 1.11 degrees
If we decide to turn on the air conditioner, so that the temperature in the room starts decreasing, we can expand our calculations to include that "control" variable, and use it to update our estimates by "predicting" how much colder it'll be at each measurement.
<<back>>
EXAMPLE OF INITIALIZATION
Horizontal gradient operator applied with pressure held constant Pressure is the independent vertical coordinate
Static stability parameter for isobaric coordinate. Sp = 5 x 10-4 K/Pa in the midtroposphere.
Sp is positive provided that lapse rate is less than dry adiabatic lapse rate
+ =
The LHS of the equations have been linearized about a state of rest with a domain-averaged temperature field.
INTIALIZATION
Primitive equations are written as: z (dot) + Wzz = Rz y (dot) + Wyy = Ry Notes:
Amplitudes of high frequency modes denoted by vector z and the corresponding frequencies by the diagonal matrix Wz Amplitudes of slower (Rossby) modes denoted by vector y and the corresponding frequencies by the diagonal matrix Wy Rz and Ry are projections of Rv and R onto the fast and slow modes, respectively and each is a non-linear function of both z and y
INTIALIZATION
Assume at initial time t = to, that z(dot)(to) = 0. This referred to as the Machenhauer condition and implies that,
z(to) = - Wz-1Rz(to) <<back>>
-high frequencies can be suppressed using normal mode theory of Q 4DVAR -most successful is using the Machenhauer condition as a constraint in the minimization of J = Jp + Jn (from n=0
to N)