Anda di halaman 1dari 70

ATMOSPHERIC DATA ASSIMILATION

By Roger Daley*
Journal of the Meteorological Society of Japan, Vol. 75, No. 1B, pp.319-329, 1997 (Manuscript received 23 May 1995, in revised from 15 February 1996) *Naval Research Laboratory, 7 Grace Hopper Avenue, Monterey CA 93943-5502, USA

DEFINITION
Data assimilation is an analysis technique in which the observed information is accumulated into the model state by taking advantage of consistency constraints with laws of time evolution and physical properties.
-F. Bouttier and P. Courtier

OBJECTIVE (1)
to produce a regular, physically consistent 4 dimensional representation of the state of the atmosphere from a heterogeneous array of in situ and remote instruments which sample imperfectly and irregularly in space and time.

-Roger Daley

OBJECTIVE (2)
to provide a dynamically consistent motion picture of the atmosphere and oceans, in three space dimensions, with known error bars. -M. Ghil and P. Malanotte-Rizzoli

OBJECTIVE (3)

Extracts the signal from noisy observations (filtering)


Interpolates is space and time (interpolation) Reconstructs state variables that are not sampled by the observation network (completeness)
-R.Daley

KEEP IN MIND (1)


What is the purpose of the DA weather prediction, physical understanding, signal detection, environmental monitoring, etc? What are the physical characteristics of the phenomenon of interest? What are its temporal and spatial characteristics and what relations exist between state variables? What are the characteristics of other physical phenomena which might obscure the desired signal?

KEEP IN MIND (2)


What are the characteristics of the observing system? Is the observing system largely under the control of the scientist (as in field experiment) or is it given? Is it possible to influence the design of the observing system, can DA techniques be used in the observation system design?

KEEP IN MIND (3)


All models and observations are approximate The resulting analyses will be approximate The observations must be combined in some optimal fashion

It is better to have enough observations to overdetermine the problem


The model is used to provide the preliminary estimate The final estimate should fit the observations within their (presumed) observation error.

MAXIMUM LIKELIHOOD ESTIMATION (1)


Zero dimensional/scalar case and a define variable x Observation x o and a forecast x f (produced by a model) Observation error: o= x o x Model error: f= x f x These errors are assumed to be random, unbiased, normally distributed.

MAXIMUM LIKELIHOOD ESTIMATION (2)


A variable w/c is normally distributed with mean 0 and variance 2 has a probability distribution () = 2
0.5 _1exp(2/22)

The joint probability distribution of errors is (0, ) = 2


0.5 _1exp(2/2 2

/2 2 2 )

error variance

MAXIMUM LIKELIHOOD ESTIMATION (3)


The maximum likelihood estimate x a = x f + 2(2 + 2)-1(x o - x f) The analysis is the weighted mean of the observed and forecasted values. Analysis error: a= x a x The analysis error is unbiased and the inverse of its variance is: Note that 2 is smaller than 2and 2

(1/2) = (1/2) + (1/2)

MINIMUM VARIANCE ESTIMATION


Unbiased linear estimate of x x e = cox o + cfxf(co, cf non-negative) co + cf = 1 Unbiased linear estimate error:e= x e x Expected error variance of x:

<(e)2> = (co)22+ (cf)22

Best Linear UnibiasedEstimate x e = (2x o + 2x f )(2+ 2)-1

MAXIMUM LIKELIHOOD ESTIMATE VS. MINIMUM VARIANCE ESTIMATE


The maximum likelihood estimate x a = x f + 2(2 + 2)-1(x o - x f) -finds the mode Best Linear UnibiasedEstimate x e = (2x o + 2x f )(2+ 2)-1 -finds the mean When the error probabilities are normally-distributed as in . _ () = 2 0 5 1exp(2/22) , the mean and the mode and the minimum variance and maximum likelihood estimates are the same.

THE L2 NORM

(1)

In most meteorological practice, L2 norms are used because they lead to linear analysis equations.

L2 norm estimation yields the mean, L1 estimation gives the median and Lestimation determines the mid-range.

THE L2 NORM

(2)

Example: estimation based on 5 observations. Assume that each observation has the same observation error variance 2 The 5 observation values: -22.5, 1.1, 1.2, 1.3 and 650 It seems likely that there must have been severe measurement problems in the first and last observations.

THE L2 NORM
The median value is 1.2 (L1) The mid-range value is 313.95 (L)

(3)

The mean value of the observations is 126.3 (L2)

In this example, minimization with respect to the L1 norm gives the most credible estimate. The L1 norm is much superior to the L2 norm when it comes to detecting and removing gross errors.

In atmospheric data assimilation, there are situations where errors are not normally distributed.

THE L2 NORM

(4)

However, in meteorological practice, a quality control procedure is used prior to analysis in order to remove gross errors.
Under these circumstances, the use of an L2 norm is more justifiable.

QUALITY CONTROL OF OBSERVATIONS

All observing systems have their limitations, problems, and failures, resulting in the reported measurements being sometimes incorrect. Such data must be identified and rejected by the data assimilation system in order to avoid corruption of the analysis. Due to the amount of data handled this is done by automatic routines, both in the form of preprocessing and during the data assimilation stage.
Xiang-Yu Huang and Henrik Vedel

The systems used to quality control the observational data include Bad reporting practice check.

Blacklist check. For stations which are found to always report erroneous data.
Gross check. Against some limits, e.g. from climatology. Background check. Based on the deviation between the observation and the expectation based on a short term forecast. Buddy check. Checking against nearby observations.

Redundancy check.
-Xiang-Yu Huang and Henrik Vedel

THREE-DIMENSIONAL SPATIAL ANALYSIS

THE VECTOR CASE (1)


Assume that the observation network coincides with the analysis/forecast grid and the observed variable is the same as the analyzed/forecast variable. Column vector of forecast values: xf error vector: ef = xf x Column vector of analyzed values: xa error vector: ea = xa x Column vector of observation: xo error vector: eo = xo x True state: x

THE VECTOR CASE (2)


Assume that the observation and forecast error are unbiased, normally-distributed and not mutually correlated. That is,
<e0(ef)T> = <ef(e0)T> = 0 T is the matrix transpose.

THE VECTOR CASE (3) THE COST FUNCTION


The maximum likelihood estimate is obtained by minimizing the cost function: J = 0.5[xo-xa]TR-1[xo-xa] + 0.5[xf-xa]T[Pb]-1[xf-xa]
Where

R = <eo(eo)T> is the observation error covariance matrix Pb= < ef(ef)T> is the forecast error variance matrix

These matrices have real, positive eigenvalues and the inverses exist.

THE VECTOR CASE (4) THE COST FUNCTION


Minimizing the cost function gives: xa = xf + Pb[Pb + R]-1[xo-xf]
Xa is the maximum likelihood estimate Xf is the background, prior or first-guess (forecast from earlier information)

The analysis error covariance Pa = < ea(ea)T> is given by [Pa]-1 = R-1 + [Pb]-1

The main diagonal elements of Pa are the analysis error variances at each grid point.

SCALAR VS. VECTOR


Scalar: x a = x f + 2(2 + 2)-1(x o - x f) Vector: xa= xf + Pb[Pb + R]-1[xo-xf]

Note that the observation and forecasted error covariancesR= <eo(eo)T> and Pb= <ef(ef)T> correspond to 2and 2 in the scalar case.

GENERAL OBSERVATION NETWORKS (1)


In general, the observations are not located at the analysis gridpoints. Observed variables need not be the same as the forecast/analyzed variables.

GENERAL OBSERVATION NETWORKS


(2) THE FORWARD OPERATOR

We define the operator H(x) (forward or observation model) as that operator which maps the forecast/analyzed variables on the analysis grid to the observed variables at the observation location. e.g. The forecast variable might be the temperature and the observed variable might be the radiance seen by an orbiting radiometer. The operator H would combine both spatial interpolation together with the appropriate form of the radiative transfer equation.

GENERAL OBSERVATION NETWORKS


(3) THE MODIFIED COST FUNCTION

Orig: J = 0.5[xo-xa]TR-1[xo-xa] + 0.5[xf-xa]T[Pb]-1[xf-xa] Mod: J = 0.5{[yo-H(xa)]TR-1[yo-H(xa)] + 0.5[xf-xa]T[Pb]-1[xf-xa]


Reason: the observed and forecast variables are not necessarily the same. Observed variable as yo and forecast variable as xf.

GENERAL OBSERVATION NETWORKS


(4) THE MODIFIED MAXIMUM LIKELIHOOD ESTIMATE

Orig: xa = xf + Pb[Pb + R]-1[xo-xf] Mod: xa = xf + PbHT[HPbHT+ R]-1[yo H(xf)]

Note: H (x) is frequently a non-linear operator, but it can be linearized by defining the tangent linear operatorH =
()

GENERAL OBSERVATION NETWORKS


(5) THE MODIFIED ANALYSIS ERROR COVARIANCE

Orig: [Pa]-1 = R-1 + [Pb]-1

Mod: [Pa]-1 = HTR-1H + [Pb]-1

THE OPTIMAL INTERPOLATION (OI) TECHNIQUES (1)


OI techniques are a special case of
xa = xf + PbHT[HPbHT+ R]-1[yo H(xf)] in which the forward interpolation operator H is not considered explicitly. Thus,
the matrix HPbHT is replaced by an estimate of the forecast error covariance between stations; and PbHT by the the forecast error covariance between observation stations and analysis gridpoints.

THE OPTIMAL INTERPOLATION (OI) TECHNIQUES (2)


OI is often simplified so that it does not produce a whole domain analysis, but rather a number of local analyses at each gridpoint or small grid volume.
OI methodology is sufficiently powerful to perform credible multivariate analysis that is, where the observations of one variable (temperature, say) are used in the analysis of another variable (wind, say).

This is done by incorporating linear diagnostic relations (such as geostrophic and hydrostatic balance) between two variables in the forecast error covariance Pb

OTHER METHODS
xa can be obtained by direct minimization of J using conjugate gradient or quasi-Newton procedures.
Direct minimization of J is referred to as a three dimensional variational (3DVAR)algorithm. Some advantages of 3DVAR over OI:
Can be done over the whole domain rather than locally Forward model (or its transpose) can be used for a direct assimilation of satellite radiances

TEMPORAL ASPECTS

The analysis xa = xf + PbHT[HPbHT+ R]-1[yo H(xf)] is a combination of the observations with a forecast.
Where did this forecast (and forecast error covariance) come from?

Forecasts are produced by models from analyses at an earlier time and we will incorporate the model explicitly into the equations.

THE FORECAST/ANALYSIS CYCLE (1)


We modify the notation to introduce forecast x and vectors at time analysis x 1 We define a model, which marches forward in time from time 1 to time
x = M (x1 )

THE FORECAST/ANALYSIS CYCLE


(2) THE MODIFIED MAXIMUM LIKELIHOOD ESTIMATE

Orig: xa = xf + PbHT[HPbHT+ R]-1[yo H(xf)]


= x + b T [ b T + ]-1[y o (x )] Mod: x

Remember:
x = M (x1 ) o (x )innovation/observation increment y

x correction/analysis increment x
b forecast error covariance

THE FORECAST/ANALYSIS CYCLE


(3) THE CYCLE

DYNAMICALLY-GENERATED ANALYSIS WEIGHTS


(1) THE KALMAN FILTER

While the forecast x is responsive to all the complexities of atmospheric flow simulated by the model, the forecast b specified in the OI and 3DVAr error covariances algorithms are completely insensitive to the flow. Modern assimilation techniques attempt to generate the analysis weights dynamically, explicitly using the model M.

DYNAMICALLY-GENERATED ANALYSIS WEIGHTS


(2) THE KALMAN FILTER

One way of doing this, is through an explicit evolution equation for the forecast error covariance. Defining the tangential linear model

() and

assuming that model M is imperfect, the forecast error at time tn is related to the analysis error covariance at time tn-1 by
b a = T +

Where Qn= <n(n)T>is the model error covariance and n is the model error

DYNAMICALLY-GENERATED ANALYSIS WEIGHTS


(3) THE KALMAN FILTER

The fundamental equation of the Kalman filter algorithm:


b = a T +
(an equation for the propagation of second moment error statistics)

The non-linear form of Kalman filter is referred to as extended Kalman filter (EKF) The EKF is a sequential algorithm that makes use of the past and present observations.

Statistical error moments higher than the second are generated.

DYNAMICALLY-GENERATED ANALYSIS WEIGHTS


(4) THE KALMAN FILTER

It is necessary to invoke a moment closure assumption to produce a tractable algorithm. This closure assumption may cause EKF to diverge from the observations, if they are too sparse. The fixed-lag Kalman smoother (FLKS) is an algorithm that makes use of the future observations. The FLKS is applied to non-forecast problems such as environmental monitoring or climate studies where it is not necessary that the forecast/analysis cycle be run in strict real time. <<Kalman filter example>>

DYNAMICALLY-GENERATED ANALYSIS WEIGHTS


(5) THE 4 DIMENSIONAL VARIATIONAL ALGORITHM

The 4DVAR algorithm is implementable for global forecast models The algorithm naturally makes use of future information. It assumes that the model is perfect.

It is appropriate for a fixed time interval and does not naturally cycle in time.
To resolve this, it is necessary to incorporate the Kalman filter to compute the error covariance. <<click here for high frequency interference>>

THANK YOU

MATRIX TRANSPOSE
The transpose of a m by n matrix is defined to be a n by m matrix that results from interchanging the rows and columns of the matrix. The transpose of a matrix is designated by the superscript T.

http://comp.uark.edu/~jjrencis/femur/Learning-Modules/LinearAlgebra/mtxdef/transpose/transpose.html <<back>>

THE INVERSE PROBLEM (1)


The inverse problem can be conceptually formulated as follows: Data Model parameters
The inverse problem is considered the "inverse" to the forward problem which relates the model parameters to the data that we observe:

Model parameters Data

THE INVERSE PROBLEM (2)


The objective of an inverse problem is to find the best model m such that (at least approximately) d = G(m) where G is an operator describing the explicit relationship between the observed data, d , and the model parameters. In various contexts, the operator G is called forward
operator, observation operator, or observation function.

In the most general context, G represents the governing equations that relate the model parameters to the observed data (i.e. the governing physics). <<back>>

SIMPLE KALMAN FILTER EXAMPLE (1)


SOURCE: http://credentiality2.blogspot.com/2010/08/simple-kalman-filter-example.html

Kalman filters are a way to take a bunch of noisy measurements of something, and perhaps also some predictions of how that something is changing, and maybe even some forces we're applying to that something, and to efficiently compute an accurate estimate of that something's true value.

SIMPLE KALMAN FILTER EXAMPLE (2)


Let's say we want to measure the temperature in a room. We think it's about 72 degrees, 2 degrees. And we have a thermometer that gives uniformly random results within a range of 5 degrees of the true temperature.
We take a measurement with the thermometer and it reads 75. So what's our best estimate of the true temperature?

Kalmanfilters use a weighted average to pick a point somewhere between our 72 degree guess and the 75 degree measurement.

SIMPLE KALMAN FILTER EXAMPLE (3)


Here's how we choose the optimal weight, given the accuracy of our guess and the accuracy of the thermometer:
temperature variance temperature variance + thermometer variance 0.29 = 2+5
2

weight =

If the weight is large (approaching 1.0), we mostly trust our thermometer. If the weight is small, we mostly trust our guess and ignore the thermometer. 29% weight means we'll trust our guess more than the thermometer, which makes sense, because we think our guess is good to 2 degrees, whereas the thermometer was only good to 5

SIMPLE KALMAN FILTER EXAMPLE (4)


Now we do the weighted average:
estimate = guess + weight*(measurement - guess) 72.87 = 72 + 0.29*(75-72) we went 29% of the way from 72 to 75 OR estimate = (1-weight)*guess + weight*measurement 72.87 = 0.71*72 + 0.29*75 we took 71% of the guess plus 29% of the measurement

SIMPLE KALMAN FILTER EXAMPLE (4)


how confident are we in our estimate of 72.87 degrees?
estimate variance = temperature variancethermometer variance temperature variance + thermometer variance 1.43 =
25 2+5

So we think our estimate is correct to 1.43 degrees we have a guess that the temperature in the room is 72.87 degrees, 1.43 degrees.

SIMPLE KALMAN FILTER EXAMPLE (5)


Now from the guess that the temperature in the room is 72.87 degrees, 1.43 degrees. And we still have a thermometer that tells the temperature 5 degrees. That's basically the situation where we started, so we can run the whole algorithm again: First we compute the weight, using our new, more accurate guess confidence: weight = temperature variance temperature variance + thermometer variance 0.22 =
1.43 1.43+5

SIMPLE KALMAN FILTER EXAMPLE (6)


We take a measurement, and this time let's say it comes up as 71 degrees.
Now we can compute the weighted average of our old guess and our new measurement:
estimate = (1-weight)*guess + weight*measurement 72.46 = 0.78*72.87 + 0.22*71

we took 78% of the guess plus 22% of the measurement

SIMPLE KALMAN FILTER EXAMPLE (7)


And the new confidence level:
estimate variance = temperature variancethermometer variance temperature variance + thermometer variance

1.11 =

1.435 1.43+5

So after the second measurement, we estimate that the actual temperature is 72.46 degrees, 1.11 degrees

SIMPLE KALMAN FILTER EXAMPLE (7)


In Kalman filters, we don't have to remember the whole history of measurements and estimates.
We just keep track of our most recent estimate and our confidence level in that estimate.

If we decide to turn on the air conditioner, so that the temperature in the room starts decreasing, we can expand our calculations to include that "control" variable, and use it to update our estimates by "predicting" how much colder it'll be at each measurement.
<<back>>

HIGH FREQUENCY INTERFERENCE (1)


The atmosphere has a number of timescales. In general, only limited frequency band will be important for a given atmospheric DA application. For synoptic and planetary scale forecast/analyses purposes, it is timescales of approximately one hour to one week which are of interest For mesoscale or convective scale modeling, the relevant timescales are minutes to hours For climate/environmental monitoring purposes, timescales from weeks to decades are of interest.

HIGH FREQUENCY INTERFERENCE (2)


In environmental modeling problem, it is desirable to be able to detect slow and very subtle changes in the environment (on decadal timescales) due to natural or anthropogenic causes. It is quite easy to obscure these changes if the analysis algorithm is altered on yearly or monthly timescales. Rapid algorithmic changes have been the rule To eliminate such effects on low frequency phenomena, years of data must be analyzed using a fixed algorithm

HIGH FREQUENCY INTERFERENCE (3)


In atmosphere, high frequency modes generally have relatively low amplitude. However, when models are integrated from analyses produced from observations by some analysis algorithm, high frequency oscillations of an amplitude much larger than observed in nature may be excited. These oscillations may completely obscure the phenomena of interest or may cause perfectly good observations to be rejected.

HIGH FREQUENCY INTERFERENCE (4)


Undesirable oscillations maybe suppressed in two ways:
Modify the governing equations of the model, so that they dont admit the undesirable high frequency solutions. Modify the analysis the initialization procedures

HIGH FREQUENCY INTERFERENCE (5)

Illustration of the effect of initialization.

EXAMPLE OF INITIALIZATION

THE MOMENTUM EQUATION

Horizontal gradient operator applied with pressure held constant Pressure is the independent vertical coordinate

THE THERMODYNAMIC EQUATION

Static stability parameter for isobaric coordinate. Sp = 5 x 10-4 K/Pa in the midtroposphere.

Sp is positive provided that lapse rate is less than dry adiabatic lapse rate

THE CONTINUITY EQUATION

Divergence of horizontal velocities which is taken at constant pressure

Omega vertical motion pressure change following the motion

No reference to the density field Does not involve time variations

THE PRIMITIVE EQUATIONS


V(dot) + f k + =
()

+ =

The LHS of the equations have been linearized about a state of rest with a domain-averaged temperature field.

INTIALIZATION
Primitive equations are written as: z (dot) + Wzz = Rz y (dot) + Wyy = Ry Notes:
Amplitudes of high frequency modes denoted by vector z and the corresponding frequencies by the diagonal matrix Wz Amplitudes of slower (Rossby) modes denoted by vector y and the corresponding frequencies by the diagonal matrix Wy Rz and Ry are projections of Rv and R onto the fast and slow modes, respectively and each is a non-linear function of both z and y

INTIALIZATION

Assume at initial time t = to, that z(dot)(to) = 0. This referred to as the Machenhauer condition and implies that,
z(to) = - Wz-1Rz(to) <<back>>

SUPPRESSION OF HIGH FREQUENCIES IN KF AND 4DVAR


KF

-high frequencies can be suppressed using normal mode theory of Q 4DVAR -most successful is using the Machenhauer condition as a constraint in the minimization of J = Jp + Jn (from n=0
to N)

Anda mungkin juga menyukai