NPGNonlinear Processes in GeophysicsNPGNonlin. Processes Geophys.1607-7946Copernicus GmbHGöttingen, Germany10.5194/npg-22-403-2015Verification against perturbed analyses and observationsBowlerN. E.neill.bowler@metoffice.gov.ukCullenM. J. P.PiccoloC.Met Office, Fitzroy Road, Exeter, EX1 3PB, UKN. E. Bowler (neill.bowler@metoffice.gov.uk)24July20152244034116December201313April201515June2015This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/This article is available from https://npg.copernicus.org/articles/22/403/2015/npg-22-403-2015.htmlThe full text article is available as a PDF file from https://npg.copernicus.org/articles/22/403/2015/npg-22-403-2015.pdf
It has long been known that verification of a forecast against the sequence
of analyses used to produce those forecasts can under-estimate the magnitude
of forecast errors. Here we show that under certain conditions the
verification of a short-range forecast against a perturbed analysis coming
from an ensemble data assimilation scheme can give the same root-mean-square
error as verification against the truth. This means that a perturbed analysis
can be used as a reliable proxy for the truth. However, the conditions
required for this result to hold are rather restrictive: the analysis must be
optimal, the ensemble spread must be equal to the error in the mean, the
ensemble size must be large and the forecast being verified must be the
background forecast used in the data assimilation. Although these criteria
are unlikely to be met exactly it becomes clear that for most cases
verification against a perturbed analysis gives better results than
verification against an unperturbed analysis.
We demonstrate the application of these results in a idealised model
framework and a numerical weather prediction context. In deriving this result
we recall that an optimal (Kalman) analysis is one for which the analysis
increments are uncorrelated with the analysis errors.
Introduction
Verification of forecasts is an important aspect in the development of those
forecasts. Any improvement in the forecasting system should be tested to
demonstrate that the forecasts are genuinely improved. Each forecast is
typically launched from an analysis state which is a combination of
observations with a previous short-range forecast from the system. A common
practice is to use the analysis from such a system as the truth against which
to verify (for instance see ). Since each analysis depends
on the forecasts from previous cycles this is a dangerous practice,
particularly at short forecast lead times . Nonetheless
the convenience of performing verification against a state which is available
on the model grid means that this remains a common practice with its
attendant problems (as observed in ).
One solution to the problem of verification against analyses is to verify
forecasts against observations. The observations do not depend on the
forecast, and so provide an independent measurement of the true state of the
system
Although any time correlation in observation errors can
create a correlation between forecast and observation errors.
. However,
observations themselves are contaminated by errors. Methods exist to account
for the effect of these errors on verification statistics
. However, these errors
are often poorly known, so accounting for their effect is difficult.
Additionally, there are often few conventional observations over the oceans,
which means that verification statistics can be blind to these areas.
As an alternative solution to these problems, we offer the idea of performing
the verification against a perturbed analysis.
Verification against perturbed analysis
We are looking to verify a forecast xf using the
root-mean-square (RMS) error. This forecast is a single realisation, and so
could either be a forecast from a deterministic system or an ensemble mean
forecast. Ideally one would verify this forecast against the true state of
the system xt, but this state is generally unknown.
Given that the truth is unknown we choose to verify instead some other state,
in this case an analysis. We consider that rather than having a single
analysis we have an ensemble of analyses and verify against a randomly chosen
analysis ensemble member. We assume that the analysis ensemble represents its
own errors correctly. Since we are considering mean-square errors, then we
only need this last statement to hold to second order; that is, we require
that
<|x‾a-xt|2>=<|xia-x‾a|2>,
where |x|2=xTx denotes the inner product
where T indicates the matrix transpose, and the angle brackets
<.> indicate the average over a large number of cases. The
ensemble states are denoted by xia where i is the ensemble
member number and the overbar (x‾) indicates the
ensemble mean.
Given the above definitions we consider the RMS error calculated against a
perturbed analysis, that is a randomly chosen member of the analysis
ensemble. The mean-square error of the forecast against this analysis is
<|xf-xia|2>=<|(xf-x‾a)-(xia-x‾a)|2>=<|xf-x‾a|2>+<|xia-x‾a|2>-2<(xf-x‾a)T(xia-x‾a)>.
In this case we are considering the verification against a given, chosen
ensemble member i, not against each ensemble member in turn. However, since
all ensemble members are typically exchangeable, this distinction is not
important. We do not include a time index in this notation since all
quantities are valid at the same time.
To continue the analysis, we consider that there exists the truth state,
xt, against which we would ideally conduct the
verification. Using this we expand one of the terms appearing on the
right-hand side of Eq. ():
<|xf-x‾a|2>=<|(xf-xt)-(x‾a-xt)|2>=<|xf-xt|2>+<|x‾a-xt|2>-2<(xf-xt)T(x‾a-xt)>.
Combining Eqs. () and (), we find that
<|xf-xia|2>=<|xf-xt|2>+<|x‾a-xt|2>+<|xia-x‾a|2>-2<(xf-x‾a)T(xia-x‾a)>-2<(xf-xt)T(x‾a-xt)>.
The last term in this equation can be further re-arranged:
<(xf-xt)T(x‾a-xt)>=<((xf-x‾a)+(x‾a-xt))T(x‾a-xt)>=<|x‾a-xt|2>+<(xf-x‾a)T(x‾a-xt)>.
We have previously assumed that the ensemble of analyses is ideal
(Eq. ). Using this assumption and substituting
Eq. () into Eq. (), various terms
cancel and we find
<|xf-xia|2>=<|xf-xt|2>-2<(xf-x‾a)T(xia-x‾a)>-2<(xf-x‾a)T(x‾a-xt)>.
So, if the last two terms in this equation are zero (or cancel), then we
would expect that verifying against a perturbed analysis would give the same
result as verification against the truth.
In the second to last term, the second bracket is the difference between a
random analysis ensemble member and the ensemble mean. If this term were
averaged over all the choices of the random member, then it is easy to see
that this term is zero, since the mean of the second bracket would be
precisely zero. If all the ensemble members are equivalent to each other,
then this term should disappear if the number of cases is large enough.
If the final term also vanishes, then we can consider that the
data-assimilation system is in some sense optimal. If the final term were not
zero, then it would be possible to make the ensemble mean analysis closer to
the truth by post-processing it using the difference
xf-x‾a. A
statistically optimal analysis will not benefit from post-processing in this
way because it is by design as close to the truth as possible, and so the
final term must also be zero. This is a somewhat different definition of an
“optimal” data assimilation scheme from the usual. This difference is
explored in more detail in Sect. .
Therefore, we conclude that verification against a perturbed analysis will
give the same RMS error as verification against the truth if the analysis
ensemble is ideal (the spread equals the error of the mean analysis) and the
analysis is statistically optimal (could not be improved by simple
post-processing). In a sense Eq. () is a simple
result, since we have assumed that the analysis ensemble correctly represents
the errors in the ensemble mean analysis. However, this re-arrangement allows
us to see that all that is required for perturbed analysis to be a good proxy
for the truth is for two cross-terms to be zero. The first of these is
straightforwardly zero; the condition for the second to be zero is more
challenging, as will be seen below.
Verification against perturbed observations
It might be thought that, since a true observation is statistically
indistinguishable from a random member of a set of perturbed observations,
then verification against perturbed observations would also be equivalent to
verification against the truth. However, we show that this is not the case.
Consider the final term in Eq. (). If we replace the
references to the analysis with the observations, then this term becomes
<(Hxf-y)T(y-Hxt)>
where y are the observations and
H is the observation operator, which we will assume to be linear
for simplicity. Now, we choose to define the observation using ϵo,
its departure from the truth
y=Hxt+ϵo.
Using this definition, we find
<(Hxf-y)T(y-Hxt)>=<(H(xf-xt)-ϵo)T(ϵo)>.
If we assume that forecast and observation errors are uncorrelated, then this
reduces to
<(Hxf-y)T(y-Hxt)>=-<(ϵo)Tϵo>=-Tr(R),
which is the trace of the observation error covariance matrix. Therefore
verification against perturbed observations will not give the same result as
verification against the truth.
Although the use of perturbed observations is unhelpful, it is possible to
subtract the estimated observation error from the RMSE calculated using
unperturbed observations. This has been used successfully by some authors
for instance see, but retains the limitation that
observations do not universally cover the globe.
Definitions of an optimal analysis
Earlier we indicated that an analysis for which <(xf-x‾a)T(x‾a-xt)>= 0
should be considered an optimal analysis, since it would not be possible
to improve this analysis by a simple post-processing. This is the same as
saying that the analysis increments are orthogonal to the analysis errors.
However, a more usual definition of an optimal analysis is one which uses the
Kalman gain in calculating the analysis state. In the following we will
demonstrate that these two definitions of an optimal analysis are equivalent.
The orthogonality of analysis increments and errors for an optimal filter has
been known for many years see for instance. We include
a derivation of this fact here as it highlights certain assumptions which
need to be made.
To calculate an analysis state we use the following formula:
xa=xf+K(y-Hxf).
In this equation and the following paragraphs we refer to xa and
xf without an overbar because this derivation can apply to any
forecast and analysis and not simply one coming from an ensemble system.
K is the gain matrix applied to the innovations – this does not
need to be the optimal (Kalman) gain. As in Eq. () the
observation is defined by its departure from the truth, ϵo. This
allows us to re-arrange Eq. () as
KH(xa-xt)=(I-KH)(xf-xa)+Kϵo.
We post-multiply this equation by (xf-xa)T and take
the average over a large number of cases. This yields
KH<(xa-xt)(xf-xa)T>=(I-KH)<(xf-xa)(xf-xa)T>+K<ϵo(xf-xa)T>,
where we have assumed that K and H are constant in
time. Note that in this equation the terms appear as
<xxT>, which is the outer product where
previously we have been dealing with terms like
<xTx>, which is the inner product. Now, to
deal with the terms on the right-hand side of this equation, we re-arrange
the analysis Eq. () to be
xf-xa=KH(xf-xt)-Kϵo.
We can square this equation, and take the average over a long time series to
give
<(xf-xa)(xf-xa)T>=KH<(xf-xt)(xf-xt)T>HTKT+K<ϵo(ϵo)T>KT,
where we have assumed that the forecast and observation errors are
uncorrelated. We re-write the forecast and observation covariance matrices
using their usual terms B and R to give
<(xf-xa)(xf-xa)T>=KHBHTKT+KRKT.
Returning to Eq. () we may multiply this by
ϵo to get the estimate of the second term as
<ϵo(xf-xa)T>=<ϵo(xf-xt)T>HTKT-<ϵo(ϵo)T>KT.
If we assume that forecast and observation errors are uncorrelated, then we
find that
<ϵo(xf-xa)T>=-RKT.
Substituting Eqs. () and ()
into Eq. () we find that
KH<(xa-xt)(xf-xa)T>=(I-KH)(KHBHTKT+KRKT)-KRKT.
Expanding the right-hand side and cancelling terms, we get
<(xa-xt)(xf-xa)T>=BHTKT-KHBHTKT-KRKT.
In Eq. () we have not made any assumption about the form
of K, and the terms labelled B and R are the
true forecast- and observation-error covariance matrices. Previously we
argued that Eq. () is zero if the gain matrix is equal to
the Kalman gain. So, we substitute the Kalman gain K=BHT(HBHT+R)-1
for some of the terms in Eq. () to give
<(xa-xt)(xf-xa)T>=BHT-BHT(HBHT+R)-1HBHT-BHT(HBHT+R)-1RKT=0.
So, if we assume that the gain used in the data assimilation is optimal, then
the key cross-term in Eq. () is zero. This is one
of the conditions required for verification against a perturbed analysis to
give the same RMS error as verification against the truth.
Now, Eq. () states that the outer product of the
analysis errors with the analysis increment is zero. However, for the
verification against a perturbed analysis to be a suitable substitute for
verification against the truth we require the inner product of these two
terms to be zero. If we have two vectors y and x then stating
that the average of the outer product of these vectors is zero,
<yxT>= 0, is the same as stating that
<yixj>=0foralli,j.
If the inner product is to be zero, then we require that
<yTx>=<∑i=1Nyixi>=0.
This demonstrates that Eq. () implies that
<(xf-x‾a)T(x‾a-xt)>= 0.
In this calculation the forecast xf is the one used in
calculating the new analysis. Given that the analysis referred to in the last
term of Eq. () is an ensemble mean, then
xf should be the ensemble mean background forecast to the data
assimilation. That is, we must re-write Eq. () as
<|x‾f-xia|2>=<|x‾f-xt|2>-2<(x‾f-x‾a)T(xia-x‾a)>-2<(x‾f-x‾a)T(x‾a-xt)>≈<|x‾f-xt|2>,
where x‾f is the ensemble-mean background for the
ensemble data assimilation. Thus the above argument does not apply to
deterministic forecasts or longer lead time forecasts. The issue of longer
lead times is discussed further in Sect. .
This derivation also informs how the analysis ensemble is created. Following
Eq. () the update of the ensemble mean will follow
x‾a=x‾f+K(y-Hx‾f),
where K is the optimal (Kalman) gain matrix. In
Sect. we assumed that the analysis ensemble
perturbations are drawn from the same distribution as the analysis errors.
One way to ensure this is to update each ensemble member
according to
xia=xif+K(y+yi-Hxif),
where yi is a perturbation to the observations created using the
(true) observation error covariance matrix, R. Note that in both
the above equations K is the Kalman gain calculated using the true
(unknown) background and observation error covariance matrices. This matrix
is approximated in the ensemble Kalman filter and ensemble-variational
methods used with geophysical models . In
the following tests we use K=BKT(HBHT+R)-1
with B and R fixed.
Testing using a simple model
A toy-model data assimilation system was created to test whether the above
assumptions can hold in an idealised context. For this, the logistic map was
used see for instance. The logistic map is a
single-variable chaotic map, iterated according to
xn+1=Cxn(1-xn),
where C is a constant. The basin of attraction for this map is the
range (0, 1), and states x> 1 will diverge towards infinity. The map
is chaotic when C> 3.57 (approx.) and has a Hausdorff dimension of
about 0.538 . In our experiments we choose
C= 3.7 as for this value the map exhibits chaotic behaviour.
We initialise an ensemble by randomly choosing states in the interval (0, 1).
The logistic map is applied to each member to create a forecast ensemble. The
forecast ensemble is transformed into an analysis ensemble by each member
assimilating a perturbed observation. The observations are created by adding
a perturbation to the run of the truth model. These perturbations are
distributed according to ∼N(0, 0.001). The perturbed observations
are created from the observations by adding a perturbation sampled from the
same distribution. The assimilation always uses a fixed background error
variance, B, and we test the formulas derived above by varying the value of
B. A fixed B is a poor approximation to the true background errors. This
assimilation will not be optimal and we may find that
<(xf-x‾a)T(x‾a-xt)>
is non-zero. We examine this later. Observations are assimilated every time
step and Eq. () is used to iterate both the ensemble members
and the truth run. The first 2000 assimilation cycles are rejected as a
spin-up period. Analysis states which fall outside the basin of attraction
are reset to lie within it. The assimilation is run for a further
200 000 assimilation cycles and 400 ensemble members are used. Confidence
intervals were calculated using the bootstrap method assuming each of the
assimilation cycles gives an independent sample of the analysis error. Since
we use a long run the estimated confidence intervals are very narrow, and
correspond approximately to the line width in the plots. Therefore these are
not shown in order to aid clarity. In order to be consistent with the results
of the previous section the only forecasts verified are the ensemble mean
background forecasts. All results shown here have used the logistic map.
Similar results have also been found with the models of .
Figure shows the RMS background-forecast and
analysis errors as a function of B. When B is small the forecast and
analysis errors (dark blue line and red line, respectively) are large and the
system is sub-optimal for these values. Verification against a perturbed
analysis gives a systematically lower RMS error (RMSE) than verification
against the truth (dark blue line) for small values of B, since
insufficient weight is given to the observations. The RMS error for
verification against a perturbed analysis becomes equal to that when
verifying against the truth for moderate values of B (∼ 0.049). This
point is also where the RMS error crosses the diagonal, indicating that the
background errors used in the assimilation are equal to the actual background
errors, and the assimilation is optimal. Verification against observations
gives RMS errors which are systematically higher than all the other
estimates. If observation errors are accounted for, then verification against
observations becomes very similar to verification against the truth (not
shown). Verification against unperturbed analyses gives smaller RMSEs than
all the other methods.
The circles in Fig. indicate the point at which
the RMS errors are minimised for each curve. The minimum RMSE for
verification against analysis (purple line) is a value of B around 0.026
which is much lower than the optimal value of B for verification against
the truth. The black line shows verification against perturbed analyses and
the minimum RMS error for this curve is when B is around 0.03. This is much
larger than the value of B for the minimum RMS error for verification
against (unperturbed) analyses. However, this value of B, around 0.03, is
much lower than the optimal (Kalman) value of around 0.049. When verifying
forecasts against the truth (dark blue line) the minimum value of the
forecast error is found for B around 0.036, lower than the optimal (Kalman)
value. This statement may seem counter-intuitive – the lowest forecast error
is found when the value of B used in the analysis is not equal to the
forecast error. However, recall that the logistic map is a non-linear map and
that the Kalman filter is only optimal for linear models. We have found a
similar result with other models (the models of ). For both these models the forecast error is minimised when the
value of B used is larger than the actual forecast error (the value given
for the Kalman filter). For the logistic map the value of B which minimises
the analysis error is around 0.044, closer to the Kalman value than for the
forecast error – this appears to be a result consistent across the different
models.
RMS error of the forecast and analysis using the logistic model as a
function of the background error standard deviation used in calculating the
analysis. The red and blue lines show the RMSE for the analysis and forecast
measured against the truth state. The other lines show the RMSE of the
forecast, when verified against a different proxy for the truth. Verification
is calculated over 200 000 analysis and forecast cycles.
The vertical line in Fig. is the point at
which the cross-term <(x‾f-x‾a)T(x‾a-xt)>
(last term of Eq. ) is zero. We can see that
this vertical line is at approximately the same value of B where the
forecast and background errors are equal. This cross-term is plotted in
Fig. , as the solid green line, as a function of B. Also
plotted is the correlation between the forecast and analysis errors
<(x‾f-xt)T(x‾a-xt)>
(blue dashed line). This is non-zero for all the values
of B run in these experiments. This demonstrates the problem in verifying
against an unperturbed analysis that for all the values of B used here the
errors in the forecast are correlated with the errors in the analysis.
Important cross-terms calculated from a long analysis cycle using
the logistic model, as a function of the background error standard deviation
used in calculating the analysis. These are <(x‾f-xt)(x‾a-xt)> (blue
dashed) and <(x‾f-x‾a)(x‾a-xt)>
(green solid).
One of the conditions required for verification against perturbed analyses to
give similar results to verification against the truth is for the analysis
ensemble spread to equal the RMS analysis errors (Eq. ).
The analysis and forecast ensemble spread and error is plotted
in Fig. . The ensembles appear to be well calibrated
for most values of B. This may change if model error were introduced into
the system.
RMS error and ensemble spread of the forecast and analysis using the
logistic model, as a function of the background error standard deviation used
in calculating the analysis. The ensembles were created by each ensemble
member using the same assimilation method, assimilating perturbed observations.
Considering the effects of ensemble size
Next, we consider whether these results change substantially if fewer
ensemble members are used. Results with a 10 member ensemble are shown in
Fig. . This figure is rather similar to
Fig. , with the most notable difference being
that the vertical line no longer meets where the other lines cross.
RMS error of the forecast and analysis as plotted in
Fig. , but using an ensemble with only 10 members.
To understand how ensemble size can affect the results, we need to return to
estimates of the analysis error and spread. In Eq. () we
relied on a cancellation of the analysis ensemble spread with the error of
the ensemble mean. For a limited-size ensemble this cancellation does not
hold precisely. As has been shown by the RMS error of an
ensemble mean is slightly increased by effects related to the limited
ensemble size. To show the limitations consider that the true state and each
ensemble member are a random draw from the same distribution which has mean
μ and variance σ2. We can thus write the truth as the mean of
this distribution plus a deviation from the mean
xt=μ+s,
where <s>= 0 and <s2>=σ2. For an analysis
ensemble member we would have
xia=μ+wi,
where wi is a random draw from the same distribution as s. Thus we may
write the ensemble mean as
x‾a=μ+w‾=μ+1N∑i=1Nwi.
We see that w‾ has mean zero and variance σ2/N where N
is the ensemble size. Using this showed that the
mean-square error of the ensemble mean is
<(x‾a-xt)2>=<w‾2-2w‾s+s2>=σ2(1+1N),
since <w‾s>= 0. Due to the fact that the ensemble mean is
not exactly equal to the mean of the distribution, the error of the ensemble
mean is slightly larger than the variance of the distribution. This is a
standard mathematical result for instance seep. 128.
Using a similar argument lets us now consider the ensemble perturbations
<(xia-x‾a)2>=<wi2-2wiw‾+w‾2>.
From the definition of w‾ and recalling that the wi's are
independent samples we see that
<wiw‾>=1N<wi2>=σ2N
and so
<(xia-x‾a)2>=σ2(1-1N).
So, the ensemble spread is slightly smaller than the variance of the
distribution due to correlations between deviations of the ensemble mean from
the distribution mean and the perturbations. This is often accounted for by
using the unbiased estimator of the ensemble spread. Putting all this
together, we find that for a well-calibrated ensemble
<(x‾a-xt)2><(xia-x‾a)2>=N+1N-1.
As the ensemble size goes to infinity this ratio tends to 1 and
Eq. () holds. However, for a limited ensemble size these
differences mean that verification against analysis is not the same as
verification against the truth, even when the other conditions hold. This
could be corrected for if the analysis spread is known.
Longer lead times
As was discussed in Sect. the argument that the final term in
Eq. () is zero requires the forecast being verified to
be the background for the analysis. However, we might expect that this term
is zero for longer lead times, since otherwise it should be possible to
produce a superior analysis. To investigate this further we turn to the
simple model tests used earlier.
Verification for longer lead times using the system described in
Sect. are given in Fig. . This shows the
ratio of the RMSE measured against truth to the RMSE measured against
perturbed analyses. This line is plotted for two choices of B. When the
Kalman value of B is used the two verifications give the same RMS error at
the first lead time (i.e. where the forecast is the background for the
analysis). At longer lead times the RMS error when verifying against a
perturbed analysis becomes larger than when verifying against the truth. This
is caused by the final term in Eq. () giving a
positive contribution to the verification against perturbed analysis. The
interpretation is that
xt-x‾a and
x‾f-x‾a
are positively correlated – errors in the analysis are anti-correlated with
differences between the forecast and the analysis. The correlation of
analysis errors with forecast-analysis differences may be related to the use
of a nonlinear model. The nonlinearity can lead to non-randomness of the
errors which leads to the correlation.
Ratio of the RMS errors of forecasts verified against truth and
perturbed analyses using the logistic map for various lead times. For the
solid line the background error was taken as the approximate Kalman value.
For the dashed line B was taken for the value which minimises the
short-period forecast error.
Also shown in Fig. is the ratio when B is chosen to be
the value which gives the minimum forecast error – for the logistic map this
value is lower than the Kalman value for B. In this case verification
against perturbed analysis gives smaller RMSEs than verification against the
truth at short lead times. At longer lead times the verifications cross over
and the RMSE against perturbed analyses is greater than the RMSE against the
truth.
This behaviour at long lead times suggests that verification against a
perturbed analysis is most useful at short lead times. Nonetheless it avoids
the worst problems of verification against an unperturbed analysis.
Therefore, we argue that it is still a useful replacement for that method of verification.
Verification of NWP forecasts
In order to understand whether this method can be applied to numerical
weather prediction (NWP) systems we calculated the RMS error of a forecast
ensemble mean against observations and perturbed analyses. The RMS error
against analyses was calculated at observation locations so that the
quantities are directly comparable.
RMS errors of MOGREPS ensemble mean as a function of forecast lead
time for forecasts of 500 hPa geopotential height. The forecast errors are
reported for verification against observations and perturbed and unperturbed
analyses.
Figure shows the RMS error of the forecast ensemble mean as a
function of lead time for 500 hPa geopotential height for the Met Office
Global and Regional Ensemble Prediction System, MOGREPS .
At the time the forecasts were taken the MOGREPS ensemble consisted of a
random sample of 11 members selected from 22 perturbed members used to cycle
the ETKF every 6 h, plus the control member. The time average has been taken
over 1 month of data. The different panels in Fig. represent
means over different geographical areas: Northern Hemisphere, tropics,
Southern Hemisphere and the whole globe. Each panel shows the RMS error of
the ensemble mean against the unperturbed analysis in red, the perturbed
analyses in black, and the observations in blue, in green against the
observations when the observation errors are accounted for. An observation
error of 9.4 m (RMS) has been assumed.
Verification against observations gives RMS errors which are systematically
higher than all other estimates, while verification against unperturbed
analyses provides smaller RMS error than verification against observations
and perturbed analyses. This is in agreement with
Fig. . The exception is for the Southern
Hemisphere, where the error against observations becomes smaller than the
estimates against analyses after T+ 60 h. When observation errors are
accounted for, the verification against the observations is very similar to
the verification against perturbed analyses from T+ 0 h to
T+ 36 h for the Northern and Southern hemispheres, while for longer
lead times it gives lower RMS errors. This does not happen in the tropics
since it is likely that verification includes the contribution of systematic
errors which are not accounted for in the analysis perturbations. This is
expected since 500 hPa geopotential height does not provide a good
representation of what happens in the tropics.
The consistency of the RMS errors for short lead times in the northern and
southern extra-tropics when calculated against perturbed analyses and
observations (when subtracting observation error) suggests that this ensemble
meets many of the required criteria. At longer lead times the RMS error
against perturbed and unperturbed analyses gives larger errors than for
verification against observations, when subtracting observation error. This
is consistent with the results in Fig. – when
analysis and forecast errors are no longer correlated the effect of analysis
error is to over-estimate the RMSE.
Conclusions
We have shown that verification against a perturbed analysis gives the same
RMS errors as verification against the truth, under certain conditions. These
conditions require that the analysis ensemble is ideal (its RMS spread
matches the RMS error in the mean analysis), that the analysis is optimal and
that the ensemble size is large. Although NWP data assimilation systems are
typically well tuned (to maximise forecast performance), none of these
conditions is likely to hold exactly in practice. Additionally, the above
results only apply to a forecast which is the background for the analysis
against which it is verified.
In spite of these limitations we believe that this may be a useful approach
to verification. Firstly it will give more realistic results than
verification against an unperturbed analysis in most situations. Secondly the
alternative is to verify against observations and explicitly account for the
effect of observation error. Given the difficulty in estimating observation
error and the fact that many parts of the world are sparsely observed, this
has its own limitations. The verification results for NWP forecasts indicate
it gives very similar results to verification against observations, when
observation error is accounted for, for short lead times in the
extra-tropics. Given that the problems of verification against unperturbed
analyses are most pronounced at short lead times, our method is potentially
valuable for verification of short-term NWP forecasts.
It would be interesting to further explore some of the aspects of this
method. For instance, what is the effect of using an analysis ensemble which
is over-spread in some areas and under-spread in others? This study also
demonstrated that for a non-linear model the Kalman filter solution may not
minimise the system's forecast error. We feel that a better understanding of
this result would be beneficial.
Acknowledgement
The analysis of limited ensemble size came about through discussion with
Jonathan Flowerdew. Rob Darvell gave extensive assistance in the verification
of the NWP forecasts.
Edited by: O. Talagrand
Reviewed by: five anonymous referees
References
Berre, L., Stefanescu, S., and Pereira, M.: The representation of the analysis
effect in three error simulation techniques, Tellus A, 58, 196–209, 2006.
Bowler, N. E.: Explicitly accounting for observation error in categorical
verification of forecasts, Mon. Weather Rev., 134, 1600–1606, 2006.
Bowler, N. E.: Accounting for the effect of observation errors on verification
of MOGREPS, Meteorol. Appl., 15, 199–205, 2008.Bowler, N. E., Arribas, A., Mylne, K. R., Robertson, K. B., and Beare, S. E.:
The MOGREPS short-range ensemble prediction system, Q. J. Roy. Meteorol.
Soc., 134, 703–722, 2008.
Buizza, R., Houtekamer, P., Toth, Z., Pellerin, G., Wei, M., and Zhu, Y.: A
comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems,
Mon. Weather Rev., 133, 1076–1097, 2005.
Candille, G. and Talagrand, O.: Impact of observational error on the validation
of ensemble prediction systems, Q. J. Roy. Meteorol. Soc., 134, 957–971, 2008.
Ciach, G. J. and Krajewski, W. F.: On the estimation of radar rainfall error
variance, Adv. Water Resour., 22, 585–595, 1999.
Clayton, A. M., Lorenc, A. C., and Barker, D. M.: Operational implementation
of a hybrid ensemble/4D-Var global data assimilation system at the Met
Office, Q. J. Roy. Meteorol. Soc., 139, 1445–1461, 2013.
Evensen, G.: Sequential data assimilation with a nonlinear quasi-geostrophic
model using monte-carlo methods to forecast error statistics, J.
Geophys. Res.-Oceans, 99, 10143–10162, 1994.
Grassberger, P. and Procaccia, I.: Measuring the strangeness of strange
attractors, Physica D, 9, 189–208, 1983.
Hoel, P. G.: Introduction to mathematical statistics, 5th Edn., Wiley, 1984.
Houtekamer, P., Lefaivre, L., Derome, J., Ritchie, H., and Mitchell, H.: A
system simulation approach to ensemble prediction, Mon. Weather Rev., 124, 1225–1242, 1996.
Kailath, T.: An innovations approach to least-squares estimation, Part I:
linear filtering in additive white noise, IEEE T. Autom. Control, 13, 646–655, 1968.
Lorenz, E. N.: Deterministic nonperiodic flow, J. Atmos. Sci., 20, 130–148, 1963.
Lorenz, E. N.: Predictability: a problem partly solved, in: Proceedings of
the seminar on predictability, vol. I, ECMWF, Reading, Berkshire, UK, 1–18, 1995.
Peitgen, H. O., Jurgens, H., and Saupe, D.: Chaos and Fractals: New Frontiers
of Science, Springer-Velag, New York, 1992.
Saetra, O., Hersbach, H., Bidlot, J.-R., and Richardson, D. S.: Effects of
observation errors on the statistics for ensemble spread and reliability,
Mon. Weather Rev., 132, 1487–1501, 2004.
Weigel, A. P.: Ensemble forecasts, in: Forecast verification: A practitioner's
guide in atmospheric science, edited by: Jolliffe, I. T. and Stephenson,
D. B., Wiley-Blackwell, Chichester, England, p. 144, 2011.