It has long been known that verification of a forecast against the sequence of analyses used to produce those forecasts can under-estimate the magnitude of forecast errors. Here we show that under certain conditions the verification of a short-range forecast against a perturbed analysis coming from an ensemble data assimilation scheme can give the same root-mean-square error as verification against the truth. This means that a perturbed analysis can be used as a reliable proxy for the truth. However, the conditions required for this result to hold are rather restrictive: the analysis must be optimal, the ensemble spread must be equal to the error in the mean, the ensemble size must be large and the forecast being verified must be the background forecast used in the data assimilation. Although these criteria are unlikely to be met exactly it becomes clear that for most cases verification against a perturbed analysis gives better results than verification against an unperturbed analysis.

We demonstrate the application of these results in a idealised model framework and a numerical weather prediction context. In deriving this result we recall that an optimal (Kalman) analysis is one for which the analysis increments are uncorrelated with the analysis errors.

Verification of forecasts is an important aspect in the development of those
forecasts. Any improvement in the forecasting system should be tested to
demonstrate that the forecasts are genuinely improved. Each forecast is
typically launched from an analysis state which is a combination of
observations with a previous short-range forecast from the system. A common
practice is to use the analysis from such a system as the truth against which
to verify (for instance see

One solution to the problem of verification against analyses is to verify
forecasts against observations. The observations do not depend on the
forecast, and so provide an independent measurement of the true state of the
system

Although any time correlation in observation errors can create a correlation between forecast and observation errors.

. However, observations themselves are contaminated by errors. Methods exist to account for the effect of these errors on verification statisticsAs an alternative solution to these problems, we offer the idea of performing the verification against a perturbed analysis.

We are looking to verify a forecast

Given the above definitions we consider the RMS error calculated against a
perturbed analysis, that is a randomly chosen member of the analysis
ensemble. The mean-square error of the forecast against this analysis is

To continue the analysis, we consider that there exists the truth state,

We have previously assumed that the ensemble of analyses is ideal
(Eq.

In the second to last term, the second bracket is the difference between a random analysis ensemble member and the ensemble mean. If this term were averaged over all the choices of the random member, then it is easy to see that this term is zero, since the mean of the second bracket would be precisely zero. If all the ensemble members are equivalent to each other, then this term should disappear if the number of cases is large enough.

If the final term also vanishes, then we can consider that the
data-assimilation system is in some sense optimal. If the final term were not
zero, then it would be possible to make the ensemble mean analysis closer to
the truth by post-processing it using the difference

Therefore, we conclude that verification against a perturbed analysis will
give the same RMS error as verification against the truth if the analysis
ensemble is ideal (the spread equals the error of the mean analysis) and the
analysis is statistically optimal (could not be improved by simple
post-processing). In a sense Eq. (

It might be thought that, since a true observation is statistically indistinguishable from a random member of a set of perturbed observations, then verification against perturbed observations would also be equivalent to verification against the truth. However, we show that this is not the case.

Consider the final term in Eq. (

Although the use of perturbed observations is unhelpful, it is possible to
subtract the estimated observation error from the RMSE calculated using
unperturbed observations. This has been used successfully by some authors

Earlier we indicated that an analysis for which

To calculate an analysis state we use the following formula:

Now, Eq. (

In this calculation the forecast

This derivation also informs how the analysis ensemble is created. Following
Eq. (

A toy-model data assimilation system was created to test whether the above
assumptions can hold in an idealised context. For this, the logistic map was
used

We initialise an ensemble by randomly choosing states in the interval (0, 1).
The logistic map is applied to each member to create a forecast ensemble. The
forecast ensemble is transformed into an analysis ensemble by each member
assimilating a perturbed observation. The observations are created by adding
a perturbation to the run of the truth model. These perturbations are
distributed according to

Figure

The circles in Fig.

RMS error of the forecast and analysis using the logistic model as a function of the background error standard deviation used in calculating the analysis. The red and blue lines show the RMSE for the analysis and forecast measured against the truth state. The other lines show the RMSE of the forecast, when verified against a different proxy for the truth. Verification is calculated over 200 000 analysis and forecast cycles.

The vertical line in Fig.

Important cross-terms calculated from a long analysis cycle using
the logistic model, as a function of the background error standard deviation
used in calculating the analysis. These are

One of the conditions required for verification against perturbed analyses to
give similar results to verification against the truth is for the analysis
ensemble spread to equal the RMS analysis errors (Eq.

RMS error and ensemble spread of the forecast and analysis using the logistic model, as a function of the background error standard deviation used in calculating the analysis. The ensembles were created by each ensemble member using the same assimilation method, assimilating perturbed observations.

Next, we consider whether these results change substantially if fewer
ensemble members are used. Results with a 10 member ensemble are shown in
Fig.

RMS error of the forecast and analysis as plotted in
Fig.

To understand how ensemble size can affect the results, we need to return to
estimates of the analysis error and spread. In Eq. (

As was discussed in Sect.

Verification for longer lead times using the system described in
Sect.

Ratio of the RMS errors of forecasts verified against truth and
perturbed analyses using the logistic map for various lead times. For the
solid line the background error was taken as the approximate Kalman value.
For the dashed line

Also shown in Fig.

This behaviour at long lead times suggests that verification against a perturbed analysis is most useful at short lead times. Nonetheless it avoids the worst problems of verification against an unperturbed analysis. Therefore, we argue that it is still a useful replacement for that method of verification.

In order to understand whether this method can be applied to numerical weather prediction (NWP) systems we calculated the RMS error of a forecast ensemble mean against observations and perturbed analyses. The RMS error against analyses was calculated at observation locations so that the quantities are directly comparable.

RMS errors of MOGREPS ensemble mean as a function of forecast lead time for forecasts of 500 hPa geopotential height. The forecast errors are reported for verification against observations and perturbed and unperturbed analyses.

Figure

Verification against observations gives RMS errors which are systematically
higher than all other estimates, while verification against unperturbed
analyses provides smaller RMS error than verification against observations
and perturbed analyses. This is in agreement with
Fig.

The consistency of the RMS errors for short lead times in the northern and
southern extra-tropics when calculated against perturbed analyses and
observations (when subtracting observation error) suggests that this ensemble
meets many of the required criteria. At longer lead times the RMS error
against perturbed and unperturbed analyses gives larger errors than for
verification against observations, when subtracting observation error. This
is consistent with the results in Fig.

We have shown that verification against a perturbed analysis gives the same RMS errors as verification against the truth, under certain conditions. These conditions require that the analysis ensemble is ideal (its RMS spread matches the RMS error in the mean analysis), that the analysis is optimal and that the ensemble size is large. Although NWP data assimilation systems are typically well tuned (to maximise forecast performance), none of these conditions is likely to hold exactly in practice. Additionally, the above results only apply to a forecast which is the background for the analysis against which it is verified.

In spite of these limitations we believe that this may be a useful approach to verification. Firstly it will give more realistic results than verification against an unperturbed analysis in most situations. Secondly the alternative is to verify against observations and explicitly account for the effect of observation error. Given the difficulty in estimating observation error and the fact that many parts of the world are sparsely observed, this has its own limitations. The verification results for NWP forecasts indicate it gives very similar results to verification against observations, when observation error is accounted for, for short lead times in the extra-tropics. Given that the problems of verification against unperturbed analyses are most pronounced at short lead times, our method is potentially valuable for verification of short-term NWP forecasts.

It would be interesting to further explore some of the aspects of this method. For instance, what is the effect of using an analysis ensemble which is over-spread in some areas and under-spread in others? This study also demonstrated that for a non-linear model the Kalman filter solution may not minimise the system's forecast error. We feel that a better understanding of this result would be beneficial.

The analysis of limited ensemble size came about through discussion with Jonathan Flowerdew. Rob Darvell gave extensive assistance in the verification of the NWP forecasts. Edited by: O. Talagrand Reviewed by: five anonymous referees