Embedding reconstruction methodology for short time series - application to large El Nino events

. We propose an alternative approach for the embedding space reconstruction method for short time series. An m -dimensional embedding space is reconstructed with a set of time delays including the relevant time scales characterizing the dynamical properties of the system. By using a maximal predictability criterion a d -dimensional subspace is selected with its associated set of time delays, in which a local nonlinear blind forecasting prediction performs the best reconstruction of a particular event of a time series. An locally unfolded d -dimensional embedding space is then obtained. The efﬁciency of the methodology, which is mathematically consistent with the fundamental deﬁnitions of the local nonlinear long time-scale predictability, was tested with a chaotic time series of the Lorenz system. When applied to the Southern Oscillation Index (SOI) (observational data associated with the El Ni ˜ no-Southern Oscillation phenomena (ENSO)) an optimal set of embedding parameters exists, that allows constructing the main characteristics of the El Ni ˜ no 1982–1983 and 1997–1998 events, directly from measure-ments up to 3 to 4 years in advance.


Introduction
Predictability is a fundamental physical property of deterministic systems. Predicting the time evolution of a system is an important problem in many disciplines of science, such as economics, dynamical systems and weather forecasting. It is indeed a defiant problem if the only information available from the system comes from time series of some univariate experimental data. Numerous tools have been developed precisely for this purpose (see, e.g. Kantz and Schreiber, 2005). An important tool is the method Correspondence to: H. F. Astudillo (hastudil@udec.cl) of phase space (or state space) reconstruction introduced by Packard et al. (1980) and mathematically stated as the embedding theorem by Takens (1981), and also later by Sauer et al. (1991). The embedding theorem states that the dynamics of a physical system recorded in a time series s (t 1 ),s (t 2 ),...,s (t k )..,s (t n ) can be fully captured or embedded in the m-dimensional phase space defined by x(t k ) = [s (t k ),s (t k + τ 0 ),........,s (t k + (m − 1)τ 0 )], where τ 0 is a time delay. If the attractor is of dimension d A , then, given any time delay τ 0 , an embedding dimension m ≥ 2d A + 1 is required. Under that condition nearly all delay reconstructions are one to one and faithful, appropriately diffeomorphic to the original phase space. That is, under certain generic conditions the state space reconstruction is equivalent to the original phase space. This equivalence ensures that differential information is preserved and allows for both qualitative and quantitative analysis. However, these theorems are existence proofs (Abarbanel et al., 1993) and they do not directly explain how to get a suitable time delay or embedding dimension from a finite time series (Pecora et al., 2007). Implementation of the reconstruction method requires that the time delay and the dimension be determined by using heuristic and/or statistical criteria (for reviews Abarbanel et al., 1993;Schreiber, 1999; see also Cellucci et al., 2003;Letellier et al., 2008). It is important to stress that the embedding dimension required by the embedding theorem is a sufficient condition and could be larger than the necessary embedding dimension (Abarbanel et al., 1993). To find the necessary minimal embedding dimension Pecora et al. (1995Pecora et al. ( , 2007 proposed an alternative approach that reduced the problems of choosing all embedding parameters to one. This problem is addressable by using a single statistical test formulated directly from the reconstruction theorems, a continuity test. This is a global test in the sense that it uses information on the whole attractor. For the Lorenz system they found that three dimensions (for τ = [0,τ 0 ,4τ 0 ] with τ 0 = 14) are required to unfold the Lorenz attractor efficiently. This is indeed much better than Published by Copernicus Publications on behalf of the European Geosciences Union and the American Geophysical Union. the 16 components required with the fixed time delay embedding methodology (Pecora et al., 2007). Let's note that varying time delays for the Lorenz attractor were addressed also by Garcia and Almeida (2005), indicating that different time delays are more efficient than the conventional use of a fixed time delay. Variable embedding lag vectors have been also used in some global modelling strategies (see e.g., Judd and Mees, 1998).
Predictability is one of the system's properties that can be used to obtain a set of parameters for the embedding space reconstruction method (Sugihara and May, 1990;Casdagli et al., 1991;Alparslan et al., 1998). It's precisely by using the prediction power that Regonda et al. (2005) developed an approach based on local polynomial regression for ensemble forecasting of time series. It was applied to two kinds of time series: chaotic (Hennon and Lorenz) and observational data (Great Salt Lake and NIÑO3 Index). Later, Meng and Peng (2007) (see also Zhao et al., 2009), also using the prediction power criterion, proposed a new local linear prediction model to obtain optimal embedding parameters. They showed that its prediction performance was superior to the traditional local linear prediction. In addition they also indicated (for the Lorenz system) that the optimal parameters found, change accordingly with the initial conditions used. In line with previous results by Regonda et al. (2005), that indicated that the forecasts initiated from several contiguous starting points show a change in the local predictability.
The basic idea in non-linear model prediction relies on using adequately the information on the temporal evolution of orbits which lie on a compact attractor in phase space. Each orbit has near it a whole neighbourhood of points in phase space which also evolve under the dynamics to new points (e.g. Abarbanel et al., 1993). Using the information about how neighbours evolve, it is possible to use phase space information to construct a map. The mapping function can be estimated using local models in which the function approximation at each time step is done from data sets of the local neighbourhood only in a piecewise manner, or global models in which the function approximation is done for the whole domain (Abarbanel et al., 1993).
In this paper we report an alternative methodological approach for the embedding space reconstruction method for short time series. We restrict our study to local prediction models. Our proposition relies on the use of a maximal predictability criterion to reconstruct an embedding space. This is performed in the sense that the dynamical evolution of the system recorded in the experimental data can be reproduced during a given time span. Under this criterion, we also obtain an optimal embedding dimension by selecting the optimal d-dimensional subspace from a larger m-dimensional embedding space. The m-dimensional embedding space is reconstructed so that the relevant time scales that characterize the phenomenon are contained in the set of time delays of the embedding. We apply this methodology first to a time series from the Lorenz system with chaotic regime param-eters, then to an observational time series, a benchmark in climatic science, the Southern Oscillation Index (SOI), the main index describing the temporal evolution of the El Niño-Southern Oscillation (ENSO) phenomena. In Sect. 2 we describe the methodology and in Sect. 3 we show its theoretical basis, while in Sects. 4 and 5 we present the results for both the Lorenz system and SOI that are later discussed in Sect. 6.

Methodology
The first step of our methodology consists in reconstructing an m-dimensional embedding space by defining an mdimensional delay phase space vector with components x i = s(t +τ i ), where τ i = [i −1]τ 0 and τ 0 is a basic time scale. The embedding dimension m is chosen such that τ m = (m − 1)τ 0 is larger than the largest of the most relevant time scales characterizing the phenomenon. In case of unknown phenomena the relevant time scales can be identified through different time series methods, for example Fourier spectrum. For the time delay τ 0 we choose the zero autocorrelation time or the 68% decay time of autocorrelation of the time serie s (Abarbanel et al., 1993;Cellucci et al., 2003). Similar to the unified approach to resolve the attractor reconstruction problem proposed by Pecora et al. (2007), we seek a set of time delays τ j , or equivalently, a set of delay vector components x j (t) = s(t + τ j ) defined in a d-dimensional space, which is a subspace of the m-dimensional reconstructed embedding space. In this way, we obtain a vector of variable delay times, which is optimal because it allows a better description of the evolution of the system. To determine when the optimal embedding parameters are obtained, we use the maximal local predictability criterion. We seek, for the reconstruction of a particular event, to begin at some "starting" (or "present") point and reconstruct or finish p time steps into the "future". Under this criterion we assume that, when the whole selected event is optimally reconstructed with a local blind forecast technique, the attractor is optimally unfolded, at least in a neighbourhood of the "starting" point. The set of time delays τ j , with j = 1,...,d < m is the optimal set for time delays of a d-dimensional subspace of the m−dimensional embedding space.
To perform a blind forecast p time steps into the future we use a linear function (Farmer and Sidorowich, 1987) where t is the time step and the a ij coefficients are locally determined by considering k future near neighbours that belong to the near orbits (closest orbits). The forecasting time T = p t is large enough so that the whole event is included within the forecast time. To this end, it is important to notice that in order to evaluate the local topology of the embedding Nonlin. Processes Geophys., 17, 753-764, 2010 www.nonlin-processes-geophys.net/17/753/2010/ k k+1 k+2 n n+1 n−1 Fig. 1. A schematic view illustrating the collection of spatial neighbours and their time evolution in the state space. In the scheme, n corresponds to the starting point or present point, n − 1 is the nearest neighbour in the same orbit, n + 1 correspond to the forecast for the first step. The k's correspond to the forward neighbours of the nearest orbit. To forecast the next steps, the vector containing the indices of the close neighbours (small black dots) is incremented by one for each step. space, we use information contained only in past trajectories of the attractor. In particular, in the present study we use only the closest orbit. We use the term "blind forecast" in the following sense: at the starting point the nearest neighbour is selected in the embedding space by computing Euclidean distances from the starting point to each older point in the time series. The nearest neighbour is obtained as the minimum of the distance to the starting point. The determination of the nearest neighbour also determines the closest orbit. In our procedure, we do not determine the closest orbit again for the further steps of the forecast but we follow the closest orbit determined in the first step.
To monitor the accuracy of the reconstructions, we define the cumulative error between the event and its reconstruction by where σ s is the standard deviation of the time series and x d is the last embedding vector component. We seek the set of time delays that minimizes the cumulative error at the termination point, corresponding to the point when the event finishes.
The implementation of the proposed algorithm is as follows: 1. Select the event to be reconstructed.
2. Select the "starting" or "present" point and the termination point that marks the end of the event p time steps into the "future".
3. Reconstruct an m-dimensional embedding space with a set of time delays given by τ j = [j − 1]τ 0 , with j = 1,...,m, and τ 0 some basic time scale. The maximum time delay τ m has to be larger than the maximal relevant time scale of the phenomenon.
4. Compute a ij for each step using the least square method by feeding a number of near neighbours of the closest orbit (see Fig. 1).
5. Perform a blind forecast p time steps into the "future", for all subspaces of dimension d in the m-dimensional embedding space.
6. For each d-dimensional subspace, compute the cumulative error at the termination point.
7. Select the set of time delays (the optimal set of time delays) that minimizes the cumulative error at the termination point. The number of elements of the optimal set of time delays, d, is the optimal local embedding dimension.
Note that the application of this methodology requires the specification of the time interval to be analysed for the predictability of the event, the time prediction. Under this requirement blind forecasts are computed for a given dimension with all possible combinations of time delays.
In all the cases shown here, the first delay was scanned for delay times beginning with the time for which the autocorrelation is zero. In this way we avoid the reconstruction of a collapsed attractor. Other ways of choosing the lower limit for scanning the first delay can be easily studied. We focused our study on variations of the basic time scale. In our study we reduce the value of the basic time scale of the value for which the autocorrelation drops to 68%. As the results show, for the latter case, the methodology provides a better reproduction of the events. The method presented here may be improved by using nonlinear predictors (e.g. polynomial) and including weights to the nearest neighbours.
It is also important to stress that in our method the procedure starts first with the determination of the local phase space topology around the last known state when obtaining the nearest neighbour and the closest orbit. Forecasts of future states are obtained with a linear model following the nearest (older) trajectory on the attractor, which in essence provides information about the nonlinear characteristics of the local dynamics of the physical process.

Local predictability
To visualize this last point, let us recall some basic definitions on predictability in dynamic systems. Lyapunov exponents quantify predictability through globally averaged effective growth rates of uncertainty in the limits of large time and small uncertainty; thus by construction they are of limited use. To obtain a quantitative estimate of the accuracy of a particular forecast, the local dynamics of uncertainties about that initial condition are more relevant (Ziehmann et al., 2000). Local quantities provide a more detailed description of the dynamics since they probe the motion on short enough time-scales to distinguish the different states that the system passes through. In such situations, distributions of local Lyapunov exponents, namely the value of the Lyapunov exponents over finite segments of a trajectory, provide a better probe of the underlying nonuniform attractor. In particular, the local Lyapunov exponent can be negative even when the global Lyapunov exponent is positive (as on a typical chaotic attractor) or vice versa (on strange nonchaotic attractors) (Datta and Ramaswamy, 2003).
We follow and reproduce the definitions of linear and nonlinear finite time lyapunov exponents given by Ding and Li (2007). By definition, chaotic systems display sensitive dependence on initial conditions: two initially close trajectories can diverge exponentially in the phase space with a rate given by the largest Lyapunov exponent λ 1 . The lyapunov exponents are defined as follows: Let us consider an n-dimensional continuous-time dynamical system d dt where X = (x 1 ,x 2 ,...,x n ) T and F is an n-dimensional vector field. By performing an infinitesimal dispacement δ(t) = X(t) − X 0 (t) from the fiducial orbit X 0 (t), the linearized equations are given by where J(X) is the Jacobian matrix, and G(X,δ) are the high order nonlinear terms of the perturbations δ. For initial infinitesimal perturbation Eq. (4) is linearized dropping the nonlinear perturbation term. The linear propagator µ(X 0 ,t) is obtained integrating the linearized equations along a fiducial orbit X 0 (t), and δ µ (t) = µ(X 0 ,t)δ(0) gives the evolution of any infinitesimal initial error δ(0) forward for a time t to δ µ (t) (Ziehmann et al., 2000). Local or finite time lyapunov exponent is defined bỹ The largest lyapunov exponent is defined by: provides that the lyapunov exponent exists (Ott and Yorke, 2008). Because ergodicity the global largest lyapunov exponent do not depends on X 0 (Oseledec, 1968).
Nonlinear finite time lyapunov exponent are given by where the nonlinear propagator η(X 0 ,δ(0),t) is obtained integrating Eq. (4) along a fiducial orbit X 0 (t). Now, δ η (t) = η(X 0 ,δ(0),t)δ(0) gives the evolution of any initial error δ(0) forward for a time t to δ η (t). The mean nonlinear finite-time lyapunov exponent is given bȳ where <> N denotes the ensemble average. Numerical results demonstrate superiority of the nonlinear finite time lyapunov exponent in determining the limit of predictability of chaotic systems in comparison with linear one (Ding and Li, 2007). Local predictability limit gives a measure of long time-scale local predictability on the atractor (Ding et al., 2008). It is different from the local divergence rates or local Lyapunov exponent, which are restricted to conditions of sufficiently small perturbations and only applicable to reflect the short time-scale local predictability. Therefore, the distribution of the local predictability limit does not appear some regions of underlying high predictability or low predictability in phase space (Ziehmann et al., 2000;Ding et al., 2008).
In order to show the theoretical basis of our methology we write Eq. (1) as, X(n + 1) = a(n)X(n) + X(n) = A(n)X(n), where A(n) = a(n) + I, and I the identity matrix. It is important to note that the matrix a(n) is evaluated at each time step in our procedure. Let us denote by X 0 (0) the nearest neighbour of the nearest orbit (the fiducial orbit) to the present point, and X 0 (n) the n-steps future neighbour on the fiducial orbit. Then, we can also write so that, and, In our methodology we obtain the future of the present orbit by following the fiducial orbit, so that A 0 (n) = A(n). We can write δ(n + 1) = A(n)δ(n), or, δ(n + 1) = η(X 0 ,δ(0),n)δ(0).
Nonlin Because in most of the measured time series the present and the nearest orbit (the fiducial orbit) are not arbirarily close to each other, we identify Eq. (14) with the nonlinear propagated error δ η . We conclude that our procedure ressemble without any approximation the definition of the nonlinear finite time lyapunov exponents. The selected subspace of the embedding space minimizes the cumulative error at the termination point, that is the subspace where, locally, the reproduced attractor is less unstable.
The present methodology, which is mathematically consistent with the fundamental definitions of the local nonlinear predictability, provides a simple procedure to study nonlinear, long-term local predictability in observational time series.

Application to the Lorenz System
To test our methodology we use a time series obtained from the well known Lorenz system which is described by the following equations (Lorenz, 1963): The time series was produced using the parameters, σ = 16, r = 45.92, b = 4 and initial condition (x 0 ,y 0 ,z 0 ) as (1.,0.,0.) to generate a time series of 6000 observations. With these parameters the Lorenz system show chaotic behaviour. The standard fourth-order Runge-Kutta method is used to solve the equations with integration step of t = 0.05. A region with similar characteristics to that shown by By using the Fourier spectrum we determine the temporal time scales that drives the phenomena. Therein the largest statistically significant time scale is a period of 334 steps. The zero autocorrelation time for the time series is τ 0 = 15. Thus, we choose the largest time delay to be τ max = 345. To determine the a ij coefficients of Eq. (1) we select the nearest neighbour of the same orbit to the present point plus 8 forward near neighbours of the closest orbit.
We choose two "starting" points to show the performance of our method; a point near x = 0 and a point which is away from x = 0, 5748 and 5753, respectively. In panels (a-d) of Fig. 2 we show a 50 steps blind forecast (green dots) with starting point at 5748 with τ 0 = 15, and for dimensions from d = 4 to d = 7, respectively. In panels (a-d) of Fig. 3 we show a 50 steps blind forecast (green dots) with starting point at 5753 for τ 0 = 15, and for dimensions from d = 4 to d = 7. In Fig. 4 the cumulative error of the event reconstruction is plotted. In panel (a) we show the cumulative error for a 50 steps blind forecast starting at point 5748 for dimensions d = 4 to d = 7 for the Lorenz x-coordinate. In panel (b) we show the cumulative error for a 50 steps blind forecast starting at point 5753 for dimensions d = 4 to d = 7 for Lorenz x coordinate.
It is clear that our method is able to reproduce the x coordinate time series for the Lorenz system in both cases and for several cycles for dimension d = 7 with a cumulative error slightly higher than 10% at step 50 in both cases.

Application to El Niño
In this section we show the performance of the proposed methodology when characterizing the largest ENSO events. El Niño Southern Oscillation (ENSO) is known as the dominant driver for year-to-year climate variability (Trenberth, 1997). The ENSO phenomenon originates in the coupled ocean−atmosphere dynamics of the tropical Pacific (Philander, 1990). Through teleconnections associated with atmospheric circulation and air-sea interaction outside the tropical Pacific, it deeply affects climate worldwide with large environmental and societal impacts (Glantz, 2001). ENSO is generally represented by the Southern Oscillation index (SOI) (Trenberth, 1984), representing the southern oscillation, i.e. the principal mode of surface pressure variability in the Tropics. The SOI index is the standardised anomaly of the mean sea level pressure difference between Tahiti and Darwin. Sustained negative values of the SOI often indicate El Niño episodes. These negative values are usually accompanied by a decrease in the strength of the Pacific Trade Winds and an anomaly warming of the central and eastern tropical Pacific Ocean. Instead positive values of the SOI, La Niña episodes, are associated with stronger Pacific trade winds and warmer sea temperatures over the western Pacific, while the central and eastern tropical Pacific Ocean is cooler. An important aspect of ENSO is that El Niño events are generally characterized by a larger magnitude than La Niña counterparts. The strongest El Niño events ever recorded being those of the 1982-1983 and 1997-1998 years. The oscillation does not have a specific period but occurs preferentially at inter-annual time scales. It has been also observed (Torrence and Webster, 1999) that the amplitude of ENSO undergoes changes on decade time scales. The SOI time series used here (see Fig. 5) extends monthly from 1866 to 2009 (http://www.cgd.ucar.edu/cas/catalog/climind/ soi.html). We low pass filtered (box car, 15 months) the anomaly time series to keep frequencies where the power of the system is concentrated and which are the most relevant, the low frequencies (interannual, decadal).
Indeed, the events selected to test our method are the strongest, that is the 1982-1983and 1997 In the signal, both events (oscillations) can be roughly described by two maxima and a minimum. The minimum is the peak amplitude of the EL Niño event. In the study, the termination point is located just one month after the second maximum. We place the "starting point" 60 months (5 years) before the termination point.
To determine the a ij coefficients of Eq. (1) at the "present" point, the nearest neighbour to the present point and 24 near forward neighbours from the nearest older orbit are selected (corresponding to the small black dots in Fig. 1). Here, we use τ 0 , the zero autocorrelation lag and the 68% autocorrelation decay time (for SOI these are 15 months and 7 months, respectively) to reconstruct the m-dimensional embedding space. The largest time scale relevant to the phenomenon used is 18 years (216 months), giving m = 16 and m = 34 for τ 0 = 15 and τ 0 = 7 months, respectively. The d-dimensional subspaces of dimension from 3 to 8 are scanned. The blind forecast time is T = 60 months.
To obtain a time delay set that characterizes a region of the attractor, we seek for a unique optimal set of time delays that minimizes, simultaneously, the sum of cumulative errors at the termination point for a number of successive starting points. In this study this region spans six months, and a mean forecast is computed by averaging all forecasts for the six successive starting points. Although individual performance give better results, as expected in local prediction (Ding et al., 2008), with the requirement of simultaneously, the sub space of the embedding space is determined where the limit of predictability is obtained by ensemble average.
In Fig. 6 the mean forecast is plotted (orange dots) for τ 0 = 15 and dimensions from d = 3 to d = 8. The individual forecast for each starting point is also plotted (green dots), as well as the SOI signal (black dots). In panel (a) we show three vertical segmented lines referencing successively the first and last starting point, as well as the termination point (TP) of the event. In each panel the corresponding set of time delays is shown.
In Fig. 7 is shown the performance of the methodology for the El Niño 1997-1998 event with the 68% decay time of the autocorrelation as basic time scale. The panels (a-f) show the mean forecast (orange dots) for τ 0 = 7, and for dimensions from d = 3 to d = 8, respectively. The individual forecasts, for each starting point, are also plotted (green dots), as well as the SOI signal (black dots). In panel (a) we show three vertical segmented lines referencing successively the first and last starting point, as well as the termination point (TP).
In Fig. 8 the performance of the methodology for the EL Niño 1982-1983 event is shown. The panels (a-f) show the mean forecast (orange dots) for τ 0 = 7, and for dimensions from d = 3 to d = 8, respectively. The individual forecasts, for each starting point, are also plotted (green dots), as well as the SOI signal (black dots). In panel (a) we show three vertical segmented lines referencing successively the first and last starting point, as well as the termination point (TP).
In Fig. 9, in panels (a) and (b) (for 1997-1998 event) and (c) (for 1982-1983 event), we plot the cumulative errors of the mean forecast for dimensions d = 3 to d = 8 corresponding to Figs. 6, 7, and 8, respectively.
If we compare Figs. 6 to 7, particularly panels (d) to (f), which correspond to embeddings from dimension d = 6 to d = 8, it is visually clear that the mean forecast fits better the event in Fig. 7. Thus, a global comparison of cumulative errors for the 1997-1998 event in Fig. 9 (panels a and b) show that using the 68% decay time for the autocorrelation (τ 0 = 7) offers a significantly better event reconstruction than the zero autocorrelation time (τ 0 = 15). The plotted cumulative errors of the mean forecast at the termination point for the computed dimensions in panel a) (with τ 0 = 15) show no significant differences in the range between 50% and 60%. Instead, the ones computed with the 68% decay time for the autocorrelation (τ 0 = 7, panel (b) show important differences with figures spreading over a range between 15% and 55%. Remarkably, dimension 7 has a significantly lower cumulative error (≈ 15%) performing a better event reconstruction than dimension 8. Thus, for this particular event, the use of this technique allows also to identify the minimal embedding dimension. For the El Niño 1997-1998 event, the embedding dimension 7 and time delays [0,29,85,92,162,190,197] produce a mean forecast that reconstructs the event with a cumulative error of (≈ 15%) at the termination point. The event's characteristics, the amplitude and the termination (at the termination point), are reconstructed roughly 3.5 and 4.5 years in advance. In panel (c) of Fig. 9 we plot cumulative errors of the mean forecast for the 1982-1983 event at the termination point for the computed dimensions with the 68% decay time for the autocorrelation (τ 0 = 7). It is also visually clear that, compared to the 1997-1998 event's (panel b of Fig. 9), the cumulative errors are larger, although, in Fig. 8 panels (d) to (f), timing and amplitude of the main event are correctly estimated.
For the El Niño 1982-1983 event, the results indicate a correct reconstruction for the timing and the amplitude of the main event (ranging from 1982 to 1984, see panels (d and e) in Fig. 8) but with relatively larger errors before (from 1980 to the beginning of 1982), as compared to 1997-1998's results. Probably, the better performance of the methodology in reconstructing the 1997-1998 event is due to the fact that the information of the 1982-1983 event is already present in the time series used for reconstructing the 1997-1998 event.
Finally, to complement the information provided in this paper we show in Fig. 10 some two-dimensional projections of the two events with the parameters of the case shown in

Conclusions
We report a methodology for applying Takens's embedding theorem to reconstruct an event taking into account only local information contained in a short time series. The proposed method makes use of a maximal predictability criterion to reconstruct locally an embedding space, in the sense that the dynamical evolution of the system registered in the experimental data can be reproduced during a given time span. It is important to reconstruct first an m-dimensional embedding space so that the relevant time scales that characterize the phenomenon are contained in the set of time delays of the embedding. The event reconstruction is performed with local nonlinear blind forecast p time steps into the future, using near neighbours of the closest orbit. The optimal embedding dimension is selected from all d-dimensional subspaces of a larger m-dimensional embedding space, the one presenting the minimal cumulative error at the termination point of the event. We find that it is possible to select an optimal set of time delays that performs a local long term reconstruction of an event, that is, there exist a subspace of an embedding space where local recontruction can reproduce a class of events in a time series of a measured observable of a complex dynamical system.
To test the efficiency of our methodology we use the monthly SOI time series. We studied the El Niño 1982Niño -1983Niño and 1997Niño -1998 events that are known to be the strongest measured (by instruments) events, and difficult to predict (Fedorov et al., 2003) more than two years in advance (Chen et al., 2004). The performance of the methodology for the 1997-1998 event can be evaluated by observing that the amplitude and the termination point can be reconstructed roughly 3.5 and 4.5 years in advance, respectively. Instead, a correct reconstruction for the timing and amplitude of the El Niño 1982-1983 event is performed but with relatively larger errors as compared to 1997-1998's results. This is probably related to the fact that there is no event with the comparable system dynamics already present in the time series, as was the case for the 1997-1998 event. Therefore, it is possible to conjecture that the proposed method may not give good results if within the time series, past events are not similar to the studied event, i.e. if there are no similar orbits. Likewise, the method can be applied to arbitrarily large sizes, but it requires a high computational cost.
These results outline the capabilities of the methodology, and stress that this local reconstruction method can be used to characterize the local and nonlinear properties of a given system dynamics.
Characterizing the onset of an El Niño or La Niña phase far in advance has long been a goal of climate science (Trenberth et al., 1998). Results by Chen et al. (2004) has shown that when the equatorial dynamics is known (based in the 1960-2000 events), prominent El Niño events could be predicted, throughout the past century, with lead times of up to two years.
In fact forecasts has been limited by the so called spring predictability barrier; forecasts issued before the preceding Northern Hemisphere spring, typically show very limited Nonlin. Processes Geophys., 17, 753-764, 2010 www.nonlin-processes-geophys.net/17/753/2010/ skill (Webster and Yang, 1992). Recent advances by Izumo et al. (2010) has shown that by taking into account both the Indian and Pacific ocean basins the prediction horizon of ENSO can be extended efficiently up to 14 months. Therefore new physical understanding (Izumo et al., 2010), and new statistical reconstruction methodologies (such as the one proposed here) that may allow longer-range forecasts, by transcending the spring predictability barrier, i.e., more than a year in advance, could help with climate forecast in other regions where ENSO has a strong influence.