A stochastic nonlinear oscillator model for glacial millennial-scale climate transitions derived from ice-core data

A stochastic Duffing-type oscillator model, i.e noise-driven motion with inertia in a potential landscape, is considered for glacial millennial-scale climate transitions. The potential and noise parameters are estimated from a Greenland ice-core record using a nonlinear Kalman filter. For the period from 60 to 20 ky before present, a bistable potential with a deep well corresponding to a cold stadial state and a shallow well corresponding to a warm interstadial state is found. The system is in the strongly dissipative regime and can be very well approximated by an effective one-dimensional Langevin equation.


Introduction
Past and future abrupt climate changes have been extensively discussed in recent years (e.g.Alley et al., 2003).A particular subject of investigation are the abrupt climate transitions between cold stadials and warm interstadials during the last glacial period, the so-called Dansgaard-Oeschger (DO) events (Dansgaard et al., 1993).Their origin and dynamical mechanism is still controversial; it is not obvious which part of the Earth's climate system is responsible for abrupt changes.DO events may be attributed to a temporary collapse and resumption of the Atlantic meridional overturning circulation (Ganopolski and Rahmstorf, 2002).Other hypotheses refer to internal oscillations in the climate system (Sakai and Peltier, 1997;Schulz et al., 2002;Timmermann et al., 2003) or external forcing mechanisms (Ganopolski and Rahmstorf, 2001;Braun et al., 2005).
Besides studies with complex numerical models, other have tried to reduce the system to low-order, box and conceptual models both in a deterministic and stochastic setting.
Different types of low-order models have been proposed to explain the dynamics of DO events.A bistable nonlinear system has been assumed in which shifts between the two distinctly different states are triggered randomly by stochastic forcing (Ditlevsen, 1999;Ditlevsen et al., 2005;Kwasniok and Lohmann, 2009;Livina et al., 2010).Stochastic resonance (Benzi et al., 1982;Alley et al., 2001;Ganopolski and Rahmstorf, 2002) may or may not play a role in such a model.Recently, the mechanism of ghost stochastic resonance has been proposed (Braun et al., 2009) as a noise-induced phenomenon beyond classical stochastic resonance.Moreover, a system of nonlinearly coupled relaxation oscillators has been postulated (Schulz et al., 2002).Each of the oscillators represents a fundamental climate mode with a characteristic frequency in the centennial-to-millennial band; nonlinear interaction between the oscillators may lead to phase synchronisation and the development of a new climate mode with a joint frequency.Furthermore, a nonlinear thermal oscillator has been proposed (Rial, 2004) where the timing of the deterministic external forcing is crucial for generating DOlike oscillations.In a conceptual model approach, Dima and Lohmann (2009) propose that millennial-scale variability can originate from rectification of an external forcing, and suggest that the thermohaline circulation, through a threshold response, could be the rectifier.The latter hypothesis is a combination of the externally driven and internal oscillation hypotheses.
In the present paper, a noise-driven nonlinear oscillator of the Duffing type is formulated as a model for DO events and its parameters estimated from ice-core data.The study extends recent work based on a one-dimensional potential model (Kwasniok and Lohmann, 2009;Livina et al., 2010) to a two-dimensional model.We will examine the question F. Kwasniok and G. Lohmann: Oscillator model for glacial millennial-scale climate transitions of if the DO events can be characterised by a stochastic nonlinear Duffing-type oscillator model.

Dynamical model
Abrupt palaeoclimatic changes are often assumed to be related to a shift between two different stable states in a noisedriven nonlinear system (e.g.Alley et al., 2001;Ganopolski and Rahmstorf, 2002).A very simple conceptual model is stochastically driven motion in a one-dimensional potential landscape (e.g.Ditlevsen, 1999;Kwasniok and Lohmann, 2009).The governing equation is where U (z) is a potential function, η is a white Gaussian noise with zero mean and unit variance and τ is the standard deviation of the stochastic forcing.The stationary probability density of the system is We here extend the dynamical framework and adopt a twodimensional model given by where V (z) is a potential function and γ is a dissipation constant.η 1 and η 2 are two independent Gaussian white noises with zero mean and unit variance; σ 1 and σ 2 are the standard deviations of the noise forcings.The variable z corresponds to a palaeotemperature proxy as given by an ice-core record; the variable v is a velocity-like quantity (e.g.temperature change) which is here not explicitly specified further.
In the case σ 1 = 0, the system of Eqs. ( 3) and ( 4) is the well-known Kramers model for Brownian motion with inertia in a potential (Kramers, 1940;Gardiner, 2010) with stationary probability density: It is then equivalent to the second-order equation This is actually a nonlinear stochastic oscillator; for the particular choice V (z) = 1 4 αz 4 + 1 2 βz 2 with α > 0, it is the (noise-driven) Duffing oscillator which for β < 0 exhibits bistability.
The stochastic terms are meant to account for model uncertainty and structural discrepancy between the model and the real system.Adding a noise term also in the equation for z (Eq.3), which is uncommon, considerably enhances the dynamical richness and realism of the model beyond the Kramers model.The system of Eqs. ( 3) and (4) for σ 1 = 0 breaks the condition of detailed balance and therefore the time-reversal symmetry of the stationary joint probability density: p(z, v, t; z , v , t ) = p(z , −v , t; z, −v, t ).It is capable of producing temporally asymmetric time series; in fact, the characteristic saw-tooth shape is an important nonlinear feature of DO events (Dansgaard et al., 1993).The stationary probability density p(z, v) is not straightforward any more, and z and v are no longer statistically independent.The one-dimensional potential model of Eq. ( 1) and the Kramers model are in detailed balance and are therefore not able to break the time-reversal symmetry.
In the limit of strong dissipation (Smoluchowski regime), v can be adiabatically eliminated from Eqs. ( 3) and ( 4) and the dynamics of z well approximated by an effective onedimensional Langevin equation given by Eq. (1) with and This can be shown by slightly extending standard results (Gardiner, 2010, Chap. 8.2).
The potential V (z) is here assumed to be a general fourthorder polynomial (Kwasniok and Lohmann, 2009): with free parameters {a i } 4 i=1 to be determined from data.
3 Model estimation

Unscented Kalman filter
The unscented Kalman filter (UKF) (Julier et al., 2000;Sitz et al., 2002;Julier and Uhlmann, 2004) is used for parameter estimation in the oscillator model.It allows for recursive estimation of unobserved states and parameters in stochastic nonlinear systems from incomplete, indirect and noisy observations.The UKF keeps the full nonlinear system dynamics rather than linearising it but truncates the filter probability density to a Gaussian in each iteration by only propagating first and second moments.We use the framework of a continuous-discrete nonlinear state space model.The evolution of the state vector z = (z, v) T of dimension n = 2 is governed by the continuoustime, nonlinear stochastic dynamical system: which is given by Eqs.
(3) and (4).λ = (a 4 , a 3 , a 2 , a 1 , γ ) T is the vector of system parameters of dimension p = 5 and ξ is a vector of Gaussian white noises with zero mean and covariance matrix: An augmented state vector x of dimension n a = n + p = 7 is formed by merging the state vector and the parameters.Equation ( 10), together with a constant dynamics for the parameters form the dynamical or state equation of the state space model: ξ a is the augmented noise process vector with corresponding covariance matrix Q a .At discrete times t k , observations y k of dimension m = 1 are available which are linked to the state by the observation equation: We here observe only the variable z; thus the observation operator is H = (1, 0, 0, 0, 0, 0, 0).ε k is white Gaussian observational noise with mean zero and variance R.
The UKF provides estimates of the system states and parameters given a time series of noisy observations {y k } N k=0 .Let xk−1|k−1 be the mean estimate of the augmented state vector and P k−1|k−1 its covariance matrix at time step k − 1 having processed all data up to time step k − 1.The filter density is represented by a small number of so-called sigma points that are propagated through the full nonlinear dynamical equations.The interval , each in augmented state space of dimension n a , given as are the columns of A where A can be any matrix satisfying AA T  = n a P l−1 .Here, we calculate A using the Cholesky decomposition of P l−1 .The sigma points are transformed as and new mean and covariance estimates are given by and The sequence is initialised with x0 = xk−1|k−1 and P 0 = P k−1|k−1 .We set xk|k−1 = xL and P k|k−1 = P L .Then the estimates of the states and the parameters as well as their uncertainties are updated using the new observation according to the Kalman update equations: Here is the innovation or residual, is the Kalman gain matrix and is the predicted residual covariance matrix.

Estimation of noise parameters
For estimation of the noise levels σ 1 and σ 2 , we follow the likelihood approach recently proposed by Kwasniok (2012).At time step k, the predictive probability density of the UKF for the residual ζ k is a Gaussian with zero mean and variance S k .Thus the log-likelihood function of the data set is The UKF is run for different noise parameters and the likelihood maximised.It is sufficient to just calculate the likelihood on a fine enough mesh in noise parameter space and find the maximum.The method for estimating the noise levels is based on internal consistency; for the correct noise parameters, the uncertainty propagated by the Kalman filter matches the true predictive uncertainty reflected in the data.

Simulated data
We first use simulated data to test the ability of the method to reliably identify system parameters and noise levels.A symmetric double-well potential is chosen given by V (z) = 10z 4 − 20z 2 .The damping constant is γ = 10; the noise levels are σ 1 = 1 and σ 2 = 10.The system is integrated using the Euler-Maruyama scheme with a step size of 10 −5 .Only the variable z is observed.No observational noise is added.Figure 1 displays a sample trajectory of the variable z in the system.The time scale of transitions between the two stable states is of the same order as in the ice-core record considered later (Fig. 4) when interpreting the system units as ky.
The system is quite dissipative but not in the limit that it could be described by a one-dimensional Langevin equation.Let z * = 0 be the maximum of the potential.Introducing the frequency ω * by ω 2 * = |V (z * )| we have γ /ω * ≈ 1.6 which is neither small nor large compared to 1, corresponding to an intermediate regime of γ .One may also look at the behaviour of the system in the vicinity of the minima z 0 = ±1.We introduce the deviation from the equilibrium x = z − z 0 and the frequency ω 0 by ω 2 0 = V (z 0 ).For small |x|, the deterministic part of the system is then given by the damped harmonic oscillator equation: The solution type is characterised by the damping ratio ρ = γ 2ω 0 .The values ρ = 0, 0 < ρ < 1, ρ = 1 and ρ > 1 correspond to the undamped, underdamped (oscillatory), critically damped and overdamped regimes, respectively.We here have ρ ≈ 0.56.The system exhibits damped oscillations with period 0.85 and e-folding time 0.2.
In order to have a comparison with the relatively short ice-core record, we use simulated time series of length N = 800 with sampling interval δt = 0.05, spanning a time period of 40 time units, and then perform multiple experiments.The UKF is initialised with the parameter setting (a 4 , a 3 , a 2 , a 1 , γ ) = (5, 5, −10, 5, 5).The initial state estimate is given by the initial observation; the initial velocity estimate is set to zero.The initial variances are all set to 100 except for the state where it is zero as there is no observational noise.The step size in the UKF is h = δt/100 = 0.0005; we set R = 10 −12 .The data set spanning a period of 40 time units with 800 data points is too small to obtain well-converged estimates for the parameters.Therefore, the data are processed 10 times to improve the estimates.Each new sweep is started with the final estimates for the parameters and uncertainties from the preceding sweep but the offdiagonal elements of the covariance matrix set to zero (Sitz et al., 2002).
Figure 2 shows the log-likelihood as a function of the noise levels σ 1 and σ 2 for a particular realisation taken over the last sweep through the data.Using a grid of size 0.01 for σ 1 and   0.5 for σ 2 , the maximum of the likelihood is at σ 1 = 0.99 and σ 2 = 11.5.Due to the shortness of the time series, estimation of the noise levels is quite ill-conditioned as the likelihood function is relatively flat.Figure 3 displays the parameter estimates together with the error standard deviations as the algorithm proceeds through the time series.Taking averages over the last sweep of 800 data points and the final error estimates, the values of the parameters are a 4 = 10.10 ± 0.59, a 3 = −0.30± 0.48, a 2 = −18.33±1.38, a 1 = −0.49±1.26 and γ = 9.83±0.66.The uncertainties in the estimates are considerable as one expects given the shortness of the time series.For all parameters, the true values are consistent with the mean and error estimates of the Kalman filter.The potential corresponding to the mean parameter estimates is plotted in Fig. 3f.
An ensemble of 100 realisations of time series of length N = 800 spanning a time interval of 40 time units was performed.The means and standard deviations calculated from this ensemble are a 4 = 10.55±1.79, a 3 = −0.04±1.33, a 2 = −20.74± 3.37, a 1 = 0.09 ± 3.78, γ = 10.21 ± 2.26, σ 1 = 1.00 ± 0.03 and σ 2 = 10.26 ± 2.47.The corresponding potential is very close to the true one (Fig. 3f); the errors in  a 4 and a 2 almost cancel over the z-range where the data are.The noise level σ 2 is slightly overestimated as is the damping coefficient γ .The standard deviation of σ 2 is quite large as v is not observed; this is somewhat mitigated by the existing positive correlation between the uncertainties in σ 2 and γ .We also processed the data with the extended Kalman filter, a simpler and more common nonlinear Kalman filter algorithm.The mean estimates and standard deviations obtained from the same ensemble of 100 realisations used before are a 4 = 10.50±1.90, a 3 = 0.01±1.28,a 2 = −19.05±3.56, a 1 = −0.08±3.73,γ = 10.38±2.46,σ 1 = 1.00±0.03and σ 2 = 10.22 ± 2.86.The estimate of the potential is considerably worse than that with the unscented Kalman filter.The fine structure of the potential is somehow missed; the wells are too shallow (Fig. 3f).The substantial dynamical noise level gives the system a truly stochastic character.Therefore the propagated state uncertainties are quite large; the superior covariance propagation of the unscented Kalman filter over the extended Kalman filter then has a visible effect.We conclude that in the present context there is some case for using the more advanced unscented Kalman filter.

Ice-core data
We investigate the record of δ 18 O as a proxy for Northern Hemisphere temperatures from the North Greenland Ice Core Project (NGRIP) (North Greenland Ice Core Project members, 2004).In order to focus on the DO activity, the time series for the period from 60 ky to 20 ky before present is actually used for the analysis (Fig. 4).The mean value for that period is −41.75 ‰; it is removed from the data set prior to the analysis as the dynamical model is formulated as an anomaly model.The data are equidistant with a sampling interval δt = 0.05 ky, resulting in N = 800 data points.The measurement error of the ice-core record is indicated to be as small as 0.07 % (North Greenland Ice Core Project members, 2004).The error due to uncertain dating is hard to quantify but is probably larger than the measurement error.Yet dynamical modelling (Kwasniok, 2012) confirms that the uncertainty is dominated by the large dynamical noise level and observational noise is negligible.The underlying dynamics of the ice-core data is strongly stochastic and determinism is weak as has been shown using the method of surrogate data (Kwasniok and Lohmann, 2009).Technically, R is not set exactly to zero but to a very small value, say 10 −12 , to guarantee that the covariance matrices in the Kalman filter are positive definite for the algorithm not to break down due to rounding errors.
The ice-core data are processed in the same way as the simulated data.The step size in the Kalman filter is set to h = δt/100 = 0.0005 ky. Figure 5 displays the log-likelihood as a function of the noise levels σ 1 and σ 2 calculated from the last sweep through the data.It turns out that approximately models with the same value τ 2 = σ 2 1 +σ 2 2 /γ 2 have the same likelihood; the likelihood contours in the (σ 1 , σ 2 /γ )plane are circles.The maximum of the likelihood is at a radius of about τ = 3.95.Moreover, models on the same likelihood contour have the same scaled potential V (z)/γ .From Eqs. ( 7) and ( 8), this is an indication that the system is in the limit of strong dissipation and models with the same effective one-dimensional noise level τ are equivalent.The exact value of γ is ill-determined and depends on its initial estimate and uncertainty, but it is certainly very large (γ ∼ 500-1000).V (z)/γ and σ 2 /γ are virtually independent of the precise value of γ .We can therefore fix γ to an indicative value, say γ = 1000.Given that the partition of the noise among σ 1 and σ 2 does not matter here, we restrict our attention in the following to the case σ 1 = 0, that is, the classical Kramers problem.
For the ice-core data, the parameter a 1 turns out to be illdetermined.This is probably a combined effect of the high noise level, the almost degenerate shape of the potential and    the structural model error.The problem is removed by applying the constraint that the mean state of the model matches the mean state of the data (cf.Kwasniok and Lohmann, 2009), that is, z = ∞ −∞ zp(z)dz = 0, leading to the condition (cf.Eq. 5) It is easily seen that, for fixed parameters a 4 , a 3 , a 2 , γ and noise level σ 2 , the mean state z tends to +∞ as a 1 goes to −∞, tends to −∞ as a 1 goes to +∞, and has a monotonic dependence on a 1 in between.Thus Eq. ( 25) uniquely determines a 1 for given a 4 , a 3 , a 2 , γ and σ 2 .The integral is evaluated numerically; the root is then found by running 15 iterations of the bisection algorithm starting with the interval [−10, 10] for a 1 /γ .The UKF is modified in that only a 4 , a 3 and a 2 are parameters to be estimated.We then have n = 2,    p = 3 and n a = 5. a 1 is treated as a constant in the Kalman filter and updated according to Eq. ( 25) at each data point after the Kalman update using the current estimates of a 4 , a 3 and a 2 .
Figure 6 shows the log-likelihood as a function of σ 2 /γ .There is a maximum at σ 2 /γ = 4 using a mesh of size 0.1.This is compared to directly fitting a one-dimensional Langevin model given by Eq. ( 1) with a potential U (z) = 4 i=1 a i z i and a noise level τ using the procedure of Kwasniok and Lohmann (2009).We then have a maximum of the likelihood at τ = 3.9.Apart from a small shift in the noise level, the two likelihood profiles are virtually the same; in particular the obtained maximum of the likelihood is the same.It is not clear where the shift in the noise level comes from; it is somehow in accordance with the F. Kwasniok and G. Lohmann: Oscillator model for glacial millennial-scale climate transitions slight overestimation of σ 2 in the simulated data.Figure 7 shows the parameter estimates and the derived potential.Averaging over the last sweep, the parameters are a 4 /γ = 0.16±0.01,a 3 /γ = −0.22±0.02, a 2 /γ = −0.87±0.07 and a 1 /γ = 1.39.The potential is bistable and asymmetric; it has a deep well corresponding to the cold stadial state and a shallow well corresponding to the warm interstadial state.The curvature of the potential gives γ /ω * ∼ 20 1, confirming that the system is clearly in the limit of strong dissipation.Also, when linearising the oscillator equation about the two equilibria of the potential, we get a damping ratio of ρ ∼ 5 for the cold state and ρ ∼ 10 for the warm state, both far into the overdamped regime.The potential obtained by directly fitting a one-dimensional potential model is also given; it is virtually identical.The parameters then are a 4 = 0.17±0.01,a 3 = −0.25 ± 0.02, a 2 = −0.91 ± 0.07 and a 1 = 1.50.Figure 7c also displays the stationary probability densities of the data and the two models; they are inflated by a factor of 25 to increase the readability of the plot.The probability densities were estimated using a Gaussian kernel estimator with the standard choice for the bandwidth (Silverman, 1986).Figure 8 gives sample trajectories of the oscillator model and the one-dimensional Langevin model contrasted with the anomaly ice-core record.As expected, the oscillator model is statistically indistinguishable from the potential model.

Conclusions
We conclude that a Duffing-type oscillator model can be determined from the ice-core data, but it is in the regime of strong dissipation and can be very well approximated by a one-dimensional effective Langevin equation.The additional dynamics offered by Eqs.(3) and (4) over Eq. ( 1) is not actually used by the system.The simpler one-dimensional model yields virtually the same likelihood as well as an equivalent potential and noise level.As already shown in Kwasniok and Lohmann (2009), such a model is able to capture some basic features of the ice-core record: the two modes of the probability density with approximately the correct population (Fig. 7c) as well as the amplitude and time scale of the switches between the stadial and interstadial state (Fig. 8).It cannot capture the pronounced temporal asymmetry of the DO events.A van der Pol-type relaxation oscillator might be more adequate here.But this is outside the scope of the present study and may be pursued elsewhere.
Edited by: R. Donner Reviewed by: P. D. Ditlevsen and M. Crucifix

Fig. 1 .Fig. 2 .
Fig. 1.Sample trajectory of z in the oscillator with symmetric double-well potential.

Fig. 1 .
Fig. 1.Sample trajectory of z in the oscillator with symmetric double-well potential.

Fig. 2 .
Fig. 2. Simulated data: Log-likelihood as a function of the noise levels σ 1 and σ 2 .Neighbouring contours differ by a factor of 2 in likelihood.

Fig. 2 .
Fig. 2. Simulated data: Log-likelihood as a function of the noise levels σ 1 and σ 2 .Neighbouring contours differ by a factor of 2 in likelihood.

Fig. 3 .
Fig. 3. Simulated data: Estimate and error standard deviation for the parameters a4 (a), a3 (b), a2 (c), a1 (d) and γ (e) obtained from a particular time series of length 40 time units, processed ten times.The dashed horizontal lines indicate the true parameter values.(f) True potential (solid) and reconstructed potentials as a mean over 100 realisations with the unscented Kalman filter (dashed) and the extended Kalman filter (dotted).The dot-dashed line gives the potential estimated from the particular realisation whose parameter estimates are displayed in panels (a)-(e).

Fig. 3 .
Fig. 3. Simulated data: Estimate and error standard deviation for the parameters a 4 (a), a 3 (b), a 2 (c), a 1 (d) and γ (e) obtained from a particular time series of length 40 time units, processed ten times.The dashed horizontal lines indicate the true parameter values.(f) True potential (solid) and reconstructed potentials as a mean over 100 realisations with the unscented Kalman filter (dashed) and the extended Kalman filter (dotted).The dot-dashed line gives the potential estimated from the particular realisation whose parameter estimates are displayed in panels (a)-(e).

Fig. 6 .
Fig.6.Ice-core data: Log-likelihood as a function of σ2/γ for the oscillator model (solid) and as a function of τ for the onedimensional potential model (dashed).

Fig. 4 .
Fig. 4. Record of δ 18 O from the NGRIP ice core for the last glacial period.

Fig. 5 .
Fig. 5. Ice-core data: Log-likelihood as a function of the noise levels σ 1 and σ 2 .Neighbouring contours differ by a factor of 100 in likelihood.

Fig. 5 .
Fig.5.Ice-core data: Log-likelihood as a function of the noise levels σ 1 and σ 2 .Neighbouring contours differ by a factor of 100 in likelihood.

Fig. 6 .
Fig.6.Ice-core data: Log-likelihood as a function σ 2 /γ for the oscillator model (solid) and as a function of τ for the onedimensional potential model (dashed).

Fig. 7 .
Fig. 7. Ice-core data: (a) Estimates for a4/γ and a3/γ with error standard deviation.(b) Estimates for a2/γ with error standard deviation and estimates for a1/γ.(c) Scaled potential V (z)/γ for the oscillator model (thick solid) and potential U (z) for the one-dimensional model (thick dotted).Probability densities of the ice-core data (solid), the oscillator model (dotted) and the onedimensional model (dashed).

Fig. 8 .
Fig. 8. (a) Record of δ 18 O from the NGRIP ice core for the last glacial period with the mean value removed.(b) Sample trajectory of the oscillator model.(c) Sample trajectory of the onedimensional potential model.

Fig. 7 .
Fig. 7.Ice-core data: (a) Estimates for a 4 /γ and a 3 /γ with error standard deviation.(b) Estimates for a 2 /γ with error standard deviation and estimates for a 1 /γ .(c) Scaled potential V (z)/γ for the oscillator model (thick solid) and potential U (z) for the one-dimensional model (thick dotted).densities of the ice-core data (solid), the oscillator model (dotted) and the onedimensional model (dashed).

Fig. 7 .
Fig. 7. Ice-core data: (a) Estimates for a4/γ and a3/γ with error standard deviation.(b) Estimates for a2/γ with error standard deviation and estimates for a1/γ.(c) Scaled potential V (z)/γ for the oscillator model (thick solid) and potential U (z) for the one-dimensional model (thick dotted).Probability densities of the ice-core data (solid), the oscillator model (dotted) and the onedimensional model (dashed).

Fig. 8 .
Fig. 8. (a) Record of δ 18 O from the NGRIP ice core for the last glacial period with the mean value removed.(b) Sample trajectory of the oscillator model.(c) Sample trajectory of the onedimensional potential model.

Fig. 8 .
Fig. 8. (a) Record of δ 18 O from the NGRIP ice core for the last glacial period with the mean value removed.(b) Sample trajectory of the oscillator model.(c) Sample trajectory of the onedimensional potential model.