Comment on npg-2021-5 Anonymous Referee # 1 Referee comment on " A Study of Capturing AMOC Regime Transition through Observation-Constrained Model Parameters

In this manuscript the authors describe a series of twin experiments in which the Ensemble Adjustment Kalman Filter (EAKF) is used for state and parameter estimation in models of the Atlantic Meridional Overturning Circulation (AMOC). The models in question are box models, driven by atmospheric input taken from the output of chaotic, rather than stochastic systems. The authors demonstrate that their methods allow reliable estimation of unknown parameters in the model AMOCs, which exhibit multiple stable equilibria. The results described here will be of interest to a broad segment of NPG readers, but the manuscript as it stands needs a great deal of work to make it acceptable.


Introduction 25
The Atlantic meridional overturning circulation (AMOC), the core of the thermohaline circulation (Roquet et al., 2017), is an essential component of the World Ocean circulations (e.g., Delworth and Greatbatch, 2000). One of its important characteristics is the existence of multiple equilibria (Mu et al., 2004). The research addressing this characteristic originates from Stommel (1961) who used two boxes with uniform temperature and salinity to simulate the equatorial ocean and the polar ocean respectively. This box model simulates multiple equilibria of thermohaline circulation including three steady solutions: 30 a stable thermal mode, an unstable thermal mode (mainly driven by heat), and a stable haline mode (mainly controlled by https://doi.org/10.5194/npg-2021-5 Preprint. Discussion started: 1 February 2021 c Author(s) 2021. CC BY 4.0 License. salinity). Using an idealized box model has since become one of the most efficient approaches in the studies of AMOC simulations.
The idealized box model, although of limited applicability in simulating the entire Atlantic circulation or even the global circulation, provides the most basic explanation for some of the important characteristics of the AMOC (Scott et al., 1999). 35 Besides Stommel's box model, which places two boxes side by side, Welander (1982) placed one box on top of the other to simulate the vertical structure of the real ocean. Then, the two-box model is extended to three boxes, and different three-box hemispheric models result in multiple equilibrium solutions (Birchfield, 1989;Guan and Huang, 2008;Shen et al., 2011). Also based on Stommel's box model, in some studies an additional box is added to simulate interhemispheric flows, constructing an idealized double-hemisphere model consisting of two high-latitude boxes and a low-latitude box. Multiple equilibria appear 40 in such box models, and the transition between multiple equilibrium states is related to salt flux or freshwater flux (Rooth, 1982;Rahmstorf, 1996;Scott et al., 1999). Extending Rooth's box model, with the equatorial box and the polar box connected at depth, results in nine equilibrium solutions, four of which are stable (Welander, 1986). The double-hemisphere model is closer to the real AMOC than the hemispheric model regarding the cross-equatorial flow in the Atlantic and upwelling flows in the Southern Ocean. 45 The multiple equilibrium states of the AMOC have been confirmed not only in simple idealized box models but also in comprehensive ocean general circulation models (Fürst and Levermann, 2012). In addition to the four different equilibrium states obtained in ocean circulation models with two basins representing the idealized Atlantic and Pacific oceans (Marotzke and Willebrand, 1991), it is even more encouraging that multiple equilibria are first simulated in a complex ocean general circulation model (Bryan, 1986), followed by two steady states in a global model of the coupled ocean-atmosphere system 50 (Manabe and Stouffer, 1988). While such a phenomenon of AMOC multiple equilibria as a reverse haline mode cannot be directly simulated in general circulation models (e.g., Stouffer et al., 2006;Weijer et al., 2019), it is instead replaced by a weak positive circulation or a collapsed AMOC state (e.g., Liu et al., 2013), generally referring to regime transitions.
The transition between different equilibrium states is related to many factors, one of which is freshwater, the most commonly considered, starting with Stommel's box model that illustrates the effect of freshwater input on thermohaline circulation 55 (Lambert et al., 2016). Changes in freshwater over a range of parameters may trigger shifts between different equilibrium states (e.g., Bryan, 1986;Birchfield et al., 1990;Marotzke and Willebrand, 1991;Nilsson and Walin, 2001;Stouffer et al., 2006;Nilsson and Walin, 2010). In addition to freshwater fluxes, the switch between different equilibria may also be caused by wind-driven gyre. The equilibrium solutions in both Stommel's box model and Rooth's box model will be eliminated by a strong enough wind-driven ocean gyre (Longworth et al., 2005). The same result can be obtained by replacing the buoyancy 60 constraint with an energy constraint, which shows that wind-driven gyre above a certain threshold may suppress the multiple equilibrium solutions (Guan and Huang, 2008). Wind is one of the important drivers of ocean circulation and its impact on AMOC may be related to the horizontal mixing effect of wind-driven circulation and its effect on upper ocean heat and salt transport (Ashkenazy and Tziperman, 2007). The study for AMOC regime transitions could be performed in idealized models.
For example, Sévellec and Fedorov (2014) developed an idealized low-order model to describe a multiple equilibria regime characteristic and investigated the predictability of the system regime shifts. In a conceptual box model, two types of transitions were described, with a temporary cessation of the downwelling (called F-type transition) or a full collapse of the AMOC (called S-type transition), and the F-type transitions might have been found in the direct observation (Castellana et al., 2019).
The thermal mode and the reverse haline mode correspond to different equilibrium states of the AMOC. For simplicity, we will refer to these different states as the stronger AMOC (on-state) and the weaker AMOC (off-state) in simple conceptual 70 models.
Constrained by the limited measurement technique and time length, the direct observation of AMOC is in general scarce, in terms of its nature of rich spectrum especially addressing low-frequency (e.g., Delworth et al., 1993). The direct observation of AMOC is mainly from the RAPID-MOC/MOCHA (Meridional Overturning Circulation and Heatflux Array) mooring array, which has been conducted at 26° N since 2004 (Cunningham et al., 2007;Smeed et al., 2014). The scope of direct observation 75 is difficult to cover the entire Atlantic Ocean, and it is difficult to achieve long-term continuous direct observation. Ocean temperature data could be used to derive a proxy index for the variability of the AMOC, so both observations from satellites and ocean temperature measurements from the ARGO program could be used to monitor AMOC, and historical variations of AMOC could be reconstructed from historical sea surface temperature (Zhang, 2008;Dima and Lohmann, 2011). Indicators representing AMOC can be established based on the physical relationship between AMOC and atmospheric indices or oceanic 80 variables (e.g., Delworth et al., 2016;Caesar et al., 2018). Previous studies have compared and evaluated some of these indicators with direct observations of AMOC and the results indicate that this approach is feasible for AMOC reconstruction (Sun et al., 2020). However, the direct observations from RAPID or the ocean temperature measurements from ARGO program are only available for about the last two decades, and the lack of multi-decadal observations makes it impossible to evaluate the multi-decadal AMOC variation (Roquet, 2017). Besides, Paleoclimate records from marine sediments or ice cores are often 85 used to investigate AMOC variations (e.g., Rühlemann et al., 2004;Lynch-Stieglitz, 2017). Paleoclimate data can be used as observations of the AMOC on centurial and millennial timescales. Analyses of paleoclimate data reveal that the strength and pattern of AMOC changed between the glacial and interglacial periods (e.g., Bryan, 1986). The two equilibrium solutions in the work of Birchfield (1989) correspond to the modern ocean and the warm saline Cretaceous ocean, respectively.
In summary, direct observations of AMOC are so scarce as to be unrepresentative in studies of multi-equilibria of AMOC 90 at long time scales, and paleoclimate data have considerable uncertainty, so numerical simulations using ocean circulation models and coupled climate models are the main method to study the multiple equilibria of AMOC at present. However, due to an incomplete understanding of the physical processes and the error of the default parameter values, the numerical model is problematic in simulating AMOC multiple equilibria. This study addresses the problem that for long time scale AMOC reanalysis data, the AMOC multiple equilibrium states simulated by different models are different and do not fully represent 95 the "real" AMOC multi-equilibrium transition path. How to simulate regime transition of AMOC with a model where influencing factors such as freshwater and wind-driven gyre change over time. Then the next key is how to make the simulation results closer to "reality" on the feature of regime transitions by constraining the parameter values by observation. Data assimilation that combines a model with observed data is a feasible approach to study the multi-equilibria of AMOC given the situation described above. A popular data assimilation scheme is the Kalman filter (Kalman, 1960;Kalman and 100 Bucy, 1961). The main idea is to adjust the model predictions according to the observational data to obtain an optimal estimation of model states. Combining the Kalman filter with the idea of ensemble prediction, the ensemble Kalman filter (EnKF) uses ensemble samples of system states to estimate the background error covariance (e.g., Evensen, 1994). As a variant of EnKF, the ensemble adjustment Kalman filter (EAKF) derives a linear operator from the product of the observational distribution and the prior distribution of the model state to update the model ensemble (Anderson, 2001). EAKF has been 105 applied to climate models to have developed fully coupled data assimilation systems (e.g., Zhang et al., 2007;Liu et al., 2014).
The method of parameter estimation is based on the theory of data assimilation, i.e. information estimation theory, or filtering theory (e.g., Jazwinski, 1970). Research on the use of observations to estimate model parameters has attracted extensive attention and has produced encouraging results in the literature (Annan et al., 2005;Aksoy et al., 2006aAksoy et al., , 2006bHansen and Penland, 2007;Kondrashov et al., 2008;Hu et al., 2010). 110 Here we present a method for improving the modeling of AMOC multi-equilibria. The new method is shown to simulate the AMOC transition between different equilibrium states accurately in two simple coupled models, the first combining a three-box overturning simulation model with a five-variable simple climate model, and the second with clearer physical meaning. Then, we apply EAKF to both AMOC models to establish an ensemble coupled model data assimilation and parameter estimation (CDAPE) system, respectively. Within a "twin" experiment framework, the "observation" information, 115 which is from the assimilation model simulation, is used to adjust the parameters of the model, thereby constraining the paths of transition between different AMOC equilibrium states, so that the path simulated by the model is close to the "real" path. This paper is organized as follows. After the introduction, the methodology is described in Sect. 2, including a general proposition of optimizing multi-equilibrium transition path of AMOC, the EAKF algorithm, and the design of twin experiments used throughout this study. Section 3 begins with a description of the three-box and five-variable models, and their combination 120 to simulate regime transitions of the AMOC, and then describes the optimization of the trajectory for the multi-equilibrium transition by CDAPE. In Sect. 4, the method of capturing regime transitions by CDAPE is applied to a similar simple model with a more explicit physical meaning. Finally, the summary and discussions are given in Sect. 5.

Simulation and optimization of AMOC regime transitions 125
The AMOC, as part of the thermohaline circulation, consists mainly of warmer and saltier water flowing from low to high latitudes in the upper ocean of the Atlantic, colder North Atlantic Deep Water (NADW) flowing southward in the deep ocean, and the corresponding upwelling and downwelling currents (Fig. 1a). Multiple equilibria exist in the system, for example, including the thermal mode (active AMOC or on-state) and the reverse haline mode (weak AMOC or off-state). The regime transitions of AMOC are simulated in simple idealized box models and complex ocean general circulation models. There are many influencing factors involved in the model, such as wind-driven gyre and freshwater flux etc, and their variations will result in different states for AMOC. Their specific values and parameterization schemes are often designed with respect to the model states x.
Generally, an ocean-atmosphere coupled model which contains complex physical processes, can be generally expressed as  direct observations y in real Earth system, such as atmospheric wind and temperature fields, ocean temperature and salinity etc. For longer timescales (centurial and millennial timescales), the observations of AMOC can be derived from the paleoclimate records y. The red "+" signs in Fig. 1b, derived from the direct observations y, represent the indirect 160 "observations" of the AMOC sampled from reality. The observations y are projected onto the model parameters β by CDAPE (the red arrow in Fig. 1b) so that β evolves over time with observation-dependent trend. Since the varying parameters allow the physical process of the model to be more flexible and the parameters β constrained by observation y gradually approach their true values, the model CDAPE simulation (purple line) results in a more realistic AMOC multi-equilibria (the blue arrow).
To explore how likely this idea is to be realized, we attempt to capture AMOC regime transitions, in a conceptual model 165 reflecting the characteristic of multi-equilibria and a more complex model with simple physical processes, and even planned in a more complex ocean-atmosphere coupled model representing the real Earth system.

Coupled model data assimilation and parameter estimation (CDAPE)
The ensemble adjustment Kalman filter (Anderson, 2001) is used for data assimilation and parameter estimation in this study.
The basic process of the two-step EAKF (Anderson, 2003;Zhang and Anderson, 2003;Zhang et al., 2007) is to project the 170 observational increment onto model states (relevant parameters) by calculating the error covariance between the prior ensemble of the model variable (parameter) and the model-estimated ensemble.
The first step is to compute the observational increment Δ at the observation location. Δ is formulated by =̄+ Δ ′ − , where the subscript i represents the ensemble sample index, superscript o represents the observational quantity and superscript p represents the prior quantity. While ̄ represents the shift of the ensemble mean and ′ represents the 175 formation of the new ensemble spread. ̄ is computed by where σ is the error standard deviation. Equation (1) shows that the ensemble mean shifts toward the prior model ensemble mean ̄ or the observation value , and whether it is ̄ or depends on who has the smaller variance. Then, ′ is formulated by 180 where represents the prior ensemble spread. Equation (2) denotes that the prior probability density function is squashed by a new observation.
The second step is to distribute the increments on to model states and this assimilation process can be expressed as For the model parameter estimation, the increments are distributed onto a relevant parameter. Therefore, the equation is where is the prior ensemble of the parameter. 190

Experimental design
To show the contribution of data assimilation and parameter estimation to capture AMOC regime transitions, a twin experiment containing a truth model and assimilation models is designed. Since this study focuses on the effect of parameter on multiple equilibria, it is assumed that the model bias is originated only from the incorrectly set parameter. The parameter in the truth model is set to the truth value β0, and the simulation results represent the real state of AMOC in reality. Similar to the 195 observation process in reality, the observations y are obtained by superimposing white noise on the real state and sampling at a certain frequency. The assimilation models differ from the truth model only in the parameter values, with the same initial conditions and other aspects such as the differential scheme. The parameter in the i-th assimilation model is assumed to be incorrectly guessed as βi, and all βi in the assimilation models have the mean of βm (βm ≠ β0) and the variance of βv. The role of data assimilation is shown by the model states constrained by the observations y, and furthermore, the role of parameter 200 estimation is shown by the model states obtained after the parameters are constrained by the observations y.

A three-box MOC model
In the classic two-box model of Stommel (1961), a buoyancy constraint on the thermohaline circulation was present. Following 205 the energy-constraint approach, the thermohaline circulation is driven and maintained by mechanical energy so that buoyancy constraint is replaced by an energy constraint (Guan and Huang, 2008). On this basis, considering the effect of wind-driven gyre, a three-box model is formulated (Shen et al., 2011).
The three-box model used in this study has an upper box representing the mid and low latitude surface ocean, a pole box representing the high-latitude ocean, and a lower box representing the mid and low latitude deep ocean, as illustrated in Fig.  210 2. The three-box model is designed with two different modes which are the thermal mode (driven by temperature) and the haline mode (driven by salinity). In the thermal mode, water flows from the pole box, passing the lower box, then flowing into the upper box by upwelling, and finally returning to the pole box (solid arrows in the boxes), while the circulation is reversed in the haline mode (dotted arrows). The horizontal and vertical water flow are represented by the terms u and v, respectively (more details are given in Shen et al., 2011). The heat balance equations and the salinity balance equations in each box are established firstly. By introducing the nondimensional variables, after derivation, the simple non-dimensional ordinary differential equations are finally obtained as 225 follows: where Ti and Si represent the temperature and salinity in box i, an overdot denotes time tendency, 0 * and 0 * are the mean temperature and mean salinity of the box model ocean, the subscript t stands for the thermal mode, ω represents the winddriven gyre, and p represents the freshwater flux. The non-dimensional continuity equation is 235 Under the energy constraint, the scale of overturning rate in the three-box model satisfies https://doi.org/10.5194/npg-2021-5 Preprint.
where e represents the strength of the external source of mechanical energy sustaining mixing, ρ0 is the mean density of the model ocean, α is the thermal expansion coefficient, and β is the saline expansion coefficient. 240 In the haline mode, the influence of wind-driven gyre is the same as it is in the thermal mode, but the circulation in three boxes is reversed. Accordingly, the non-dimensional equations in each box are where the subscript s stands for the haline mode. The non-dimensional continuity equation is The overturning rate's vs is formulated by Equations (5)-(7) are governing equations for the thermal mode, and Equations (8)-(10) are governing equations for the haline mode of the thermohaline circulation in a hemisphere three-box model. The time derivatives in Eqs. (5) and (8) are set to be zero. By solving the non-dimensional differential equations and the continuity equations, there are two stable solutions 255 and one unstable solution, which means that the three-box model has three equilibrium states. A more detailed description of this model can be found in Guan and Huang (2008) and Shen et al. (2011). The overturning rate (v) and multiple equilibria are affected by energy constraint e, freshwater flux p, and wind-driven gyre ω. The haline mode will switch to the thermal mode when e or ω is increased or p is decreased beyond the critical value (Shen and Guan, 2015).

A 5-variable conceptual climate model 260
Lorenz (1963) proposed a simple model with only three variables to simulate the chaotic characteristics of the atmosphere, where x1 is proportional to the intensity of the convective motion, x2 is proportional to the temperature difference between the ascending and descending currents, and x3 is proportional to the distortion of the vertical temperature profile from linearity.
However, its three variables only reflect the process of atmospheric convection, and they cannot represent the interaction of the atmosphere and ocean, as well as the low-frequency nature of climate evolution. On this basis, two ocean variables that 265 represent the slab ocean variable and the deep ocean pycnocline anomaly are added and coupled with the chaotic "atmosphere" to simulate the interactions between the atmosphere and the ocean (Zhang et al., 2012) as well as the upper and deep oceans (Zhang, 2011a(Zhang, , 2011b. The model equations are: https://doi.org/10.5194/npg-2021-5 Preprint. Discussion started: 1 February 2021 c Author(s) 2021. CC BY 4.0 License.
where x1, x2, and x3 are the high-frequency variables that represent the atmosphere, w and η are the low-frequency variables that conceptually simulate the simple variation characteristics of the upper ocean and the deep ocean, respectively. 275 The original σ, κ, and b sustain the chaotic nature of the atmosphere. The coupling between the fast atmosphere and the slow ocean is reflected by c1 and c2. The coefficient c1 represents the oceanic forcing on the atmosphere and c2 represents the atmospheric forcing on the ocean. To ensure that the time scale of the ocean is slower than the atmosphere, the heat capacity Om must be much larger than the damping rate Od.  (σ,κ,b,c1,c2,Om,Od,Sm,Ss,Spd,Γ,c3,c4,c5, c6) = (9.95, 28, 8/3, 10 -1 , 1, 1, 10, 10, 1, 10, 100, 10 -2 , 10 -2 , 1, 10 -3 ). 285

The 3-box MOC model coupled with the 5-variable model (MOC3B-5V)
The construction of the MOC3B-5V model starts with the three-box model of the previous study of Shen and Guan (2015), including the non-dimensional temperature and salinity differential equations, the continuity equations, and the equation for the overturning rate (Eqs. 5-10). To obtain the time series of overturning rate and simulate the AMOC transition between different equilibrium states in the time series, we no longer set the time derivatives in Eqs. (5) and (8) to be zero, instead using 290 a leapfrog time differencing scheme to forward the temperature and salinity so as the overturning rate.
To test the feasibility of the time differencing scheme, the values of e, p, ω in the three-box model are changed respectively from small to large, and the overturning rate is calculated when the temperature and salinity in the three boxes are almost steady, which means that the AMOC reaches a quasi-equilibrium state. By using a leapfrog time differencing, the three-box model is first spun up for 10 5 TUs (Time units, 1 TUs = 100 steps) starting from (T1, T2, T3, S1, S2, S3) = (20.0, 0.0, 15.0, 35.5, 295 35.0, 34.5) with the values of relevant parameters described in the previous study (Shen and Guan, 2015). The initial values of temperature and salinity at the equilibrium state are obtained. Then the value of e (from 0.0 to 3.0 × 10 -7 kg m -2 s -1 ), p (from 1.0 to 0.0 m yr -1 ) or ω (from 0.0 to 5.0 Sv, 1 Sv = 10 6 m 3 s -1 ) is changed within a certain range. Each time it changes, the threebox model is integrated for another 500 TUs for spin-up to reach an equilibrium state, and the overturning rate corresponding https://doi.org/10.5194/npg-2021-5 Preprint. to different values of e, p, ω is calculated. To distinguish the overturning rate in the haline mode from that in the thermal mode, 300 the overturning rate in the haline mode can be represented by -vs, which means that the circulation is reversed.
The results are consistent with previous results from the research on model stability in Shen et al. (2011). Figure 3a shows the effects of e on the circulation in the three-box model. The corresponding threshold of e exists in the haline mode. When e is less than the critical value, the overturning rate is less than zero. When the value of e is increased beyond the threshold, the haline mode switches to the thermal mode. Similarly, when p is decreased beyond the corresponding critical value (in Fig. 3b), 305 or when ω is increased beyond the corresponding critical value (in Fig. 3c), the AMOC transition from the haline mode to the thermal mode. To simulate the transition between different states of AMOC and achieve the shift from the thermal mode to the haline mode, the non-dimensional differential equations for temperature and salinity balance are adjusted by adding a term Qa which represents an additional freshwater flux from the atmosphere to the upper box and the pole box. Similar to the parameterization 315 scheme in Roebber (1995), a simple and more idealized parameterization scheme for Qa is devised, which assumes that the Thus, v or u greater (less) than zero represents the thermal (haline) mode with circulation flowing clockwise (counterclockwise) in the three boxes in Fig. 2.
Hence, the non-dimensional differential equations are 325 where the function θ(x) is a step function, which has the value 1 for a positive argument and the value zero otherwise. Through this function, we can represent the different circulation given by different sign of v. The continuity equation is and the overturning rate can take a form as 335 so that the sign of v will represent where equilibrium state the AMOC is. The intensity of overturning rate and the state of AMOC are mainly affected by e, p, and ω in the circulation control equations.
In reality, the circulation intensity and the AMOC state are affected by many factors, such as mechanical energy, which is directly used to sustain the vertical mixing in stratification, freshwater flux, and wind-driven circulation. These factors change 340 irregularly in the earth system. To make the variations of e, p, and ω in the model have chaotic characteristics, which is similar to reality, these three influencing factors and the five-variable model are combined. Since the energy constraint e is related to the upper ocean and the deep ocean, the freshwater flux p is related to the atmosphere and the upper ocean, and the winddriven circulation ω is directly related to the atmosphere, it is possible to conceptually idealize a simple equation for the relationship between e, p, ω with x2, w, η. 345 The energy constraint e reflects the strength of the external mechanical energy that sustains mixing, the main sources of which are the energy provided by the wind and tidal dissipation. In this process, kinetic energy is converted to potential energy through turbulence and internal waves (Huang, 2004). Such external mechanical energy is estimated to be about 2 TW (terawatts), with about 1.2 TW as the contribution of wind to mixing, including the generation of internal waves in the surface https://doi.org/10.5194/npg-2021-5 Preprint. Discussion started: 1 February 2021 c Author(s) 2021. CC BY 4.0 License.
ocean (Munk and Wunsch, 1998), and the energy from the wind can radiate throughout the ocean (Wunsch and Ferrari, 2004). 350 Besides, a previous study has estimated the energy provided by wind at 1 TW, which is also about half of the total external energy to sustain mixing (Wunsch, 1998). The other half of the total energy comes mainly from the tidal dissipation in the deep ocean, and to a lesser extent from the interactions of the eddy with the ocean bottom topography (Wunsch and Ferrari, 2004;Kuhlbrodt et al., 2007). The energy parameter e varies continuously with climate conditions, but it is difficult to establish the connection between them accurately, due to the large uncertainty in the estimation of these energy sources (Guan and 355 Huang, 2008).
For the idealized three-box model coupled with the simple conceptual climate model, the energy constraint e can only be conceptually constructed by approximate estimation. However, this paper focuses primarily on capturing regime transitions of the AMOC, so it will not be affected by the inaccurate energy constraint e that is conceptually established. The main sources of external energy to sustain mixing are tide and wind, so e is defined as e = Et + Ew, where Et represents the kinetic energy 360 originating from the abyssal tidal flow and Ew represents the energy from wind contribution to the ocean. Et is primarily associated with the deep ocean, so the equation is simply established as Et = a1 (η + b1). Since the wind affects the upper ocean directly and radiates throughout the ocean through the interaction of the upper ocean with the deep ocean, the equation is established as Ew = a2 (w + b2) + a3 (w + b2) (η + b1), where a1, a2, a3, b1, b2 are constants to be determined. The range of e has been estimated to be roughly 1 × 10 -7 to 3 × 10 -7 kg m -2 s -1 (Guan and Huang, 2008), and considering that the threshold for the 365 equilibrium state transition is near 1.0 × 10 -7 kg m -2 s -1 (in Fig. 3a), the mean value of e is taken to be about 1.5 × 10 -7 kg m -2 s -1 , with wind and tidal contributing half of the total, respectively. Scaling w and η based on the mean and range of variation of them, the values for a1, a2, a3, b1, b2 can be readily derived.
The freshwater flux mainly consists of river runoff (denoted by Pr), evaporation, and precipitation (jointly denoted by Pep), so p is formulated by p = Pr + Pep. Since Pr is primarily associated with the upper ocean, establish the equation Pr = a4 (w + 370 b2). Pep are related to the interaction of the upper ocean with the atmosphere, so the equation Pep = a5 (x2 + b3) + a6 (x2 + b3) (w + b2) is established. The river runoff accounts for a major portion of the total freshwater flux in the northern part of the Atlantic (Broecker et al., 1990), and concerning Fig. 3b, where the threshold for the equilibrium state transition is approximately 0.5 m yr -1 , the values of a4, a5, a6 and b3 can be readily derived. The wind-driven circulation is mainly related to the atmospheric forcing, and the equation is simply established as ω = a7 (x2 + b3). Based on the fact that the equilibrium state transition point 375 is near 2.5 Sv (in Fig. 3c) and scaling is performed on x2, the values of a7 can be estimated. Then, the relationships between e, p, ω from the three-box model and x2, w, η (red terms in Fig. 2) are established as follows = 1 ( + 1 ) + 2 ( + 2 ) + 3 ( + 2 )( + 1 ) = 4 ( + 2 ) + 5 ( 2 + 3 ) + 6 ( 2 + 3 )( + 2 ) = 7 ( 2 + 3 ) .

Experimental design 385
For the MOC3B-5V model, the model states (vector x) are the five variables, the ocean temperature variables, and the ocean salinity variables. The physical processes = ( , ) are represented by Eqs. (11) and (12) observations of x1, x2, x3, and w for estimation, κ = 28 is the "true" solution of the parameter. The truth model is run forward for 5000 TUs to establish the "truth" (see dashed line in Fig. 4 with x2 as a case). To simulate the observation errors, white noise is added to the true value, and the standard deviations of these observation errors are set to 2 for x1, x2, x3, and 0.5 for w. 395 Then, the true value with white noise is sampled at a certain frequency (5 time steps for x1, x2, x3, and 20 time steps for w) as observations. The "+" signs in Fig. 4 show an example of observations (x2). For the assimilation model, the initial conditions are the same as the above "observation". 20 model ensembles are set with different κ to simulate the transitions of AMOC in different models. 20 Gaussian random numbers are drawn for the parameter κ to be estimated, with the mean (̄ = 32) and the guessed variance ( 0 2 = 0.1). To capture the "true" regime transitions of 400 AMOC by the constraint of "observations", the 20 models are spun up from the same initial conditions for another 5000 TUs with different values of κ and standard values for other model parameters.

Sensitivity of model parameters
Several numerical models have shown that multiple equilibria exist in the thermohaline circulation, but the AMOC regime transitions obtained in different models are different. Changes in freshwater flux, energy constraint, wind-driven circulation, and other factors will cause the AMOC to switch between different equilibrium states. By combining the three-box model with 410 the five-variable model, it is simulated that AMOC switches between different equilibrium states in the time series.
As described in Sect. 3.2, the parameter κ, which affects the variation of e, p, and ω in the model, is erroneously guessed as 32. The errors of AMOC transitions caused by model parameters grow rapidly. For the twenty different ensemble members in the free model control ensemble simulations, although the values of κ are all close to 32 with a small difference and their variance is only 0.1, the simulation results (orange lines in Fig. 5a) are quite different which means that the equilibrium states 415 of AMOC are different. Meanwhile, the path of transition between different equilibrium states is also different and does not converge to the truth (red lines). The overturning rate greater than zero means that AMOC is in the thermal mode. By contrast, the value of overturning rate is negative because of the reversed direction of water flow in the three boxes.

Data assimilation and parameter estimation
AMOC simulation results of twenty ensemble members along different transition paths, which are different from the 420 "observations". To adjust the model to make the simulation results closer to the truth, the "observations" are assimilated into the model. Based on the method in Sect. 2.2, the observational increment and the covariance between the prior ensemble of model variable and the model-estimated ensemble are calculated first, after which each observational increment is applied to Eq. (3) to update the model variable ensemble. Obtain an updated prior ensemble of the variable in preparation for the next cycle of data assimilation. The results show that after data assimilation, although the twenty ensemble members (orange lines 425 in Fig. 5b) are in the same path of transition, where AMOC switches between different equilibrium states at the same pace, their transition paths are still different from the "observational" path (red lines). This is because there is a deviation between the parameter value in the model and its best estimate.
Parameterization can approximate many physics in the model, but the values of parameters are usually estimated roughly by summing up experiences in a large number of experiments. To reduce the error caused by parameter errors between model 430 simulation results and the truth, parameter estimation is performed next. The observational increment is applied to the error covariance between the model-estimated ensemble and the prior parameter ensemble by Eq. (4). Parameter estimation starts at 300th-unit. The result shows that after 300th-unit, the overturning rate in twenty ensemble members all follow the same transition path (orange lines in Fig. 5c), which is the same as the "observational" path (red lines). Meanwhile, the parameter κ is adjusted to around the best-estimated value 28 (Fig. 6). As an example of capturing regime transitions of the AMOC, Figure  435 7 shows the results in one of the twenty ensemble members in the free model control ensemble simulations with or without CDAPE. From Fig. 7, we learned that although the model parameter κ is erroneously guessed, constraining κ with observational https://doi.org/10.5194/npg-2021-5 Preprint.  Figure 5. Time series of the overturning rate in the "truth" simulation with κ = 28 (red) and the individual ensemble members (orange) in the free model control ensemble simulations a) without data assimilation and parameter estimation, b) with data assimilation or c) with data assimilation and parameter estimation using erroneously-guessed κ values with a Gaussian perturbation that has a mean value of 32 and a standard deviation of 0.1. The dashed black line denoting v = 0 is a division line between the thermal mode equilibrium state and the haline mode equilibrium state. The dotted-dashed black line denoting 300th-unit marks the start of parameter estimation.

The MOCBM
After proving that it is feasible to capture regime transitions by constraining parameters in an idealized conceptual model, we also use a MOCBM (Tardif et al., 2014;Zhao et al., 2019) with a better physical basis to study the problem of AMOC transition.
The MOCBM is a coupled lower-order ocean-atmosphere climate model constructed by Roebber (1995), which reflects the chaotic variability in the atmosphere and the oscillation or multi-equilibria in the ocean (Roebber, 1995;Taboada and Lorenzo, 465 2005). The atmospheric part of the model is represented by the wave-mean-flow atmospheric circulation model of Lorenz (1984). In contrast to Lorenz's 1963 model describing convection, Lorenz's 1984 model simulates the general atmospheric circulation at mid-latitudes (Lorenz, 1984). The ocean part of the model is represented by another three-box model of the Northern Atlantic Ocean at mid-latitude (Birchfield, 1989). A schematic illustration of this MOCBM can be found in Fig. 1 of Tardif et al. (2014). 470 In MOCBM, the model states (vector x) include atmospheric states X, Y, and Z, and oceanic states T1, T2, T3, S1, S2, and S3.
The governing equations of atmospheric model are: where X represents the intensity of the westerly wind current, Y and Z represent the magnitudes of cosine and sine phases of large-scale eddies, respectively. The terms XY and XZ represent amplification of the eddies through interaction with the westerly current. This amplification is at the expense of the westerly current, which is denoted by the term -(Y 2 + Z 2 ). The terms -bXZ and bXY represent displacement of the eddies by westerly current, while -aX, -Y, and -Z represent mechanical damping. Finally, F represents the diabatic heating contrasts between the low-and high-latitude ocean, and G represents the 480 longitudinal heating contrast between land and sea.
The simple non-dimensional ordinary differential equations are where KT represents the coefficient of heat exchange between the ocean and the atmosphere, KZ represents the coefficient of vertical interaction between the upper ocean and the deep ocean, TA1 and TA2 are the air surface temperature, and QS is the volume averaged equivalent salt flux. The non-dimensional variables r1, r2 and r3 are defined as rj = Vj / (V1 + V2 + V3), where Vj represent the volume of box j. The meridional overturning circulation q satisfies where α and β are the thermal and haline expansion coefficients of seawater, respectively, and µ is a proportionality constant.
The coupled interaction between the ocean box model and the atmospheric model is accomplished by the terms F, G, TA1, 495 TA2, and Qs. Superimposing background value and the variation in a seasonal cycle, as well as long-term variation associated with changes in upper ocean temperatures, F and G are defined as where F0, F1, F2, G0, G1, and G2 are constants, and ω is the annual frequency. Since X in the atmospheric model is directly 500 related to the temperature, the temperature is defined as where TA2 and γ are constants. Finally, the equivalent salt flux is formulated by S = runoff +̄W V + ′ WV , where runoff is the runoff into the ocean from the rivers, ̄W V and ′ WV are the mean and transient eddy components of the atmospheric water vapor transport, respectively. runoff and ̄W V are assumed to be constant and ′ WV is postulated to be linearly related 505 to the eddy sensible heat flux (Y 2 + Z 2 ) (Stone and Yao, 1990). Finally, Qs is obtained as follows: where c1 and c2 are constants to be determined. The parameters in this MOCBM are set to (a,b,r1,r2,r3,KT,Kz,TA2,α,β,µ,F0,F1,F2,G0,G1,G2,γ,) = (0.25,4.00,16.495,5.295,1.332,0.35,0.05276,1, psu -1 , 4 × 10 10 m 3 s -1 , 6.65, 2.0, 47.9, -3.60, 1.24, 3.81, 0.06364).
As in Roebber (1995), the value of Qs affects the solution of the thermohaline circulation, and Qs above a critical value will eventually lead to a complete reversal of the flow. To obtain this critical value of Qs, the equilibrium solution for the 515 thermohaline circulation is calculated as the value of Qs varies from 0.5 × 10 -3 to 4.0 × 10 -3 . As shown in Fig. 8, this critical value is near 2.05 × 10 -3 . Considering the mean and the range of variation of (Y 2 + Z 2 ), and also referring to the values taken in Roebber (1995) and Tardif et al. (2014), c1 and c2 are set here to 1.94 × 10 -3 and 4.05 × 10 -5 , respectively.

Parameter estimation with observational information
A twin experimental framework is designed to perform the study of capturing regime transitions using the MOCBM. Using a 520 fourth-order Runge-Kutta time differencing scheme with a time step of 3 h, the MOCBM is specified with parameter values described above. The parameter b is assumed to be incorrectly estimated as 4.0, while its "true" value is 3.8. The truth model is spun up for an initial 10 5 years starting from an initial value of q equal to 15 Sv. Then, another 60000 years are run forward to produce the "truth" states. The states of the atmosphere and the temperature and salinity of the surface ocean are considered as the variables to be observed. The white noise is added to the "truth" states, and the standard deviations of the observation 525 errors are set to 0.1 for X, Y, Z, 0.5 K for T1, T2, and 0.1 psu for S1, S2. The "observations" are eventually obtained by sampling these variables at a frequency of one year. In the twin experimental framework, the assimilation model is similar to the truth model except that parameter b is assumed to be incorrectly estimated. Thus, the mean value of all parameters b from the twenty assimilation models is 4.0 and their variance is 0.1. Similar to the analysis of model parameter sensitivity in the MOC3B-5V model in Sect. 3.2, the analysis for this model is not repeated here. 530 Figure 9 shows the time series of meridional overturning circulation q, where the positive and negative aspects of q reflect the reversed circulation, which represents the transition between two different states. The simulation results of the twenty assimilation models (orange line in Fig. 9a) are different, and they all have significant errors with the results from the truth model (red line). Then, the "observations" from the truth model are used to adjust the model states, as well as constrain the parameter b. The results of data assimilation and parameter estimation are shown in Fig. 9b, where the simulation results of 535 the twenty assimilation models (orange lines) are constrained by "observations", and the reconstructed transition path (red lines) are obtained.
Compared to the MOC3B-5V model, the MOCBM is more explicit in physical meaning, which is mainly reflected in the meaning of the model states. The meaning of the state variables in the atmospheric part of the MOCBM is more explicit, such as X for westerly wind current and Y and Z for large-scale eddies. The MOC3B-5V model, however, only describes the chaotic 540 characteristic of the atmosphere starting from a simple heating disturbance problem. It further describes the basic variation can be used to test the feasibility of the methodology in Sect. 2. By constraining the model parameters with observations, both 545 models result in capturing regime transitions of the AMOC. Figure 9. Time series of the meridional overturning circulation q in the "truth" simulation with b = 3.8 (red) and the individual ensemble members (orange) in the free model control ensemble simulations a) without or b) with data assimilation and parameter estimation using erroneously-guessed b values with a Gaussian perturbation that has a mean value of 4.0 and a standard deviation of 0.1. The dashed black 550 line denoting q = 0 is a division line between two equilibrium states.

Summary and discussions
A method for combining general AMOC simulation model with ensemble Kalman filtering is designed to form a CDAPE system. Given that the discrepancy exists between the influencing factors of AMOC in the real world and the corresponding parameters of models, parameter estimation is used to estimate the model parameters. With the CDAPE system, within a "twin" 555 experiment framework, twenty assimilation models are set with an incorrect parameter, while a model representing the "truth" that uses the parameters as the standard values. The assimilation models simulate twenty different paths for AMOC to switch between equilibrium states with disturbing parameters. The observational information from the "truth" is assimilated into the assimilation models, and the transition path of AMOC is optimized by parameter estimation, so that regime transitions of AMOC is captured correctly. Our results suggest that guided by estimation theory, appropriately constraining coupled model 560 parameters with observed data can make a climate model capture regime transitions of the AMOC. The research methodology is applied to simple climate models that can simulate AMOC multi-equilibria. The first model in this study provides conceptual https://doi.org/10.5194/npg-2021-5 Preprint. proof that the methodology is feasible, and the second model with more explicit physical meaning provides further demonstration through simulation results.
A simple model that consists of a 3-box ocean model and a 5-variable climate model (the MOC3B-5V model) has been 565 developed to simulate the basic characteristic of AMOC that transits between different equilibrium states. The parameters in the three-box model are linked to the atmospheric variables, the upper ocean variable, and the deep ocean variable in the 5variable model to construct energy constraint, wind-driven circulation, and freshwater flux, which dynamically change within a reasonable range. By projecting observational information of model states to the parameters, the AMOC regime transitions simulated by the model are much closer to reality. It has to be noted that after the change in the three-box model from the 570 stable haline mode to the stable thermal mode, a catastrophic change occurs in the system, which results in the disappearance of the stable haline mode (Guan and Huang, 2008). It is impossible to change the state from the thermal mode to the haline mode by changing parameters. Therefore, to adjust the transition between the thermal mode and the haline mode, additional atmospheric forcing (additional freshwater flux from the atmosphere) is added to the two boxes that have contact with the atmosphere. The effect of this forcing is small and does not affect the overall balance of the model. Although we acknowledge 575 that the effect of additional forcing still needs to be further investigated, the disappearance of the haline mode is not addressed in this study so that we may focus on capturing regime transitions by observation-constrained model parameters. It is important to emphasize that the simple conceptual model is not attempting to simulate a specific oceanic and atmospheric physical process, but rather the opposite: our objective is to explore whether the error between models and reality in terms of the AMOC transition can be reduced by incorporating observational information into the model parameters. 580 The MOCBM (Roebber, 1995) with clearer physical meaning is used in this study. Although both the MOCBM and the MOC3B-5V model are simple idealized hemispheric models, our concerns can be illustrated more clearly through them. Our effort here is to make the AMOC multiple equilibrium states from model simulations better reflect the features in reality. Our focus is on adjusting the model parameters by sampling the observations so that the simulation of the model is closer to the truth on the feature of regime transitions. The conceptual model, albeit simple, has demonstrated the importance of data 585 assimilation and parameter estimation. It is hoped that such a simple model study on AMOC transition will inspire the hypotheses and the optimization of parameters in Coupled General Circulation Models (CGCMs). Taking a study by Ashkenazy (2007) as an example, to understand the ocean general circulation model results, they constructed a simple threebox model to understand the behavior of the thermohaline circulation in more realistic parameter regimes (Ashkenazy, 2007).
We have already captured regime transitions of the AMOC in a conceptual model as well as in a simple model with clearer 590 physical meaning and will apply this method to more complex real systems such as CGCM. The characteristic of AMOC multi-equilibria has been simulated in box models (e.g., Stommel, 1961;Rooth, 1982;Welander, 1986;Birchfield, 1989), ocean circulation models (e.g., Marotzke and Willebrand, 1991), and coupled ocean-atmosphere models (e.g., Manabe and Stouffer, 1988). However, it should also be noted that AMOC multiple equilibria have not been directly simulated by some CGCMs. Tremendous research efforts thus have been put to tackle this issue. One focus was on the CGCM presentation of 595 Stommel's salt advection feedback (Rahmstorf, 1996). It has been suggested this feedback is distorted in CGCMs due to salinity https://doi.org/10.5194/npg-2021-5 Preprint. Discussion started: 1 February 2021 c Author(s) 2021. CC BY 4.0 License.
biases (Huisman et al., 2010;Jackson, 2013;Liu et al., 2017). Another argument is on ocean eddies. It has also been suggested that CGCMs with an eddy-permitting ocean allow for a simulation of AMOC multiple equilibria (Jackson and Wood, 2018) since ocean eddies modify the overall freshwater balance (Mecking et al., 2016). In follow-up studies, we will explore the contribution of a CDAPE system to AMOC multi-equilibrium using different resolutions, ranging from a coarse-resolution 600 CGCM with the ability to simulate AMOC multi-equilibrium characteristic, and eventually to a high-resolution and more realistic Earth system model. The general methodology of this study could be used for both S-type transition and F-type transition. The S-type transition with centurial and millennial timescales could use observations from paleoclimate records, and the methodology can be applied to paleoclimate models for capturing AMOC regime transitions. In current climate system, the F-type transition with very high transition probabilities on multi-decadal timescales (Castellana et al., 2019;Castellana and 605 Dijkstra, 2020) could use direct observations from RAPID or indirect observations from satellites or ARGO program.
Although the observation-constrained model simulates the transition between different equilibrium states of AMOC, this study only serves as the first step of capturing regime transitions and many challenges still exist. First, the deviations of AMOC transition paths simulated in different models are caused by not only parameter errors but also mismatches between real physical processes and model simulations (Zhang et al., 2012). Therefore, the performance of parameter estimation still needs 610 further experimentation with more realistic models. Second, the mechanism of AMOC transition needs further investigation.
The effect of stochastic forcing has been taken into account in previous work. Cessi (1994) studied the transition from one equilibrium to another in a modified Stommel model and she found that the transition could occur under stochastic white-noise forcing. In our study, the transition phenomenon of AMOC is ultimately affected by the model parameters. Usually, traditional state estimation with data assimilation has limited usage to detect the mechanism. Here, we aim at constraining the model 615 parameters through utilization of observational information, which eventually results in a more realistic model behavior in terms of the AMOC transitions. Future effort is needed on how the effect of the stochastic component will manifest in the AMOC system.
Besides, in the study of two simple models, the observational information, which is used for data assimilation and parameter estimation, only comes from the atmosphere and surface ocean. In the real Earth system, the flow of seawater located in the 620 deep ocean is an important part of AMOC, its measurement is difficult. The changes in each component of AMOC will affect the entire circulation. AMOC reconstruction heavily relies on comprehensive observational data. In the future, with the improvement of the Earth observing system, the coupled climate system model will be improved continuously, and the results of numerical simulation will have a higher credibility. These could lead to significant improvement of the reanalysis and prediction of the AMOC. 625 https://doi.org/10.5194/npg-2021-5 Preprint. Discussion started: 1 February 2021 c Author(s) 2021. CC BY 4.0 License.