the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Explaining the high skill of Reservoir Computing methods in El Niño prediction
Abstract. Accurate prediction of the extreme phases of the El Niño Southern Oscillation (ENSO) is important to mitigate the socioeconomic impacts of this phenomenon. It has long been thought that prediction skill was limited to a 6 months lead time. However, Machine Learning methods have shown to have skill at lead times up to 21 months. In this paper we aim to explain for one class of such methods, i.e. Reservoir Computers (RCs), the origin of this high skill. Using a Conditional Nonlinear Optimal Perturbation (CNOP) approach, we compare the initial error propagation in a deterministic Zebiak-Cane (ZC) ENSO model and that in an RC trained on synthetic observations derived from a stochastic ZC model. Optimal initial perturbations at long lead times in the RC involve both sea surface temperature and thermocline anomalies which leads to a decreased error propagation compared to the ZC model, where mainly thermocline anomalies dominate the optimal initial perturbations. This reduced error propagation allows the RC to provide a higher skill at long lead times than the deterministic ZC model.
- Preprint
(5484 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
CC1: 'Comment on npg-2024-24', Paul Pukite, 23 Nov 2024
Because of the importance of the thermocline in ENSO behavior, the impact of long-period tides in a reduced effective gravity environment has to be included in any predictive analysis. This is particularly appropriate for machine learning, where known tidal data can be straightforwardly included as with any other input. It's obvious from the paper that the concentration focuses on natural responses (see the reproduced Fig.A2(a ) below) which clearly shows the damping characteristic of the perhaps stochastically-selected (via noise) eigenvalue solution to a differential equation.
" This distinction hinges on whether ENSO variability occurs as a sustained oscillation or limit cycle (supercritical) or is a damped oscillation excited by stochastic forcing (subcritical)."
Yet, it's more than likely that ENSO is the result of a forced response to tidal forces, with the annual nonlinear interaction creating an erratic cycling about the approximate 4 year mean period estimated from an index such as NINO3. For the main long-period tidal factors of Mf and Mm, the annually sidebanded periods are calculated at 3.8 and 3.9 years. The complete nonlinear solution of the shallow-water Laplace's tidal equations used to model oceanic fluid dynamics is described in [1]. A similar training/validation/test procedure is used for finding an optimal predictive fit as that used in machine learning. The main point in this type of modeling is that predictive analysis can conceivably be made years in advance. The continually forcing of the mixed lunar and annual cycles will create the requisite temporal boundary/guiding conditions to maintain coherence over a long range, much like conventional tides do for sea-level height (SLH) analysis.
[1] Pukite, P., Coyne, D., & Challou, D. (2019). Mathematical Geoenergy: Discovery, Depletion, and Renewal (Vol. 241). John Wiley & Sons. https://agupubs.onlinelibrary.wiley.com/doi/10.1002/9781119434351.ch12. Also see the following site for recent information: https://geoenergymath.com/2024/11/10/lunar-torque-controls-all
Citation: https://doi.org/10.5194/npg-2024-24-CC1 - RC1: 'Comment on npg-2024-24', Anonymous Referee #1, 18 Dec 2024
-
RC2: 'Comment on npg-2024-24', Anonymous Referee #2, 04 Jan 2025
Review on “Explaining the high skill of Reservoir Computing methods in El Niño prediction”
Reservoir Computer (RC) is one special version of RNN, which has been applied to build ENSO prediction model including the study in this article. This study aims to explain the high skill of RC in El Nino prediction theoretically. Based on ideal experiments, the author uses various ZC models in different regimes to generate “observation”, and uses RC to learn these data so that the ENSO dynamic characteristics of ZC and the Nino trajectory can be learned. From the results, whether in the subcritical or supercritical regime, RC shows high performance. In addition, through CNOP calculation, the sensitivities of RC and ZC to the initial field are explored with meaningful results. However, it seems to be contrary to the main purpose of the study, and it does not seem to fully explain the reason why RC can produce high El Nino forecasting skills. This is my biggest doubt about this work, and of course it is also the most interesting point. I hope the author can have a more elegant explanation. In addition, the following are some thoughts and suggestions on this work or article:
1. The results in Figure 2 make me think deeply. It tells us that when training a model, it doesn't mean that the richer the data included, the better the results will be. It seems to be related to the inherent dynamic characteristics of the system. Could you please explain why. For example, in the sub-critical state, why the prediction skill is better when wind field is not included in the training period? Although you attribute it to the sensitivity to wind noise, this is not specific enough. I think it can be discussed in more detail.
By the way, I do not understand the sentence in Line 115.
2. For the CNOP part, “lead time” in ms is optimization time, right?
3. Some pictures need to be refined. For example, it is recommended that the abscissa and ordinate in Figure A1 should be changed into the format of latitude and longitude coordinates.
4. The calculation of CNOP in complicated climate models has always been a major problem. How did you use the gradient-free Cobyla optimization algorithm to solve it? In addition, it is necessary to further verify whether the obtained CNOP is truly the CNOP. It is recommended to add random small perturbations to the obtained CNOP and project it onto the constraint conditions to compare the development of errors, so as to prove that the solution of CNOP is optimal.
5. Compared with linear regression, it seems that the advantages of RC are not particularly significant either. What's your view on this issue?
6. In RC, CNOP is not sensitive to the forecast duration, while the opposite is true in ZC (Fig. 5). Why is this the case and what does it imply?
7. Personally, to explain the advantages of RC in ENSO prediction, the key is to focus on the extent to which RC has learned the ENSO dynamics or nonlinear behaviors.
Citation: https://doi.org/10.5194/npg-2024-24-RC2 -
RC3: 'Comment on npg-2024-24', Anonymous Referee #2, 04 Jan 2025
Review on “Explaining the high skill of Reservoir Computing methods in El Niño prediction”
Reservoir Computer (RC) is one special version of RNN, which has been applied to build ENSO prediction model including the study in this article. This study aims to explain the high skill of RC in El Nino prediction theoretically. Based on ideal experiments, the author uses various ZC models in different regimes to generate “observation”, and uses RC to learn these data so that the ENSO dynamic characteristics of ZC and the Nino trajectory can be learned. From the results, whether in the subcritical or supercritical regime, RC shows high performance. In addition, through CNOP calculation, the sensitivities of RC and ZC to the initial field are explored with meaningful results. However, it seems to be contrary to the main purpose of the study, and it does not seem to fully explain the reason why RC can produce high El Nino forecasting skills. This is my biggest doubt about this work, and of course it is also the most interesting point. I hope the author can have a more elegant explanation. In addition, the following are some thoughts and suggestions on this work or article:
1. The results in Figure 2 make me think deeply. It tells us that when training a model, it doesn't mean that the richer the data included, the better the results will be. It seems to be related to the inherent dynamic characteristics of the system. Could you please explain why. For example, in the sub-critical state, why the prediction skill is better when wind field is not included in the training period? Although you attribute it to the sensitivity to wind noise, this is not specific enough. I think it can be discussed in more detail.
By the way, I do not understand the sentence in Line 115.
2. For the CNOP part, “lead time” in ms is optimization time, right?
3. Some pictures need to be refined. For example, it is recommended that the abscissa and ordinate in Figure A1 should be changed into the format of latitude and longitude coordinates.
4. The calculation of CNOP in complicated climate models has always been a major problem. How did you use the gradient-free Cobyla optimization algorithm to solve it? In addition, it is necessary to further verify whether the obtained CNOP is truly the CNOP. It is recommended to add random small perturbations to the obtained CNOP and project it onto the constraint conditions to compare the development of errors, so as to prove that the solution of CNOP is optimal.
5. Compared with linear regression, it seems that the advantages of RC are not particularly significant either. What's your view on this issue?
6. In RC, CNOP is not sensitive to the forecast duration, while the opposite is true in ZC (Fig. 5). Why is this the case and what does it imply?
7. Personally, to explain the advantages of RC in ENSO prediction, the key is to focus on the extent to which RC has learned the ENSO dynamics or nonlinear behaviors.
Citation: https://doi.org/10.5194/npg-2024-24-RC3 -
RC4: 'Comment on npg-2024-24', Anonymous Referee #3, 06 Jan 2025
This manuscript investigates the prediction skill of a specific type of Recurrent Neural Network, known as Reservoir Computer (RC), in relation to ENSO forecasting. It finds that error propagation in RC is lessened compared to the Zebiak-Cane (ZC) model. While the RC demonstrates high prediction skill (e.g., an ACC greater than 0.6 at 18 months lead time), I believe this manuscript is not suitable for publication for several reasons:
1. The predictions are not based on real-world data. Both the training and testing datasets are generated from the ZC model, which does not reflect actual observations. It is unclear how well RC performs when predicting real-world events, such as the ENSO events of 2014-2015.
2. The prediction accuracy of RC is very similar to that achieved by linear regression (LR, as shown in Fig. 2). First, error bars should be included for the LR results. Second, the performance of LR is comparable to that of RC, particularly as indicated by the proximity of the red and blue lines at lead times of 1-9 months.
3. The influence of wind stress in RC is inconsistent. In some instances, incorporating wind stress enhances ENSO predictions, while in others, it does not. This inconsistency undermines the conclusions drawn, as it does not provide clear insights for real-world predictions, particularly regarding whether ENSO is damped or self-exciting in actual observations.
4. The results from the ZC model raise concerns. For instance, in Fig. A2, the Nino3 index only fluctuates between 0.1 and -0.1.
I am not familiar with CNOP, so I will refrain from commenting on section 3.3.
Citation: https://doi.org/10.5194/npg-2024-24-RC4
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
210 | 45 | 14 | 269 | 5 | 5 |
- HTML: 210
- PDF: 45
- XML: 14
- Total: 269
- BibTeX: 5
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1