<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">NPG</journal-id><journal-title-group>
    <journal-title>Nonlinear Processes in Geophysics</journal-title>
    <abbrev-journal-title abbrev-type="publisher">NPG</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Nonlin. Processes Geophys.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1607-7946</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/npg-31-535-2024</article-id><title-group><article-title>Learning extreme vegetation response to climate drivers with recurrent neural networks</article-title><alt-title>Learning vegetation response to climate drivers</alt-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1 aff2 aff3">
          <name><surname>Martinuzzi</surname><given-names>Francesco</given-names></name>
          <email>martinuzzi@informatik.uni-leipzig.de</email>
        <ext-link>https://orcid.org/0000-0003-3249-3703</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1 aff2 aff3 aff4">
          <name><surname>Mahecha</surname><given-names>Miguel D.</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-3031-613X</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff5">
          <name><surname>Camps-Valls</surname><given-names>Gustau</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-1683-2138</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2 aff3 aff4">
          <name><surname>Montero</surname><given-names>David</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-9010-3286</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff5">
          <name><surname>Williams</surname><given-names>Tristan</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-0532-7013</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2 aff3">
          <name><surname>Mora</surname><given-names>Karin</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-3323-4490</ext-link></contrib>
        <aff id="aff1"><label>1</label><institution>Center for Scalable Data Analytics and Artificial Intelligence, Leipzig University, Leipzig, Germany</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Institute for Earth System Science &amp; Remote Sensing, Leipzig University, Leipzig, Germany</institution>
        </aff>
        <aff id="aff3"><label>3</label><institution>Remote Sensing Centre for Earth System Research, Leipzig University and UFZ, Leipzig, Germany</institution>
        </aff>
        <aff id="aff4"><label>4</label><institution>German Centre for Integrative Biodiversity Research (iDiv), Leipzig, Germany</institution>
        </aff>
        <aff id="aff5"><label>5</label><institution>Image Processing Laboratory (IPL), Universitat de València, València, Spain</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Francesco Martinuzzi (martinuzzi@informatik.uni-leipzig.de)</corresp></author-notes><pub-date><day>13</day><month>November</month><year>2024</year></pub-date>
      
      <volume>31</volume>
      <issue>4</issue>
      <fpage>535</fpage><lpage>557</lpage>
      <history>
        <date date-type="received"><day>13</day><month>October</month><year>2023</year></date>
           <date date-type="rev-request"><day>17</day><month>October</month><year>2023</year></date>
           <date date-type="rev-recd"><day>6</day><month>September</month><year>2024</year></date>
           <date date-type="accepted"><day>9</day><month>September</month><year>2024</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2024 Francesco Martinuzzi et al.</copyright-statement>
        <copyright-year>2024</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024.html">This article is available from https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024.html</self-uri><self-uri xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024.pdf">The full text article is available as a PDF file from https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d2e154">The spectral signatures of vegetation are indicative of ecosystem states and health. Spectral indices used to monitor vegetation are characterized by long-term trends, seasonal fluctuations, and responses to weather anomalies. This study investigates the potential of neural networks in learning and predicting vegetation response, including extreme behavior from meteorological data. While machine learning methods, particularly neural networks, have significantly advanced in modeling nonlinear dynamics, it has become standard practice to approach the problem using recurrent architectures capable of capturing nonlinear effects and accommodating both long- and short-term memory. We compare four recurrent-based learning models, which differ in their training and architecture for predicting spectral indices at different forest sites in Europe: (1) recurrent neural networks (RNNs), (2) long short-term memory networks (LSTMs), (3) gated recurrent unit networks (GRUs), and (4) echo state networks (ESNs). While our results show minimal quantitative differences in their performances, ESNs exhibit slightly superior results across various metrics. Overall, we show that recurrent network architectures prove generally suitable for vegetation state prediction yet exhibit limitations under extreme conditions. This study highlights the potential of recurrent network architectures for vegetation state prediction, emphasizing the need for further research to address limitations in modeling extreme conditions within ecosystem dynamics.</p>
  </abstract>
    
<funding-group>
<award-group id="gs1">
<funding-source>Sächsisches Staatsministerium für Wissenschaft und Kunst</funding-source>
<award-id>ScaDS.AI</award-id>
<award-id>232171353</award-id>
<award-id>3-7304/35/6-2021/48880</award-id>
</award-group>
<award-group id="gs2">
<funding-source>European Space Agency</funding-source>
<award-id>4000143500/23/I-DT</award-id>
</award-group>
<award-group id="gs3">
<funding-source>Deutsches Zentrum für Luft- und Raumfahrt</funding-source>
<award-id>50EE2201B</award-id>
</award-group>
<award-group id="gs4">
<funding-source>European Commission</funding-source>
<award-id>101003469</award-id>
</award-group>
</funding-group>
</article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d2e166">The recent increase in atmospheric CO<sub>2</sub> concentrations only partly reflects anthropogenic emissions, as oceans and land ecosystems contribute to the carbon uptake <xref ref-type="bibr" rid="bib1.bibx32 bib1.bibx71 bib1.bibx15" id="paren.1"/>. Forests and other terrestrial ecosystems absorb nearly a third of human-made emissions and establish an essential negative feedback within the global carbon cycle <xref ref-type="bibr" rid="bib1.bibx41 bib1.bibx70" id="paren.2"/>. However, during extreme events such as persistent droughts and heat waves, ecosystems may release more CO<sub>2</sub> into the atmosphere than they absorb, e.g., due to suppressed photosynthesis <xref ref-type="bibr" rid="bib1.bibx134 bib1.bibx121" id="paren.3"/>. Variations in the frequency and intensity of these events can lead to long-lasting environmental modifications, contributing to positive feedback loops that aggravate climate warming <xref ref-type="bibr" rid="bib1.bibx105" id="paren.4"/>. For example, increases in drought intensities have been consistently linked to excess tree mortality <xref ref-type="bibr" rid="bib1.bibx48 bib1.bibx36 bib1.bibx72 bib1.bibx31" id="paren.5"/>, negatively impacting carbon sequestration potential <xref ref-type="bibr" rid="bib1.bibx131" id="paren.6"/>. The frequency, intensity, and duration of extremes over the next few decades are expected to increase compared to previous decades <xref ref-type="bibr" rid="bib1.bibx118" id="paren.7"/>. Therefore, understanding how vegetation responds to climate drivers becomes crucial in land–atmosphere modeling <xref ref-type="bibr" rid="bib1.bibx81 bib1.bibx82" id="paren.8"/>.</p>
      <p id="d2e212">The vegetation response changes over time, showing seasonal patterns and long-term trends <xref ref-type="bibr" rid="bib1.bibx123 bib1.bibx79 bib1.bibx26 bib1.bibx27 bib1.bibx73 bib1.bibx117 bib1.bibx90" id="paren.9"/>. This variability is partially inherited by climate variations, reflected in the dynamics of solar radiation, air temperature, and precipitation, which affect vital biosphere processes such as phenology, photosynthesis, respiration, and growth. However, the relationship between climate and biosphere involves complex interactions due to the nonlinear response of vegetation to climate drivers <xref ref-type="bibr" rid="bib1.bibx37 bib1.bibx141 bib1.bibx93 bib1.bibx117" id="paren.10"/>. In particular, ecosystems exhibit long-range memory effects <xref ref-type="bibr" rid="bib1.bibx58 bib1.bibx96" id="paren.11"/> that can put their resilience at risk <xref ref-type="bibr" rid="bib1.bibx28" id="paren.12"/>. For instance, extreme heat waves can negatively impact leaf growth and development that, when coupled with drought conditions, can induce tree mortality <xref ref-type="bibr" rid="bib1.bibx129" id="paren.13"/>. Extreme climate events can cause irreversible damage and reduce an ecosystem's resilience  <xref ref-type="bibr" rid="bib1.bibx46 bib1.bibx5 bib1.bibx82" id="paren.14"/>. These factors collectively contribute to the challenge of predicting the vegetation and climate system.</p>
      <p id="d2e234">Traditionally, terrestrial biosphere models have played a pivotal role in simulating ecosystem responses to climate variability <xref ref-type="bibr" rid="bib1.bibx122 bib1.bibx66" id="paren.15"/>. These process-based models are inherently complex, as they do not only have to adhere to physical laws but also need to reflect biotic responses that are derived from empirical observations and difficult to parametrize. Hence, terrestrial biosphere models sometimes fall short in capturing the complexity of ecosystem dynamics accurately <xref ref-type="bibr" rid="bib1.bibx94" id="paren.16"/>. It has therefore been discussed to what degree  machine learning (ML) techniques can alleviate these issues <xref ref-type="bibr" rid="bib1.bibx107" id="paren.17"/>. These methods represent powerful modeling tools that are able to find patterns in data that process-based models may not be able to capture. As a consequence, applications of ML models in land–atmosphere interactions are wide-ranging, for example local-to-global flux upscaling <xref ref-type="bibr" rid="bib1.bibx95 bib1.bibx59 bib1.bibx91" id="paren.18"/>, quantifying dynamic properties of seasonal behavior <xref ref-type="bibr" rid="bib1.bibx117 bib1.bibx90" id="paren.19"/>, or the prediction of ecosystem states <xref ref-type="bibr" rid="bib1.bibx60 bib1.bibx144 bib1.bibx101" id="paren.20"/>. More specifically, recurrent neural networks (RNNs) represent suitable architectures for modeling complex Earth system dynamics <xref ref-type="bibr" rid="bib1.bibx107 bib1.bibx14" id="paren.21"/> due to their ability to encode nonlinear temporal dependencies <xref ref-type="bibr" rid="bib1.bibx6 bib1.bibx68" id="paren.22"/> and capacity to retain information from past inputs <xref ref-type="bibr" rid="bib1.bibx33" id="paren.23"/>.</p>
      <p id="d2e265">However, RNNs have technical challenges associated with gradient-based training, including the issues of vanishing and exploding gradients, which impede network convergence <xref ref-type="bibr" rid="bib1.bibx53 bib1.bibx97" id="paren.24"/>. To tackle these problems, specialized RNN architectures have been developed. Long short-term memory (LSTM) networks <xref ref-type="bibr" rid="bib1.bibx54" id="paren.25"/> maintain the gradient-based training of the original RNNs while addressing these problems through gating mechanisms. Another architecture, the gated recurrent unit (GRU) <xref ref-type="bibr" rid="bib1.bibx21" id="paren.26"/>, further refines the LSTM approach, providing comparable results with computational efficiency <xref ref-type="bibr" rid="bib1.bibx22" id="paren.27"/>. In contrast, echo state networks (ESNs) employ a distinct approach by training only the last layer through linear regression <xref ref-type="bibr" rid="bib1.bibx57" id="paren.28"/>. The absence of derivatives guarantees non-vanishing or exploding gradients, offering an alternative training solution to gating. The improvements provided by both the gated models and the ESNs allowed the application of these models to different tasks such as rainfall–runoff modeling <xref ref-type="bibr" rid="bib1.bibx64 bib1.bibx43" id="paren.29"/>, sea surface temperature estimation <xref ref-type="bibr" rid="bib1.bibx143 bib1.bibx135" id="paren.30"/>, and chaotic systems' forecasting <xref ref-type="bibr" rid="bib1.bibx99 bib1.bibx100 bib1.bibx133 bib1.bibx18 bib1.bibx44" id="paren.31"/> among others.</p>
      <p id="d2e294">Expanding on the utility of RNNs, particularly LSTMs, recent studies have demonstrated their effectiveness in addressing specific challenges within land–atmosphere interactions, modeling land fluxes from meteorological drivers <xref ref-type="bibr" rid="bib1.bibx106" id="paren.32"/>. Further studies reinforced the suitability of RNN approaches for land–atmosphere interactions <xref ref-type="bibr" rid="bib1.bibx20" id="paren.33"/>. In <xref ref-type="bibr" rid="bib1.bibx8" id="text.34"/>, the authors employ LSTMs to predict land fluxes from remote sensing data and climate variables to explore the memory effects of vegetation. Additionally, the work compares LSTMs with random forests, showing the better performance of deep learning models in this task. Further explorations in dynamic memory effects with LSTMs have been carried out by <xref ref-type="bibr" rid="bib1.bibx63" id="text.35"/>. Given the recent findings on the utility of RNNs and LSTMs in studying dynamics, it is timely to investigate if these tools can accurately predict extreme biosphere dynamics.</p>
      <p id="d2e309">Can recurrent neural networks learn the vegetation's extreme responses to climate drivers? Recurrent architectures can embed the dynamics of target systems into their higher dimensional representation <xref ref-type="bibr" rid="bib1.bibx42 bib1.bibx51" id="paren.36"/>. When considering the nature of extremes in dynamical systems, they can be understood as specific regions of the phase space <xref ref-type="bibr" rid="bib1.bibx35" id="paren.37"/>. Combining this with the embedding abilities of RNNs offers an explanation for the observed efficacy of ESNs and LSTMs in learning extreme events within controlled environments or “toy models” <xref ref-type="bibr" rid="bib1.bibx124 bib1.bibx69 bib1.bibx103 bib1.bibx104 bib1.bibx87 bib1.bibx92" id="paren.38"/>. However, the applicability of these findings to land–atmosphere interactions remains unclear. Unlike the systems investigated in previous studies, biosphere dynamics are characterized by stochasticity <xref ref-type="bibr" rid="bib1.bibx30" id="paren.39"/>, while their measurements present noise <xref ref-type="bibr" rid="bib1.bibx88" id="paren.40"/>. Therefore, exploring how recurrent architectures perform in the specific context of vegetation extremes in ecosystem dynamics is imperative.</p>
      <p id="d2e327">To address this question, we investigate the ability of various recurrent architectures to model the response of vegetation states, i.e., spectral reflectance indices, to climate drivers. Vegetation interacts with sunlight, showing specific spectral responses that can be altered during extreme events such as heat waves. Vegetation indices, obtained from the spectral response through linear or nonlinear transformations, are used to quantify these changes, isolating vegetation properties from other influences such as soil background <xref ref-type="bibr" rid="bib1.bibx142" id="paren.41"/>; detailed lists of these indices are provided by <xref ref-type="bibr" rid="bib1.bibx142" id="text.42"/> and <xref ref-type="bibr" rid="bib1.bibx89" id="text.43"/>. Our study focuses on the normalized difference vegetation index (NDVI) <xref ref-type="bibr" rid="bib1.bibx112" id="paren.44"/>, which indicates vegetation greenness. To build a model that can accurately predict NDVI responses to climate conditions, we use climate-related variables, such as air temperature and precipitation, as inputs.</p>
      <p id="d2e342">The goal of this study is threefold: (1) to further solidify the viability of the recurrent neural network approach to model ecosystem dynamics; (2) to investigate whether these models can capture extreme vegetation responses to climate drivers; and, finally, (3) to investigate whether a specific RNN architecture is more suited for these tasks. To evaluate the performance of the models, we use metrics such as the normalized root mean squared error (NRMSE) and symmetric mean absolute percentage error (SMAPE) in conjunction with information-theory-based measures, namely the entropy–complexity (EC) plots <xref ref-type="bibr" rid="bib1.bibx111" id="paren.45"/>. The latter approach quantifies each model's ability to capture the target dynamics. Using the model's residuals, defined as the difference between the prediction and the actual signal, the EC plots return an intuition of the model's ability to capture dynamics beyond the seasonality <xref ref-type="bibr" rid="bib1.bibx120" id="paren.46"/>. Additionally, we investigate whether recurrent architectures can effectively capture vegetation responses to extreme events. To our knowledge, no study has compared the performance of recurrent models in the context of extreme events or ecosystem dynamics.</p>
      <p id="d2e351">The remainder of this paper is structured as follows. In Sect. <xref ref-type="sec" rid="Ch1.S2.SS1"/>, we present the data we used, including the site selection process and pre-processing steps.  Next, in Sect. <xref ref-type="sec" rid="Ch1.S2.SS2"/>, we describe the architecture of the recurrent neural networks and formalize the task. Following this, Sect. <xref ref-type="sec" rid="Ch1.S2.SS3"/> and <xref ref-type="sec" rid="Ch1.S2.SS4"/> give a background on the methods of backpropagation training and echo state networks, respectively. In Sect. <xref ref-type="sec" rid="Ch1.S2.SS5"/> we describe the procedure to identify extreme events in the NDVI time series. We detail the metrics used in the study in Sect. <xref ref-type="sec" rid="Ch1.S2.SS6"/>. More specifically, in Sect. <xref ref-type="sec" rid="Ch1.S2.SS6.SSS1"/>, we introduce NRMSE and SMAPE; in Sect. <xref ref-type="sec" rid="Ch1.S2.SS6.SSS2"/>, we illustrate the EC plots; finally, in Sect. <xref ref-type="sec" rid="Ch1.S2.SS6.SSS3"/>, we describe the metrics used for evaluating model performance on predicting extreme events. We show our results in Sect. <xref ref-type="sec" rid="Ch1.S3"/>. In Sect. <xref ref-type="sec" rid="Ch1.S3.SS1"/> we compare model performances using NRMSE and SMAPE. Additionally, in Sect. <xref ref-type="sec" rid="Ch1.S3.SS2"/> we show the results for the EC plots. Finally, we illustrate the models' capability to predict extreme events in Sect. <xref ref-type="sec" rid="Ch1.S3.SS3"/>. We draw conclusions and discuss broader implications in Sect. <xref ref-type="sec" rid="Ch1.S4"/>.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Methods</title>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Data and pre-processing</title>
      <p id="d2e399">We used optical remote sensing data of forest sites to measure biosphere dynamics, specifically the normalized difference vegetation index (NDVI) <xref ref-type="bibr" rid="bib1.bibx112" id="paren.47"/>, which we define as the “target” variable. However, employing NDVI presents drawbacks <xref ref-type="bibr" rid="bib1.bibx13" id="paren.48"/>. Namely, it has a saturating and nonlinear connection to green biomass and responds to greenness rather than the actual plant photosynthesis process. Despite these limitations, this index has been used successfully for various purposes, including, but not limited to, evaluating ecosystem resilience <xref ref-type="bibr" rid="bib1.bibx140" id="paren.49"/> and tracking the decline of vegetation greenness in the Amazon forests <xref ref-type="bibr" rid="bib1.bibx52" id="paren.50"/>. Additionally, NDVI was shown to be a good indicator of vegetation response to extreme climate events <xref ref-type="bibr" rid="bib1.bibx74" id="paren.51"/>.</p>

      <fig id="Ch1.F1" specific-use="star"><label>Figure 1</label><caption><p id="d2e419"><italic>Chosen locations and pre-processing</italic>. In <bold>(a)</bold>, we show the available and used locations for the study and their forest type. We used locations with more than 80 % forest cover, not using the remaining ones depicted in gray in this figure. In <bold>(b)</bold>, we show the process of including additional pixels into the cube for an example location (CZ-Stn). While different radii are shown, we use all the pixels in a single cube. Finally, in <bold>(c)</bold>, we show the NDVI signal corresponding to the mean of the pixels in the area with radius <inline-formula><mml:math id="M3" display="inline"><mml:mi>r</mml:mi></mml:math></inline-formula> previously shown for the same example location. Additionally, it also indicates the percentage of the pixels that are flagged as forest (fc).</p></caption>
          <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f01.png"/>

        </fig>

      <p id="d2e446">We used NDVI values obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) in the FluxnetEO dataset v1.0 <xref ref-type="bibr" rid="bib1.bibx136" id="paren.52"/>. This dataset is a multi-dimensional array of values referred to as a data cube <xref ref-type="bibr" rid="bib1.bibx80" id="paren.53"/>. It comprises a collection of labeled univariate time series at the geographic location of eddy covariance towers. Eddy covariance towers are specialized measurement stations designed to capture and quantify land–atmosphere fluxes and meteorological conditions, providing insights into ecosystem health and functionality <xref ref-type="bibr" rid="bib1.bibx2" id="paren.54"/>. The FluxnetEO datasets are gap-filled using measurements from eddy covariance towers, thus providing a higher-resolution product in time and space compared to the raw MODIS data.</p>
      <p id="d2e459">The data cubes in the dataset present a spatial resolution of 500 m and a daily temporal resolution, covering the period of 2000–2020 (inclusive). The cubes span a pixel of size <inline-formula><mml:math id="M4" display="inline"><mml:mrow><mml:mn mathvariant="normal">3</mml:mn><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mrow class="unit"><mml:mi mathvariant="normal">km</mml:mi></mml:mrow><mml:mo>×</mml:mo><mml:mn mathvariant="normal">3</mml:mn><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mrow class="unit"><mml:mi mathvariant="normal">km</mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula> centered on the eddy covariance towers in each location. We further pre-processed the data by averaging the NDVI for each time step as the mean of all the pixels within the cube; see Fig. <xref ref-type="fig" rid="Ch1.F1"/>. Additionally, we smoothed the signal using a Savitzky–Golay filter <xref ref-type="bibr" rid="bib1.bibx115 bib1.bibx126" id="paren.55"/> with a 7 d window and a polynomial order of 3 <xref ref-type="bibr" rid="bib1.bibx19" id="paren.56"/> to eliminate any potential artifacts caused by noise.</p>
      <p id="d2e490">We selected study sites based on their forest cover percentage, ensuring over 80 % forest cover in each cube; see Fig. <xref ref-type="fig" rid="Ch1.F1"/>. The forest masks were obtained from the Copernicus Global Land Service (CGLS) product <xref ref-type="bibr" rid="bib1.bibx12" id="paren.57"/>, which has a resolution of 100 m. These sites represent three different forest types: evergreen broadleaved forests (EBF), mixed forests (MF), and deciduous broadleaved forests (DBF). We chose to study forest sites because of their importance to the carbon cycle and because they are the least affected by human influence at a daily timescale. Consequently, we employed the described approach, detailed in Fig. <xref ref-type="fig" rid="Ch1.F1"/>, to minimize further imperfections in the vegetation signal caused by human intervention. As a result of this selection criterion, the number of study sites is reduced from 42 to 20.</p>
      <p id="d2e500">In this study, we used climate variables as input variables to predict the target variable. Following machine learning terminology, we will refer to them as “features”. We selected air temperature (mean, minimum, and maximum), mean sea level pressure, mean global solar radiation, and precipitation as features. The climate data were obtained from the E-OBS product v26.0e <xref ref-type="bibr" rid="bib1.bibx23" id="paren.58"/>. Based on in situ observations, this dataset is spatially interpolated to cover most of the European continent. The spatial resolution is 0.1° (<inline-formula><mml:math id="M5" display="inline"><mml:mrow><mml:mn mathvariant="normal">11.1</mml:mn><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mrow class="unit"><mml:mi mathvariant="normal">km</mml:mi></mml:mrow><mml:mo>×</mml:mo><mml:mn mathvariant="normal">11.1</mml:mn><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mrow class="unit"><mml:mi mathvariant="normal">km</mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula>), and the temporal resolution is daily. The time length of the feature variables is identical to the target variable and spans from 2000 to 2020 (inclusive). We split the data for training and testing as follows: years from 2000 to 2013 (inclusive) for training and validation. We use the remaining years, from 2014 to 2020, for testing, resulting in a 67 % and 33 % training and testing split ratio.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Approach and models</title>
      <p id="d2e534">We aim to learn the NDVI behavior of forests as a proxy for ecosystem response to climate drivers. We use air temperature (mean, minimum, maximum), precipitation, pressure, and global solar radiation. We assume that the knowledge of the target variable is constrained to a certain period, after which only the features are available. Our goal is to build a model that, using those features, can predict the target variable. This is obtained by training the models on the available time interval, which comprises the years 2000–2013 (inclusive) for this study. After the training, the models only use feature variables to predict the NDVI for the remaining period, 2014–2020. Further details for the training setup can be found in Appendix <xref ref-type="sec" rid="App1.Ch1.S2"/>.</p>

      <fig id="Ch1.F2" specific-use="star"><label>Figure 2</label><caption><p id="d2e541"><italic>Training methods</italic>. Both diagrams illustrate input and predicted data denoted by <inline-formula><mml:math id="M6" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M7" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo mathvariant="normal" stretchy="false">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, respectively. Diagram <bold>(a)</bold> presents the conventional approach used by RNNs, GRUs, and LSTMs, involving backpropagation through time (BPTT). In contrast, diagram <bold>(b)</bold> portrays the training methodology of ESNs. The initial two layers are randomly generated and remain untrained, while only the final layer undergoes one-shot training via linear regression (LR). Training encompasses the input layer (comprising three neurons in this instance), stacked recurrent layers (two layers, each with four neurons), and the output layer (two neurons). All these neurons are trained through backpropagation. The circular arrow atop signifies the recurrent layers. The number of neurons and internal recurrent layers serves visualization purposes only.</p></caption>
          <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f02.png"/>

        </fig>

      <p id="d2e589">The task can be formalized as follows. We aim to approximate the target variable <inline-formula><mml:math id="M8" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> using input data <inline-formula><mml:math id="M9" display="inline"><mml:mrow><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. In the context of this study, we retain a unidimensional approach with <inline-formula><mml:math id="M10" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>, given that we have a single target variable, the NDVI. We assume that both <inline-formula><mml:math id="M11" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M12" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are components of the same dynamical system. Under this assumption, the behavior <inline-formula><mml:math id="M13" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is influenced by <inline-formula><mml:math id="M14" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, allowing us to leverage the information from <inline-formula><mml:math id="M15" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> to estimate <inline-formula><mml:math id="M16" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Our setup consists of two sets of data, <inline-formula><mml:math id="M17" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M18" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which can be measured over a given time period <inline-formula><mml:math id="M19" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.33em"/><mml:mi mathvariant="normal">…</mml:mi><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mo>,</mml:mo><mml:mi>T</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>. After time step <inline-formula><mml:math id="M20" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:math></inline-formula> only <inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> remains measurable, and the goal of the recurrent architectures is to create a model efficient in modeling <inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> based solely on the available <inline-formula><mml:math id="M23" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> data <xref ref-type="bibr" rid="bib1.bibx77" id="paren.59"/>. To summarize, the models use meteorological data of a given day to predict the response of the vegetation on that same day. Despite having different training methods, all the models in this study are built to perform this task.</p>
      <p id="d2e844">We consider four different recurrent architectures: recurrent neural networks (RNNs), long short-term memory networks (LSTMs), gated recurrent-unit networks (GRUs), and echo state networks (ESNs). Each of these models presents an internal state <inline-formula><mml:math id="M24" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, which encodes the temporal dependencies of the input data <inline-formula><mml:math id="M25" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M26" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">6</mml:mn></mml:mrow></mml:math></inline-formula> in our study, representing the dimension of the input data. The internal size <inline-formula><mml:math id="M27" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is chosen to be <inline-formula><mml:math id="M28" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>&gt;</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. A defining feature of these architectures is the recursive transmission of internal states, facilitating historical data retention as the model progresses through subsequent steps. This evolution of the generic RNN model is given by <xref ref-type="bibr" rid="bib1.bibx128" id="paren.60"/>
            <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M29" display="block"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>H</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>;</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M30" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> represents the weights and biases of the model, also called parameters, and <inline-formula><mml:math id="M31" display="inline"><mml:mi>H</mml:mi></mml:math></inline-formula> represents a generic RNN update function.</p>
</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Training – backpropagation through time</title>
      <p id="d2e1018">Deep learning models like feed-forward neural networks (FFNNs) adjust their weights at each time step <inline-formula><mml:math id="M32" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mi mathvariant="normal">…</mml:mi><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mo>,</mml:mo><mml:mi>T</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> during training using a method called backpropagation (BP) <xref ref-type="bibr" rid="bib1.bibx114" id="paren.61"/>. BP relies on a “loss function” <inline-formula><mml:math id="M33" display="inline"><mml:mi mathvariant="script">L</mml:mi></mml:math></inline-formula> given by  <inline-formula><mml:math id="M34" display="inline"><mml:mrow><mml:mi mathvariant="script">L</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>T</mml:mi></mml:msubsup><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M35" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> stands for the network's parameters. Leveraging the loss function, BP minimizes the difference between the model's predicted and actual output by adjusting the model's weights. To do this, BP calculates the gradient of the loss function with respect to each weight and then updates the network weights in a direction that minimizes the loss. One of the most common approaches to minimizing the loss function is stochastic gradient descent (SGD) <xref ref-type="bibr" rid="bib1.bibx11" id="paren.62"/>. However, one of BP's limitations is that it does not account for time dependencies.</p>
      <p id="d2e1112">For time-dependent models, such as RNNs, LSTMs, and GRUs, handling sequential data poses additional challenges. Backpropagation through time (BPTT) <xref ref-type="bibr" rid="bib1.bibx114" id="paren.63"/> is a specialized training method designed for these architectures <xref ref-type="bibr" rid="bib1.bibx138" id="paren.64"/>. Central to BPTT is the notion of “unrolling” the network over time, effectively transforming it into a FFNN where backpropagation can be applied. This allows the model to calculate errors and update weights across the entire sequence, making it possible to capture long-term dependencies and patterns in time series data.</p>
      <p id="d2e1121">However, applying BPTT across the entire sequence can be computationally intense and time-consuming. Moreover, it often reduces the error to a very small amount by the end of the sequence. To avoid this issue, we use a truncated version of BPTT, limiting the backpropagation to a fixed number of steps, denoted by <inline-formula><mml:math id="M36" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula> <xref ref-type="bibr" rid="bib1.bibx139" id="paren.65"/>, which is smaller than the total number of training time steps, <inline-formula><mml:math id="M37" display="inline"><mml:mi>T</mml:mi></mml:math></inline-formula> <xref ref-type="bibr" rid="bib1.bibx1" id="paren.66"/>.</p>
      <p id="d2e1144">This truncated approach ensures efficiency but requires transferring the last hidden state from the truncated section to the initial state in the following sequence. This step maintains some memory and ensures the network's training process continuity.</p>
      <p id="d2e1148">The final output of all the models considered in this study comes from a feed-forward layer. The parameters of this layer are also trained using BP. The following equation describes the feed-forward layer:
            <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M38" display="block"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>v</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>v</mml:mi></mml:msup><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is the predicted output, <inline-formula><mml:math id="M40" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is the hidden state of the model at time step <inline-formula><mml:math id="M41" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>v</mml:mi></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is the weight matrix, and <inline-formula><mml:math id="M43" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>v</mml:mi></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is a bias vector. Additionally, <inline-formula><mml:math id="M44" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula> represents the activation function. This procedure is illustrated graphically in Fig. <xref ref-type="fig" rid="Ch1.F2"/>a. Finally, the full details of the models are given in Appendix <xref ref-type="sec" rid="App1.Ch1.S1.SS1"/> for the RNNs, Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS2"/> for the LSTMs, and Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS3"/> for the GRUs. The RNNs, LSTMs, and GRUs have been implemented using the <monospace>PyTorch</monospace> library <xref ref-type="bibr" rid="bib1.bibx98" id="paren.67"/> accessed through <monospace>Skorch</monospace> <xref ref-type="bibr" rid="bib1.bibx130" id="paren.68"/>.</p>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Training – echo state approach</title>
      <p id="d2e1348">Echo state networks <xref ref-type="bibr" rid="bib1.bibx57" id="paren.69"><named-content content-type="pre">ESNs; </named-content></xref>, along with liquid state machines <xref ref-type="bibr" rid="bib1.bibx78" id="paren.70"/>, belong to the larger family of reservoir computing (RC) models, based on a shared theoretical background <xref ref-type="bibr" rid="bib1.bibx132" id="paren.71"/>. The fundamental idea of ESNs is to project the training data into a higher-dimensional, nonlinear system named the “reservoir” through an input layer. This process transforms the input data into vectors called “states”. After the data pass through the reservoir, the states are collected. The model is then trained by regressing these states against the target data.</p>
      <p id="d2e1362">More specifically, the ESNs have three layers: an input layer <inline-formula><mml:math id="M45" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, a reservoir layer <inline-formula><mml:math id="M46" display="inline"><mml:mi mathvariant="bold">W</mml:mi></mml:math></inline-formula>, and an output layer <inline-formula><mml:math id="M47" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>out</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>. A distinctive feature of ESNs is that the weights in the input and reservoir layers are fixed. These weights are generated randomly and do not change during training. This approach contrasts with conventional neural networks, where weights are continuously updated during training to reduce errors. For each training time step <inline-formula><mml:math id="M48" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mi mathvariant="normal">…</mml:mi><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mo>,</mml:mo><mml:mi>T</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, the hidden states, indicated as <inline-formula><mml:math id="M49" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, are preserved and accumulated in a matrix <inline-formula><mml:math id="M50" display="inline"><mml:mrow><mml:mi mathvariant="bold">X</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. Indicated as a “state matrix”, this matrix effectively represents the system's dynamics. The last layer of the ESN is obtained through a linear regression operation that uses the states matrix to generate a feed-forward layer, creating the network's output layer:
            <disp-formula id="Ch1.E3" content-type="numbered"><label>3</label><mml:math id="M51" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>out</mml:mtext></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="bold">Y</mml:mi><mml:mtext>target</mml:mtext></mml:msup><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mo>⊤</mml:mo></mml:msup><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold">XX</mml:mi><mml:mo>⊤</mml:mo></mml:msup><mml:mo>+</mml:mo><mml:mi mathvariant="italic">β</mml:mi><mml:mi mathvariant="bold">I</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M52" display="inline"><mml:mi mathvariant="bold">I</mml:mi></mml:math></inline-formula> is the identity matrix, and <inline-formula><mml:math id="M53" display="inline"><mml:mi mathvariant="italic">β</mml:mi></mml:math></inline-formula> is a regularization coefficient. The matrix <inline-formula><mml:math id="M54" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">Y</mml:mi><mml:mtext>target</mml:mtext></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is generated with the desired output <inline-formula><mml:math id="M55" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo>∈</mml:mo><mml:msubsup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>v</mml:mi><mml:mi>d</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> stacked column-wise. While layers of recurrent models trained through BPTT can be stacked, ESNs are usually computed from a single inner layer (reservoir).</p>
      <p id="d2e1566">The construction and training of the ESN allow for the faster computational time of all the proposed models while also solving the vanishing and exploding gradients since no derivatives are taken at any step of the process. An illustration of the ESN approach to training is provided in Fig. <xref ref-type="fig" rid="Ch1.F2"/>b. The full details for the ESNs are given in Appendix <xref ref-type="sec" rid="App1.Ch1.S1.SS4"/>. For the implementations of the ESNs, we relied on the Julia package <monospace>ReservoirComputing.jl</monospace> <xref ref-type="bibr" rid="bib1.bibx86" id="paren.72"/>.</p>
</sec>
<sec id="Ch1.S2.SS5">
  <label>2.5</label><title>Anomalies and extremes in vegetation</title>
      <p id="d2e1588">In this study, we adopt the approach outlined by <xref ref-type="bibr" rid="bib1.bibx76" id="text.73"/> to define anomalies based on the seasonal variability observed in the signal. The process described in this section is depicted in Fig. <xref ref-type="fig" rid="Ch1.F3"/>. The anomalies at a given time <inline-formula><mml:math id="M56" display="inline"><mml:mi>l</mml:mi></mml:math></inline-formula>, denoted by <inline-formula><mml:math id="M57" display="inline"><mml:mrow><mml:mi>A</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, are defined as follows:
            <disp-formula id="Ch1.E4" content-type="numbered"><label>4</label><mml:math id="M58" display="block"><mml:mrow><mml:mi>A</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>s</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mover accent="true"><mml:mi>s</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
          Here, <inline-formula><mml:math id="M59" display="inline"><mml:mrow><mml:mi>s</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the signal at time <inline-formula><mml:math id="M60" display="inline"><mml:mi>l</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M61" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi>s</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the mean of the signal at that time, and <inline-formula><mml:math id="M62" display="inline"><mml:mrow><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> refers to its standard deviation. In this context, the time variable <inline-formula><mml:math id="M63" display="inline"><mml:mi>l</mml:mi></mml:math></inline-formula> denotes the specific day of the year, i.e., the third of March, and is determined based on a multi-year mean. The normalization process ensures that the signal exhibits a zero mean, which facilitates identifying extreme events as data points that fall outside a specified distribution range.</p>

      <fig id="Ch1.F3" specific-use="star"><label>Figure 3</label><caption><p id="d2e1730"><italic>Definition of extreme events</italic>. In <bold>(a)</bold>, the NDVI of the dataset is plotted at an example location, IT-La2. The dotted gray line in 2014 represents the separation between training and testing data. The training data have length <inline-formula><mml:math id="M64" display="inline"><mml:mi>T</mml:mi></mml:math></inline-formula> (dotted line). In <bold>(b)</bold>, we show each yearly cycle of the training data: on the left are the full signals, and on the right are the mean and standard deviation (SD). In <bold>(c)</bold>, we show the result of the normalization of the training dataset using the mean and SD obtained in the previous step. At this stage, we also define the quantile (fixed at 0.90 in this example) and define the values of the thresholds <inline-formula><mml:math id="M65" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>+</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M66" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>-</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula>. These values are then used in <bold>(d)</bold>, where we identify the extremes in the normalized testing data. The normalization is also done using the mean and SD of the training data. Finally, in <bold>(e)</bold>, the testing data are again shown in their raw form to showcase the extreme response of the vegetation. The extreme responses are highlighted by a red-shaded area.</p></caption>
          <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f03.png"/>

        </fig>

      <p id="d2e1786">After normalizing the data and delineating the anomalies, we characterize extreme events as data points falling outside a specific quantile, chosen to be 0.90, 0.91, <inline-formula><mml:math id="M67" display="inline"><mml:mi mathvariant="normal">…</mml:mi></mml:math></inline-formula>, or 0.99. Based on the selected quantile, we define two threshold parameters <inline-formula><mml:math id="M68" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>+</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M69" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>-</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> to represent positive and negative extremes, respectively. We determine these parameters individually for each site involved in our study.</p>
      <p id="d2e1819">During the training phase, which spans 2000 to 2014, we determine the threshold values <inline-formula><mml:math id="M70" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>+</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M71" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>-</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula>. We apply these threshold values to the test dataset comprising the years from 2014 to 2020 to determine the extreme events in this latter dataset. The normalization of the data in the test datasets is done with the mean and SD values obtained from the training dataset.</p>
</sec>
<sec id="Ch1.S2.SS6">
  <label>2.6</label><title>Metrics</title>
<sec id="Ch1.S2.SS6.SSS1">
  <label>2.6.1</label><title>General metrics</title>
      <p id="d2e1859">In this study, we evaluated the predictive accuracy of our model using two primary metrics: the normalized root mean square error (NRMSE) and the symmetric mean absolute percentage error (SMAPE).</p>
      <p id="d2e1862">The NRMSE is derived from the root mean square error (RMSE) <xref ref-type="bibr" rid="bib1.bibx56" id="paren.74"/> and is defined as
              <disp-formula id="Ch1.E5" content-type="numbered"><label>5</label><mml:math id="M72" display="block"><mml:mrow><mml:mtext>NRMSE</mml:mtext><mml:mo>=</mml:mo><mml:msqrt><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi>v</mml:mi><mml:mo mathvariant="normal" stretchy="false">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mi>v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle></mml:msqrt><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mtext>max</mml:mtext></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mtext>min</mml:mtext></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M73" display="inline"><mml:mrow><mml:mi>v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the observed target variable at the <inline-formula><mml:math id="M74" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula>th observation among a total of <inline-formula><mml:math id="M75" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> observations, and <inline-formula><mml:math id="M76" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi>v</mml:mi><mml:mo mathvariant="normal" stretchy="false">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the corresponding model prediction. To facilitate comparisons across different sites, we normalized this metric using the range of the observed data, which is computed as the difference between the maximum <inline-formula><mml:math id="M77" display="inline"><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mtext>max</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> and minimum <inline-formula><mml:math id="M78" display="inline"><mml:mrow><mml:msub><mml:mi>v</mml:mi><mml:mtext>min</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> observed values, with <inline-formula><mml:math id="M79" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e2036">In addition to NRMSE, we use SMAPE to assess the predictive performance of our model. SMAPE, which is a dimension-agnostic measure, is given by the formula
              <disp-formula id="Ch1.E6" content-type="numbered"><label>6</label><mml:math id="M80" display="block"><mml:mrow><mml:mtext>SMAPE</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">100</mml:mn><mml:mi>n</mml:mi></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>|</mml:mo><mml:mi>v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mover accent="true"><mml:mi>v</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>v</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>|</mml:mo><mml:mo>+</mml:mo><mml:mo>|</mml:mo><mml:mover accent="true"><mml:mi>v</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M81" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> denotes the number of observations. This metric was chosen based on its ability to provide a symmetric measurement of the absolute percentage error, thereby affording a balanced view of the forecast accuracy <xref ref-type="bibr" rid="bib1.bibx83 bib1.bibx56" id="paren.75"/>. Both of the metrics proposed in this section indicate better model performance as their value approaches zero.</p>
</sec>
<sec id="Ch1.S2.SS6.SSS2">
  <label>2.6.2</label><title>Entropy–complexity plots</title>
      <p id="d2e2146">In this study, we also use information-theory-based quantifiers to analyze the model's residuals, defined as the differences between the model's prediction and the actual measurements. Based on the approach proposed by <xref ref-type="bibr" rid="bib1.bibx120" id="text.76"/>, we employ entropy–complexity (EC) plots. These plots return a visual representation of the amount of information still present in the residuals. Higher information content would indicate that the models do not sufficiently approximate the target variable. On the other hand, values of EC closer to white noise would suggest that the models fully reproduce the target variable's behavior. We illustrate this approach, closely following the exposition provided by <xref ref-type="bibr" rid="bib1.bibx120" id="text.77"/>.</p>
      <p id="d2e2155">To generate the EC plots, we consider a metric <inline-formula><mml:math id="M82" display="inline"><mml:mrow><mml:mi mathvariant="script">H</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> of a probability distribution <inline-formula><mml:math id="M83" display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>;</mml:mo><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>N</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, with <inline-formula><mml:math id="M84" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> possible states and with <inline-formula><mml:math id="M85" display="inline"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:msubsup><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>, to quantify the information content of a time series. One such metric is the Shannon entropy <inline-formula><mml:math id="M86" display="inline"><mml:mrow><mml:mi mathvariant="script">S</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>,
              <disp-formula id="Ch1.E7" content-type="numbered"><label>7</label><mml:math id="M87" display="block"><mml:mrow><mml:mi mathvariant="script">S</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mtext>ln</mml:mtext><mml:mo>[</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>]</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            which is maximized <inline-formula><mml:math id="M88" display="inline"><mml:mrow><mml:mi mathvariant="script">S</mml:mi><mml:mo>[</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="script">S</mml:mi><mml:mtext>max</mml:mtext></mml:msub><mml:mo>=</mml:mo><mml:mtext>ln</mml:mtext><mml:mi>N</mml:mi></mml:mrow></mml:math></inline-formula> for the uniform distribution <inline-formula><mml:math id="M89" display="inline"><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:mo>;</mml:mo><mml:mo>∀</mml:mo><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mi mathvariant="normal">…</mml:mi><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mo>,</mml:mo><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mi>N</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>. We can then define the normalized entropy
              <disp-formula id="Ch1.E8" content-type="numbered"><label>8</label><mml:math id="M90" display="block"><mml:mrow><mml:mi mathvariant="script">H</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="script">S</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi mathvariant="script">S</mml:mi><mml:mtext>max</mml:mtext></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            which is the first metric used in the EC plots. In addition to the information content of a time series, we are interested in quantifying the complexity <inline-formula><mml:math id="M91" display="inline"><mml:mrow><mml:mi mathvariant="script">C</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. Following <xref ref-type="bibr" rid="bib1.bibx75" id="text.78"/>, we use a definition of complexity <inline-formula><mml:math id="M92" display="inline"><mml:mrow><mml:mi mathvariant="script">C</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, which is the product of a measure of information, such as entropy <inline-formula><mml:math id="M93" display="inline"><mml:mrow><mml:mi mathvariant="script">H</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, and a measure of disequilibrium <inline-formula><mml:math id="M94" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">Q</mml:mi><mml:mi>J</mml:mi></mml:msub><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>.
              <disp-formula id="Ch1.E9" content-type="numbered"><label>9</label><mml:math id="M95" display="block"><mml:mrow><mml:mi mathvariant="script">C</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="script">Q</mml:mi><mml:mi>J</mml:mi></mml:msub><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>]</mml:mo><mml:mi mathvariant="script">H</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></disp-formula>
            In this context, disequilibrium takes the meaning of distance from the uniform distribution of the available states of a system. The definition of disequilibrium <inline-formula><mml:math id="M96" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">Q</mml:mi><mml:mi>J</mml:mi></mml:msub><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> makes use of the Jensen–Shannon divergence <inline-formula><mml:math id="M97" display="inline"><mml:mrow><mml:mi mathvariant="script">J</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>, which quantifies the difference between probability distributions <xref ref-type="bibr" rid="bib1.bibx49" id="paren.79"/>, and it is defined as
              <disp-formula id="Ch1.E10" content-type="numbered"><label>10</label><mml:math id="M98" display="block"><mml:mrow><mml:mi mathvariant="script">J</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="script">Q</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mfenced close="}" open="{"><mml:mrow><mml:mi>S</mml:mi><mml:mfenced open="[" close="]"><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>]</mml:mo><mml:mo>-</mml:mo><mml:mi>S</mml:mi><mml:mo>[</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>]</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M99" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">Q</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> is a normalization constant, <xref ref-type="bibr" rid="bib1.bibx67 bib1.bibx111" id="paren.80"/>, chosen such that <inline-formula><mml:math id="M100" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">Q</mml:mi><mml:mi>J</mml:mi></mml:msub><mml:mo>[</mml:mo><mml:mi>P</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>]</mml:mo><mml:mo>∈</mml:mo><mml:mo>[</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e2719">The computations of entropy and complexity rely on the probability distribution associated with the data. To determine this distribution, we leverage the method proposed by <xref ref-type="bibr" rid="bib1.bibx3" id="text.81"/>, which analyzes time series data by comparing neighboring values. It involves dividing the data into a set of patterns based on the order of the values and then calculating the probability of each pattern occurring <xref ref-type="bibr" rid="bib1.bibx110" id="paren.82"/>. The pattern separation is obtained by embedding the time series in a <inline-formula><mml:math id="M101" display="inline"><mml:mi>D</mml:mi></mml:math></inline-formula> dimensional space with a time lag <inline-formula><mml:math id="M102" display="inline"><mml:mi mathvariant="italic">τ</mml:mi></mml:math></inline-formula>. We set <inline-formula><mml:math id="M103" display="inline"><mml:mrow><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">6</mml:mn></mml:mrow></mml:math></inline-formula> and  <inline-formula><mml:math id="M104" display="inline"><mml:mrow><mml:mi mathvariant="italic">τ</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>, as proposed by <xref ref-type="bibr" rid="bib1.bibx111" id="text.83"/> and <xref ref-type="bibr" rid="bib1.bibx120" id="text.84"/>. The complexity measure's theoretical upper and lower bounds can now be computed <xref ref-type="bibr" rid="bib1.bibx84" id="paren.85"/> and are shown in the plots. The calculations of the EC plots were performed with the Julia package <monospace>ComplexityMeasures.jl</monospace> <xref ref-type="bibr" rid="bib1.bibx50" id="paren.86"/> from <monospace>DynamicalSystems.jl</monospace> <xref ref-type="bibr" rid="bib1.bibx25" id="paren.87"/>.</p>
</sec>
<sec id="Ch1.S2.SS6.SSS3">
  <label>2.6.3</label><title>Extremes as binary events</title>
      <p id="d2e2797">To analyze extreme events, we set thresholds <inline-formula><mml:math id="M105" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>+</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>-</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula>, as described earlier (in Sect. <xref ref-type="sec" rid="Ch1.S2.SS5"/>), to identify values as either extremes or not extremes. We adopt this binary approach for both the observed data and the data predicted by the models.</p>
      <p id="d2e2824">Following the method outlined by <xref ref-type="bibr" rid="bib1.bibx55" id="text.88"/>, we classify the outcomes into four categories: hits <inline-formula><mml:math id="M107" display="inline"><mml:mi>a</mml:mi></mml:math></inline-formula>, where the model correctly identifies an extreme event; false alarms <inline-formula><mml:math id="M108" display="inline"><mml:mi>b</mml:mi></mml:math></inline-formula>, where the model incorrectly flags a value as an extreme event; misses <inline-formula><mml:math id="M109" display="inline"><mml:mi>c</mml:mi></mml:math></inline-formula>, where the model fails to identify an extreme event; and correct rejections <inline-formula><mml:math id="M110" display="inline"><mml:mi>d</mml:mi></mml:math></inline-formula>, where the model correctly identifies a non-extreme event.</p>
      <p id="d2e2858">Given <inline-formula><mml:math id="M111" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> observation points, we employ the following metrics to assess the model's performance on detecting extremes, <xref ref-type="bibr" rid="bib1.bibx4" id="paren.89"/>. We define the probability of detection (POD) as the ratio of correctly identified extreme events and the total number of extreme events, the probability of false detection (POFD) as the ratio of incorrectly flagged extreme events and all events that are not extreme, the probability of false alarms (POFA) as the ratio of false alarms and all predicted extreme events, and the proportion correct (PC) as the ratio of all accurate predictions (both hits and correct rejections) and the total number of observations, given by
              <disp-formula id="Ch1.E11" content-type="numbered"><label>11</label><mml:math id="M112" display="block"><mml:mtable class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mtext>POD</mml:mtext></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>a</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:mi>c</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo><mml:mspace width="0.33em" linebreak="nobreak"/><mml:mtext>POFD</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>b</mml:mi><mml:mrow><mml:mi>b</mml:mi><mml:mo>+</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.33em"/><mml:mtext>POFA</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>b</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:mi>b</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mtext>PC</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mi>n</mml:mi></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
            respectively. These metrics are standard tools for evaluating deep learning models for predicting atmospheric variables such as wind speed <xref ref-type="bibr" rid="bib1.bibx116" id="paren.90"/> and precipitation <xref ref-type="bibr" rid="bib1.bibx119" id="paren.91"/>.</p>
</sec>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Results</title>
      <p id="d2e2973">In the following, we first compare the different models by standard evaluation metrics such as the normalized root mean squared error, NRMSE, and the symmetric mean absolute percentage error, SMAPE (Sect. <xref ref-type="sec" rid="Ch1.S3.SS1"/>). Second, we extend the comparison to information-theoretic quantifiers in the entropy–complexity (EC) plane (Sect. <xref ref-type="sec" rid="Ch1.S3.SS2"/>). We perform this comparison for two cases: (i) the full-time series (FS), which encompasses all data points in the year, and (ii) the meteorological growing season (GS), which encompasses the months between May and September only. This division helps us differentiate the models' ability to capture the seasonal cycle, dominated by an oscillation, from their performance in the more stable growing season conditions, where more minor variations are likely harder to represent. Last, we focus on extreme events and compare the models' performance on extreme events in Sect. <xref ref-type="sec" rid="Ch1.S3.SS3"/>.</p>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>Comparison of prediction accuracy</title>
      <p id="d2e2989">We present our results in Table <xref ref-type="table" rid="Ch1.T1"/>, showing that the ESN outperformed all other models for both the FS and GS signals, closely followed by the LSTM. While the GRU exhibited comparable results to the LSTM for the FS signal, the differences became more pronounced in the GS signal. In contrast, the RNN consistently delivered the least favorable results across the board and exhibited the highest standard deviation (SD) among the analyzed models. In Fig. <xref ref-type="fig" rid="App1.Ch1.S3.F8"/>, we include the comparison of these metrics per site, which shows great variation in the models' performance across different sites.</p>

      <fig id="Ch1.F4" specific-use="star"><label>Figure 4</label><caption><p id="d2e2998"><italic>Time series and predictions for selected locations</italic>. This figure illustrates the predictive performance of four distinct recurrent architectures at specified NDVI time-series locations. The target time series is shown in black, while the predictions only use a singular run from a set of 100 per model. Panel <bold>(a)</bold> delineates the results obtained at three selected locations: Germany, Italy, Czech Republic (CZ-Stn). Subsequently, panel <bold>(b)</bold> offers a magnified view of the outcomes at the CZ-Stn location in 2018, highlighting the extremes defined by a 90 % threshold using a grayed-out area. It is pertinent to underline that the predictions generated by the RNNs carry considerable noise, thereby affecting the interpretations drawn from the entropy–complexity plots.</p></caption>
          <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f04.png"/>

        </fig>

      <p id="d2e3015">The performance rankings of the models remained consistent when transitioning from the FS signal to the GS signal. Notably, the SMAPE metric indicated increased accuracy for all models, while the NRMSE metric suggested decreased accuracy. Similarly, we observe a reduction in the SD for the SMAPE but an increase in the NRMSE. This increase is very noticeable in the GRU and RNN models.</p>
      <p id="d2e3019">The models generally exhibited similar performance, with the ESN yielding slightly better results and the RNN demonstrating the least accurate forecasts. The gated methods showed similar performances. While these metrics provide an initial picture of the models' performance in the task, the results do not indicate a clear best performer among the models. These similarities in the results' metrics emphasize the need for further exploration using alternative evaluation tools.</p>

<table-wrap id="Ch1.T1" specific-use="star"><label>Table 1</label><caption><p id="d2e3025"><italic>Accuracy of the models</italic>. The table illustrates the performance of the four different models across various scenarios. “FS” represents the “full season”, representing the full dataset used for predictions without isolating specific events or months. “GS” denotes the “growing season”, which encompasses the peak summer months in addition to May and September. Models with the highest accuracy in each category are highlighted using bold text. The arrows pointing down (<inline-formula><mml:math id="M113" display="inline"><mml:mo lspace="0mm">↓</mml:mo></mml:math></inline-formula>) near the metric's name indicate that smaller values represent higher accuracy. The means are calculated as the average of 100 runs for each of the 20 different locations, and then the mean of these 20 location-derived means is shown.  Similarly, the standard deviations are calculated for each location over the 100 runs and then averaged to provide the overall standard deviation for each metric. Bold font indicates the best-performing model.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="6">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3">LSTM</oasis:entry>
         <oasis:entry colname="col4">GRU</oasis:entry>
         <oasis:entry colname="col5">RNN</oasis:entry>
         <oasis:entry colname="col6">ESN</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">FS</oasis:entry>
         <oasis:entry colname="col2">SMAPE <inline-formula><mml:math id="M114" display="inline"><mml:mo>↓</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M115" display="inline"><mml:mrow><mml:mn mathvariant="normal">7.07</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">2.37</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M116" display="inline"><mml:mrow><mml:mn mathvariant="normal">7.15</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">2.35</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M117" display="inline"><mml:mrow><mml:mn mathvariant="normal">8.74</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">3.24</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M118" display="inline"><mml:mrow><mml:mn mathvariant="bold">4.80</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.24</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">NRMSE <inline-formula><mml:math id="M119" display="inline"><mml:mo>↓</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M120" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.193</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">5.34</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M121" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.201</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">5.18</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M122" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.236</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">4.97</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M123" display="inline"><mml:mrow><mml:mn mathvariant="bold">0.138</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">3.45</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">GS</oasis:entry>
         <oasis:entry colname="col2">SMAPE <inline-formula><mml:math id="M124" display="inline"><mml:mo>↓</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M125" display="inline"><mml:mrow><mml:mn mathvariant="normal">5.79</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">2.33</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M126" display="inline"><mml:mrow><mml:mn mathvariant="normal">5.96</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">2.01</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M127" display="inline"><mml:mrow><mml:mn mathvariant="normal">7.64</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">2.47</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M128" display="inline"><mml:mrow><mml:mn mathvariant="bold">3.26</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.12</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">NRMSE <inline-formula><mml:math id="M129" display="inline"><mml:mo>↓</mml:mo></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M130" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.263</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">9.00</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M131" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.268</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">11.2</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M132" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.332</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">12.4</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M133" display="inline"><mml:mrow><mml:mn mathvariant="bold">0.157</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">5.24</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Comparison of the entropy–complexity</title>
      <p id="d2e3439">Drawing from information theory, we use entropy–complexity (EC) plots (introduced in Sect. <xref ref-type="sec" rid="Ch1.S2.SS6.SSS2"/>) to examine the residuals, defined as the differences between the model's predictions and measurements. Residuals convey valuable information about a model's performance. In an ideal scenario, where a model perfectly represents a system's dynamics, these residuals would resemble white noise and be positioned in the lower-right corner of the EC plane <xref ref-type="bibr" rid="bib1.bibx120" id="paren.92"/>.</p>
      <p id="d2e3447">Figure <xref ref-type="fig" rid="Ch1.F5"/> shows the EC plots of the models' residuals. Additionally, we show the mean of these metrics per model architecture across all sites. We find that the residuals cluster by model across all locations, with minimal overlap between each model's clusters. The LSTM and GRU models occupy the curve's ascending left side, indicating the presence of more structure in the residuals. In contrast, the ESN and, to a greater extent, the RNN are positioned closer to the white noise region, implying less structure in the residuals.</p>
      <p id="d2e3452">Comparing the FS signal to the GS signal shows no apparent differences in the results, underscoring the consistency of model performance. Based on the position of RNN residuals in the EC plots, we would expect a substantially improved performance of these models compared to the other model architectures. However, inspecting the predicted time series suggests a different explanation: the RNN model predictions show large variability (Fig. <xref ref-type="fig" rid="Ch1.F4"/>, which obfuscates the underlying dynamics. Because the EC plots capture the resulting residuals' noise, one can misinterpret the positioning of RNNs as favorable results.</p>

      <fig id="Ch1.F5" specific-use="star"><label>Figure 5</label><caption><p id="d2e3460"><italic>Entropy–complexity curves of model's residuals</italic>. The entropy–complexity plots are computed from the residuals of each model at each location (color). The mean of each model's performance over all locations is shown in black. These plots visualize the amount of information and complexity left in the residuals. Ideal values would reside in the lower left corner, symbolizing white noise in the residuals. The gray lines denote the theoretical upper and lower bounds.</p></caption>
          <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f05.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS3">
  <label>3.3</label><title>Extreme events</title>
      <p id="d2e3479">In this section, we use the metrics introduced in Sect. <xref ref-type="sec" rid="Ch1.S2.SS6.SSS3"/> to evaluate the models' ability to capture extreme vegetation dynamics. Fig. <xref ref-type="fig" rid="Ch1.F6"/>a shows a distinct division in the probability of detection (POD): the LSTM and ESN models exhibit low values, indicating an inability to detect extreme events. The RNN and GRU models show higher values, indicating better performance. However, the POD values are generally low for all models. Furthermore, all models display some level of overlap in the confidence bands. Finally, it is possible to observe a worsening of the results with increased quantiles. Figure <xref ref-type="fig" rid="Ch1.F6"/>b shows a lower probability of false detection (POFD) of ESNs, indicating that they perform better at avoiding the prediction of extremes when there are none compared to the other models. Conversely, the RNN, which predicts noisy behavior, as shown in Sect. <xref ref-type="sec" rid="Ch1.S3.SS1"/>, exhibits higher POFD values. The probability of false alarms (POFA; Fig. <xref ref-type="fig" rid="Ch1.F6"/>c) follows a similar pattern, with the ESN outperforming the other models. The GRU and LSTM display more comparable behaviors, while the RNN lags behind, showing the least favorable performance among all models. The displayed metric consistently leans towards the higher end of the potential value range. Like in the case of the POD, the values of the POFA get worse with each increase in the quantiles. Finally, Fig. <xref ref-type="fig" rid="Ch1.F6"/>d shows the probability of correctness (PC). The ESN leads the graph despite showing the lowest POD. Following closely are the LSTM and GRU, which exhibit similar performance. The RNN maintains the poorest performance across all metrics, emphasizing its limitations in capturing extreme vegetation dynamics.</p>

      <fig id="Ch1.F6"><label>Figure 6</label><caption><p id="d2e3497"><italic>Extremes as binary events across all sites</italic>. The results of quantiles spanning 0.90 to 0.99 are shown. All values are expressed as percentages. In <bold>(a)</bold>, we show the detection percentage (POD), indicating how many extreme events have been detected. In <bold>(b)</bold>, we show the percentage of false detection (POFD). In <bold>(c)</bold> we depict the probability of false alarms (POFA). Finally, in <bold>(d)</bold> we show the proportion correct (PC). The error bars represent the standard deviation over 100 simulations with different initializations for each model. The arrows (up <inline-formula><mml:math id="M134" display="inline"><mml:mo>↑</mml:mo></mml:math></inline-formula> and down <inline-formula><mml:math id="M135" display="inline"><mml:mo>↓</mml:mo></mml:math></inline-formula>) by the metric's name indicate the direction of the optimal values.</p></caption>
          <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f06.png"/>

        </fig>

      <p id="d2e3535">To examine the prediction performance of extreme NDVI reduction during the summer season, we focus on June, July, and August using only <inline-formula><mml:math id="M136" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">κ</mml:mi><mml:mo>-</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula>, showcased in Fig. <xref ref-type="fig" rid="Ch1.F7"/>. In Fig. <xref ref-type="fig" rid="Ch1.F7"/>a, we see the probability of detection (POD) for different models. The RNN performs best, outperforming the LSTM, GRU, and ESN, delivering similar results. Figure <xref ref-type="fig" rid="Ch1.F7"/>b shows the probability of false detection (POFD). Here, the RNN performs the worst, while the ESN stands out with the best values and the lowest standard deviation (SD) among all models. The gating-based models, LSTM and GRU, perform almost identically. We find similar performances among models in Fig. <xref ref-type="fig" rid="Ch1.F7"/>c. The ESN demonstrates a decrease in the probability of false alarms (POFA) values for the 0.92 quantiles. Any small gaps in performance observed for lower quantiles are effectively bridged at higher values. Finally, Fig. <xref ref-type="fig" rid="Ch1.F7"/>d illustrates the ESN as the top-performing model, exhibiting the lowest SD among the models. The gated models, LSTM and GRU, display similar performances, while the RNN ranks as the poorest-performing model in this analysis of NDVI summertime decreases. Figure <xref ref-type="fig" rid="App1.Ch1.S3.F9"/> shows the variability of the results across all analyzed locations.</p>

      <fig id="Ch1.F7"><label>Figure 7</label><caption><p id="d2e3565"><italic>Negative extreme events for summer months across all sites</italic>. Only the negative extremes during the summer period are considered here. The results of quantiles spanning 0.90 to 0.99 are shown. All values are expressed as percentages. In <bold>(a)</bold>, we show the detection percentage (POD), indicating how many extreme events have been detected. In <bold>(b)</bold>, we show the percentage of false detection (POFD). In <bold>(c)</bold> we depict the probability of false alarms (POFA). Finally, <bold>(d)</bold> we show the proportion correct (PC). The error bars represent the standard deviation over 100 simulations with different initializations for each model. The arrows (up <inline-formula><mml:math id="M137" display="inline"><mml:mo>↑</mml:mo></mml:math></inline-formula> and down <inline-formula><mml:math id="M138" display="inline"><mml:mo>↓</mml:mo></mml:math></inline-formula>) by the metric's name indicate the direction of the optimal values.</p></caption>
          <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f07.png"/>

        </fig>

</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Discussion</title>
      <p id="d2e3612">We first discuss the performance of the models and their comparison in modeling vegetation greenness in response to climate drivers (Sect. <xref ref-type="sec" rid="Ch1.S4.SS1"/>). Then, we analyze their performance in learning the extremes in NDVI (Sect. <xref ref-type="sec" rid="Ch1.S4.SS2"/>).  Finally, we discuss some limitations of this study and future research directions (Sect. <xref ref-type="sec" rid="Ch1.S4.SS3"/>).</p>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>Model's performance and comparison</title>
      <p id="d2e3628">Similar to previous studies <xref ref-type="bibr" rid="bib1.bibx7 bib1.bibx62" id="paren.93"/>, this work seeks to understand the ability of recurrent neural network architectures to model vegetation greenness. Based on the analyzed metrics, gated architectures and ESNs appear almost equally effective, with ESNs having a slight advantage. The models' performance varies considerably across sites, as illustrated in Fig. <xref ref-type="fig" rid="App1.Ch1.S3.F8"/>. Such variability can explain the overlap in the standard deviation and is probably due to the different dynamics of the locations investigated that span different climate zones and forest types.</p>
      <p id="d2e3636">Additionally, we also provide a comparison of entropy–complexity (EC) plots in Fig. <xref ref-type="fig" rid="Ch1.F5"/>.  Contrary to the previous metrics, RNNs seem to outperform all other models in the EC plots. However, the presence of disproportionate high noise in the EC plots skewed the results towards the white noise range. Generally, this would indicate the absence of structure in the residuals and point to a good behavior of the prediction. Further investigation into the actual dynamics of the prediction showed that this was due to the high levels of noise. Furthermore, the EC plots clear the distinction between similarly performing models, i.e., gated architectures and ESNs. While these models show comparable performances of NRMSE and SMAPE, they are clearly distinct in the EC plots, highlighting the better performance of the ESNs. Thus, our results illustrate that it is crucial not to rely solely on a single set of metrics when evaluating results in similar studies.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>Do the models capture extremes?</title>
      <p id="d2e3649">Our results for the analysis of vegetation extreme responses to climate drivers using daily data showed that none of the models notably outperformed the others across all the metrics considered, as seen in Fig. <xref ref-type="fig" rid="Ch1.F6"/>. The ESNs performed well in avoiding false alarms but failed to obtain a high accuracy when predicting extremes, performing similarly to LSTMs. ESNs also showed the lowest standard deviation among all the models considered. RNNs performed worst among all metrics analyzed. Overall, all the models showed poor results for this task. When considering negative extremes in the summer season, indicating a decrease in the productivity of vegetation, the results showed again that the recurrent architectures could not correctly capture extremes in vegetation responses to climate forcing (Fig. <xref ref-type="fig" rid="Ch1.F7"/>). The ESNs performed best in avoiding false extremes and the overall accuracy of the task's execution, while the RNNs showed the worst prediction performance.</p>
      <p id="d2e3656">Motivated by recent ESN applications to data from laboratory experiments <xref ref-type="bibr" rid="bib1.bibx92" id="paren.94"/>, we expected ESNs to outperform the other models in the extreme prediction task. Instead, the results showed that, while the ESN provided good results, they were not outstanding compared to the BPTT gated models. Additionally, none of the models showed good results, contrary to other applications in hydrology showcasing the successful application of recurrent models to extremes <xref ref-type="bibr" rid="bib1.bibx38" id="paren.95"/>.</p>
</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>Limitation and future directions</title>
      <p id="d2e3673">Often, remote sensing studies are subject to a series of data limitations. However, in this case, the data went through rigorous quality checks and corrections <xref ref-type="bibr" rid="bib1.bibx136" id="paren.96"/>. When sensed from space in discrete wavelengths, such optical data remain a coarse abstraction of the biological states of the forest under scrutiny. More critical limitation arises from the availability of daily data solely spanning the last 21 years. Although this is a long time span from a remote sensing perspective, it can be regarded as a short period for training machine learning models, which require abundant data points for optimal performance.</p>
      <p id="d2e3679">The metrics we used are known to be sensitive to noise. Using various metrics, we identified limitations in methods such as the EC plots (Fig. <xref ref-type="fig" rid="Ch1.F5"/>). For instance, the noise in the RNN's prediction was incorrectly attributed to residual noise, distorting the outcomes. Moreover, standard metrics like NRMSE and SMAPE provide a narrowed view. This is evident when comparing ESNs and LSTMs: even though they had similar scores with these metrics, their distinctions were clear in the EC plots.</p>
      <p id="d2e3684">Most existing studies center on creating a single global model, which is key to understanding performances beyond the training sample <xref ref-type="bibr" rid="bib1.bibx63 bib1.bibx29" id="paren.97"/>. However, we opted to train individual models for each site to understand how, under optimal conditions, each architecture could capture the specific dynamics of an ecosystem without the need for extrapolation. In this way, we could focus on the models' inherent capability to understand each ecosystem's dynamics, and we are sure that model differences do not arise due to different data demands for generalizations. Increasing the locations and features and expanding the models are necessary future steps for the continued investigation of modeling extremes in vegetation with ML. It has been shown that including additional features and multiple locations improves the ML model performance in hydrological applications <xref ref-type="bibr" rid="bib1.bibx65 bib1.bibx34" id="paren.98"/>. While similar studies do exist in the context of biosphere dynamics <xref ref-type="bibr" rid="bib1.bibx63" id="paren.99"/>, more investigation is needed. For example, including more locations, as highlighted in <xref ref-type="bibr" rid="bib1.bibx65" id="text.100"/> represents the easiest way to improve performance. Investigating the effects of more location on the performance of ML for vegetation extremes would be an important contribution for the continued adoption of ML models in predicting biosphere dynamics.</p>
      <p id="d2e3699">ESN and LSTM demonstrated comparable performance in modeling vegetation greenness. However, these models exhibited limited predictive capability for extreme events. Spatial information has proven helpful in similar studies <xref ref-type="bibr" rid="bib1.bibx108 bib1.bibx29 bib1.bibx109 bib1.bibx7 bib1.bibx62" id="paren.101"/>, and it could be beneficial to explore the extent to which it contributes to extreme conditions. Furthermore, the models used could be tailored more to the task. In <xref ref-type="bibr" rid="bib1.bibx10" id="text.102"/>, the authors point out the limitations in using the mean-squared-error loss function, a practice still widely diffused in the field and adopted in this paper. Furthermore, <xref ref-type="bibr" rid="bib1.bibx113" id="text.103"/> show the superior performance of loss functions based on weighting and relative entropy compared to other loss functions in learning extreme events. Finally, further investigations could benefit from deeper explorations of ESNs. With their user-friendly nature, straightforward hyperparameter tuning, and fast training, they stand out as a robust modeling method, able to compete with top-tier deep learning architectures such as LSTMs.</p>
</sec>
</sec>
<sec id="Ch1.S5" sec-type="conclusions">
  <label>5</label><title>Conclusions</title>
      <p id="d2e3721">We compared the performance of four recurrent neural network architectures in modeling biosphere dynamics, i.e., spectral vegetation indices, in response to climate drivers. Using daily data, we assessed the effectiveness of these network architectures in capturing extreme anomalies within these vegetation dynamics. To discern variations in performance across different scenarios, we employed various metrics such as normalized root mean square error and symmetric mean absolute percentage error paired with information theory quantifiers. Our findings revealed that ESNs and LSTMs performed similarly across most analyzed metrics, indicating that no single model outperformed the others. Additionally, all the models under investigation failed to model the extreme responses of the vegetation. This work highlights the necessity to continue refining and developing specialized models that can more adeptly capture extreme vegetation response to meteorological drivers.</p>
</sec>

      
      </body>
    <back><app-group>

<app id="App1.Ch1.S1">
  <label>Appendix A</label><title>Recurrent neural network details</title>
      <p id="d2e3735">In this appendix, we provide the mathematical details behind the models used in this study. Starting from Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS1"/>, we introduce the simple version of the RNN, which provides the basic equation for all following models. Subsequently, in Sects. <xref ref-type="sec" rid="App1.Ch1.S1.SS2"/> and <xref ref-type="sec" rid="App1.Ch1.S1.SS3"/>, we illustrate the gated approaches of LSTMs and GRUs, respectively. Finally, in Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS4"/>, we showcase the ESNs. While the general approach for all the models shows some similarities, different constructions exhibit variations in complexity <xref ref-type="bibr" rid="bib1.bibx40 bib1.bibx39" id="paren.104"/> and training speed <xref ref-type="bibr" rid="bib1.bibx16 bib1.bibx127" id="paren.105"/>.</p>
<sec id="App1.Ch1.S1.SS1">
  <label>A1</label><title>Recurrent neural networks</title>
      <p id="d2e3760">The most basic version of RNN was proposed by <xref ref-type="bibr" rid="bib1.bibx33" id="text.106"/>. The equations used to obtain the hidden state <inline-formula><mml:math id="M139" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> can be described as follows:
            <disp-formula id="App1.Ch1.S1.E12" content-type="numbered"><label>A1</label><mml:math id="M140" display="block"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>x</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>x</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>x</mml:mi></mml:msup><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M141" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula> is the activation function, <inline-formula><mml:math id="M142" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>x</mml:mi></mml:msubsup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M143" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>x</mml:mi></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> are the weight matrices, and <inline-formula><mml:math id="M144" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>x</mml:mi></mml:msup><mml:mo>∈</mml:mo><mml:msubsup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>x</mml:mi><mml:mi>d</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> is a bias vector. In addition, <inline-formula><mml:math id="M145" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msubsup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>u</mml:mi><mml:mi>d</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> is the input vector at time <inline-formula><mml:math id="M146" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e3976">The main problem of this model is the vanishing and exploding gradient due to the multiple calculations of the gradient during backpropagation through time <xref ref-type="bibr" rid="bib1.bibx137" id="paren.107"/>.</p>
</sec>
<sec id="App1.Ch1.S1.SS2">
  <label>A2</label><title>Long short-term memory</title>
      <p id="d2e3990">For LSTMs the hidden state <inline-formula><mml:math id="M147" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is obtained as follows <xref ref-type="bibr" rid="bib1.bibx54" id="paren.108"/>:

                <disp-formula specific-use="align" content-type="numbered"><mml:math id="M148" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E13"><mml:mtd><mml:mtext>A2</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="bold-italic">f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>g</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>f</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>f</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>f</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E14"><mml:mtd><mml:mtext>A3</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>g</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>i</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>i</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>i</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E15"><mml:mtd><mml:mtext>A4</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>g</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>o</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>o</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>o</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E16"><mml:mtd><mml:mtext>A5</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mover accent="true"><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:mtext>tanh</mml:mtext><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow><mml:mi>c</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E17"><mml:mtd><mml:mtext>A6</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:mi mathvariant="bold-italic">f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>⊙</mml:mo><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>⊙</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E18"><mml:mtd><mml:mtext>A7</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>⊙</mml:mo><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">c</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

            where <inline-formula><mml:math id="M149" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the <italic>forget</italic> gate, <inline-formula><mml:math id="M150" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">i</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the <italic>input</italic> gate, and <inline-formula><mml:math id="M151" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">o</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the <italic>output</italic> gate. The activation functions <inline-formula><mml:math id="M152" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>g</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are usually set to be sigmoid, and <inline-formula><mml:math id="M153" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is set to be the hyperbolic tangent. However, it can be set to unity for some variations of the model (e.g., “peephole” LSTM, <xref ref-type="bibr" rid="bib1.bibx45" id="altparen.109"/>). The matrices <inline-formula><mml:math id="M154" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>j</mml:mi></mml:msubsup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M155" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>j</mml:mi></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> for <inline-formula><mml:math id="M156" display="inline"><mml:mrow><mml:mi>j</mml:mi><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> are the weight matrices, while the vectors <inline-formula><mml:math id="M157" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>j</mml:mi></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> for <inline-formula><mml:math id="M158" display="inline"><mml:mrow><mml:mi>j</mml:mi><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> are bias vectors. The vector <inline-formula><mml:math id="M159" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> represents the input vector at time <inline-formula><mml:math id="M160" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula>.</p>
</sec>
<sec id="App1.Ch1.S1.SS3">
  <label>A3</label><title>Gated recurrent units</title>
      <p id="d2e4691">The equations to obtain the hidden state <inline-formula><mml:math id="M161" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> for GRUs are defined as follows <xref ref-type="bibr" rid="bib1.bibx21" id="paren.110"/>:

                <disp-formula specific-use="align" content-type="numbered"><mml:math id="M162" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E19"><mml:mtd><mml:mtext>A8</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>r</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>r</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>r</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E20"><mml:mtd><mml:mtext>A9</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>z</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>z</mml:mi></mml:msup><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>z</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E21"><mml:mtd><mml:mtext>A10</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:mtext>tanh</mml:mtext><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi></mml:mrow><mml:mi>x</mml:mi></mml:msubsup><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>x</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>⊙</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E22"><mml:mtd><mml:mtext>A11</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>⊙</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>⊙</mml:mo><mml:mover accent="true"><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

            where <inline-formula><mml:math id="M163" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">r</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the <italic>reset</italic> gate,  <inline-formula><mml:math id="M164" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the <italic>update</italic> gate, and <inline-formula><mml:math id="M165" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is the input signal. The activation functions <inline-formula><mml:math id="M166" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula> are set to be sigmoid. As in the LSTM, the matrices <inline-formula><mml:math id="M167" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext><mml:mi>j</mml:mi></mml:msubsup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M168" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">W</mml:mi><mml:mi>j</mml:mi></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> for <inline-formula><mml:math id="M169" display="inline"><mml:mrow><mml:mi>j</mml:mi><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi>r</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> are the weight matrices, while the vectors <inline-formula><mml:math id="M170" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold-italic">b</mml:mi><mml:mi>j</mml:mi></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> for <inline-formula><mml:math id="M171" display="inline"><mml:mrow><mml:mi>j</mml:mi><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi>r</mml:mi><mml:mo>,</mml:mo><mml:mi>z</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> are bias vectors.</p>
</sec>
<sec id="App1.Ch1.S1.SS4">
  <label>A4</label><title>Echo state networks</title>
      <p id="d2e5234">The hidden state <inline-formula><mml:math id="M172" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> for the ESN is defined as <xref ref-type="bibr" rid="bib1.bibx57" id="paren.111"/>
            <disp-formula id="App1.Ch1.S1.E23" content-type="numbered"><label>A12</label><mml:math id="M173" display="block"><mml:mrow><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:mo>)</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:mtext>tanh</mml:mtext><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext></mml:msub><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="bold">W</mml:mi><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M174" display="inline"><mml:mi mathvariant="italic">α</mml:mi></mml:math></inline-formula> is the leaky coefficient, and <inline-formula><mml:math id="M175" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is the input data. Similarly to before, the matrices <inline-formula><mml:math id="M176" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M177" display="inline"><mml:mrow><mml:mi mathvariant="bold">W</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> are the weights matrices, with the difference that this time these matrices do not undergo training or change. Since they are kept fixed, the initialization of these matrices also plays a role in predicting the model. The standard choices are to create <inline-formula><mml:math id="M178" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>in</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>, also called <italic>input</italic> matrix, as a dense matrix with weights randomly sampled from a uniform distribution in the range <inline-formula><mml:math id="M179" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">σ</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. The weight matrix <inline-formula><mml:math id="M180" display="inline"><mml:mi mathvariant="bold">W</mml:mi></mml:math></inline-formula> is usually referred to as the <italic>reservoir</italic> matrix, and it is usually built from an Erdős–Rényi graph configuration. This matrix shows a high sparsity, usually in the 1 %–10 % range, and its values are also randomly sampled from a uniform distribution <inline-formula><mml:math id="M181" display="inline"><mml:mrow><mml:mo>∈</mml:mo><mml:mo>[</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. Subsequently, the matrix is scaled to obtain a chosen spectral radius <inline-formula><mml:math id="M182" display="inline"><mml:mrow><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold">W</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. The values of the spectral radius, size of the matrix, and its sparsity are the main hyperparameters for ESN models.</p>
      <p id="d2e5512">The output is obtained through a linear feed-forward layer:
            <disp-formula id="App1.Ch1.S1.E24" content-type="numbered"><label>A13</label><mml:math id="M183" display="block"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo mathvariant="normal" stretchy="false">̃</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>out</mml:mtext></mml:msub><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M184" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>out</mml:mtext></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is the <italic>output</italic> matrix. This matrix is the only one whose weights undergo training. Unlike the models illustrated before, this training is not done using BPTT but simple linear regression. During the training phase, all the inputs <inline-formula><mml:math id="M185" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> are passed through the ESN, and the respective expansions (hidden states) are saved column-wise in a <italic>state</italic> matrix <inline-formula><mml:math id="M186" display="inline"><mml:mrow><mml:mi mathvariant="bold">X</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>T</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M187" display="inline"><mml:mi>T</mml:mi></mml:math></inline-formula> is the length of the training set <inline-formula><mml:math id="M188" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:math></inline-formula>. In a similar way, the matrix <inline-formula><mml:math id="M189" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">Y</mml:mi><mml:mtext>target</mml:mtext></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>v</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>T</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> that is built with the desired output <inline-formula><mml:math id="M190" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">v</mml:mi><mml:mo>∈</mml:mo><mml:msubsup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>v</mml:mi><mml:mi>d</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> is stacked column-wise. This way, the output layer can be obtained using ridge regression with the following closed form:
            <disp-formula id="App1.Ch1.S1.E25" content-type="numbered"><label>A14</label><mml:math id="M191" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold">W</mml:mi><mml:mtext>out</mml:mtext></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="bold">Y</mml:mi><mml:mtext>target</mml:mtext></mml:msup><mml:msup><mml:mi mathvariant="bold">X</mml:mi><mml:mtext>T</mml:mtext></mml:msup><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold">XX</mml:mi><mml:mtext>T</mml:mtext></mml:msup><mml:mo>+</mml:mo><mml:mi mathvariant="italic">β</mml:mi><mml:mi mathvariant="bold">I</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M192" display="inline"><mml:mi mathvariant="bold">I</mml:mi></mml:math></inline-formula> is the identity matrix, and <inline-formula><mml:math id="M193" display="inline"><mml:mi mathvariant="italic">β</mml:mi></mml:math></inline-formula> is a regularization coefficient.</p><table-wrap id="App1.Ch1.S1.T2"><label>Table A1</label><caption><p id="d2e5774">ESN hyperparameter grid search values.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="right"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="left"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry namest="col1" nameend="col4" align="center">ESN </oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Sparsity</oasis:entry>
         <oasis:entry colname="col2">Ridge coefficient</oasis:entry>
         <oasis:entry colname="col3">Leaky coeff.</oasis:entry>
         <oasis:entry colname="col4">Radius</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M194" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.01</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">0.01</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">0.1</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M195" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.0</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M196" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.5</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">0.05</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">1.0</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M197" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.9</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">0.05</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">1.5</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
</app>

<app id="App1.Ch1.S2">
  <label>Appendix B</label><title>Computational details</title>
<sec id="App1.Ch1.S2.SS1">
  <label>B1</label><title>Experimental settings</title>
      <p id="d2e5937">All the figures in this paper (except Fig. <xref ref-type="fig" rid="Ch1.F1"/>) have been obtained using the Julia language <xref ref-type="bibr" rid="bib1.bibx9" id="paren.112"/> package <monospace>Makie.jl</monospace> <xref ref-type="bibr" rid="bib1.bibx24" id="paren.113"/>. All the models have been optimized using grid search. For each optimal set of hyperparameters, 100 different runs have been carried out for each model and site. Unless specified otherwise, the results showcased are mean and standard deviation over these 100 runs per location, over all locations. All the simulations are run on a machine fitted with an NVIDIA RTX A6000 graphics processing unit (GPU) and 504 GB of random access memory (RAM).</p>
</sec>
<sec id="App1.Ch1.S2.SS2">
  <label>B2</label><title>Details of the models</title>
      <p id="d2e5959">In the proposed models based on truncated BPTT, the parameters are optimized using the stochastic optimization method called Adam <xref ref-type="bibr" rid="bib1.bibx61" id="paren.114"/>, while dropout <xref ref-type="bibr" rid="bib1.bibx125" id="paren.115"/> is used to protect against over-fitting. Additionally, early stopping is employed to halt training when the validation loss has not changed, with a patience factor of 50 epochs. The weights <inline-formula><mml:math id="M198" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">W</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are initialized using the technique proposed in <xref ref-type="bibr" rid="bib1.bibx47" id="paren.116"/>, drawing from a uniform distribution. The biases <inline-formula><mml:math id="M199" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are initialized by drawing values from a uniform distribution.</p>
      <p id="d2e5993">Different layers of recurrent networks are also stacked on top of each other, providing a deeper model. The number of layers and weights per layer is also optimized. All the hyper-parameters undergo optimization using grid search with the root mean square error (RMSE) as a guiding measure. Split temporal cross-validation is used <xref ref-type="bibr" rid="bib1.bibx17" id="paren.117"/>. In this technique, the training data are subdivided into smaller sections of increasing length, called folds. Each fold contains both training and testing data, with a chosen split between them. We used three folds for cross-validation, with 20 % of the training dataset left for validation in each fold.</p>
</sec>
<sec id="App1.Ch1.S2.SS3">
  <label>B3</label><title>Grid search parameters</title>
      <p id="d2e6009">Table <xref ref-type="table" rid="App1.Ch1.S1.T2"/> provides the parameters used for the grid search in this study. The values are either given as a list, divided by a comma (<inline-formula><mml:math id="M200" display="inline"><mml:mrow><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mo>,</mml:mo><mml:mi>c</mml:mi></mml:mrow></mml:math></inline-formula>), or as an interval, separated by a colon (<inline-formula><mml:math id="M201" display="inline"><mml:mrow><mml:mi mathvariant="normal">start</mml:mi><mml:mo>:</mml:mo><mml:mi mathvariant="normal">step</mml:mi><mml:mo>:</mml:mo><mml:mi mathvariant="normal">stop</mml:mi></mml:mrow></mml:math></inline-formula>). In the first case, the values indicated are the values used. In the second case, the values used are between the first and last, with a step size indicated by the second value.</p><table-wrap id="App1.Ch1.S2.T3"><label>Table B1</label><caption><p id="d2e6050">RNN/LSTM/GRU hyperparameter grid search values.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry namest="col1" nameend="col4" align="center">RNN/LSTM/GRU </oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Learning rate</oasis:entry>
         <oasis:entry colname="col2">Hidden dimension</oasis:entry>
         <oasis:entry colname="col3">No. layers</oasis:entry>
         <oasis:entry colname="col4">Dropout</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M202" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.0</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M203" display="inline"><mml:mrow><mml:mn mathvariant="normal">32</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">64</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">128</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M204" display="inline"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">3</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M205" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.1</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">0.1</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
</app>

<app id="App1.Ch1.S3">
  <label>Appendix C</label><title>Site comparison</title>
      <p id="d2e6196">Here, we compute the models' performance across all study sites. Following the nomenclature introduced in Sect. <xref ref-type="sec" rid="Ch1.S2.SS6.SSS3"/>, the <inline-formula><mml:math id="M206" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> score is defined as follows <xref ref-type="bibr" rid="bib1.bibx102" id="paren.118"/>:
          <disp-formula id="App1.Ch1.S3.E26" content-type="numbered"><label>C1</label><mml:math id="M207" display="block"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>⋅</mml:mo><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mo>⋅</mml:mo><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:mi>b</mml:mi><mml:mo>+</mml:mo><mml:mi>c</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e6253">The results are shown in Fig. <xref ref-type="fig" rid="App1.Ch1.S3.F9"/>, which provides an additional overview of the algorithms' performance with respect to extremes.</p>

      <fig id="App1.Ch1.S3.F8"><label>Figure C1</label><caption><p id="d2e6260"><italic>Mean results across locations</italic>. We show the metrics for all the analyzed locations and all models. Full season refers to the use of the full dataset for the results. Growing season indicates that we only utilized the months between May and September (included). The figure shows the mean of 100 runs per location.</p></caption>
        
        <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f08.png"/>

      </fig>

<fig id="App1.Ch1.S3.F9"><label>Figure C2</label><caption><p id="d2e6277"><italic>Mean</italic> <inline-formula><mml:math id="M208" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> <italic>score results across locations</italic>. We show the <inline-formula><mml:math id="M209" display="inline"><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> score for all the analyzed locations and all models. Full season refers to the use of the full dataset for the results. Summer indicates that we only utilized the months between May and September (included). The figure shows the mean of 100 runs per location.</p></caption>
        
        <graphic xlink:href="https://npg.copernicus.org/articles/31/535/2024/npg-31-535-2024-f09.png"/>

      </fig>

</app>
  </app-group><notes notes-type="codeavailability"><title>Code availability</title>

      <p id="d2e6319">The code for this study is available at <uri>https://github.com/MartinuzziFrancesco/rnn-ndvi</uri> <xref ref-type="bibr" rid="bib1.bibx85" id="paren.119"/>.</p>
  </notes><notes notes-type="dataavailability"><title>Data availability</title>

      <p id="d2e6331">The data used in this study are available online: <list list-type="bullet"><list-item>
      <p id="d2e6336">E-OBS dataset <xref ref-type="bibr" rid="bib1.bibx23" id="paren.120"/> at <uri>https://www.ecad.eu/download/ensembles/download.php</uri></p></list-item><list-item>
      <p id="d2e6345">FluxnetEO dataset <xref ref-type="bibr" rid="bib1.bibx136" id="paren.121"/> at <uri>https://meta.icos-cp.eu/collections/tEAkpU6UduMMONrFyym5-tUW</uri>.</p></list-item></list></p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d2e6357">FM and KM conceptualized the work, and FM carried out the simulations. KM, MDM, GCV, and TW provided suggestions for the analysis. KM and MDM supervised the work. DM formulated the data pre-processing pipeline, while FM implemented it. FM wrote the manuscript with contributions from all authors.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d2e6363">The contact author has declared that none of the authors has any competing interests.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d2e6370">Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.</p>
  </notes><ack><title>Acknowledgements</title><p id="d2e6376">We thank Sophia Walther for explaining the FluxnetEO data set to us in detail. This research was supported by grants from the European Space Agency and ESA (AI4Science – Deep Extremes and Deep Earth System Data Lab). Francesco Martinuzzi and Miguel D. Mahecha acknowledge the financial support from the Federal Ministry of Education and Research of Germany and from the Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus in the program of the Center of Excellence for AI Research “Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig”, project identification no. ScaDS.AI. Miguel D. Mahecha acknowledges support from the Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus (SMWK project 232171353). We thank the Breathing Nature community for the interdisciplinary exchange. Karin Mora acknowledges funding by the Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus (SMWK 3-7304/35/6-2021/48880). Karin Mora and Miguel D. Mahecha acknowledge the funding by the European Space Agency (ESA) AI4Science via the DeepFeatures project. David Montero and Miguel D. Mahecha acknowledge support from the “Digital Forest” project, Niedersächsisches Ministerium für Wissenschaft und Kultur (MWK) via the Niedersächsisches Vorab (ZN3679) program. Miguel D. Mahecha acknowledges support from the German Aerospace Center, DLR (ML4Earth; grant no. 50EE2201B). We thank the European Union for funding XAIDA via Horizon 2020 grant no. 101003469.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d2e6381">This research has been supported by the Federal Ministry of Education and Research of Germany and by the Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus (ScaDS.AI); the European Space Agency (ESA) (contract no. 4000143500/23/I-DT); the Programm Niedersächsisches Vorab (grant no. ZN 3679); the German Aerospace Center (DLR; grant no. 50EE2201B); the European Commission (XAIDA; grant no. 101003469); and the Saxon State Ministry for Science, Culture and Tourism (SMWK; grant nos. 232171353 and 3-7304/35/6-2021/48880).</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d2e6388">This paper was edited by Zoltan Toth and reviewed by two anonymous referees.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><label>Aicher et al.(2020)Aicher, Foti, and Fox</label><mixed-citation>Aicher, C., Foti, N. J., and Fox, E. B.: Adaptively truncating backpropagation through time to control gradient bias, in: Uncertainty in Artificial Intelligence, PMLR, 799–808, <uri>http://proceedings.mlr.press/v115/aicher20a/aicher20a.pdf</uri> (last access: 4 November 2024),  2020.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Aubinet et al.(2012)Aubinet, Vesala, and Papale</label><mixed-citation>Aubinet, M., Vesala, T., and Papale, D.: Eddy covariance: a practical guide to measurement and data analysis, Springer Science &amp; Business Media, <ext-link xlink:href="https://doi.org/10.1007/978-94-007-2351-1" ext-link-type="DOI">10.1007/978-94-007-2351-1</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>Bandt and Pompe(2002)</label><mixed-citation>Bandt, C. and Pompe, B.: Permutation entropy: a natural complexity measure for time series, Phys. Rev. Lett., 88, 174102, <ext-link xlink:href="https://doi.org/10.1103/PhysRevLett.88.174102" ext-link-type="DOI">10.1103/PhysRevLett.88.174102</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Barnes et al.(2009)Barnes, Schultz, Gruntfest, Hayden, and Benight</label><mixed-citation> Barnes, L. R., Schultz, D. M., Gruntfest, E. C., Hayden, M. H., and Benight, C. C.: Corrigendum: False alarm rate or false alarm ratio?, Weather Forecast., 24, 1452–1454, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Bastos et al.(2023)Bastos, Sippel, Frank, Mahecha, Zaehle, Zscheischler, and Reichstein</label><mixed-citation> Bastos, A., Sippel, S., Frank, D., Mahecha, M. D., Zaehle, S., Zscheischler, J., and Reichstein, M.: A joint framework for studying compound ecoclimatic events, Nat. Rev. Earth Environ., 4, 333–350, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Bengio et al.(1994)Bengio, Simard, and Frasconi</label><mixed-citation> Bengio, Y., Simard, P., and Frasconi, P.: Learning long-term dependencies with gradient descent is difficult, IEEE T. Neural Network., 5, 157–166, 1994.</mixed-citation></ref>
      <ref id="bib1.bibx7"><label>Benson et al.(2024)Benson, Robin, Requena-Mesa, Alonso, Carvalhais, Cortés, Gao, Linscheid, Weynants, and Reichstein</label><mixed-citation>Benson, V., Robin, C., Requena-Mesa, C., Alonso, L., Carvalhais, N., Cortés, J., Gao, Z., Linscheid, N., Weynants, M., and Reichstein, M.: Multi-modal learning for geospatial vegetation forecasting, in: Conference on Computer Vision and Pattern Recognition, 16–22 June 2024, Seattle, Washington, United States, <ext-link xlink:href="https://doi.org/10.1109/CVPR52733.2024.02625" ext-link-type="DOI">10.1109/CVPR52733.2024.02625</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>Besnard et al.(2019)Besnard, Carvalhais, Arain, Black, Brede, Buchmann, Chen, Clevers, Dutrieux, Gans et al.</label><mixed-citation>Besnard, S., Carvalhais, N., Arain, M. A., Black, A., Brede, B., Buchmann, N., Chen, J., Clevers, J. G. W., Dutrieux, L. P., Gans, F., Herold, M., Jung, M., Kosugi, Y., Knohl, A., Law, B. E., Paul-Limoges, E., Lohila, A. Merbold, L., Roupsard, O., Valentini, R., Wolf, S., Zhang, X., and Reichstein, M.: Memory effects of climate and vegetation affecting net ecosystem CO<sub>2</sub> fluxes in global forests, PloS one, 14, e0211510, <ext-link xlink:href="https://doi.org/10.1371/journal.pone.0211510" ext-link-type="DOI">10.1371/journal.pone.0211510</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Bezanson et al.(2017)Bezanson, Edelman, Karpinski, and Shah</label><mixed-citation> Bezanson, J., Edelman, A., Karpinski, S., and Shah, V. B.: Julia: A fresh approach to numerical computing, SIAM review, 59, 65–98, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Bonavita et al.(2023)Bonavita, Schneider, Arcucci, Chantry, Chrust, Geer, Le Saux, and Vitolo</label><mixed-citation>Bonavita, M., Schneider, R., Arcucci, R., Chantry, M., Chrust, M., Geer, A., Le Saux, B., and Vitolo, C.: 2022 ECMWF-ESA workshop report: current status, progress and opportunities in machine learning for Earth System observation and prediction, npj Climate and Atmospheric Science, 6, 87, <ext-link xlink:href="https://doi.org/10.1038/s41612-023-00387-2" ext-link-type="DOI">10.1038/s41612-023-00387-2</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Bottou(2012)</label><mixed-citation> Bottou, L.: Stochastic gradient descent tricks, in: Neural Networks: Tricks of the Trade: Second Edition,  Springer, 421–436, ISBN 978-3-642-35288-1, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Buchhorn et al.(2020)Buchhorn, Smets, Bertels, Roo, Lesiv, Tsendbazar, Herold, and Fritz</label><mixed-citation>Buchhorn, M., Smets, B., Bertels, L., Roo, B. D., Lesiv, M., Tsendbazar, N.-E., Herold, M., and Fritz, S.: Copernicus Global Land Service: Land Cover 100m: collection 3: epoch 2019: Globe, Zenodo [data set], <ext-link xlink:href="https://doi.org/10.5281/zenodo.3939050" ext-link-type="DOI">10.5281/zenodo.3939050</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Camps-Valls et al.(2021a)Camps-Valls, Campos-Taberner, Moreno-Martínez, Walther, Duveiller, Cescatti, Mahecha, Muñoz-Marí, García-Haro, Guanter et al.</label><mixed-citation>Camps-Valls, G., Campos-Taberner, M., Moreno-Martínez, Á., Walther, S., Duveiller, G., Cescatti, A., Mahecha, M. D., Muñoz-Marí, J., García-Haro, F. J., Guanter, L., Jung, M., Gamon, J. A., Reichstein, M., and Running, S. W.: A unified vegetation index for quantifying the terrestrial biosphere, Science Advances, 7, eabc7447, <ext-link xlink:href="https://doi.org/10.1126/sciadv.abc7447" ext-link-type="DOI">10.1126/sciadv.abc7447</ext-link>, 2021a.</mixed-citation></ref>
      <ref id="bib1.bibx14"><label>Camps-Valls et al.(2021b)Camps-Valls, Tuia, Zhu, and Reichstein</label><mixed-citation> Camps-Valls, G., Tuia, D., Zhu, X. X., and Reichstein, M.: Deep learning for the Earth Sciences: A comprehensive approach to remote sensing, climate science and geosciences, John Wiley &amp; Sons, ISBN 1119646146, 2021b.</mixed-citation></ref>
      <ref id="bib1.bibx15"><label>Canadell et al.(2021)Canadell, Monteiro, Costa, Cotrim da Cunha, Cox, Eliseev, Henson, Ishii, Jaccard, Koven, Lohila, Patra, Piao, Rogelj, Syampungani, Zaehle, and Zickfeld</label><mixed-citation>Canadell, J., Monteiro, P., Costa, M., Cotrim da Cunha, L., Cox, P., Eliseev, A., Henson, S., Ishii, M., Jaccard, S., Koven, C., Lohila, A., Patra, P., Piao, S., Rogelj, J., Syampungani, S., Zaehle, S., and Zickfeld, K.: Global Carbon and other Biogeochemical Cycles and Feedbacks, in: Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, edited by Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S., Péan, C., Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M., Huang, M., Leitzell, K., Lonnoy, E., Matthews, J., Maycock, T., Waterfield, T., Yelekçi, O., Yu, R., and Zhou, B., p. 673–816, Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, <ext-link xlink:href="https://doi.org/10.1017/9781009157896.007" ext-link-type="DOI">10.1017/9781009157896.007</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>Cerina et al.(2020)Cerina, Santambrogio, Franco, Gallicchio, and Micheli</label><mixed-citation>Cerina, L., Santambrogio, M. D., Franco, G., Gallicchio, C., and Micheli, A.: Efficient embedded machine learning applications using echo state networks, in: 2020 Design, Automation &amp; Test in Europe Conference &amp; Exhibition (DATE), IEEE, 1299–1302, <ext-link xlink:href="https://doi.org/10.23919/DATE48585.2020.9116334" ext-link-type="DOI">10.23919/DATE48585.2020.9116334</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>Cerqueira et al.(2020)Cerqueira, Torgo, and Mozetič</label><mixed-citation> Cerqueira, V., Torgo, L., and Mozetič, I.: Evaluating time series forecasting models: An empirical study on performance estimation methods, Machine Learning, 109, 1997–2028, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>Chattopadhyay et al.(2020)Chattopadhyay, Hassanzadeh, and Subramanian</label><mixed-citation>Chattopadhyay, A., Hassanzadeh, P., and Subramanian, D.: Data-driven predictions of a multiscale Lorenz 96 chaotic system using machine-learning methods: reservoir computing, artificial neural network, and long short-term memory network, Nonlin. Processes Geophys., 27, 373–389, <ext-link xlink:href="https://doi.org/10.5194/npg-27-373-2020" ext-link-type="DOI">10.5194/npg-27-373-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>Chen et al.(2004)Chen, Jönsson, Tamura, Gu, Matsushita, and Eklundh</label><mixed-citation> Chen, J., Jönsson, P., Tamura, M., Gu, Z., Matsushita, B., and Eklundh, L.: A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky–Golay filter, Remote Sens. Environ., 91, 332–344, 2004.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>Chen et al.(2021)Chen, Liu, Xu, Wu, Liang, Cao, and Chen</label><mixed-citation> Chen, Z., Liu, H., Xu, C., Wu, X., Liang, B., Cao, J., and Chen, D.: Modeling vegetation greenness and its climate sensitivity with deep-learning technology, Ecol. Evol., 11, 7335–7345, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx21"><label>Cho et al.(2014)Cho, Van Merriënboer, Gulcehre, Bahdanau, Bougares, Schwenk, and Bengio</label><mixed-citation>Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, October 2014, 1724–1734, <ext-link xlink:href="https://doi.org/10.3115/v1/D14-1179" ext-link-type="DOI">10.3115/v1/D14-1179</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx22"><label>Chung et al.(2014)Chung, Gulcehre, Cho, and Bengio</label><mixed-citation>Chung, J., Gulcehre, C., Cho, K., and Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.1412.3555" ext-link-type="DOI">10.48550/arXiv.1412.3555</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>Cornes et al.(2018)Cornes, van der Schrier, van den Besselaar, and Jones</label><mixed-citation>Cornes, R. C., van der Schrier, G., van den Besselaar, E. J., and Jones, P. D.: An ensemble version of the E-OBS temperature and precipitation data sets, J. Geophys. Res.-Atmos., 123, 9391–9409, 2018 (data available at: <uri>https://www.ecad.eu/download/ensembles/download.php</uri>, last access: 1 June 2023).</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Danisch and Krumbiegel(2021)</label><mixed-citation>Danisch, S. and Krumbiegel, J.: Makie.jl: Flexible high-performance data visualization for Julia, Journal of Open Source Software, 6, 3349, <ext-link xlink:href="https://doi.org/10.21105/joss.03349" ext-link-type="DOI">10.21105/joss.03349</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx25"><label>Datseris(2018)</label><mixed-citation>Datseris, G.: DynamicalSystems.jl: A Julia software library for chaos and nonlinear dynamics, Journal of Open Source Software, 3, 598, <ext-link xlink:href="https://doi.org/10.21105/joss.00598" ext-link-type="DOI">10.21105/joss.00598</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>De Jong et al.(2011)De Jong, de Bruin, de Wit, Schaepman, and Dent</label><mixed-citation> De Jong, R., de Bruin, S., de Wit, A., Schaepman, M. E., and Dent, D. L.: Analysis of monotonic greening and browning trends from global NDVI time-series, Remote Sens. Environ., 115, 692–702, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx27"><label>De Jong et al.(2012)De Jong, Verbesselt, Schaepman, and De Bruin</label><mixed-citation> De Jong, R., Verbesselt, J., Schaepman, M. E., and De Bruin, S.: Trend changes in global greening and browning: contribution of short-term trends to longer-term change, Glob. Change Biol., 18, 642–655, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx28"><label>De Keersmaecker et al.(2016)De Keersmaecker, van Rooijen, Lhermitte, Tits, Schaminée, Coppin, Honnay, and Somers</label><mixed-citation>De Keersmaecker, W., van Rooijen, N., Lhermitte, S., Tits, L., Schaminée, J., Coppin, P., Honnay, O., and Somers, B.: Species-rich semi-natural grasslands have a higher resistance but a lower resilience than intensively managed agricultural grasslands in response to climate anomalies, J. Appl. Ecol., 53, 430–439, <ext-link xlink:href="https://doi.org/10.1111/1365-2664.12595" ext-link-type="DOI">10.1111/1365-2664.12595</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx29"><label>Diaconu et al.(2022)Diaconu, Saha, Günnemann, and Zhu</label><mixed-citation>Diaconu, C.-A., Saha, S., Günnemann, S., and Zhu, X. X.: Understanding the Role of Weather Data for Earth Surface Forecasting using a ConvLSTM-based Model, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,  New Orleans, Louisiana, United States, 12–24 June 2022, 1362–1371, <uri>https://openaccess.thecvf.com/content/CVPR2022W/EarthVision/html/Diaconu_Understanding_the_Role_of_Weather_Data_for_Earth_Surface_Forecasting_CVPRW_2022_paper.html</uri> (last access: 4 November 2024), 2022.</mixed-citation></ref>
      <ref id="bib1.bibx30"><label>Dijkstra(2013)</label><mixed-citation>Dijkstra, H. A.: Nonlinear climate dynamics, Cambridge University Press,  ISBN 9781139034135, <ext-link xlink:href="https://doi.org/10.1017/CBO9781139034135" ext-link-type="DOI">10.1017/CBO9781139034135</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx31"><label>Dobbertin et al.(2007)Dobbertin, Wermelinger, Bigler, Bürgi, Carron, Forster, Gimmi, Rigling et al.</label><mixed-citation> Dobbertin, M., Wermelinger, B., Bigler, C., Bürgi, M., Carron, M., Forster, B., Gimmi, U., and Rigling, A.: Linking increasing drought stress to Scots pine mortality and bark beetle infestations, Sci. World J., 7, 231–239, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx32"><label>Eggleton(2012)</label><mixed-citation>Eggleton, T.: A short introduction to climate change, Cambridge University Press,  ISBN 9781139524353, <ext-link xlink:href="https://doi.org/10.1017/CBO9781139524353" ext-link-type="DOI">10.1017/CBO9781139524353</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx33"><label>Elman(1990)</label><mixed-citation> Elman, J. L.: Finding structure in time, Cognitive Sci., 14, 179–211, 1990.</mixed-citation></ref>
      <ref id="bib1.bibx34"><label>Fang et al.(2022)Fang, Kifer, Lawson, Feng, and Shen</label><mixed-citation>Fang, K., Kifer, D., Lawson, K., Feng, D., and Shen, C.: The data synergy effects of time-series deep learning models in hydrology, Water Resour. Res., 58, e2021WR029583, <ext-link xlink:href="https://doi.org/10.1029/2021WR029583" ext-link-type="DOI">10.1029/2021WR029583</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx35"><label>Farazmand and Sapsis(2019)</label><mixed-citation>Farazmand, M. and Sapsis, T. P.: Extreme events: Mechanisms and prediction, Appl. Mech. Rev., 71, 050801, <ext-link xlink:href="https://doi.org/10.1115/1.4042065" ext-link-type="DOI">10.1115/1.4042065</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx36"><label>Fensham and Holman(1999)</label><mixed-citation> Fensham, R. and Holman, J.: Temporal and spatial patterns in drought-related tree dieback in Australian savanna, J. Appl. Ecol., 36, 1035–1050, 1999.</mixed-citation></ref>
      <ref id="bib1.bibx37"><label>Foley et al.(1998)Foley, Levis, Prentice, Pollard, and Thompson</label><mixed-citation> Foley, J. A., Levis, S., Prentice, I. C., Pollard, D., and Thompson, S. L.: Coupling dynamic models of climate and vegetation, Glob. Change Biol., 4, 561–579, 1998.</mixed-citation></ref>
      <ref id="bib1.bibx38"><label>Frame et al.(2022)Frame, Kratzert, Klotz, Gauch, Shalev, Gilon, Qualls, Gupta, and Nearing</label><mixed-citation>Frame, J. M., Kratzert, F., Klotz, D., Gauch, M., Shalev, G., Gilon, O., Qualls, L. M., Gupta, H. V., and Nearing, G. S.: Deep learning rainfall–runoff predictions of extreme events, Hydrol. Earth Syst. Sci., 26, 3377–3392, <ext-link xlink:href="https://doi.org/10.5194/hess-26-3377-2022" ext-link-type="DOI">10.5194/hess-26-3377-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx39"><label>Freire et al.(2024)Freire, Srivallapanondh, Spinnler, Napoli, Costa, Prilepsky, and Turitsyn</label><mixed-citation>Freire, P., Srivallapanondh, S., Spinnler, B., Napoli, A., Costa, N., Prilepsky, J. E., and Turitsyn, S. K.: Computational Complexity Optimization of Neural Network-Based Equalizers in Digital Signal Processing: A Comprehensive Approach, J. Lightwave Technol., 42, 4177–4201, <ext-link xlink:href="https://doi.org/10.1109/JLT.2024.3386886" ext-link-type="DOI">10.1109/JLT.2024.3386886</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx40"><label>Freire et al.(2021)Freire, Osadchuk, Spinnler, Napoli, Schairer, Costa, Prilepsky, and Turitsyn</label><mixed-citation> Freire, P. J., Osadchuk, Y., Spinnler, B., Napoli, A., Schairer, W., Costa, N., Prilepsky, J. E., and Turitsyn, S. K.: Performance versus complexity study of neural network equalizers in coherent optical systems, J. Lightwave Technol., 39, 6085–6096, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx41"><label>Friedlingstein et al.(2006)Friedlingstein, Cox, Betts, Bopp, von Bloh, Brovkin, Cadule, Doney, Eby, Fung et al.</label><mixed-citation> Friedlingstein, P., Cox, P., Betts, R., Bopp, L., von Bloh, W., Brovkin, V., Cadule, P., Doney, S., Eby, M., Fung, I., Bala, G., John, J., Jones, C., Joos, F., Kato, T., Kawamiya, M., Knorr, W., Lindsay, K., Matthews, H. D., Raddatz, T., Rayner, P., Reick, C., Roeckner, E., Schnitzler, K.-G., Schnur, R., Strassmann, K., Weaver, A. J., Yoshikawa, C., and Zeng, N.: Climate–carbon cycle feedback analysis: results from the C4MIP model intercomparison, J. Climate, 19, 3337–3353, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx42"><label>Funahashi and Nakamura(1993)</label><mixed-citation> Funahashi, K.-i. and Nakamura, Y.: Approximation of dynamical systems by continuous time recurrent neural networks, Neural Networks, 6, 801–806, 1993.</mixed-citation></ref>
      <ref id="bib1.bibx43"><label>Gauch et al.(2021)Gauch, Kratzert, Klotz, Nearing, Lin, and Hochreiter</label><mixed-citation>Gauch, M., Kratzert, F., Klotz, D., Nearing, G., Lin, J., and Hochreiter, S.: Rainfall–runoff prediction at multiple timescales with a single Long Short-Term Memory network, Hydrol. Earth Syst. Sci., 25, 2045–2062, <ext-link xlink:href="https://doi.org/10.5194/hess-25-2045-2021" ext-link-type="DOI">10.5194/hess-25-2045-2021</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx44"><label>Gauthier et al.(2022)Gauthier, Fischer, and Röhm</label><mixed-citation>Gauthier, D. J., Fischer, I., and Röhm, A.: Learning unseen coexisting attractors, Chaos: An Interdisciplinary Journal of Nonlinear Science, 32, 113107, doi10.1063/5.0116784, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx45"><label>Gers and Schmidhuber(2000)</label><mixed-citation>Gers, F. and Schmidhuber, J.: Recurrent nets that time and count, in: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, IEEE,   Como, Italy, 27–27 July 2000, <ext-link xlink:href="https://doi.org/10.1109/ijcnn.2000.861302" ext-link-type="DOI">10.1109/ijcnn.2000.861302</ext-link>, 2000.</mixed-citation></ref>
      <ref id="bib1.bibx46"><label>Ghazoul et al.(2015)Ghazoul, Burivalova, Garcia-Ulloa, and King</label><mixed-citation> Ghazoul, J., Burivalova, Z., Garcia-Ulloa, J., and King, L. A.: Conceptualizing forest degradation, Trends Ecol. Evol., 30, 622–632, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx47"><label>Glorot and Bengio(2010)</label><mixed-citation>Glorot, X. and Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks, in: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, edited by Teh, Y. W. and Titterington, M., vol. 9 of Proceedings of Machine Learning Research,  PMLR, Chia Laguna Resort, Sardinia, Italy, 13–15 May 2010, 249–256, <uri>https://proceedings.mlr.press/v9/glorot10a.html</uri> (last access: 25 October 2024), 2010.</mixed-citation></ref>
      <ref id="bib1.bibx48"><label>Grant(1984)</label><mixed-citation> Grant, P. J.: Drought effect on high-altitude forests, Ruahine range, North Island, New Zealand, New Zeal. J. Bot., 22, 15–27, 1984.</mixed-citation></ref>
      <ref id="bib1.bibx49"><label>Grosse et al.(2002)Grosse, Bernaola-Galván, Carpena, Román-Roldán, Oliver, and Stanley</label><mixed-citation>Grosse, I., Bernaola-Galván, P., Carpena, P., Román-Roldán, R., Oliver, J., and Stanley, H. E.: Analysis of symbolic sequences using the Jensen-Shannon divergence, Phys. Rev. E, 65, 041905, <ext-link xlink:href="https://doi.org/10.1103/PhysRevE.65.041905" ext-link-type="DOI">10.1103/PhysRevE.65.041905</ext-link>, 2002.</mixed-citation></ref>
      <ref id="bib1.bibx50"><label>Haaga and Datseris(2023)</label><mixed-citation>Haaga, K. A. and Datseris, G.: JuliaDynamics/ComplexityMeasures.jl: v2.7.2, Zenodo [software], <ext-link xlink:href="https://doi.org/10.5281/zenodo.7862020" ext-link-type="DOI">10.5281/zenodo.7862020</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx51"><label>Hart et al.(2020)Hart, Hook, and Dawes</label><mixed-citation> Hart, A., Hook, J., and Dawes, J.: Embedding and approximation theorems for echo state networks, Neural Networks, 128, 234–247, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx52"><label>Hilker et al.(2014)Hilker, Lyapustin, Tucker, Hall, Myneni, Wang, Bi, Mendes de Moura, and Sellers</label><mixed-citation> Hilker, T., Lyapustin, A. I., Tucker, C. J., Hall, F. G., Myneni, R. B., Wang, Y., Bi, J., Mendes de Moura, Y., and Sellers, P. J.: Vegetation dynamics and rainfall sensitivity of the Amazon, P. Natl. Acad. Sci. USA, 111, 16041–16046, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx53"><label>Hochreiter(1998)</label><mixed-citation> Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions, Int. J. Uncertain. Fuzz., 6, 107–116, 1998.</mixed-citation></ref>
      <ref id="bib1.bibx54"><label>Hochreiter and Schmidhuber(1997)</label><mixed-citation> Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural Comput., 9, 1735–1780, 1997.</mixed-citation></ref>
      <ref id="bib1.bibx55"><label>Hogan and Mason(2011)</label><mixed-citation>Hogan, R. J. and Mason, I. B.: Deterministic Forecasts of Binary Events, in: Forecast Verification: A Practitioner's Guide in Atmospheric Science, 2nd edn., <ext-link xlink:href="https://doi.org/10.1002/9781119960003.ch3" ext-link-type="DOI">10.1002/9781119960003.ch3</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx56"><label>Hyndman and Koehler(2006)</label><mixed-citation>Hyndman, R. J. and Koehler, A. B.: Another look at measures of forecast accuracy, International J. Forecasting, 22, 679–688, <ext-link xlink:href="https://doi.org/10.1016/j.ijforecast.2006.03.001" ext-link-type="DOI">10.1016/j.ijforecast.2006.03.001</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx57"><label>Jaeger(2001)</label><mixed-citation>Jaeger, H.: The “echo state” approach to analysing and training recurrent neural networks-with an erratum note, Bonn, Germany: German National Research Center for Information Technology GMD Technical Report, 148, 13, <uri>https://www.ai.rug.nl/minds/uploads/EchoStatesTechRep.pdf</uri> (last access: 25 October 2024), 2001.</mixed-citation></ref>
      <ref id="bib1.bibx58"><label>Johnstone et al.(2016)Johnstone, Allen, Franklin, Frelich, Harvey, Higuera, Mack, Meentemeyer, Metz, Perry et al.</label><mixed-citation> Johnstone, J. F., Allen, C. D., Franklin, J. F., Frelich, L. E., Harvey, B. J., Higuera, P. E., Mack, M. C., Meentemeyer, R. K., Metz, M. R., Perry, G. L. W., Schoennagel, T., and Turner, M. G.: Changing disturbance regimes, ecological memory, and forest resilience, Front. Ecol. Environ., 14, 369–378, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx59"><label>Jung et al.(2020)Jung, Schwalm, Migliavacca, Walther, Camps-Valls, Koirala, Anthoni, Besnard, Bodesheim, Carvalhais, Chevallier, Gans, Goll, Haverd, Köhler, Ichii, Jain, Liu, Lombardozzi, Nabel, Nelson, O'Sullivan, Pallandt, Papale, Peters, Pongratz, Rödenbeck, Sitch, Tramontana, Walker, Weber, and Reichstein</label><mixed-citation>Jung, M., Schwalm, C., Migliavacca, M., Walther, S., Camps-Valls, G., Koirala, S., Anthoni, P., Besnard, S., Bodesheim, P., Carvalhais, N., Chevallier, F., Gans, F., Goll, D. S., Haverd, V., Köhler, P., Ichii, K., Jain, A. K., Liu, J., Lombardozzi, D., Nabel, J. E. M. S., Nelson, J. A., O'Sullivan, M., Pallandt, M., Papale, D., Peters, W., Pongratz, J., Rödenbeck, C., Sitch, S., Tramontana, G., Walker, A., Weber, U., and Reichstein, M.: Scaling carbon fluxes from eddy covariance sites to globe: synthesis and evaluation of the FLUXCOM approach, Biogeosciences, 17, 1343–1365, <ext-link xlink:href="https://doi.org/10.5194/bg-17-1343-2020" ext-link-type="DOI">10.5194/bg-17-1343-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx60"><label>Kang et al.(2016)Kang, Di, Deng, Yu, and Xu</label><mixed-citation>Kang, L., Di, L., Deng, M., Yu, E., and Xu, Y.: Forecasting vegetation index based on vegetation-meteorological factor interactions with artificial neural network, in: 2016 Fifth International Conference on Agro-Geoinformatics (Agro-Geoinformatics), Tianjin, China, 18–20 July 2016,  1–6, <ext-link xlink:href="https://doi.org/10.1109/Agro-Geoinformatics.2016.7577673" ext-link-type="DOI">10.1109/Agro-Geoinformatics.2016.7577673</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx61"><label>Kingma and Ba(2014)</label><mixed-citation>Kingma, D. P. and Ba, J.: Adam: A method for stochastic optimization, arXiv [preprint],  <ext-link xlink:href="https://doi.org/10.48550/arXiv.1412.6980" ext-link-type="DOI">10.48550/arXiv.1412.6980</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx62"><label>Kladny et al.(2024)Kladny, Milanta, Mraz, Hufkens, and Stocker</label><mixed-citation>Kladny, K.-R., Milanta, M., Mraz, O., Hufkens, K., and Stocker, B. D.: Enhanced prediction of vegetation responses to extreme drought using deep learning and Earth observation data, Ecol. Inform., 80, 102474, <ext-link xlink:href="https://doi.org/10.1016/j.ecoinf.2024.102474" ext-link-type="DOI">10.1016/j.ecoinf.2024.102474</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx63"><label>Kraft et al.(2019)Kraft, Jung, Körner, Requena Mesa, Cortés, and Reichstein</label><mixed-citation>Kraft, B., Jung, M., Körner, M., Requena Mesa, C., Cortés, J., and Reichstein, M.: Identifying dynamic memory effects on vegetation state using recurrent neural networks, Frontiers in Big Data, 2, 31, <ext-link xlink:href="https://doi.org/10.3389/fdata.2019.00031" ext-link-type="DOI">10.3389/fdata.2019.00031</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx64"><label>Kratzert et al.(2018)Kratzert, Klotz, Brenner, Schulz, and Herrnegger</label><mixed-citation>Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, <ext-link xlink:href="https://doi.org/10.5194/hess-22-6005-2018" ext-link-type="DOI">10.5194/hess-22-6005-2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx65"><label>Kratzert et al.(2024)Kratzert, Gauch, Klotz, and Nearing</label><mixed-citation>Kratzert, F., Gauch, M., Klotz, D., and Nearing, G.: HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin, Hydrol. Earth Syst. Sci., 28, 4187–4201, <ext-link xlink:href="https://doi.org/10.5194/hess-28-4187-2024" ext-link-type="DOI">10.5194/hess-28-4187-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx66"><label>Krinner et al.(2005)Krinner, Viovy, de Noblet-Ducoudré, Ogée, Polcher, Friedlingstein, Ciais, Sitch, and Prentice</label><mixed-citation>Krinner, G., Viovy, N., de Noblet-Ducoudré, N., Ogée, J., Polcher, J., Friedlingstein, P., Ciais, P., Sitch, S., and Prentice, I. C.: A dynamic global vegetation model for studies of the coupled atmosphere-biosphere system, Global Biogeochem. Cy., 19, GB1015, <ext-link xlink:href="https://doi.org/10.1029/2003GB002199" ext-link-type="DOI">10.1029/2003GB002199</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx67"><label>Lamberti et al.(2004)Lamberti, Martin, Plastino, and Rosso</label><mixed-citation> Lamberti, P. W., Martin, M., Plastino, A., and Rosso, O.: Intensive entropic non-triviality measure, Physica A, 334, 119–131, 2004.</mixed-citation></ref>
      <ref id="bib1.bibx68"><label>LeCun et al.(2015)LeCun, Bengio, and Hinton</label><mixed-citation> LeCun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521, 436–444, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx69"><label>Lellep et al.(2020)Lellep, Prexl, Linkmann, and Eckhardt</label><mixed-citation>Lellep, M., Prexl, J., Linkmann, M., and Eckhardt, B.: Using machine learning to predict extreme events in the Hénon map, Chaos: An Interdisciplinary Journal of Nonlinear Science, 30, 013113, <ext-link xlink:href="https://doi.org/10.1063/1.5121844" ext-link-type="DOI">10.1063/1.5121844</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx70"><label>Le Quéré et al.(2009)Le Quéré, Raupach, Canadell, Marland, Bopp, Ciais, Conway, Doney, Feely, Foster et al.</label><mixed-citation> Le Quéré, C., Raupach, M. R., Canadell, J. G., Marland, G., Bopp, L., Ciais, P., Conway, T. J., Doney, S. C., Feely, R. A., Foster, P., Friedlingstein, P., Gurney, K., Houghton, R. A., House, J. I., Huntingford, C., Levy, P. E., Lomas, M. R., Majkut, J. Metzl, N., Ometto, J. P., Peters, G. P., Prentice, I. C., Randerson, J. T., Running, S. W., Sarmiento, J. L., Schuster, U. Sitch, S., Takahashi, T., Viovy, N., van der Werf, G. R., and Woodward, F. I.: Trends in the sources and sinks of carbon dioxide, Nat. Geosci., 2, 831–836, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx71"><label>Le Quéré et al.(2018)Le Quéré, Andrew, Friedlingstein, Sitch, Hauck, Pongratz, Pickers, Korsbakken, Peters, Canadell et al.</label><mixed-citation>Le Quéré, C., Andrew, R. M., Friedlingstein, P., Sitch, S., Hauck, J., Pongratz, J., Pickers, P. A., Korsbakken, J. I., Peters, G. P., Canadell, J. G., Arneth, A., Arora, V. K., Barbero, L., Bastos, A., Bopp, L., Chevallier, F., Chini, L. P., Ciais, P., Doney, S. C., Gkritzalis, T., Goll, D. S., Harris, I., Haverd, V., Hoffman, F. M., Hoppema, M., Houghton, R. A., Hurtt, G., Ilyina, T., Jain, A. K., Johannessen, T., Jones, C. D., Kato, E., Keeling, R. F., Goldewijk, K. K., Landschützer, P., Lefèvre, N., Lienert, S., Liu, Z., Lombardozzi, D., Metzl, N., Munro, D. R., Nabel, J. E. M. S., Nakaoka, S., Neill, C., Olsen, A., Ono, T., Patra, P., Peregon, A., Peters, W., Peylin, P., Pfeil, B., Pierrot, D., Poulter, B., Rehder, G., Resplandy, L., Robertson, E., Rocher, M., Rödenbeck, C., Schuster, U., Schwinger, J., Séférian, R., Skjelvan, I., Steinhoff, T., Sutton, A., Tans, P. P., Tian, H., Tilbrook, B., Tubiello, F. N., van der Laan-Luijkx, I. T., van der Werf, G. R., Viovy, N., Walker, A. P., Wiltshire, A. J., Wright, R., Zaehle, S., and Zheng, B.: Global Carbon Budget 2018, Earth Syst. Sci. Data, 10, 2141–2194, <ext-link xlink:href="https://doi.org/10.5194/essd-10-2141-2018" ext-link-type="DOI">10.5194/essd-10-2141-2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx72"><label>Liang et al.(2003)Liang, Shao, Kong, and Lin</label><mixed-citation> Liang, E., Shao, X., Kong, Z., and Lin, J.: The extreme drought in the 1920s and its effect on tree growth deduced from tree ring analysis: a case study in North China, Ann. For. Sci., 60, 145–152, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx73"><label>Linscheid et al.(2020)Linscheid, Estupinan-Suarez, Brenning, Carvalhais, Cremer, Gans, Rammig, Reichstein, Sierra, and Mahecha</label><mixed-citation>Linscheid, N., Estupinan-Suarez, L. M., Brenning, A., Carvalhais, N., Cremer, F., Gans, F., Rammig, A., Reichstein, M., Sierra, C. A., and Mahecha, M. D.: Towards a global understanding of vegetation–climate dynamics at multiple timescales, Biogeosciences, 17, 945–962, <ext-link xlink:href="https://doi.org/10.5194/bg-17-945-2020" ext-link-type="DOI">10.5194/bg-17-945-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx74"><label>Liu et al.(2013)Liu, Liu, and Yin</label><mixed-citation>Liu, G., Liu, H., and Yin, Y.: Global patterns of NDVI-indicated vegetation extremes and their sensitivity to climate extremes, Environ. Res. Lett., 8, 025009, <ext-link xlink:href="https://doi.org/10.1088/1748-9326/8/2/025009" ext-link-type="DOI">10.1088/1748-9326/8/2/025009</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx75"><label>Lopez-Ruiz et al.(1995)Lopez-Ruiz, Mancini, and Calbet</label><mixed-citation> Lopez-Ruiz, R., Mancini, H. L., and Calbet, X.: A statistical measure of complexity, Phys. Lett. A, 209, 321–326, 1995.</mixed-citation></ref>
      <ref id="bib1.bibx76"><label>Lotsch et al.(2005)Lotsch, Friedl, Anderson, and Tucker</label><mixed-citation>Lotsch, A., Friedl, M. A., Anderson, B. T., and Tucker, C. J.: Response of terrestrial ecosystems to recent Northern Hemispheric drought, Geophys. Res. Lett., 32, L06705, <ext-link xlink:href="https://doi.org/10.1029/2004GL022043" ext-link-type="DOI">10.1029/2004GL022043</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx77"><label>Lu et al.(2017)Lu, Pathak, Hunt, Girvan, Brockett, and Ott</label><mixed-citation>Lu, Z., Pathak, J., Hunt, B., Girvan, M., Brockett, R., and Ott, E.: Reservoir observers: Model-free inference of unmeasured variables in chaotic systems, Chaos: An Interdisciplinary Journal of Nonlinear Science, 27, 041102, <ext-link xlink:href="https://doi.org/10.1063/1.4979665" ext-link-type="DOI">10.1063/1.4979665</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx78"><label>Maass et al.(2002)Maass, Natschläger, and Markram</label><mixed-citation> Maass, W., Natschläger, T., and Markram, H.: Real-time computing without stable states: A new framework for neural computation based on perturbations, Neural Comput., 14, 2531–2560, 2002.</mixed-citation></ref>
      <ref id="bib1.bibx79"><label>Mahecha et al.(2010)Mahecha, Fürst, Gobron, and Lange</label><mixed-citation> Mahecha, M. D., Fürst, L. M., Gobron, N., and Lange, H.: Identifying multiple spatiotemporal patterns: A refined view on terrestrial photosynthetic activity, Pattern Recogn. Lett., 31, 2309–2317, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx80"><label>Mahecha et al.(2020)Mahecha, Gans, Brandt, Christiansen, Cornell, Fomferra, Kraemer, Peters, Bodesheim, Camps-Valls et al.</label><mixed-citation>Mahecha, M. D., Gans, F., Brandt, G., Christiansen, R., Cornell, S. E., Fomferra, N., Kraemer, G., Peters, J., Bodesheim, P., Camps-Valls, G., Donges, J. F., Dorigo, W., Estupinan-Suarez, L. M., Gutierrez-Velez, V. H., Gutwin, M., Jung, M., Londoño, M. C., Miralles, D. G., Papastefanou, P., and Reichstein, M.: Earth system data cubes unravel global multivariate dynamics, Earth Syst. Dynam., 11, 201–234, <ext-link xlink:href="https://doi.org/10.5194/esd-11-201-2020" ext-link-type="DOI">10.5194/esd-11-201-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx81"><label>Mahecha et al.(2022)Mahecha, Bastos, Bohn, Eisenhauer, Feilhauer, Hartmann, Hickler, Kalesse-Los, Migliavacca, Otto et al.</label><mixed-citation> Mahecha, M. D., Bastos, A., Bohn, F. J., Eisenhauer, N., Feilhauer, H., Hartmann, H., Hickler, T., Kalesse-Los, H., Migliavacca, M., Otto, F. E. L., Peng, J., Quaas, J., Tegen, I., Weigelt, A., Wendisch, M., and Wirth, C.: Biodiversity loss and climate extremes – study the feedbacks, Nature, 612, 30–32, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx82"><label>Mahecha et al.(2024)Mahecha, Bastos, Bohn, Eisenhauer, Feilhauer, Hickler, Kalesse-Los, Migliavacca, Otto, Peng, Sippel, Tegen, Weigelt, Wendisch, Wirth, Al-Halbouni, Deneke, Doktor, Dunker, Duveiller, Ehrlich, Foth, García-García, Guerra, Guimarães-Steinicke, Hartmann, Henning, Herrmann, Hu, Ji, Kattenborn, Kolleck, Kretschmer, Kühn, Luttkus, Maahn, Mönks, Mora, Pöhlker, Reichstein, Rüger, Sánchez-Parra, Schäfer, Stratmann, Tesche, Wehner, Wieneke, Winkler, Wolf, Zaehle, Zscheischler, and Quaas</label><mixed-citation>Mahecha, M. D., Bastos, A., Bohn, F. J., Eisenhauer, N., Feilhauer, H., Hickler, T., Kalesse-Los, H., Migliavacca, M., Otto, F. E. L., Peng, J., Sippel, S., Tegen, I., Weigelt, A., Wendisch, M., Wirth, C., Al-Halbouni, D., Deneke, H., Doktor, D., Dunker, S., Duveiller, G., Ehrlich, A., Foth, A., García-García, A., Guerra, C. A., Guimarães-Steinicke, C., Hartmann, H., Henning, S., Herrmann, H., Hu, P., Ji, C., Kattenborn, T., Kolleck, N., Kretschmer, M., Kühn, I., Luttkus, M. L., Maahn, M., Mönks, M., Mora, K., Pöhlker, M., Reichstein, M., Rüger, N., Sánchez-Parra, B., Schäfer, M., Stratmann, F., Tesche, M., Wehner, B., Wieneke, S., Winkler, A. J., Wolf, S., Zaehle, S., Zscheischler, J., and Quaas, J.: Biodiversity and Climate Extremes: Known Interactions and Research Gaps, Earth's Future, 12, e2023EF003963, <ext-link xlink:href="https://doi.org/10.1029/2023EF003963" ext-link-type="DOI">10.1029/2023EF003963</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx83"><label>Makridakis(1993)</label><mixed-citation> Makridakis, S.: Accuracy measures: theoretical and practical concerns, Int. J. Forecasting, 9, 527–529, 1993.</mixed-citation></ref>
      <ref id="bib1.bibx84"><label>Martin et al.(2006)Martin, Plastino, and Rosso</label><mixed-citation> Martin, M., Plastino, A., and Rosso, O.: Generalized statistical complexity measures: Geometrical and analytical properties, Physica A, 369, 439–462, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx85"><label>Martinuzzi(2023)</label><mixed-citation>Martinuzzi, F.: rnn-ndvi, GitHub [code], <uri>https://github.com/MartinuzziFrancesco/rnn-ndvi</uri> (last access: 4 November 2024), 2023.</mixed-citation></ref>
      <ref id="bib1.bibx86"><label>Martinuzzi et al.(2022)Martinuzzi, Rackauckas, Abdelrehim, Mahecha, and Mora</label><mixed-citation>Martinuzzi, F., Rackauckas, C., Abdelrehim, A., Mahecha, M. D., and Mora, K.: ReservoirComputing. jl: An Efficient and Modular Library for Reservoir Computing Models, J. Mach. Learn. Res. [code], <uri>http://jmlr.org/papers/v23/22-0611.html</uri> (last access: 4 November 2024), 2022.</mixed-citation></ref>
      <ref id="bib1.bibx87"><label>Meiyazhagan et al.(2021)Meiyazhagan, Sudharsan, and Senthilvelan</label><mixed-citation>Meiyazhagan, J., Sudharsan, S., and Senthilvelan, M.: Model-free prediction of emergence of extreme events in a parametrically driven nonlinear dynamical system by deep learning, Eur. Phys. J. B, 94, 156, <ext-link xlink:href="https://doi.org/10.1140/epjb/s10051-021-00167-y" ext-link-type="DOI">10.1140/epjb/s10051-021-00167-y</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx88"><label>Merchant et al.(2017)Merchant, Paul, Popp, Ablain, Bontemps, Defourny, Hollmann, Lavergne, Laeng, De Leeuw et al.</label><mixed-citation>Merchant, C. J., Paul, F., Popp, T., Ablain, M., Bontemps, S., Defourny, P., Hollmann, R., Lavergne, T., Laeng, A., de Leeuw, G., Mittaz, J., Poulsen, C., Povey, A. C., Reuter, M., Sathyendranath, S., Sandven, S., Sofieva, V. F., and Wagner, W.: Uncertainty information in climate data records from Earth observation, Earth Syst. Sci. Data, 9, 511–527, <ext-link xlink:href="https://doi.org/10.5194/essd-9-511-2017" ext-link-type="DOI">10.5194/essd-9-511-2017</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx89"><label>Montero et al.(2023)Montero, Aybar, Mahecha, Martinuzzi, Söchting, and Wieneke</label><mixed-citation>Montero, D., Aybar, C., Mahecha, M. D., Martinuzzi, F., Söchting, M., and Wieneke, S.: A standardized catalogue of spectral indices to advance the use of remote sensing in Earth system research, Scientific Data, 10, 197, <ext-link xlink:href="https://doi.org/10.1038/s41597-023-02096-0" ext-link-type="DOI">10.1038/s41597-023-02096-0</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx90"><label>Mora et al.(2024)Mora, Rzanny, Wäldchen, Feilhauer, Kattenborn, Kraemer, Mäder, Svidzinska, Wolf, and Mahecha</label><mixed-citation>Mora, K., Rzanny, M., Wäldchen, J., Feilhauer, H., Kattenborn, T., Kraemer, G., Mäder, P., Svidzinska, D., Wolf, S., and Mahecha, M. D.: Macrophenological dynamics from citizen science plant occurrence data, Methods in Ecol. Evol., 15, 1422–1437, <ext-link xlink:href="https://doi.org/10.1111/2041-210X.14365" ext-link-type="DOI">10.1111/2041-210X.14365</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx91"><label>Nelson et al.(2024)Nelson, Walther, Gans, Kraft, Weber, Novick, Buchmann, Migliavacca, Wohlfahrt, Šigut, Ibrom, Papale, Göckede, Duveiller, Knohl, Hörtnagl, Scott, Zhang, Hamdi, Reichstein, Aranda-Barranco, Ardö, Op de Beeck, Billdesbach, Bowling, Bracho, Brümmer, Camps-Valls, Chen, Cleverly, Desai, Dong, El-Madany, Euskirchen, Feigenwinter, Galvagno, Gerosa, Gielen, Goded, Goslee, Gough, Heinesch, Ichii, Jackowicz-Korczynski, Klosterhalfen, Knox, Kobayashi, Kohonen, Korkiakoski, Mammarella, Mana, Marzuoli, Matamala, Metzger, Montagnani, Nicolini, O'Halloran, Ourcival, Peichl, Pendall, Ruiz Reverter, Roland, Sabbatini, Sachs, Schmidt, Schwalm, Shekhar, Silberstein, Silveira, Spano, Tagesson, Tramontana, Trotta, Turco, Vesala, Vincke, Vitale, Vivoni, Wang, Woodgate, Yepez, Zhang, Zona, and Jung</label><mixed-citation>Nelson, J. A., Walther, S., Gans, F., Kraft, B., Weber, U., Novick, K., Buchmann, N., Migliavacca, M., Wohlfahrt, G., Šigut, L., Ibrom, A., Papale, D., Göckede, M., Duveiller, G., Knohl, A., Hörtnagl, L., Scott, R. L., Zhang, W., Hamdi, Z. M., Reichstein, M., Aranda-Barranco, S., Ardö, J., Op de Beeck, M., Billdesbach, D., Bowling, D., Bracho, R., Brümmer, C., Camps-Valls, G., Chen, S., Cleverly, J. R., Desai, A., Dong, G., El-Madany, T. S., Euskirchen, E. S., Feigenwinter, I., Galvagno, M., Gerosa, G., Gielen, B., Goded, I., Goslee, S., Gough, C. M., Heinesch, B., Ichii, K., Jackowicz-Korczynski, M. A., Klosterhalfen, A., Knox, S., Kobayashi, H., Kohonen, K.-M., Korkiakoski, M., Mammarella, I., Mana, G., Marzuoli, R., Matamala, R., Metzger, S., Montagnani, L., Nicolini, G., O'Halloran, T., Ourcival, J.-M., Peichl, M., Pendall, E., Ruiz Reverter, B., Roland, M., Sabbatini, S., Sachs, T., Schmidt, M., Schwalm, C. R., Shekhar, A., Silberstein, R., Silveira, M. L., Spano, D., Tagesson, T., Tramontana, G., Trotta, C., Turco, F., Vesala, T., Vincke, C., Vitale, D., Vivoni, E. R., Wang, Y., Woodgate, W., Yepez, E. A., Zhang, J., Zona, D., and Jung, M.: X-BASE: the first terrestrial carbon and water flux products from an extended data-driven scaling framework, FLUXCOM-X, EGUsphere [preprint], <ext-link xlink:href="https://doi.org/10.5194/egusphere-2024-165" ext-link-type="DOI">10.5194/egusphere-2024-165</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx92"><label>Pammi et al.(2023)Pammi, Clerc, Coulibaly, and Barbay</label><mixed-citation>Pammi, V. A., Clerc, M. G., Coulibaly, S., and Barbay, S.: Extreme Events Prediction from Nonlocal Partial Information in a Spatiotemporally Chaotic Microcavity Laser, Phys. Rev. Lett., 130, 223801, <ext-link xlink:href="https://doi.org/10.1103/PhysRevLett.130.223801" ext-link-type="DOI">10.1103/PhysRevLett.130.223801</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx93"><label>Papagiannopoulou et al.(2017)Papagiannopoulou, Miralles, Decubber, Demuzere, Verhoest, Dorigo, and Waegeman</label><mixed-citation>Papagiannopoulou, C., Miralles, D. G., Decubber, S., Demuzere, M., Verhoest, N. E. C., Dorigo, W. A., and Waegeman, W.: A non-linear Granger-causality framework to investigate climate–vegetation dynamics, Geosci. Model Dev., 10, 1945–1960, <ext-link xlink:href="https://doi.org/10.5194/gmd-10-1945-2017" ext-link-type="DOI">10.5194/gmd-10-1945-2017</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx94"><label>Papale and Valentini(2003)</label><mixed-citation> Papale, D. and Valentini, R.: A new assessment of European forests carbon exchanges by eddy fluxes and artificial neural network spatialization, Glob. Change Biol., 9, 525–535, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx95"><label>Papale et al.(2015)Papale, Black, Carvalhais, Cescatti, Chen, Jung, Kiely, Lasslop, Mahecha, Margolis et al.</label><mixed-citation> Papale, D., Black, T. A., Carvalhais, N., Cescatti, A., Chen, J., Jung, M., Kiely, G., Lasslop, G., Mahecha, M. D., Margolis, H., Merbold, L., Montagnani, L., Moors, E., Olesen, J. E., Reichstein, M., Tramontana, G., van Gorsel, E., Wohlfahrt, G., and Ráduly, B.: Effect of spatial sampling from European flux towers for estimating carbon and water fluxes with artificial neural networks, J. Geophys. Res.-Biogeo., 120, 1941–1957, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx96"><label>Pappas et al.(2017)Pappas, Mahecha, Frank, Babst, and Koutsoyiannis</label><mixed-citation> Pappas, C., Mahecha, M. D., Frank, D. C., Babst, F., and Koutsoyiannis, D.: Ecosystem functioning is enveloped by hydrometeorological variability, Nature Ecology &amp; Evolution, 1, 1263–1270, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx97"><label>Pascanu et al.(2013)Pascanu, Mikolov, and Bengio</label><mixed-citation>Pascanu, R., Mikolov, T., and Bengio, Y.: On the difficulty of training recurrent neural networks, in: International conference on machine learning, Atlanta, GA, USA, 16–21 June 2013, Pmlr, 1310–1318, <uri>https://proceedings.mlr.press/v28/pascanu13.pdf</uri> (last access: 4 November 2024), 2013.</mixed-citation></ref>
      <ref id="bib1.bibx98"><label>Paszke et al.(2019)Paszke, Gross, Massa, Lerer, Bradbury, Chanan, Killeen, Lin, Gimelshein, Antiga et al.</label><mixed-citation>Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S.: Pytorch: An imperative style, high-performance deep learning library, GitHub [code], <uri>https://github.com/pytorch/pytorch</uri>  (last access: 4 November 2024), 2019.</mixed-citation></ref>
      <ref id="bib1.bibx99"><label>Pathak et al.(2017)Pathak, Lu, Hunt, Girvan, and Ott</label><mixed-citation>Pathak, J., Lu, Z., Hunt, B. R., Girvan, M., and Ott, E.: Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data, Chaos: An Interdisciplinary Journal of Nonlinear Science, 27, 121102, <ext-link xlink:href="https://doi.org/10.1063/1.5010300" ext-link-type="DOI">10.1063/1.5010300</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx100"><label>Pathak et al.(2018)Pathak, Hunt, Girvan, Lu, and Ott</label><mixed-citation>Pathak, J., Hunt, B., Girvan, M., Lu, Z., and Ott, E.: Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach, Phys. Rev. Lett., 120, 024102, <ext-link xlink:href="https://doi.org/10.1103/PhysRevLett.120.024102" ext-link-type="DOI">10.1103/PhysRevLett.120.024102</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx101"><label>Peng et al.(2022)Peng, Li, Shen, He, Chen, Peng, and Yuan</label><mixed-citation>Peng, Q., Li, X., Shen, R., He, B., Chen, X., Peng, Y., and Yuan, W.: How well can we predict vegetation growth through the coming growing season?, Science of Remote Sensing, 5, 100043, <ext-link xlink:href="https://doi.org/10.1016/j.srs.2022.100043" ext-link-type="DOI">10.1016/j.srs.2022.100043</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx102"><label>Powers(2020)</label><mixed-citation>Powers, D. M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.2010.16061" ext-link-type="DOI">10.48550/arXiv.2010.16061</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx103"><label>Pyragas and Pyragas(2020)</label><mixed-citation>Pyragas, V. and Pyragas, K.: Using reservoir computer to predict and prevent extreme events, Phys. Lett. A, 384, 126591, <ext-link xlink:href="https://doi.org/10.1016/j.physleta.2020.126591" ext-link-type="DOI">10.1016/j.physleta.2020.126591</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx104"><label>Ray et al.(2021)Ray, Chakraborty, and Ghosh</label><mixed-citation>Ray, A., Chakraborty, T., and Ghosh, D.: Optimized ensemble deep learning framework for scalable forecasting of dynamics containing extreme events, Chaos: An Interdisciplinary Journal of Nonlinear Science, 31, 111105, <ext-link xlink:href="https://doi.org/10.1063/5.0074213" ext-link-type="DOI">10.1063/5.0074213</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx105"><label>Reichstein et al.(2013)Reichstein, Bahn, Ciais, Frank, Mahecha, Seneviratne, Zscheischler, Beer, Buchmann, Frank et al.</label><mixed-citation> Reichstein, M., Bahn, M., Ciais, P., Frank, D., Mahecha, M. D., Seneviratne, S. I., Zscheischler, J., Beer, C., Buchmann, N., Frank, D. C., Papale, D. Rammig, A., Smith, P., Thonicke, K., van der Velde, M., Vicca, S., Walz, A., and Wattenbach, M.: Climate extremes and the carbon cycle, Nature, 500, 287–295, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx106"><label>Reichstein et al.(2018)Reichstein, Besnard, Carvalhais, Gans, Jung, Kraft, and Mahecha</label><mixed-citation>Reichstein, M., Besnard, S., Carvalhais, N., Gans, F., Jung, M., Kraft, B., and Mahecha, M.: Modelling landsurface time-series with recurrent neural nets, in: IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing Symposium,  Valencia, Spain, 22–27 July 2018, 7640–7643, <ext-link xlink:href="https://doi.org/10.1109/IGARSS.2018.8518007" ext-link-type="DOI">10.1109/IGARSS.2018.8518007</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx107"><label>Reichstein et al.(2019)Reichstein, Camps-Valls, Stevens, Jung, Denzler, and Carvalhais</label><mixed-citation> Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., and Carvalhais, N.: Deep learning and process understanding for data-driven Earth system science, Nature, 566, 195–204, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx108"><label>Requena-Mesa et al.(2021)Requena-Mesa, Benson, Reichstein, Runge, and Denzler</label><mixed-citation>Requena-Mesa, C., Benson, V., Reichstein, M., Runge, J., and Denzler, J.: EarthNet2021: A large-scale dataset and challenge for Earth surface forecasting as a guided video prediction task, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021, 1132–1142, <uri>https://doi.ieeecomputersociety.org/10.1109/CVPRW53098.2021.00124</uri> (last access: 4 November 2024), 2021.</mixed-citation></ref>
      <ref id="bib1.bibx109"><label>Robin et al.(2022)Robin, Requena-Mesa, Benson, Alonso, Poehls, Carvalhais, and Reichstein</label><mixed-citation>Robin, C., Requena-Mesa, C., Benson, V., Alonso, L., Poehls, J., Carvalhais, N., and Reichstein, M.: Learning to forecast vegetation greenness at fine resolution over Africa with ConvLSTMs, arXiv [preprint],  <ext-link xlink:href="https://doi.org/10.48550/arXiv.2210.13648" ext-link-type="DOI">10.48550/arXiv.2210.13648</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx110"><label>Rosso and Masoller(2009)</label><mixed-citation> Rosso, O. A. and Masoller, C.: Detecting and quantifying temporal correlations in stochastic resonance via information theory measures, Eur. Phys. J. B, 69, 37–43, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx111"><label>Rosso et al.(2007)Rosso, Larrondo, Martin, Plastino, and Fuentes</label><mixed-citation>Rosso, O. A., Larrondo, H., Martin, M. T., Plastino, A., and Fuentes, M. A.: Distinguishing noise from chaos, Phys. Rev. Lett., 99, 154102, <ext-link xlink:href="https://doi.org/10.1103/PhysRevLett.99.154102" ext-link-type="DOI">10.1103/PhysRevLett.99.154102</ext-link>,  2007.</mixed-citation></ref>
      <ref id="bib1.bibx112"><label>Rouse et al.(1974)Rouse, Haas, Schell, Deering et al.</label><mixed-citation>Rouse, J. W., Haas, R. H., Schell, J. A., and Deering, D. W.: Monitoring vegetation systems in the Great Plains with ERTS, NASA Spec. Publ, 351, 309, <uri>https://ntrs.nasa.gov/citations/19740022614</uri> (last access: 4 November 2024), 1974.</mixed-citation></ref>
      <ref id="bib1.bibx113"><label>Rudy and Sapsis(2023)</label><mixed-citation>Rudy, S. H. and Sapsis, T. P.: Output-weighted and relative entropy loss functions for deep learning precursors of extreme events, Physica D, 443, 133570, <ext-link xlink:href="https://doi.org/10.1016/j.physd.2022.133570" ext-link-type="DOI">10.1016/j.physd.2022.133570</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx114"><label>Rumelhart et al.(1986)Rumelhart, Hinton, and Williams</label><mixed-citation> Rumelhart, D. E., Hinton, G. E., and Williams, R. J.: Learning representations by back-propagating errors, Nature, 323, 533–536, 1986.</mixed-citation></ref>
      <ref id="bib1.bibx115"><label>Savitzky and Golay(1964)</label><mixed-citation> Savitzky, A. and Golay, M. J.: Smoothing and differentiation of data by simplified least squares procedures, Anal. Chem., 36, 1627–1639, 1964.</mixed-citation></ref>
      <ref id="bib1.bibx116"><label>Scheepens et al.(2023)Scheepens, Schicker, Hlaváčková-Schindler, and Plant</label><mixed-citation>Scheepens, D. R., Schicker, I., Hlaváčková-Schindler, K., and Plant, C.: Adapting a deep convolutional RNN model with imbalanced regression loss for improved spatio-temporal forecasting of extreme wind speed events in the short to medium range, Geosci. Model Dev., 16, 251–270, <ext-link xlink:href="https://doi.org/10.5194/gmd-16-251-2023" ext-link-type="DOI">10.5194/gmd-16-251-2023</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx117"><label>Schulz et al.(2024)Schulz, Vollmer, Mahecha, and Mora</label><mixed-citation>Schulz, L., Vollmer, J., Mahecha, M. D., and Mora, K.: Nonlinear spectral analysis extracts harmonics from land-atmosphere fluxes, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.2407.19237" ext-link-type="DOI">10.48550/arXiv.2407.19237</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx118"><label>Seneviratne et al.(2021)Seneviratne, Zhang, Adnan, Badi, Dereczynski, Di Luca, Ghosh, Iskandar, Kossin, Lewis, Otto, Pinto, Satoh, Vicente-Serrano, Wehner, and Zhou</label><mixed-citation>Seneviratne, S., Zhang, X., Adnan, M., Badi, W., Dereczynski, C., Di Luca, A., Ghosh, S., Iskandar, I., Kossin, J., Lewis, S., Otto, F., Pinto, I., Satoh, M., Vicente-Serrano, S., Wehner, M., and Zhou, B.: Weather and Climate Extreme Events in a Changing Climate, in: Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S., Péan, C., Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M., Huang, M., Leitzell, K., Lonnoy, E., Matthews, J., Maycock, T., Waterfield, T., Yelekçi, O., Yu, R., and Zhou, B., Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA,  1513–1766, <ext-link xlink:href="https://doi.org/10.1017/9781009157896.013" ext-link-type="DOI">10.1017/9781009157896.013</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx119"><label>Shi et al.(2015)Shi, Chen, Wang, Yeung, Wong, and Woo</label><mixed-citation>Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., and Woo, W.-C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting, Adv. Neur. In., 28, 802–810, <uri>https://dl.acm.org/doi/10.5555/2969239.2969329</uri> (last access: 4 November 2024), 2015.</mixed-citation></ref>
      <ref id="bib1.bibx120"><label>Sippel et al.(2016)Sippel, Lange, Mahecha, Hauhs, Bodesheim, Kaminski, Gans, and Rosso</label><mixed-citation>Sippel, S., Lange, H., Mahecha, M. D., Hauhs, M., Bodesheim, P., Kaminski, T., Gans, F., and Rosso, O. A.: Diagnosing the dynamics of observed and simulated ecosystem gross primary productivity with time causal information theory quantifiers, PloS one, 11, e0164960, <ext-link xlink:href="https://doi.org/10.1371/journal.pone.0164960" ext-link-type="DOI">10.1371/journal.pone.0164960</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx121"><label>Sippel et al.(2018)Sippel, Reichstein, Ma, Mahecha, Lange, Flach, and Frank</label><mixed-citation> Sippel, S., Reichstein, M., Ma, X., Mahecha, M. D., Lange, H., Flach, M., and Frank, D.: Drought, heat, and the carbon cycle: a review, Current Climate Change Reports, 4, 266–286, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx122"><label>Sitch et al.(2003)Sitch, Smith, Prentice, Arneth, Bondeau, Cramer, Kaplan, Levis, Lucht, Sykes et al.</label><mixed-citation> Sitch, S., Smith, B., Prentice, I. C., Arneth, A., Bondeau, A., Cramer, W., Kaplan, J. O., Levis, S., Lucht, W., Sykes, M. T., Thonicke, K., and Venevsky, S.: Evaluation of ecosystem dynamics, plant geography and terrestrial carbon cycling in the LPJ dynamic global vegetation model, Glob. Change Biol., 9, 161–185, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx123"><label>Slayback et al.(2003)Slayback, Pinzon, Los, and Tucker</label><mixed-citation>Slayback, D. A., Pinzon, J. E., Los, S. O., and Tucker, C. J.: Northern hemisphere photosynthetic trends 1982–99, Glob. Change Biol., 9, 1–15, <ext-link xlink:href="https://doi.org/10.1046/j.1365-2486.2003.00507.x" ext-link-type="DOI">10.1046/j.1365-2486.2003.00507.x</ext-link>, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx124"><label>Srinivasan et al.(2019)Srinivasan, Guastoni, Azizpour, Schlatter, and Vinuesa</label><mixed-citation>Srinivasan, P. A., Guastoni, L., Azizpour, H., Schlatter, P., and Vinuesa, R.: Predictions of turbulent shear flows using deep neural networks, Physical Review Fluids, 4, 054603, <ext-link xlink:href="https://doi.org/10.1103/PhysRevFluids.4.054603" ext-link-type="DOI">10.1103/PhysRevFluids.4.054603</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx125"><label>Srivastava et al.(2014)Srivastava, Hinton, Krizhevsky, Sutskever, and Salakhutdinov</label><mixed-citation> Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15, 1929–1958, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx126"><label>Steinier et al.(1972)Steinier, Termonia, and Deltour</label><mixed-citation> Steinier, J., Termonia, Y., and Deltour, J.: Smoothing and differentiation of data by simplified least square procedure, Anal. Chem., 44, 1906–1909, 1972.</mixed-citation></ref>
      <ref id="bib1.bibx127"><label>Sun et al.(2022)Sun, Song, Cai, Zhang, Hong, and Li</label><mixed-citation> Sun, C., Song, M., Cai, D., Zhang, B., Hong, S., and Li, H.: A systematic review of echo state networks from design to application, IEEE Transactions on Artificial Intelligence, 5, 23–37, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx128"><label>Sutskever(2013)</label><mixed-citation> Sutskever, I.: Training recurrent neural networks, PhD thesis, University of Toronto Toronto, ON, Canada, ISBN 9780499220660, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx129"><label>Teskey et al.(2015)Teskey, Wertin, Bauweraerts, Ameye, McGuire, and Steppe</label><mixed-citation> Teskey, R., Wertin, T., Bauweraerts, I., Ameye, M., McGuire, M. A., and Steppe, K.: Responses of tree species to heat waves and extreme heat events, Plant Cell Environ., 38, 1699–1712, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx130"><label>Tietz et al.(2017)Tietz, Fan, Nouri, Bossan, and skorch Developers</label><mixed-citation>Tietz, M., Fan, T. J., Nouri, D., Bossan, B., and skorch Developers: skorch: A scikit-learn compatible neural network library that wraps PyTorch, Skorch [code], <uri>https://skorch.readthedocs.io/en/stable/</uri> (last access: 25 October 2024), 2017.</mixed-citation></ref>
      <ref id="bib1.bibx131"><label>Van Mantgem et al.(2009)Van Mantgem, Stephenson, Byrne, Daniels, Franklin, Fulé, Harmon, Larson, Smith, Taylor et al.</label><mixed-citation> Van Mantgem, P. J., Stephenson, N. L., Byrne, J. C., Daniels, L. D., Franklin, J. F., Fulé, P. Z., Harmon, M. E., Larson, A. J., Smith, J. M., Taylor, A. H., and Veblen, T. T.: Widespread increase of tree mortality rates in the western United States, Science, 323, 521–524, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx132"><label>Verstraeten et al.(2007)Verstraeten, Schrauwen, d'Haene, and Stroobandt</label><mixed-citation> Verstraeten, D., Schrauwen, B., d'Haene, M., and Stroobandt, D.: An experimental unification of reservoir computing methods, Neural Networks, 20, 391–403, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx133"><label>Vlachas et al.(2020)Vlachas, Pathak, Hunt, Sapsis, Girvan, Ott, and Koumoutsakos</label><mixed-citation>Vlachas, P., Pathak, J., Hunt, B., Sapsis, T., Girvan, M., Ott, E., and Koumoutsakos, P.: Backpropagation algorithms and Reservoir Computing in Recurrent Neural Networks for the forecasting of complex spatiotemporal dynamics, Neural Networks, 126, 191–217, <ext-link xlink:href="https://doi.org/10.1016/j.neunet.2020.02.016" ext-link-type="DOI">10.1016/j.neunet.2020.02.016</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx134"><label>von Buttlar et al.(2018)von Buttlar, Zscheischler, Rammig, Sippel, Reichstein, Knohl, Jung, Menzer, Arain, Buchmann et al.</label><mixed-citation>von Buttlar, J., Zscheischler, J., Rammig, A., Sippel, S., Reichstein, M., Knohl, A., Jung, M., Menzer, O., Arain, M. A., Buchmann, N., Cescatti, A., Gianelle, D., Kiely, G., Law, B. E., Magliulo, V., Margolis, H., McCaughey, H., Merbold, L., Migliavacca, M., Montagnani, L., Oechel, W., Pavelka, M., Peichl, M., Rambal, S., Raschi, A., Scott, R. L., Vaccari, F. P., van Gorsel, E., Varlagin, A., Wohlfahrt, G., and Mahecha, M. D.: Impacts of droughts and extreme-temperature events on gross primary production and ecosystem respiration: a systematic assessment across ecosystems and climate zones, Biogeosciences, 15, 1293–1318, <ext-link xlink:href="https://doi.org/10.5194/bg-15-1293-2018" ext-link-type="DOI">10.5194/bg-15-1293-2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx135"><label>Walleshauser and Bollt(2022)</label><mixed-citation>Walleshauser, B. and Bollt, E.: Predicting sea surface temperatures with coupled reservoir computers, Nonlin. Processes Geophys., 29, 255–264, <ext-link xlink:href="https://doi.org/10.5194/npg-29-255-2022" ext-link-type="DOI">10.5194/npg-29-255-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx136"><label>Walther et al.(2022)Walther, Besnard, Nelson, El-Madany, Migliavacca, Weber, Carvalhais, Ermida, Brümmer, Schrader et al.</label><mixed-citation>Walther, S., Besnard, S., Nelson, J. A., El-Madany, T. S., Migliavacca, M., Weber, U., Carvalhais, N., Ermida, S. L., Brümmer, C., Schrader, F., Prokushkin, A. S., Panov, A. V., and Jung, M.: Technical note: A view from space on global flux towers by MODIS and Landsat: the FluxnetEO data set, Biogeosciences, 19, 2805–2840, <ext-link xlink:href="https://doi.org/10.5194/bg-19-2805-2022" ext-link-type="DOI">10.5194/bg-19-2805-2022</ext-link>, 2022 (data available at: <uri>https://meta.icos-cp.eu/collections/tEAkpU6UduMMONrFyym5-tUW</uri>, last access: 25 October 2024).</mixed-citation></ref>
      <ref id="bib1.bibx137"><label>Werbos(1988)</label><mixed-citation> Werbos, P. J.: Generalization of backpropagation with application to a recurrent gas market model, Neural Networks, 1, 339–356, 1988.</mixed-citation></ref>
      <ref id="bib1.bibx138"><label>Werbos(1990)</label><mixed-citation> Werbos, P. J.: Backpropagation through time: what it does and how to do it, P. IEEE, 78, 1550–1560, 1990.</mixed-citation></ref>
      <ref id="bib1.bibx139"><label>Williams and Zipser(1995)</label><mixed-citation>Williams, R. J. and Zipser, D.: Gradient-based learning algorithms for recurrent, Backpropagation: Theory, Architectures, and Applications, 433, 17,   <uri>https://gwern.net/doc/ai/nn/rnn/1995-williams.pdf</uri> (last access: 4 November 2024), 1995.</mixed-citation></ref>
      <ref id="bib1.bibx140"><label>Yengoh et al.(2015)Yengoh, Dent, Olsson, Tengberg, and Tucker III</label><mixed-citation> Yengoh, G. T., Dent, D., Olsson, L., Tengberg, A. E., and Tucker III, C. J.: Use of the Normalized Difference Vegetation Index (NDVI) to assess Land degradation at multiple scales: current status, future trends, and practical considerations, Springer, ISBN 978-3-319-24112-8, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx141"><label>Zeng et al.(2002)Zeng, Hales, and Neelin</label><mixed-citation>Zeng, N., Hales, K., and Neelin, J. D.: Nonlinear dynamics in a coupled vegetation–atmosphere system and implications for desert–forest gradient, J. Climate, 15, 3474–3487, 2002.  </mixed-citation></ref>
      <ref id="bib1.bibx142"><label>Zeng et al.(2022)Zeng, Hao, Huete, Dechant, Berry, Chen, Joiner, Frankenberg, Bond-Lamberty, Ryu et al.</label><mixed-citation> Zeng, Y., Hao, D., Huete, A., Dechant, B., Berry, J., Chen, J. M., Joiner, J., Frankenberg, C., Bond-Lamberty, B., Ryu, Y., Xiao, J., Asrar, G. R., and Chen, M.: Optical vegetation indices for monitoring terrestrial ecosystems globally, Nature Reviews Earth &amp; Environment, 3, 477–493, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx143"><label>Zhang et al.(2017)Zhang, Wang, Dong, Zhong, and Sun</label><mixed-citation> Zhang, Q., Wang, H., Dong, J., Zhong, G., and Sun, X.: Prediction of sea surface temperature using long short-term memory, IEEE Geosci. Remote S., 14, 1745–1749, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx144"><label>Zhang et al.(2021)Zhang, Xin, and Li</label><mixed-citation>Zhang, Z., Xin, Q., and Li, W.: Machine Learning-Based Modeling of Vegetation Leaf Area Index and Gross Primary Productivity Across North America and Comparison With a Process-Based Model, J. Adv. Model. Earth Sy., 13, e2021MS002802, <ext-link xlink:href="https://doi.org/10.1029/2021MS002802" ext-link-type="DOI">10.1029/2021MS002802</ext-link>, 2021.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Learning extreme vegetation response to climate drivers with recurrent neural networks</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>Aicher et al.(2020)Aicher, Foti, and Fox</label><mixed-citation>
      
Aicher, C., Foti, N. J., and Fox, E. B.: Adaptively truncating backpropagation
through time to control gradient bias, in: Uncertainty in Artificial
Intelligence, PMLR, 799–808, <a href="http://proceedings.mlr.press/v115/aicher20a/aicher20a.pdf" target="_blank"/> (last access: 4 November 2024),  2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Aubinet et al.(2012)Aubinet, Vesala, and Papale</label><mixed-citation>
      
Aubinet, M., Vesala, T., and Papale, D.: Eddy covariance: a practical guide to
measurement and data analysis, Springer Science &amp; Business Media, <a href="https://doi.org/10.1007/978-94-007-2351-1" target="_blank">https://doi.org/10.1007/978-94-007-2351-1</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Bandt and Pompe(2002)</label><mixed-citation>
      
Bandt, C. and Pompe, B.: Permutation entropy: a natural complexity measure for
time series, Phys. Rev. Lett., 88, 174102, <a href="https://doi.org/10.1103/PhysRevLett.88.174102" target="_blank">https://doi.org/10.1103/PhysRevLett.88.174102</a>, 2002.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Barnes et al.(2009)Barnes, Schultz, Gruntfest, Hayden, and
Benight</label><mixed-citation>
      
Barnes, L. R., Schultz, D. M., Gruntfest, E. C., Hayden, M. H., and Benight,
C. C.: Corrigendum: False alarm rate or false alarm ratio?, Weather
Forecast., 24, 1452–1454, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Bastos et al.(2023)Bastos, Sippel, Frank, Mahecha, Zaehle,
Zscheischler, and Reichstein</label><mixed-citation>
      
Bastos, A., Sippel, S., Frank, D., Mahecha, M. D., Zaehle, S., Zscheischler,
J., and Reichstein, M.: A joint framework for studying compound ecoclimatic
events, Nat. Rev. Earth Environ., 4, 333–350, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Bengio et al.(1994)Bengio, Simard, and Frasconi</label><mixed-citation>
      
Bengio, Y., Simard, P., and Frasconi, P.: Learning long-term dependencies with
gradient descent is difficult, IEEE T. Neural Network., 5,
157–166, 1994.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Benson et al.(2024)Benson, Robin, Requena-Mesa, Alonso, Carvalhais,
Cortés, Gao, Linscheid, Weynants, and
Reichstein</label><mixed-citation>
      
Benson, V., Robin, C., Requena-Mesa, C., Alonso, L., Carvalhais, N.,
Cortés, J., Gao, Z., Linscheid, N., Weynants, M., and Reichstein, M.:
Multi-modal learning for geospatial vegetation forecasting, in: Conference on
Computer Vision and Pattern Recognition, 16–22 June 2024, Seattle,
Washington, United States, <a href="https://doi.org/10.1109/CVPR52733.2024.02625" target="_blank">https://doi.org/10.1109/CVPR52733.2024.02625</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Besnard et al.(2019)Besnard, Carvalhais, Arain, Black, Brede,
Buchmann, Chen, Clevers, Dutrieux, Gans et al.</label><mixed-citation>
      
Besnard, S., Carvalhais, N., Arain, M. A., Black, A., Brede, B., Buchmann, N.,
Chen, J., Clevers, J. G. W., Dutrieux, L. P., Gans, F., Herold, M., Jung, M., Kosugi, Y., Knohl, A., Law, B. E., Paul-Limoges, E., Lohila, A. Merbold, L., Roupsard, O., Valentini, R., Wolf, S., Zhang, X., and Reichstein, M.: Memory
effects of climate and vegetation affecting net ecosystem CO<sub>2</sub> fluxes in
global forests, PloS one, 14, e0211510, <a href="https://doi.org/10.1371/journal.pone.0211510" target="_blank">https://doi.org/10.1371/journal.pone.0211510</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Bezanson et al.(2017)Bezanson, Edelman, Karpinski, and
Shah</label><mixed-citation>
      
Bezanson, J., Edelman, A., Karpinski, S., and Shah, V. B.: Julia: A fresh
approach to numerical computing, SIAM review, 59, 65–98, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Bonavita et al.(2023)Bonavita, Schneider, Arcucci, Chantry, Chrust,
Geer, Le Saux, and Vitolo</label><mixed-citation>
      
Bonavita, M., Schneider, R., Arcucci, R., Chantry, M., Chrust, M., Geer, A.,
Le Saux, B., and Vitolo, C.: 2022 ECMWF-ESA workshop report: current status,
progress and opportunities in machine learning for Earth System observation
and prediction, npj Climate and Atmospheric Science, 6, 87, <a href="https://doi.org/10.1038/s41612-023-00387-2" target="_blank">https://doi.org/10.1038/s41612-023-00387-2</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Bottou(2012)</label><mixed-citation>
      
Bottou, L.: Stochastic gradient descent tricks, in: Neural Networks: Tricks of
the Trade: Second Edition,  Springer, 421–436, ISBN 978-3-642-35288-1, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Buchhorn et al.(2020)Buchhorn, Smets, Bertels, Roo, Lesiv,
Tsendbazar, Herold, and Fritz</label><mixed-citation>
      
Buchhorn, M., Smets, B., Bertels, L., Roo, B. D., Lesiv, M., Tsendbazar, N.-E.,
Herold, M., and Fritz, S.: Copernicus Global Land Service: Land Cover 100m:
collection 3: epoch 2019: Globe, Zenodo [data set], <a href="https://doi.org/10.5281/zenodo.3939050" target="_blank">https://doi.org/10.5281/zenodo.3939050</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Camps-Valls et al.(2021a)Camps-Valls, Campos-Taberner,
Moreno-Martínez, Walther, Duveiller, Cescatti, Mahecha,
Muñoz-Marí, García-Haro, Guanter et al.</label><mixed-citation>
      
Camps-Valls, G., Campos-Taberner, M., Moreno-Martínez, Á., Walther,
S., Duveiller, G., Cescatti, A., Mahecha, M. D., Muñoz-Marí, J.,
García-Haro, F. J., Guanter, L., Jung, M., Gamon, J. A., Reichstein, M.,
and Running, S. W.: A unified vegetation index for
quantifying the terrestrial biosphere, Science Advances, 7, eabc7447, <a href="https://doi.org/10.1126/sciadv.abc7447" target="_blank">https://doi.org/10.1126/sciadv.abc7447</a>,
2021a.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Camps-Valls et al.(2021b)Camps-Valls, Tuia, Zhu, and
Reichstein</label><mixed-citation>
      
Camps-Valls, G., Tuia, D., Zhu, X. X., and Reichstein, M.: Deep learning for
the Earth Sciences: A comprehensive approach to remote sensing, climate
science and geosciences, John Wiley &amp; Sons, ISBN 1119646146, 2021b.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Canadell et al.(2021)Canadell, Monteiro, Costa, Cotrim da Cunha, Cox,
Eliseev, Henson, Ishii, Jaccard, Koven, Lohila, Patra, Piao, Rogelj,
Syampungani, Zaehle, and Zickfeld</label><mixed-citation>
      
Canadell, J., Monteiro, P., Costa, M., Cotrim da Cunha, L., Cox, P., Eliseev,
A., Henson, S., Ishii, M., Jaccard, S., Koven, C., Lohila, A., Patra, P.,
Piao, S., Rogelj, J., Syampungani, S., Zaehle, S., and Zickfeld, K.: Global
Carbon and other Biogeochemical Cycles and Feedbacks, in: Climate Change
2021: The Physical Science Basis. Contribution of Working Group I to the
Sixth Assessment Report of the Intergovernmental Panel on Climate Change,
edited by Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S., Péan, C.,
Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M., Huang, M., Leitzell,
K., Lonnoy, E., Matthews, J., Maycock, T., Waterfield, T., Yelekçi, O., Yu,
R., and Zhou, B., p. 673–816, Cambridge University Press, Cambridge, United
Kingdom and New York, NY, USA, <a href="https://doi.org/10.1017/9781009157896.007" target="_blank">https://doi.org/10.1017/9781009157896.007</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>Cerina et al.(2020)Cerina, Santambrogio, Franco, Gallicchio, and
Micheli</label><mixed-citation>
      
Cerina, L., Santambrogio, M. D., Franco, G., Gallicchio, C., and Micheli, A.:
Efficient embedded machine learning applications using echo state networks,
in: 2020 Design, Automation &amp; Test in Europe Conference &amp; Exhibition
(DATE), IEEE, 1299–1302, <a href="https://doi.org/10.23919/DATE48585.2020.9116334" target="_blank">https://doi.org/10.23919/DATE48585.2020.9116334</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>Cerqueira et al.(2020)Cerqueira, Torgo, and
Mozetič</label><mixed-citation>
      
Cerqueira, V., Torgo, L., and Mozetič, I.: Evaluating time series
forecasting models: An empirical study on performance estimation methods,
Machine Learning, 109, 1997–2028, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>Chattopadhyay et al.(2020)Chattopadhyay, Hassanzadeh, and
Subramanian</label><mixed-citation>
      
Chattopadhyay, A., Hassanzadeh, P., and Subramanian, D.: Data-driven predictions of a multiscale Lorenz 96 chaotic system using machine-learning methods: reservoir computing, artificial neural network, and long short-term memory network, Nonlin. Processes Geophys., 27, 373–389, <a href="https://doi.org/10.5194/npg-27-373-2020" target="_blank">https://doi.org/10.5194/npg-27-373-2020</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Chen et al.(2004)Chen, Jönsson, Tamura, Gu, Matsushita, and
Eklundh</label><mixed-citation>
      
Chen, J., Jönsson, P., Tamura, M., Gu, Z., Matsushita, B., and Eklundh, L.:
A simple method for reconstructing a high-quality NDVI time-series data set
based on the Savitzky–Golay filter, Remote Sens. Environ., 91,
332–344, 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Chen et al.(2021)Chen, Liu, Xu, Wu, Liang, Cao, and
Chen</label><mixed-citation>
      
Chen, Z., Liu, H., Xu, C., Wu, X., Liang, B., Cao, J., and Chen, D.: Modeling
vegetation greenness and its climate sensitivity with deep-learning
technology, Ecol. Evol., 11, 7335–7345, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Cho et al.(2014)Cho, Van Merriënboer, Gulcehre, Bahdanau,
Bougares, Schwenk, and Bengio</label><mixed-citation>
      
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F.,
Schwenk, H., and Bengio, Y.: Learning phrase representations using RNN
encoder-decoder for statistical machine translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, October 2014, 1724–1734,
<a href="https://doi.org/10.3115/v1/D14-1179" target="_blank">https://doi.org/10.3115/v1/D14-1179</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Chung et al.(2014)Chung, Gulcehre, Cho, and
Bengio</label><mixed-citation>
      
Chung, J., Gulcehre, C., Cho, K., and Bengio, Y.: Empirical evaluation of gated
recurrent neural networks on sequence modeling, arXiv [preprint],
<a href="https://doi.org/10.48550/arXiv.1412.3555" target="_blank">https://doi.org/10.48550/arXiv.1412.3555</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>Cornes et al.(2018)Cornes, van der Schrier, van den Besselaar, and
Jones</label><mixed-citation>
      
Cornes, R. C., van der Schrier, G., van den Besselaar, E. J., and Jones, P. D.:
An ensemble version of the E-OBS temperature and precipitation data sets,
J. Geophys. Res.-Atmos., 123, 9391–9409, 2018 (data available at: <a href="https://www.ecad.eu/download/ensembles/download.php" target="_blank"/>, last access: 1 June 2023).

    </mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Danisch and Krumbiegel(2021)</label><mixed-citation>
      
Danisch, S. and Krumbiegel, J.: Makie.jl: Flexible high-performance data
visualization for Julia, Journal of Open Source Software, 6, 3349,
<a href="https://doi.org/10.21105/joss.03349" target="_blank">https://doi.org/10.21105/joss.03349</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Datseris(2018)</label><mixed-citation>
      
Datseris, G.: DynamicalSystems.jl: A Julia software library for chaos and
nonlinear dynamics, Journal of Open Source Software, 3, 598,
<a href="https://doi.org/10.21105/joss.00598" target="_blank">https://doi.org/10.21105/joss.00598</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>De Jong et al.(2011)De Jong, de Bruin, de Wit, Schaepman, and
Dent</label><mixed-citation>
      
De Jong, R., de Bruin, S., de Wit, A., Schaepman, M. E., and Dent, D. L.:
Analysis of monotonic greening and browning trends from global NDVI
time-series, Remote Sens. Environ., 115, 692–702, 2011.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>De Jong et al.(2012)De Jong, Verbesselt, Schaepman, and
De Bruin</label><mixed-citation>
      
De Jong, R., Verbesselt, J., Schaepman, M. E., and De Bruin, S.: Trend changes
in global greening and browning: contribution of short-term trends to
longer-term change, Glob. Change Biol., 18, 642–655, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>De Keersmaecker et al.(2016)De Keersmaecker, van Rooijen, Lhermitte,
Tits, Schaminée, Coppin, Honnay, and
Somers</label><mixed-citation>
      
De Keersmaecker, W., van Rooijen, N., Lhermitte, S., Tits, L., Schaminée, J.,
Coppin, P., Honnay, O., and Somers, B.: Species-rich semi-natural grasslands
have a higher resistance but a lower resilience than intensively managed
agricultural grasslands in response to climate anomalies, J. Appl.
Ecol., 53, 430–439, <a href="https://doi.org/10.1111/1365-2664.12595" target="_blank">https://doi.org/10.1111/1365-2664.12595</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Diaconu et al.(2022)Diaconu, Saha, Günnemann, and
Zhu</label><mixed-citation>
      
Diaconu, C.-A., Saha, S., Günnemann, S., and Zhu, X. X.: Understanding the
Role of Weather Data for Earth Surface Forecasting using a ConvLSTM-based
Model, in: Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition,  New Orleans, Louisiana, United States, 12–24 June 2022, 1362–1371, <a href="https://openaccess.thecvf.com/content/CVPR2022W/EarthVision/html/Diaconu_Understanding_the_Role_of_Weather_Data_for_Earth_Surface_Forecasting_CVPRW_2022_paper.html" target="_blank"/> (last access: 4 November 2024), 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Dijkstra(2013)</label><mixed-citation>
      
Dijkstra, H. A.: Nonlinear climate dynamics, Cambridge University Press,  ISBN 9781139034135, <a href="https://doi.org/10.1017/CBO9781139034135" target="_blank">https://doi.org/10.1017/CBO9781139034135</a>, 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>Dobbertin et al.(2007)Dobbertin, Wermelinger, Bigler, Bürgi,
Carron, Forster, Gimmi, Rigling et al.</label><mixed-citation>
      
Dobbertin, M., Wermelinger, B., Bigler, C., Bürgi, M., Carron, M., Forster,
B., Gimmi, U., and Rigling, A.: Linking increasing drought stress to
Scots pine mortality and bark beetle infestations, Sci. World
J., 7, 231–239, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>Eggleton(2012)</label><mixed-citation>
      
Eggleton, T.: A short introduction to climate change, Cambridge University
Press,  ISBN 9781139524353, <a href="https://doi.org/10.1017/CBO9781139524353" target="_blank">https://doi.org/10.1017/CBO9781139524353</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Elman(1990)</label><mixed-citation>
      
Elman, J. L.: Finding structure in time, Cognitive Sci., 14, 179–211, 1990.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Fang et al.(2022)Fang, Kifer, Lawson, Feng, and Shen</label><mixed-citation>
      
Fang, K., Kifer, D., Lawson, K., Feng, D., and Shen, C.: The data synergy
effects of time-series deep learning models in hydrology, Water Resour.
Res., 58, e2021WR029583, <a href="https://doi.org/10.1029/2021WR029583" target="_blank">https://doi.org/10.1029/2021WR029583</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Farazmand and Sapsis(2019)</label><mixed-citation>
      
Farazmand, M. and Sapsis, T. P.: Extreme events: Mechanisms and prediction,
Appl. Mech. Rev., 71, 050801, <a href="https://doi.org/10.1115/1.4042065" target="_blank">https://doi.org/10.1115/1.4042065</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Fensham and Holman(1999)</label><mixed-citation>
      
Fensham, R. and Holman, J.: Temporal and spatial patterns in drought-related
tree dieback in Australian savanna, J. Appl. Ecol., 36,
1035–1050, 1999.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Foley et al.(1998)Foley, Levis, Prentice, Pollard, and
Thompson</label><mixed-citation>
      
Foley, J. A., Levis, S., Prentice, I. C., Pollard, D., and Thompson, S. L.:
Coupling dynamic models of climate and vegetation, Glob. Change Biol., 4,
561–579, 1998.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Frame et al.(2022)Frame, Kratzert, Klotz, Gauch, Shalev, Gilon,
Qualls, Gupta, and Nearing</label><mixed-citation>
      
Frame, J. M., Kratzert, F., Klotz, D., Gauch, M., Shalev, G., Gilon, O., Qualls, L. M., Gupta, H. V., and Nearing, G. S.: Deep learning rainfall–runoff predictions of extreme events, Hydrol. Earth Syst. Sci., 26, 3377–3392, <a href="https://doi.org/10.5194/hess-26-3377-2022" target="_blank">https://doi.org/10.5194/hess-26-3377-2022</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>Freire et al.(2024)Freire, Srivallapanondh, Spinnler, Napoli, Costa,
Prilepsky, and Turitsyn</label><mixed-citation>
      
Freire, P., Srivallapanondh, S., Spinnler, B., Napoli, A., Costa, N.,
Prilepsky, J. E., and Turitsyn, S. K.: Computational Complexity Optimization
of Neural Network-Based Equalizers in Digital Signal Processing: A
Comprehensive Approach, J. Lightwave Technol., 42, 4177–4201, <a href="https://doi.org/10.1109/JLT.2024.3386886" target="_blank">https://doi.org/10.1109/JLT.2024.3386886</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>Freire et al.(2021)Freire, Osadchuk, Spinnler, Napoli, Schairer,
Costa, Prilepsky, and Turitsyn</label><mixed-citation>
      
Freire, P. J., Osadchuk, Y., Spinnler, B., Napoli, A., Schairer, W., Costa, N.,
Prilepsky, J. E., and Turitsyn, S. K.: Performance versus complexity study of
neural network equalizers in coherent optical systems, J. Lightwave
Technol., 39, 6085–6096, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Friedlingstein et al.(2006)Friedlingstein, Cox, Betts, Bopp, von
Bloh, Brovkin, Cadule, Doney, Eby, Fung et al.</label><mixed-citation>
      
Friedlingstein, P., Cox, P., Betts, R., Bopp, L., von Bloh, W., Brovkin, V.,
Cadule, P., Doney, S., Eby, M., Fung, I., Bala, G., John, J., Jones, C., Joos, F., Kato, T., Kawamiya, M., Knorr, W., Lindsay, K., Matthews, H. D., Raddatz, T., Rayner, P., Reick, C.,
Roeckner, E., Schnitzler, K.-G., Schnur, R., Strassmann, K., Weaver, A. J., Yoshikawa, C., and Zeng, N.: Climate–carbon cycle
feedback analysis: results from the C4MIP model intercomparison, J.
Climate, 19, 3337–3353, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Funahashi and Nakamura(1993)</label><mixed-citation>
      
Funahashi, K.-i. and Nakamura, Y.: Approximation of dynamical systems by
continuous time recurrent neural networks, Neural Networks, 6, 801–806,
1993.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Gauch et al.(2021)Gauch, Kratzert, Klotz, Nearing, Lin, and
Hochreiter</label><mixed-citation>
      
Gauch, M., Kratzert, F., Klotz, D., Nearing, G., Lin, J., and Hochreiter, S.: Rainfall–runoff prediction at multiple timescales with a single Long Short-Term Memory network, Hydrol. Earth Syst. Sci., 25, 2045–2062, <a href="https://doi.org/10.5194/hess-25-2045-2021" target="_blank">https://doi.org/10.5194/hess-25-2045-2021</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>Gauthier et al.(2022)Gauthier, Fischer, and
Röhm</label><mixed-citation>
      
Gauthier, D. J., Fischer, I., and Röhm, A.: Learning unseen coexisting
attractors, Chaos: An Interdisciplinary Journal of Nonlinear Science, 32,
113107,
<br/>doi10.1063/5.0116784, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>Gers and Schmidhuber(2000)</label><mixed-citation>
      
Gers, F. and Schmidhuber, J.: Recurrent nets that time and count, in:
Proceedings of the IEEE-INNS-ENNS International Joint Conference on
Neural Networks. IJCNN 2000. Neural Computing: New Challenges and
Perspectives for the New Millennium, IEEE,   Como, Italy, 27–27 July 2000, <a href="https://doi.org/10.1109/ijcnn.2000.861302" target="_blank">https://doi.org/10.1109/ijcnn.2000.861302</a>,
2000.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Ghazoul et al.(2015)Ghazoul, Burivalova, Garcia-Ulloa, and
King</label><mixed-citation>
      
Ghazoul, J., Burivalova, Z., Garcia-Ulloa, J., and King, L. A.: Conceptualizing
forest degradation, Trends Ecol. Evol., 30, 622–632, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>Glorot and Bengio(2010)</label><mixed-citation>
      
Glorot, X. and Bengio, Y.: Understanding the difficulty of training deep
feedforward neural networks, in: Proceedings of the Thirteenth International
Conference on Artificial Intelligence and Statistics, edited by Teh, Y. W.
and Titterington, M., vol. 9 of Proceedings of Machine Learning
Research,  PMLR, Chia Laguna Resort, Sardinia, Italy, 13–15 May 2010, 249–256,
<a href="https://proceedings.mlr.press/v9/glorot10a.html" target="_blank"/> (last access: 25 October 2024), 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>Grant(1984)</label><mixed-citation>
      
Grant, P. J.: Drought effect on high-altitude forests, Ruahine range, North
Island, New Zealand, New Zeal. J. Bot., 22, 15–27, 1984.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>Grosse et al.(2002)Grosse, Bernaola-Galván, Carpena,
Román-Roldán, Oliver, and Stanley</label><mixed-citation>
      
Grosse, I., Bernaola-Galván, P., Carpena, P., Román-Roldán, R.,
Oliver, J., and Stanley, H. E.: Analysis of symbolic sequences using the
Jensen-Shannon divergence, Phys. Rev. E, 65, 041905, <a href="https://doi.org/10.1103/PhysRevE.65.041905" target="_blank">https://doi.org/10.1103/PhysRevE.65.041905</a>, 2002.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>Haaga and Datseris(2023)</label><mixed-citation>
      
Haaga, K. A. and Datseris, G.: JuliaDynamics/ComplexityMeasures.jl: v2.7.2, Zenodo [software],
<a href="https://doi.org/10.5281/zenodo.7862020" target="_blank">https://doi.org/10.5281/zenodo.7862020</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>Hart et al.(2020)Hart, Hook, and Dawes</label><mixed-citation>
      
Hart, A., Hook, J., and Dawes, J.: Embedding and approximation theorems for
echo state networks, Neural Networks, 128, 234–247, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>Hilker et al.(2014)Hilker, Lyapustin, Tucker, Hall, Myneni, Wang, Bi,
Mendes de Moura, and Sellers</label><mixed-citation>
      
Hilker, T., Lyapustin, A. I., Tucker, C. J., Hall, F. G., Myneni, R. B., Wang,
Y., Bi, J., Mendes de Moura, Y., and Sellers, P. J.: Vegetation dynamics and
rainfall sensitivity of the Amazon, P. Natl. Acad.
Sci. USA, 111, 16041–16046, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>Hochreiter(1998)</label><mixed-citation>
      
Hochreiter, S.: The vanishing gradient problem during learning recurrent neural
nets and problem solutions, Int. J. Uncertain. Fuzz., 6, 107–116, 1998.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>Hochreiter and Schmidhuber(1997)</label><mixed-citation>
      
Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural Comput.,
9, 1735–1780, 1997.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>Hogan and Mason(2011)</label><mixed-citation>
      
Hogan, R. J. and Mason, I. B.: Deterministic Forecasts of Binary Events, in: Forecast Verification: A Practitioner's Guide in Atmospheric Science, 2nd edn.,
<a href="https://doi.org/10.1002/9781119960003.ch3" target="_blank">https://doi.org/10.1002/9781119960003.ch3</a>, 2011.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>Hyndman and Koehler(2006)</label><mixed-citation>
      
Hyndman, R. J. and Koehler, A. B.: Another look at measures of forecast
accuracy, International J. Forecasting, 22, 679–688,
<a href="https://doi.org/10.1016/j.ijforecast.2006.03.001" target="_blank">https://doi.org/10.1016/j.ijforecast.2006.03.001</a>, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>Jaeger(2001)</label><mixed-citation>
      
Jaeger, H.: The “echo state” approach to analysing and training recurrent
neural networks-with an erratum note, Bonn, Germany: German National Research
Center for Information Technology GMD Technical Report, 148, 13, <a href="https://www.ai.rug.nl/minds/uploads/EchoStatesTechRep.pdf" target="_blank"/> (last access:
25 October 2024), 2001.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>Johnstone et al.(2016)Johnstone, Allen, Franklin, Frelich, Harvey,
Higuera, Mack, Meentemeyer, Metz, Perry et al.</label><mixed-citation>
      
Johnstone, J. F., Allen, C. D., Franklin, J. F., Frelich, L. E., Harvey, B. J.,
Higuera, P. E., Mack, M. C., Meentemeyer, R. K., Metz, M. R., Perry, G. L. W., Schoennagel, T., and Turner, M. G.: Changing disturbance regimes, ecological memory, and forest
resilience, Front. Ecol. Environ., 14, 369–378, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>Jung et al.(2020)Jung, Schwalm, Migliavacca, Walther, Camps-Valls,
Koirala, Anthoni, Besnard, Bodesheim, Carvalhais, Chevallier, Gans, Goll,
Haverd, Köhler, Ichii, Jain, Liu, Lombardozzi, Nabel, Nelson, O'Sullivan,
Pallandt, Papale, Peters, Pongratz, Rödenbeck, Sitch, Tramontana, Walker,
Weber, and Reichstein</label><mixed-citation>
      
Jung, M., Schwalm, C., Migliavacca, M., Walther, S., Camps-Valls, G., Koirala, S., Anthoni, P., Besnard, S., Bodesheim, P., Carvalhais, N., Chevallier, F., Gans, F., Goll, D. S., Haverd, V., Köhler, P., Ichii, K., Jain, A. K., Liu, J., Lombardozzi, D., Nabel, J. E. M. S., Nelson, J. A., O'Sullivan, M., Pallandt, M., Papale, D., Peters, W., Pongratz, J., Rödenbeck, C., Sitch, S., Tramontana, G., Walker, A., Weber, U., and Reichstein, M.: Scaling carbon fluxes from eddy covariance sites to globe: synthesis and evaluation of the FLUXCOM approach, Biogeosciences, 17, 1343–1365, <a href="https://doi.org/10.5194/bg-17-1343-2020" target="_blank">https://doi.org/10.5194/bg-17-1343-2020</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>Kang et al.(2016)Kang, Di, Deng, Yu, and Xu</label><mixed-citation>
      
Kang, L., Di, L., Deng, M., Yu, E., and Xu, Y.: Forecasting vegetation index
based on vegetation-meteorological factor interactions with artificial neural
network, in: 2016 Fifth International Conference on Agro-Geoinformatics
(Agro-Geoinformatics), Tianjin, China, 18–20 July 2016,  1–6, <a href="https://doi.org/10.1109/Agro-Geoinformatics.2016.7577673" target="_blank">https://doi.org/10.1109/Agro-Geoinformatics.2016.7577673</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>Kingma and Ba(2014)</label><mixed-citation>
      
Kingma, D. P. and Ba, J.: Adam: A method for stochastic optimization, arXiv
[preprint],  <a href="https://doi.org/10.48550/arXiv.1412.6980" target="_blank">https://doi.org/10.48550/arXiv.1412.6980</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib62"><label>Kladny et al.(2024)Kladny, Milanta, Mraz, Hufkens, and
Stocker</label><mixed-citation>
      
Kladny, K.-R., Milanta, M., Mraz, O., Hufkens, K., and Stocker, B. D.: Enhanced
prediction of vegetation responses to extreme drought using deep learning and
Earth observation data, Ecol. Inform., 80, 102474,
<a href="https://doi.org/10.1016/j.ecoinf.2024.102474" target="_blank">https://doi.org/10.1016/j.ecoinf.2024.102474</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib63"><label>Kraft et al.(2019)Kraft, Jung, Körner, Requena Mesa, Cortés,
and Reichstein</label><mixed-citation>
      
Kraft, B., Jung, M., Körner, M., Requena Mesa, C., Cortés, J., and
Reichstein, M.: Identifying dynamic memory effects on vegetation state using
recurrent neural networks, Frontiers in Big Data, 2, 31,
<a href="https://doi.org/10.3389/fdata.2019.00031" target="_blank">https://doi.org/10.3389/fdata.2019.00031</a>,
2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib64"><label>Kratzert et al.(2018)Kratzert, Klotz, Brenner, Schulz, and
Herrnegger</label><mixed-citation>
      
Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, <a href="https://doi.org/10.5194/hess-22-6005-2018" target="_blank">https://doi.org/10.5194/hess-22-6005-2018</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib65"><label>Kratzert et al.(2024)Kratzert, Gauch, Klotz, and
Nearing</label><mixed-citation>
      
Kratzert, F., Gauch, M., Klotz, D., and Nearing, G.: HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin, Hydrol. Earth Syst. Sci., 28, 4187–4201, <a href="https://doi.org/10.5194/hess-28-4187-2024" target="_blank">https://doi.org/10.5194/hess-28-4187-2024</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib66"><label>Krinner et al.(2005)Krinner, Viovy, de Noblet-Ducoudré, Ogée,
Polcher, Friedlingstein, Ciais, Sitch, and Prentice</label><mixed-citation>
      
Krinner, G., Viovy, N., de Noblet-Ducoudré, N., Ogée, J., Polcher, J.,
Friedlingstein, P., Ciais, P., Sitch, S., and Prentice, I. C.: A dynamic
global vegetation model for studies of the coupled atmosphere-biosphere
system, Global Biogeochem. Cy., 19, GB1015, <a href="https://doi.org/10.1029/2003GB002199" target="_blank">https://doi.org/10.1029/2003GB002199</a>, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib67"><label>Lamberti et al.(2004)Lamberti, Martin, Plastino, and
Rosso</label><mixed-citation>
      
Lamberti, P. W., Martin, M., Plastino, A., and Rosso, O.: Intensive entropic
non-triviality measure, Physica A, 334, 119–131, 2004.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib68"><label>LeCun et al.(2015)LeCun, Bengio, and Hinton</label><mixed-citation>
      
LeCun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521, 436–444,
2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib69"><label>Lellep et al.(2020)Lellep, Prexl, Linkmann, and
Eckhardt</label><mixed-citation>
      
Lellep, M., Prexl, J., Linkmann, M., and Eckhardt, B.: Using machine learning
to predict extreme events in the Hénon map, Chaos: An Interdisciplinary
Journal of Nonlinear Science, 30, 013113, <a href="https://doi.org/10.1063/1.5121844" target="_blank">https://doi.org/10.1063/1.5121844</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib70"><label>Le Quéré et al.(2009)Le Quéré, Raupach, Canadell,
Marland, Bopp, Ciais, Conway, Doney, Feely, Foster et al.</label><mixed-citation>
      
Le Quéré, C., Raupach, M. R., Canadell, J. G., Marland, G., Bopp, L.,
Ciais, P., Conway, T. J., Doney, S. C., Feely, R. A., Foster, P., Friedlingstein, P.,
Gurney, K., Houghton, R. A., House, J. I., Huntingford, C., Levy, P. E., Lomas, M. R., Majkut, J. Metzl, N.,
Ometto, J. P., Peters, G. P., Prentice, I. C., Randerson, J. T., Running, S. W., Sarmiento, J. L., Schuster, U. Sitch, S.,
Takahashi, T., Viovy, N., van der Werf, G. R., and Woodward, F. I.:
Trends in the sources and sinks of carbon dioxide, Nat. Geosci., 2,
831–836, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib71"><label>Le Quéré et al.(2018)Le Quéré, Andrew,
Friedlingstein, Sitch, Hauck, Pongratz, Pickers, Korsbakken, Peters, Canadell
et al.</label><mixed-citation>
      
Le Quéré, C., Andrew, R. M., Friedlingstein, P., Sitch, S., Hauck, J., Pongratz, J., Pickers, P. A., Korsbakken, J. I., Peters, G. P., Canadell, J. G., Arneth, A., Arora, V. K., Barbero, L., Bastos, A., Bopp, L., Chevallier, F., Chini, L. P., Ciais, P., Doney, S. C., Gkritzalis, T., Goll, D. S., Harris, I., Haverd, V., Hoffman, F. M., Hoppema, M., Houghton, R. A., Hurtt, G., Ilyina, T., Jain, A. K., Johannessen, T., Jones, C. D., Kato, E., Keeling, R. F., Goldewijk, K. K., Landschützer, P., Lefèvre, N., Lienert, S., Liu, Z., Lombardozzi, D., Metzl, N., Munro, D. R., Nabel, J. E. M. S., Nakaoka, S., Neill, C., Olsen, A., Ono, T., Patra, P., Peregon, A., Peters, W., Peylin, P., Pfeil, B., Pierrot, D., Poulter, B., Rehder, G., Resplandy, L., Robertson, E., Rocher, M., Rödenbeck, C., Schuster, U., Schwinger, J., Séférian, R., Skjelvan, I., Steinhoff, T., Sutton, A., Tans, P. P., Tian, H., Tilbrook, B., Tubiello, F. N., van der Laan-Luijkx, I. T., van der Werf, G. R., Viovy, N., Walker, A. P., Wiltshire, A. J., Wright, R., Zaehle, S., and Zheng, B.: Global Carbon Budget 2018, Earth Syst. Sci. Data, 10, 2141–2194, <a href="https://doi.org/10.5194/essd-10-2141-2018" target="_blank">https://doi.org/10.5194/essd-10-2141-2018</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib72"><label>Liang et al.(2003)Liang, Shao, Kong, and Lin</label><mixed-citation>
      
Liang, E., Shao, X., Kong, Z., and Lin, J.: The extreme drought in the 1920s
and its effect on tree growth deduced from tree ring analysis: a case study
in North China, Ann. For. Sci., 60, 145–152, 2003.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib73"><label>Linscheid et al.(2020)Linscheid, Estupinan-Suarez, Brenning,
Carvalhais, Cremer, Gans, Rammig, Reichstein, Sierra, and
Mahecha</label><mixed-citation>
      
Linscheid, N., Estupinan-Suarez, L. M., Brenning, A., Carvalhais, N., Cremer, F., Gans, F., Rammig, A., Reichstein, M., Sierra, C. A., and Mahecha, M. D.: Towards a global understanding of vegetation–climate dynamics at multiple timescales, Biogeosciences, 17, 945–962, <a href="https://doi.org/10.5194/bg-17-945-2020" target="_blank">https://doi.org/10.5194/bg-17-945-2020</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib74"><label>Liu et al.(2013)Liu, Liu, and Yin</label><mixed-citation>
      
Liu, G., Liu, H., and Yin, Y.: Global patterns of NDVI-indicated vegetation
extremes and their sensitivity to climate extremes, Environ. Res.
Lett., 8, 025009, <a href="https://doi.org/10.1088/1748-9326/8/2/025009" target="_blank">https://doi.org/10.1088/1748-9326/8/2/025009</a>, 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib75"><label>Lopez-Ruiz et al.(1995)Lopez-Ruiz, Mancini, and
Calbet</label><mixed-citation>
      
Lopez-Ruiz, R., Mancini, H. L., and Calbet, X.: A statistical measure of
complexity, Phys. Lett. A, 209, 321–326, 1995.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib76"><label>Lotsch et al.(2005)Lotsch, Friedl, Anderson, and
Tucker</label><mixed-citation>
      
Lotsch, A., Friedl, M. A., Anderson, B. T., and Tucker, C. J.: Response of
terrestrial ecosystems to recent Northern Hemispheric drought, Geophys.
Res. Lett., 32, L06705, <a href="https://doi.org/10.1029/2004GL022043" target="_blank">https://doi.org/10.1029/2004GL022043</a>, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib77"><label>Lu et al.(2017)Lu, Pathak, Hunt, Girvan, Brockett, and
Ott</label><mixed-citation>
      
Lu, Z., Pathak, J., Hunt, B., Girvan, M., Brockett, R., and Ott, E.: Reservoir
observers: Model-free inference of unmeasured variables in chaotic systems,
Chaos: An Interdisciplinary Journal of Nonlinear Science, 27, 041102, <a href="https://doi.org/10.1063/1.4979665" target="_blank">https://doi.org/10.1063/1.4979665</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib78"><label>Maass et al.(2002)Maass, Natschläger, and
Markram</label><mixed-citation>
      
Maass, W., Natschläger, T., and Markram, H.: Real-time computing without
stable states: A new framework for neural computation based on perturbations,
Neural Comput., 14, 2531–2560, 2002.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib79"><label>Mahecha et al.(2010)Mahecha, Fürst, Gobron, and
Lange</label><mixed-citation>
      
Mahecha, M. D., Fürst, L. M., Gobron, N., and Lange, H.: Identifying
multiple spatiotemporal patterns: A refined view on terrestrial
photosynthetic activity, Pattern Recogn. Lett., 31, 2309–2317, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib80"><label>Mahecha et al.(2020)Mahecha, Gans, Brandt, Christiansen, Cornell,
Fomferra, Kraemer, Peters, Bodesheim, Camps-Valls et al.</label><mixed-citation>
      
Mahecha, M. D., Gans, F., Brandt, G., Christiansen, R., Cornell, S. E., Fomferra, N., Kraemer, G., Peters, J., Bodesheim, P., Camps-Valls, G., Donges, J. F., Dorigo, W., Estupinan-Suarez, L. M., Gutierrez-Velez, V. H., Gutwin, M., Jung, M., Londoño, M. C., Miralles, D. G., Papastefanou, P., and Reichstein, M.: Earth system data cubes unravel global multivariate dynamics, Earth Syst. Dynam., 11, 201–234, <a href="https://doi.org/10.5194/esd-11-201-2020" target="_blank">https://doi.org/10.5194/esd-11-201-2020</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib81"><label>Mahecha et al.(2022)Mahecha, Bastos, Bohn, Eisenhauer, Feilhauer,
Hartmann, Hickler, Kalesse-Los, Migliavacca, Otto
et al.</label><mixed-citation>
      
Mahecha, M. D., Bastos, A., Bohn, F. J., Eisenhauer, N., Feilhauer, H.,
Hartmann, H., Hickler, T., Kalesse-Los, H., Migliavacca, M., Otto, F. E. L.,
Peng, J.,
Quaas, J., Tegen, I., Weigelt, A., Wendisch, M., and
Wirth, C.: Biodiversity loss and climate extremes – study the feedbacks, Nature,
612, 30–32, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib82"><label>Mahecha et al.(2024)Mahecha, Bastos, Bohn, Eisenhauer, Feilhauer,
Hickler, Kalesse-Los, Migliavacca, Otto, Peng, Sippel, Tegen, Weigelt,
Wendisch, Wirth, Al-Halbouni, Deneke, Doktor, Dunker, Duveiller, Ehrlich,
Foth, García-García, Guerra, Guimarães-Steinicke, Hartmann, Henning,
Herrmann, Hu, Ji, Kattenborn, Kolleck, Kretschmer, Kühn, Luttkus, Maahn,
Mönks, Mora, Pöhlker, Reichstein, Rüger, Sánchez-Parra, Schäfer,
Stratmann, Tesche, Wehner, Wieneke, Winkler, Wolf, Zaehle, Zscheischler, and
Quaas</label><mixed-citation>
      
Mahecha, M. D., Bastos, A., Bohn, F. J., Eisenhauer, N., Feilhauer, H.,
Hickler, T., Kalesse-Los, H., Migliavacca, M., Otto, F. E. L., Peng, J.,
Sippel, S., Tegen, I., Weigelt, A., Wendisch, M., Wirth, C., Al-Halbouni, D.,
Deneke, H., Doktor, D., Dunker, S., Duveiller, G., Ehrlich, A., Foth, A.,
García-García, A., Guerra, C. A., Guimarães-Steinicke, C., Hartmann, H.,
Henning, S., Herrmann, H., Hu, P., Ji, C., Kattenborn, T., Kolleck, N.,
Kretschmer, M., Kühn, I., Luttkus, M. L., Maahn, M., Mönks, M., Mora, K.,
Pöhlker, M., Reichstein, M., Rüger, N., Sánchez-Parra, B., Schäfer, M.,
Stratmann, F., Tesche, M., Wehner, B., Wieneke, S., Winkler, A. J., Wolf, S.,
Zaehle, S., Zscheischler, J., and Quaas, J.: Biodiversity and Climate
Extremes: Known Interactions and Research Gaps, Earth's Future, 12,
e2023EF003963, <a href="https://doi.org/10.1029/2023EF003963" target="_blank">https://doi.org/10.1029/2023EF003963</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib83"><label>Makridakis(1993)</label><mixed-citation>
      
Makridakis, S.: Accuracy measures: theoretical and practical concerns,
Int. J. Forecasting, 9, 527–529, 1993.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib84"><label>Martin et al.(2006)Martin, Plastino, and
Rosso</label><mixed-citation>
      
Martin, M., Plastino, A., and Rosso, O.: Generalized statistical complexity
measures: Geometrical and analytical properties, Physica A, 369, 439–462, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib85"><label>Martinuzzi(2023)</label><mixed-citation>
      
Martinuzzi, F.: rnn-ndvi, GitHub [code], <a href="https://github.com/MartinuzziFrancesco/rnn-ndvi" target="_blank"/> (last access: 4 November 2024), 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib86"><label>Martinuzzi et al.(2022)Martinuzzi, Rackauckas, Abdelrehim, Mahecha,
and Mora</label><mixed-citation>
      
Martinuzzi, F., Rackauckas, C., Abdelrehim, A., Mahecha, M. D., and Mora, K.:
ReservoirComputing. jl: An Efficient and Modular Library for Reservoir
Computing Models, J. Mach. Learn. Res. [code], <a href="http://jmlr.org/papers/v23/22-0611.html" target="_blank"/> (last access: 4 November 2024), 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib87"><label>Meiyazhagan et al.(2021)Meiyazhagan, Sudharsan, and
Senthilvelan</label><mixed-citation>
      
Meiyazhagan, J., Sudharsan, S., and Senthilvelan, M.: Model-free prediction of
emergence of extreme events in a parametrically driven nonlinear dynamical
system by deep learning, Eur. Phys. J. B, 94, 156, <a href="https://doi.org/10.1140/epjb/s10051-021-00167-y" target="_blank">https://doi.org/10.1140/epjb/s10051-021-00167-y</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib88"><label>Merchant et al.(2017)Merchant, Paul, Popp, Ablain, Bontemps,
Defourny, Hollmann, Lavergne, Laeng, De Leeuw
et al.</label><mixed-citation>
      
Merchant, C. J., Paul, F., Popp, T., Ablain, M., Bontemps, S., Defourny, P., Hollmann, R., Lavergne, T., Laeng, A., de Leeuw, G., Mittaz, J., Poulsen, C., Povey, A. C., Reuter, M., Sathyendranath, S., Sandven, S., Sofieva, V. F., and Wagner, W.: Uncertainty information in climate data records from Earth observation, Earth Syst. Sci. Data, 9, 511–527, <a href="https://doi.org/10.5194/essd-9-511-2017" target="_blank">https://doi.org/10.5194/essd-9-511-2017</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib89"><label>Montero et al.(2023)Montero, Aybar, Mahecha, Martinuzzi,
Söchting, and Wieneke</label><mixed-citation>
      
Montero, D., Aybar, C., Mahecha, M. D., Martinuzzi, F., Söchting, M., and
Wieneke, S.: A standardized catalogue of spectral indices to advance the use
of remote sensing in Earth system research, Scientific Data, 10, 197, <a href="https://doi.org/10.1038/s41597-023-02096-0" target="_blank">https://doi.org/10.1038/s41597-023-02096-0</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib90"><label>Mora et al.(2024)Mora, Rzanny, Wäldchen, Feilhauer, Kattenborn,
Kraemer, Mäder, Svidzinska, Wolf, and Mahecha</label><mixed-citation>
      
Mora, K., Rzanny, M., Wäldchen, J., Feilhauer, H., Kattenborn, T., Kraemer,
G., Mäder, P., Svidzinska, D., Wolf, S., and Mahecha, M. D.:
Macrophenological dynamics from citizen science plant occurrence data,
Methods in Ecol. Evol., 15, 1422–1437,
<a href="https://doi.org/10.1111/2041-210X.14365" target="_blank">https://doi.org/10.1111/2041-210X.14365</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib91"><label>Nelson et al.(2024)Nelson, Walther, Gans, Kraft, Weber, Novick,
Buchmann, Migliavacca, Wohlfahrt, Šigut, Ibrom, Papale, Göckede,
Duveiller, Knohl, Hörtnagl, Scott, Zhang, Hamdi, Reichstein,
Aranda-Barranco, Ardö, Op de Beeck, Billdesbach, Bowling, Bracho,
Brümmer, Camps-Valls, Chen, Cleverly, Desai, Dong, El-Madany, Euskirchen,
Feigenwinter, Galvagno, Gerosa, Gielen, Goded, Goslee, Gough, Heinesch,
Ichii, Jackowicz-Korczynski, Klosterhalfen, Knox, Kobayashi, Kohonen,
Korkiakoski, Mammarella, Mana, Marzuoli, Matamala, Metzger, Montagnani,
Nicolini, O'Halloran, Ourcival, Peichl, Pendall, Ruiz Reverter, Roland,
Sabbatini, Sachs, Schmidt, Schwalm, Shekhar, Silberstein, Silveira, Spano,
Tagesson, Tramontana, Trotta, Turco, Vesala, Vincke, Vitale, Vivoni, Wang,
Woodgate, Yepez, Zhang, Zona, and Jung</label><mixed-citation>
      
Nelson, J. A., Walther, S., Gans, F., Kraft, B., Weber, U., Novick, K., Buchmann, N., Migliavacca, M., Wohlfahrt, G., Šigut, L., Ibrom, A., Papale, D., Göckede, M., Duveiller, G., Knohl, A., Hörtnagl, L., Scott, R. L., Zhang, W., Hamdi, Z. M., Reichstein, M., Aranda-Barranco, S., Ardö, J., Op de Beeck, M., Billdesbach, D., Bowling, D., Bracho, R., Brümmer, C., Camps-Valls, G., Chen, S., Cleverly, J. R., Desai, A., Dong, G., El-Madany, T. S., Euskirchen, E. S., Feigenwinter, I., Galvagno, M., Gerosa, G., Gielen, B., Goded, I., Goslee, S., Gough, C. M., Heinesch, B., Ichii, K., Jackowicz-Korczynski, M. A., Klosterhalfen, A., Knox, S., Kobayashi, H., Kohonen, K.-M., Korkiakoski, M., Mammarella, I., Mana, G., Marzuoli, R., Matamala, R., Metzger, S., Montagnani, L., Nicolini, G., O'Halloran, T., Ourcival, J.-M., Peichl, M., Pendall, E., Ruiz Reverter, B., Roland, M., Sabbatini, S., Sachs, T., Schmidt, M., Schwalm, C. R., Shekhar, A., Silberstein, R., Silveira, M. L., Spano, D., Tagesson, T., Tramontana, G., Trotta, C., Turco, F., Vesala, T., Vincke, C., Vitale, D., Vivoni, E. R., Wang, Y., Woodgate, W., Yepez, E. A., Zhang, J., Zona, D., and Jung, M.: X-BASE: the first terrestrial carbon and water flux products from an extended data-driven scaling framework, FLUXCOM-X, EGUsphere [preprint], <a href="https://doi.org/10.5194/egusphere-2024-165" target="_blank">https://doi.org/10.5194/egusphere-2024-165</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib92"><label>Pammi et al.(2023)Pammi, Clerc, Coulibaly, and
Barbay</label><mixed-citation>
      
Pammi, V. A., Clerc, M. G., Coulibaly, S., and Barbay, S.: Extreme Events
Prediction from Nonlocal Partial Information in a Spatiotemporally Chaotic
Microcavity Laser, Phys. Rev. Lett., 130, 223801,
<a href="https://doi.org/10.1103/PhysRevLett.130.223801" target="_blank">https://doi.org/10.1103/PhysRevLett.130.223801</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib93"><label>Papagiannopoulou et al.(2017)Papagiannopoulou, Miralles, Decubber,
Demuzere, Verhoest, Dorigo, and Waegeman</label><mixed-citation>
      
Papagiannopoulou, C., Miralles, D. G., Decubber, S., Demuzere, M., Verhoest, N. E. C., Dorigo, W. A., and Waegeman, W.: A non-linear Granger-causality framework to investigate climate–vegetation dynamics, Geosci. Model Dev., 10, 1945–1960, <a href="https://doi.org/10.5194/gmd-10-1945-2017" target="_blank">https://doi.org/10.5194/gmd-10-1945-2017</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib94"><label>Papale and Valentini(2003)</label><mixed-citation>
      
Papale, D. and Valentini, R.: A new assessment of European forests carbon
exchanges by eddy fluxes and artificial neural network spatialization, Glob.
Change Biol., 9, 525–535, 2003.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib95"><label>Papale et al.(2015)Papale, Black, Carvalhais, Cescatti, Chen, Jung,
Kiely, Lasslop, Mahecha, Margolis et al.</label><mixed-citation>
      
Papale, D., Black, T. A., Carvalhais, N., Cescatti, A., Chen, J., Jung, M.,
Kiely, G., Lasslop, G., Mahecha, M. D., Margolis, H., Merbold, L., Montagnani, L., Moors, E., Olesen, J. E., Reichstein, M., Tramontana,
G., van Gorsel, E., Wohlfahrt, G., and Ráduly, B.: Effect of
spatial sampling from European flux towers for estimating carbon and water
fluxes with artificial neural networks, J. Geophys. Res.-Biogeo., 120, 1941–1957, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib96"><label>Pappas et al.(2017)Pappas, Mahecha, Frank, Babst, and
Koutsoyiannis</label><mixed-citation>
      
Pappas, C., Mahecha, M. D., Frank, D. C., Babst, F., and Koutsoyiannis, D.:
Ecosystem functioning is enveloped by hydrometeorological variability, Nature
Ecology &amp; Evolution, 1, 1263–1270, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib97"><label>Pascanu et al.(2013)Pascanu, Mikolov, and
Bengio</label><mixed-citation>
      
Pascanu, R., Mikolov, T., and Bengio, Y.: On the difficulty of training
recurrent neural networks, in: International conference on machine learning, Atlanta, GA, USA, 16–21 June 2013, Pmlr,
1310–1318, <a href="https://proceedings.mlr.press/v28/pascanu13.pdf" target="_blank"/> (last access: 4 November 2024), 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib98"><label>Paszke et al.(2019)Paszke, Gross, Massa, Lerer, Bradbury, Chanan,
Killeen, Lin, Gimelshein, Antiga et al.</label><mixed-citation>
      
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen,
T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E.,
DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S.,
Steiner, B., Fang, L., Bai, J., and Chintala, S.: Pytorch: An imperative
style, high-performance deep learning library, GitHub [code], <a href="https://github.com/pytorch/pytorch" target="_blank"/>  (last access: 4 November 2024), 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib99"><label>Pathak et al.(2017)Pathak, Lu, Hunt, Girvan, and
Ott</label><mixed-citation>
      
Pathak, J., Lu, Z., Hunt, B. R., Girvan, M., and Ott, E.: Using machine
learning to replicate chaotic attractors and calculate Lyapunov exponents
from data, Chaos: An Interdisciplinary Journal of Nonlinear Science, 27,
121102,
<a href="https://doi.org/10.1063/1.5010300" target="_blank">https://doi.org/10.1063/1.5010300</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib100"><label>Pathak et al.(2018)Pathak, Hunt, Girvan, Lu, and
Ott</label><mixed-citation>
      
Pathak, J., Hunt, B., Girvan, M., Lu, Z., and Ott, E.: Model-free prediction of
large spatiotemporally chaotic systems from data: A reservoir computing
approach, Phys. Rev. Lett., 120, 024102,
<a href="https://doi.org/10.1103/PhysRevLett.120.024102" target="_blank">https://doi.org/10.1103/PhysRevLett.120.024102</a>,
2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib101"><label>Peng et al.(2022)Peng, Li, Shen, He, Chen, Peng, and
Yuan</label><mixed-citation>
      
Peng, Q., Li, X., Shen, R., He, B., Chen, X., Peng, Y., and Yuan, W.: How well
can we predict vegetation growth through the coming growing season?, Science
of Remote Sensing, 5, 100043, <a href="https://doi.org/10.1016/j.srs.2022.100043" target="_blank">https://doi.org/10.1016/j.srs.2022.100043</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib102"><label>Powers(2020)</label><mixed-citation>
      
Powers, D. M.: Evaluation: from precision, recall and F-measure to ROC,
informedness, markedness and correlation, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.2010.16061" target="_blank">https://doi.org/10.48550/arXiv.2010.16061</a>,
2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib103"><label>Pyragas and Pyragas(2020)</label><mixed-citation>
      
Pyragas, V. and Pyragas, K.: Using reservoir computer to predict and prevent
extreme events, Phys. Lett. A, 384, 126591, <a href="https://doi.org/10.1016/j.physleta.2020.126591" target="_blank">https://doi.org/10.1016/j.physleta.2020.126591</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib104"><label>Ray et al.(2021)Ray, Chakraborty, and Ghosh</label><mixed-citation>
      
Ray, A., Chakraborty, T., and Ghosh, D.: Optimized ensemble deep learning
framework for scalable forecasting of dynamics containing extreme events,
Chaos: An Interdisciplinary Journal of Nonlinear Science, 31, 111105, <a href="https://doi.org/10.1063/5.0074213" target="_blank">https://doi.org/10.1063/5.0074213</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib105"><label>Reichstein et al.(2013)Reichstein, Bahn, Ciais, Frank, Mahecha,
Seneviratne, Zscheischler, Beer, Buchmann, Frank
et al.</label><mixed-citation>
      
Reichstein, M., Bahn, M., Ciais, P., Frank, D., Mahecha, M. D., Seneviratne,
S. I., Zscheischler, J., Beer, C., Buchmann, N., Frank, D. C., Papale, D. Rammig, A., Smith, P., Thonicke, K., van der Velde, M., Vicca, S.,
Walz, A., and Wattenbach, M.:
Climate extremes and the carbon cycle, Nature, 500, 287–295, 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib106"><label>Reichstein et al.(2018)Reichstein, Besnard, Carvalhais, Gans, Jung,
Kraft, and Mahecha</label><mixed-citation>
      
Reichstein, M., Besnard, S., Carvalhais, N., Gans, F., Jung, M., Kraft, B., and
Mahecha, M.: Modelling landsurface time-series with recurrent neural nets,
in: IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing
Symposium,  Valencia, Spain, 22–27 July 2018, 7640–7643, <a href="https://doi.org/10.1109/IGARSS.2018.8518007" target="_blank">https://doi.org/10.1109/IGARSS.2018.8518007</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib107"><label>Reichstein et al.(2019)Reichstein, Camps-Valls, Stevens, Jung,
Denzler, and Carvalhais</label><mixed-citation>
      
Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., and
Carvalhais, N.: Deep learning and process understanding for data-driven Earth
system science, Nature, 566, 195–204, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib108"><label>Requena-Mesa et al.(2021)Requena-Mesa, Benson, Reichstein, Runge, and
Denzler</label><mixed-citation>
      
Requena-Mesa, C., Benson, V., Reichstein, M., Runge, J., and Denzler, J.:
EarthNet2021: A large-scale dataset and challenge for Earth surface
forecasting as a guided video prediction task, in: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021,
1132–1142, <a href="https://doi.ieeecomputersociety.org/10.1109/CVPRW53098.2021.00124" target="_blank"/> (last access: 4 November 2024), 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib109"><label>Robin et al.(2022)Robin, Requena-Mesa, Benson, Alonso, Poehls,
Carvalhais, and Reichstein</label><mixed-citation>
      
Robin, C., Requena-Mesa, C., Benson, V., Alonso, L., Poehls, J., Carvalhais,
N., and Reichstein, M.: Learning to forecast vegetation greenness at fine
resolution over Africa with ConvLSTMs, arXiv [preprint],  <a href="https://doi.org/10.48550/arXiv.2210.13648" target="_blank">https://doi.org/10.48550/arXiv.2210.13648</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib110"><label>Rosso and Masoller(2009)</label><mixed-citation>
      
Rosso, O. A. and Masoller, C.: Detecting and quantifying temporal correlations
in stochastic resonance via information theory measures, Eur.
Phys. J. B, 69, 37–43, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib111"><label>Rosso et al.(2007)Rosso, Larrondo, Martin, Plastino, and
Fuentes</label><mixed-citation>
      
Rosso, O. A., Larrondo, H., Martin, M. T., Plastino, A., and Fuentes, M. A.:
Distinguishing noise from chaos, Phys. Rev. Lett., 99, 154102, <a href="https://doi.org/10.1103/PhysRevLett.99.154102" target="_blank">https://doi.org/10.1103/PhysRevLett.99.154102</a>,  2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib112"><label>Rouse et al.(1974)Rouse, Haas, Schell, Deering
et al.</label><mixed-citation>
      
Rouse, J. W., Haas, R. H., Schell, J. A., and Deering, D. W.: Monitoring
vegetation systems in the Great Plains with ERTS, NASA Spec. Publ, 351, 309, <a href="https://ntrs.nasa.gov/citations/19740022614" target="_blank"/> (last access: 4 November 2024),
1974.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib113"><label>Rudy and Sapsis(2023)</label><mixed-citation>
      
Rudy, S. H. and Sapsis, T. P.: Output-weighted and relative entropy loss
functions for deep learning precursors of extreme events, Physica D, 443, 133570, <a href="https://doi.org/10.1016/j.physd.2022.133570" target="_blank">https://doi.org/10.1016/j.physd.2022.133570</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib114"><label>Rumelhart et al.(1986)Rumelhart, Hinton, and
Williams</label><mixed-citation>
      
Rumelhart, D. E., Hinton, G. E., and Williams, R. J.: Learning representations
by back-propagating errors, Nature, 323, 533–536, 1986.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib115"><label>Savitzky and Golay(1964)</label><mixed-citation>
      
Savitzky, A. and Golay, M. J.: Smoothing and differentiation of data by
simplified least squares procedures, Anal. Chem., 36, 1627–1639,
1964.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib116"><label>Scheepens et al.(2023)Scheepens, Schicker,
Hlaváčková-Schindler, and Plant</label><mixed-citation>
      
Scheepens, D. R., Schicker, I., Hlaváčková-Schindler, K., and Plant, C.: Adapting a deep convolutional RNN model with imbalanced regression loss for improved spatio-temporal forecasting of extreme wind speed events in the short to medium range, Geosci. Model Dev., 16, 251–270, <a href="https://doi.org/10.5194/gmd-16-251-2023" target="_blank">https://doi.org/10.5194/gmd-16-251-2023</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib117"><label>Schulz et al.(2024)Schulz, Vollmer, Mahecha, and
Mora</label><mixed-citation>
      
Schulz, L., Vollmer, J., Mahecha, M. D., and Mora, K.: Nonlinear spectral
analysis extracts harmonics from land-atmosphere fluxes, arXiv [preprint],
<a href="https://doi.org/10.48550/arXiv.2407.19237" target="_blank">https://doi.org/10.48550/arXiv.2407.19237</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib118"><label>Seneviratne et al.(2021)Seneviratne, Zhang, Adnan, Badi, Dereczynski,
Di Luca, Ghosh, Iskandar, Kossin, Lewis, Otto, Pinto, Satoh, Vicente-Serrano,
Wehner, and Zhou</label><mixed-citation>
      
Seneviratne, S., Zhang, X., Adnan, M., Badi, W., Dereczynski, C., Di Luca, A.,
Ghosh, S., Iskandar, I., Kossin, J., Lewis, S., Otto, F., Pinto, I., Satoh,
M., Vicente-Serrano, S., Wehner, M., and Zhou, B.: Weather and Climate
Extreme Events in a Changing Climate, in: Climate Change 2021: The Physical
Science Basis. Contribution of Working Group I to the Sixth Assessment Report
of the Intergovernmental Panel on Climate Change, edited by: Masson-Delmotte,
V., Zhai, P., Pirani, A., Connors, S., Péan, C., Berger, S., Caud, N., Chen,
Y., Goldfarb, L., Gomis, M., Huang, M., Leitzell, K., Lonnoy, E., Matthews,
J., Maycock, T., Waterfield, T., Yelekçi, O., Yu, R., and Zhou, B.,
Cambridge University Press, Cambridge, United Kingdom and New
York, NY, USA,  1513–1766, <a href="https://doi.org/10.1017/9781009157896.013" target="_blank">https://doi.org/10.1017/9781009157896.013</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib119"><label>Shi et al.(2015)Shi, Chen, Wang, Yeung, Wong, and
Woo</label><mixed-citation>
      
Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., and Woo, W.-C.:
Convolutional LSTM network: A machine learning approach for precipitation
nowcasting, Adv. Neur. In., 28, 802–810, <a href="https://dl.acm.org/doi/10.5555/2969239.2969329" target="_blank"/> (last access: 4 November 2024), 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib120"><label>Sippel et al.(2016)Sippel, Lange, Mahecha, Hauhs, Bodesheim,
Kaminski, Gans, and Rosso</label><mixed-citation>
      
Sippel, S., Lange, H., Mahecha, M. D., Hauhs, M., Bodesheim, P., Kaminski, T.,
Gans, F., and Rosso, O. A.: Diagnosing the dynamics of observed and simulated
ecosystem gross primary productivity with time causal information theory
quantifiers, PloS one, 11, e0164960, <a href="https://doi.org/10.1371/journal.pone.0164960" target="_blank">https://doi.org/10.1371/journal.pone.0164960</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib121"><label>Sippel et al.(2018)Sippel, Reichstein, Ma, Mahecha, Lange, Flach, and
Frank</label><mixed-citation>
      
Sippel, S., Reichstein, M., Ma, X., Mahecha, M. D., Lange, H., Flach, M., and
Frank, D.: Drought, heat, and the carbon cycle: a review, Current Climate
Change Reports, 4, 266–286, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib122"><label>Sitch et al.(2003)Sitch, Smith, Prentice, Arneth, Bondeau, Cramer,
Kaplan, Levis, Lucht, Sykes et al.</label><mixed-citation>
      
Sitch, S., Smith, B., Prentice, I. C., Arneth, A., Bondeau, A., Cramer, W.,
Kaplan, J. O., Levis, S., Lucht, W., Sykes, M. T., Thonicke, K., and Venevsky, S.: Evaluation of
ecosystem dynamics, plant geography and terrestrial carbon cycling in the LPJ
dynamic global vegetation model, Glob. Change Biol., 9, 161–185, 2003.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib123"><label>Slayback et al.(2003)Slayback, Pinzon, Los, and
Tucker</label><mixed-citation>
      
Slayback, D. A., Pinzon, J. E., Los, S. O., and Tucker, C. J.: Northern
hemisphere photosynthetic trends 1982–99, Glob. Change Biol., 9, 1–15, <a href="https://doi.org/10.1046/j.1365-2486.2003.00507.x" target="_blank">https://doi.org/10.1046/j.1365-2486.2003.00507.x</a>,
2003.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib124"><label>Srinivasan et al.(2019)Srinivasan, Guastoni, Azizpour, Schlatter, and
Vinuesa</label><mixed-citation>
      
Srinivasan, P. A., Guastoni, L., Azizpour, H., Schlatter, P., and Vinuesa, R.:
Predictions of turbulent shear flows using deep neural networks, Physical
Review Fluids, 4, 054603, <a href="https://doi.org/10.1103/PhysRevFluids.4.054603" target="_blank">https://doi.org/10.1103/PhysRevFluids.4.054603</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib125"><label>Srivastava et al.(2014)Srivastava, Hinton, Krizhevsky, Sutskever, and
Salakhutdinov</label><mixed-citation>
      
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov,
R.: Dropout: a simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15, 1929–1958, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib126"><label>Steinier et al.(1972)Steinier, Termonia, and
Deltour</label><mixed-citation>
      
Steinier, J., Termonia, Y., and Deltour, J.: Smoothing and differentiation of
data by simplified least square procedure, Anal. Chem., 44,
1906–1909, 1972.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib127"><label>Sun et al.(2022)Sun, Song, Cai, Zhang, Hong, and
Li</label><mixed-citation>
      
Sun, C., Song, M., Cai, D., Zhang, B., Hong, S., and Li, H.: A systematic
review of echo state networks from design to application, IEEE Transactions
on Artificial Intelligence, 5, 23–37, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib128"><label>Sutskever(2013)</label><mixed-citation>
      
Sutskever, I.: Training recurrent neural networks, PhD thesis, University of Toronto
Toronto, ON, Canada, ISBN 9780499220660, 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib129"><label>Teskey et al.(2015)Teskey, Wertin, Bauweraerts, Ameye, McGuire, and
Steppe</label><mixed-citation>
      
Teskey, R., Wertin, T., Bauweraerts, I., Ameye, M., McGuire, M. A., and Steppe,
K.: Responses of tree species to heat waves and extreme heat events, Plant
Cell Environ., 38, 1699–1712, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib130"><label>Tietz et al.(2017)Tietz, Fan, Nouri, Bossan, and skorch
Developers</label><mixed-citation>
      
Tietz, M., Fan, T. J., Nouri, D., Bossan, B., and skorch Developers: skorch:
A scikit-learn compatible neural network library that wraps PyTorch, Skorch [code],
<a href="https://skorch.readthedocs.io/en/stable/" target="_blank"/> (last access: 25 October 2024), 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib131"><label>Van Mantgem et al.(2009)Van Mantgem, Stephenson, Byrne, Daniels,
Franklin, Fulé, Harmon, Larson, Smith, Taylor et al.</label><mixed-citation>
      
Van Mantgem, P. J., Stephenson, N. L., Byrne, J. C., Daniels, L. D., Franklin,
J. F., Fulé, P. Z., Harmon, M. E., Larson, A. J., Smith, J. M., Taylor,
A. H., and Veblen, T. T.: Widespread increase of tree mortality rates in the western
United States, Science, 323, 521–524, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib132"><label>Verstraeten et al.(2007)Verstraeten, Schrauwen, d'Haene, and
Stroobandt</label><mixed-citation>
      
Verstraeten, D., Schrauwen, B., d'Haene, M., and Stroobandt, D.: An
experimental unification of reservoir computing methods, Neural Networks, 20,
391–403, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib133"><label>Vlachas et al.(2020)Vlachas, Pathak, Hunt, Sapsis, Girvan, Ott, and
Koumoutsakos</label><mixed-citation>
      
Vlachas, P., Pathak, J., Hunt, B., Sapsis, T., Girvan, M., Ott, E., and
Koumoutsakos, P.: Backpropagation algorithms and Reservoir Computing in
Recurrent Neural Networks for the forecasting of complex spatiotemporal
dynamics, Neural Networks, 126, 191–217,
<a href="https://doi.org/10.1016/j.neunet.2020.02.016" target="_blank">https://doi.org/10.1016/j.neunet.2020.02.016</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib134"><label>von Buttlar et al.(2018)von Buttlar, Zscheischler, Rammig, Sippel,
Reichstein, Knohl, Jung, Menzer, Arain, Buchmann
et al.</label><mixed-citation>
      
von Buttlar, J., Zscheischler, J., Rammig, A., Sippel, S., Reichstein, M., Knohl, A., Jung, M., Menzer, O., Arain, M. A., Buchmann, N., Cescatti, A., Gianelle, D., Kiely, G., Law, B. E., Magliulo, V., Margolis, H., McCaughey, H., Merbold, L., Migliavacca, M., Montagnani, L., Oechel, W., Pavelka, M., Peichl, M., Rambal, S., Raschi, A., Scott, R. L., Vaccari, F. P., van Gorsel, E., Varlagin, A., Wohlfahrt, G., and Mahecha, M. D.: Impacts of droughts and extreme-temperature events on gross primary production and ecosystem respiration: a systematic assessment across ecosystems and climate zones, Biogeosciences, 15, 1293–1318, <a href="https://doi.org/10.5194/bg-15-1293-2018" target="_blank">https://doi.org/10.5194/bg-15-1293-2018</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib135"><label>Walleshauser and Bollt(2022)</label><mixed-citation>
      
Walleshauser, B. and Bollt, E.: Predicting sea surface temperatures with coupled reservoir computers, Nonlin. Processes Geophys., 29, 255–264, <a href="https://doi.org/10.5194/npg-29-255-2022" target="_blank">https://doi.org/10.5194/npg-29-255-2022</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib136"><label>Walther et al.(2022)Walther, Besnard, Nelson, El-Madany, Migliavacca,
Weber, Carvalhais, Ermida, Brümmer, Schrader et al.</label><mixed-citation>
      
Walther, S., Besnard, S., Nelson, J. A., El-Madany, T. S., Migliavacca, M., Weber, U., Carvalhais, N., Ermida, S. L., Brümmer, C., Schrader, F., Prokushkin, A. S., Panov, A. V., and Jung, M.: Technical note: A view from space on global flux towers by MODIS and Landsat: the FluxnetEO data set, Biogeosciences, 19, 2805–2840, <a href="https://doi.org/10.5194/bg-19-2805-2022" target="_blank">https://doi.org/10.5194/bg-19-2805-2022</a>, 2022 (data available at: <a href="https://meta.icos-cp.eu/collections/tEAkpU6UduMMONrFyym5-tUW" target="_blank"/>, last access: 25 October 2024).

    </mixed-citation></ref-html>
<ref-html id="bib1.bib137"><label>Werbos(1988)</label><mixed-citation>
      
Werbos, P. J.: Generalization of backpropagation with application to a
recurrent gas market model, Neural Networks, 1, 339–356, 1988.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib138"><label>Werbos(1990)</label><mixed-citation>
      
Werbos, P. J.: Backpropagation through time: what it does and how to do it,
P. IEEE, 78, 1550–1560, 1990.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib139"><label>Williams and Zipser(1995)</label><mixed-citation>
      
Williams, R. J. and Zipser, D.: Gradient-based learning algorithms for
recurrent, Backpropagation: Theory, Architectures, and Applications, 433, 17,   <a href="https://gwern.net/doc/ai/nn/rnn/1995-williams.pdf" target="_blank"/> (last access: 4 November 2024),
1995.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib140"><label>Yengoh et al.(2015)Yengoh, Dent, Olsson, Tengberg, and
Tucker III</label><mixed-citation>
      
Yengoh, G. T., Dent, D., Olsson, L., Tengberg, A. E., and Tucker III, C. J.:
Use of the Normalized Difference Vegetation Index (NDVI) to assess Land
degradation at multiple scales: current status, future trends, and practical
considerations, Springer, ISBN 978-3-319-24112-8, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib141"><label>Zeng et al.(2002)Zeng, Hales, and Neelin</label><mixed-citation>
      
Zeng, N., Hales, K., and Neelin, J. D.: Nonlinear dynamics in a coupled
vegetation–atmosphere system and implications for desert–forest gradient,
J. Climate, 15, 3474–3487, 2002.


    </mixed-citation></ref-html>
<ref-html id="bib1.bib142"><label>Zeng et al.(2022)Zeng, Hao, Huete, Dechant, Berry, Chen, Joiner,
Frankenberg, Bond-Lamberty, Ryu et al.</label><mixed-citation>
      
Zeng, Y., Hao, D., Huete, A., Dechant, B., Berry, J., Chen, J. M., Joiner, J.,
Frankenberg, C., Bond-Lamberty, B., Ryu, Y., Xiao, J., Asrar, G. R., and
Chen, M.: Optical vegetation
indices for monitoring terrestrial ecosystems globally, Nature Reviews Earth
&amp; Environment, 3, 477–493, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib143"><label>Zhang et al.(2017)Zhang, Wang, Dong, Zhong, and
Sun</label><mixed-citation>
      
Zhang, Q., Wang, H., Dong, J., Zhong, G., and Sun, X.: Prediction of sea
surface temperature using long short-term memory, IEEE Geosci. Remote
S., 14, 1745–1749, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib144"><label>Zhang et al.(2021)Zhang, Xin, and Li</label><mixed-citation>
      
Zhang, Z., Xin, Q., and Li, W.: Machine Learning-Based Modeling of Vegetation
Leaf Area Index and Gross Primary Productivity Across North America and
Comparison With a Process-Based Model, J. Adv. Model. Earth
Sy., 13, e2021MS002802,
<a href="https://doi.org/10.1029/2021MS002802" target="_blank">https://doi.org/10.1029/2021MS002802</a>,
2021.

    </mixed-citation></ref-html>--></article>
