<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0"><?xmltex \makeatother\@nolinetrue\makeatletter?>
  <front>
    <journal-meta><journal-id journal-id-type="publisher">NPG</journal-id><journal-title-group>
    <journal-title>Nonlinear Processes in Geophysics</journal-title>
    <abbrev-journal-title abbrev-type="publisher">NPG</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Nonlin. Processes Geophys.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1607-7946</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/npg-26-227-2019</article-id><title-group><article-title>Joint state-parameter estimation of a nonlinear stochastic energy balance model from sparse noisy data</article-title><alt-title>Joint state-parameter estimation</alt-title>
      </title-group><?xmltex \runningtitle{Joint state-parameter estimation}?><?xmltex \runningauthor{F. Lu et al.}?>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1">
          <name><surname>Lu</surname><given-names>Fei</given-names></name>
          <email>feilu@math.jhu.edu</email>
        <ext-link>https://orcid.org/0000-0001-6842-7922</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2 aff3">
          <name><surname>Weitzel</surname><given-names>Nils</given-names></name>
          
        <ext-link>https://orcid.org/0000-0002-2735-2992</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff4">
          <name><surname>Monahan</surname><given-names>Adam H.</given-names></name>
          
        </contrib>
        <aff id="aff1"><label>1</label><institution>Department of Mathematics, Johns Hopkins University, Baltimore, Maryland, USA</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>Institut für Umweltphysik, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany</institution>
        </aff>
        <aff id="aff3"><label>3</label><institution>Institut für Geowissenschaften und Meteorologie, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany</institution>
        </aff>
        <aff id="aff4"><label>4</label><institution>School of Earth and Ocean Sciences, University of Victoria, Victoria, British Columbia, Canada</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Fei Lu (feilu@math.jhu.edu)</corresp></author-notes><pub-date><day>14</day><month>August</month><year>2019</year></pub-date>
      
      <volume>26</volume>
      <issue>3</issue>
      <fpage>227</fpage><lpage>250</lpage>
      <history>
        <date date-type="received"><day>8</day><month>April</month><year>2019</year></date>
           <date date-type="rev-request"><day>23</day><month>April</month><year>2019</year></date>
           <date date-type="rev-recd"><day>8</day><month>July</month><year>2019</year></date>
           <date date-type="accepted"><day>18</day><month>July</month><year>2019</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2019 Fei Lu et al.</copyright-statement>
        <copyright-year>2019</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019.html">This article is available from https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019.html</self-uri><self-uri xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019.pdf">The full text article is available as a PDF file from https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019.pdf</self-uri>
      <abstract><title>Abstract</title>
    <p id="d1e118">While nonlinear stochastic partial differential equations arise naturally in spatiotemporal modeling, inference for such systems often faces two major challenges: sparse noisy data and ill-posedness of the inverse problem of parameter estimation. To overcome the challenges, we introduce a strongly regularized posterior by normalizing the likelihood and by imposing physical constraints through priors of the parameters and states.</p>
    <p id="d1e121">We investigate joint parameter-state estimation by the regularized posterior in a physically motivated nonlinear stochastic energy balance model (SEBM) for paleoclimate reconstruction. The high-dimensional posterior is sampled by a particle Gibbs sampler that combines a Markov chain Monte Carlo (MCMC) method with an optimal particle filter exploiting the structure of the SEBM. In tests using either Gaussian or uniform priors based on the physical range of parameters, the regularized posteriors overcome the ill-posedness and lead to samples within physical ranges, quantifying the uncertainty in estimation.  Due to the ill-posedness and the regularization, the posterior of parameters presents a relatively large uncertainty, and consequently, the maximum of the posterior, which is the minimizer in a variational approach, can have a large variation. In contrast, the posterior of states generally concentrates near the truth, substantially filtering out observation noise and reducing uncertainty in the unconstrained SEBM.</p>
  </abstract>
    </article-meta>
  </front>
<body>
      

      <?xmltex \hack{\newpage}?>
<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d1e135">Physically motivated nonlinear stochastic (partial) differential equations (SDEs and SPDEs) are natural models of spatiotemporal processes with uncertainty in geoscience.  In particular, such models arise in the problem of reconstructing geophysical fields from sparse and noisy data <xref ref-type="bibr" rid="bib1.bibx45 bib1.bibx21 bib1.bibx49" id="paren.1"><named-content content-type="pre">see, e.g.,</named-content><named-content content-type="post">and the references therein</named-content></xref>. The nonlinear differential equations, derived from physical principles, often come with unknown but physically constrained parameters also to be determined from data. This promotes the problem of joint state-parameter estimation from sparse and noisy data.
When the parameters are interrelated, which is often the case in nonlinear models, their estimation can be an ill-posed inverse problem. Physical constraints on the parameters must then be taken into account. In variational approaches,  physical constraints are imposed using a regularization term in a cost function, whose minimizer provides an estimator of the parameters and states. In a Bayesian approach, the physical constraints are encoded in prior distributions, extending the regularized cost function in the variational approach to a posterior and quantifying the estimation uncertainty. When the true parameters are known, the Bayesian approach has demonstrated great success in state estimation, thanks to the developments in Monte Carlo sampling and data assimilation techniques <xref ref-type="bibr" rid="bib1.bibx8 bib1.bibx27 bib1.bibx53" id="paren.2"><named-content content-type="pre">see, e.g.,</named-content></xref>. However, the problem of joint state-parameter estimation, especially when the parameter estimation is ill-posed, has had relatively little success in nonlinear cases and remains a challenge <xref ref-type="bibr" rid="bib1.bibx25" id="paren.3"/>.</p>
      <p id="d1e153">In this paper, we investigate a Bayesian approach for<?pagebreak page228?> joint state and parameter estimation of a nonlinear two-dimensional stochastic energy balance model (SEBM) in the context of spatial–temporal paleoclimate reconstructions of temperature fields from sparse and noisy data <xref ref-type="bibr" rid="bib1.bibx48 bib1.bibx47 bib1.bibx16 bib1.bibx20" id="paren.4"/>. In particular, we consider a model of the energy balance of the atmosphere similar to those often used in idealized climate models <xref ref-type="bibr" rid="bib1.bibx17 bib1.bibx54 bib1.bibx44" id="paren.5"><named-content content-type="pre">e.g.,</named-content></xref> to study climate variability and climate sensitivity. The use of such a model in paleoclimate reconstruction aims at improving the physical consistency of temperature reconstructions during, e.g., the last deglaciation and the Holocene by combining indirect observations, so-called proxy data, with physically motivated stochastic models.</p>
      <p id="d1e164">The SEBM models surface air temperature, explicitly taking into account sinks, sources, and horizontal transport of energy in the atmosphere, with an additive stochastic forcing incorporated to account for unresolved processes and scales. The model takes the form of a nonlinear SPDE with unknown  parameters to be inferred from data. These unknown parameters are associated with processes in the energy budget (e.g., radiative transfer, air–sea energy exchange) that are represented in a simplified manner in the SEBM, and may change with a changing climate. The parameters must fall in a prescribed range such that the SEBM is physically meaningful. Specifically, they must be in sufficiently close balance for the stationary temperature of the SEBM to be within a physically realistic range. As we will show, the parametric terms arising from this physically based model are strongly correlated, leading to a Fisher information matrix that is ill-conditioned. Therefore, the parameter estimation is an ill-posed inverse problem, and the maximum likelihood estimators of individual parameters have  large variations and often fall out of the physical range.</p>
      <p id="d1e167">To overcome the ill-posedness in parameter estimation, we introduce a new strongly regularized posterior by normalizing the likelihood and by imposing the physical constraints through priors on the parameters and the states,  based on physical constraints and the climatological distribution. In the regularized posterior, the prior has the same weight as the normalized likelihood to enforce the support of the posterior to be in the physical range. Such a regularized posterior is a natural extension of the regularized cost function in a variational approach: the maximum of the posterior (MAP) is the same as the minimizer of the regularized cost function, but the posterior quantifies the uncertainty in the estimator.</p>
      <p id="d1e171">The regularized posterior of the states and parameters is high-dimensional and non-Gaussian. It is represented by its samples, which provide an empirical approximation of the distribution and allow efficient computation of quantities of interest such as posterior means. The samples are drawn using a particle Gibbs sampler with ancestor sampling <xref ref-type="bibr" rid="bib1.bibx30" id="paren.6"><named-content content-type="pre">PGAS,</named-content></xref>, a special sampler in the family of particle Markov chain Monte Carlo (MCMC) methods <xref ref-type="bibr" rid="bib1.bibx2" id="paren.7"/> that combines the strengths of both MCMC and sequential Monte Carlo methods <xref ref-type="bibr" rid="bib1.bibx15" id="paren.8"><named-content content-type="pre">see, e.g.,</named-content></xref> to ensure the convergence of the empirical approximation to the high-dimensional posterior. In the PGAS, we use an optimal particle filter that exploits the forward structure of the SEBM.</p>
      <p id="d1e187">We consider two priors for the parameters, each based on their physical ranges: a uniform prior and a Gaussian prior with 3 standard deviations inside the range. We impose a prior for the states based on their overall climatological distribution.  Tests show that the regularized posteriors overcome the ill-posedness and lead to samples of parameters and states within the physical ranges, quantifying the uncertainty in their estimation. Due to the regularization, the posterior of the parameters is supported on a relatively large range. Consequently, the MAP of the parameters has a large variation, and it is important to use the posterior to assess the uncertainty. In contrast, the posterior of the states generally concentrates near the truth, substantially filtering out the observational noise and reducing the uncertainty in state reconstruction.</p>
      <p id="d1e190">Tests also show that the regularized posterior is robust to spatial sparsity of observations, with sparser observations leading to larger uncertainties. However, due to the need for regularization to overcome ill-posedness, the uncertainty in the posterior of the parameters can not be eliminated by increasing the number of observations in time. Therefore, we suggest alternative approaches, such as re-parametrization of the nonlinear function according to the climatological distribution or nonparametric Bayesian inference <xref ref-type="bibr" rid="bib1.bibx38 bib1.bibx19" id="paren.9"><named-content content-type="pre">see, e.g.,</named-content></xref>, to avoid ill-posedness.</p>
      <p id="d1e198">The rest of the paper is organized as follows. Section <xref ref-type="sec" rid="Ch1.S2"/> introduces the SEBM and its discretization, and formulates a state-space model. We also outline in this section the Bayesian approach to the joint parameter-state estimation and the particle MCMC samplers. Section <xref ref-type="sec" rid="Ch1.S3"/> analyzes the ill-posedness of the parameter estimation problem and introduces the regularized posterior. The regularized posterior is sampled by PGAS and numerical results are presented in Sect. <xref ref-type="sec" rid="Ch1.S4"/>. Discussions and conclusions are presented in Sects. <xref ref-type="sec" rid="Ch1.S5"/> and <xref ref-type="sec" rid="Ch1.S6"/>. Technical details of the estimation procedure are described in Appendix <xref ref-type="sec" rid="App1.Ch1.S1"/>.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>State-space model formulation</title>
      <p id="d1e222">After providing a brief physical introduction to the SEBM, we present its discretization and the observation model by representing them as a state-space model suitable for application of sequential Monte Carlo methods in Bayesian inference.</p>
<?pagebreak page229?><sec id="Ch1.S2.SS1">
  <label>2.1</label><title>The stochastic energy balance model</title>
      <p id="d1e232">The SEBM describes the evolution in space (both latitude and longitude) and time of the surface air temperature <inline-formula><mml:math id="M1" display="inline"><mml:mrow><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>:
            <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M2" display="block"><mml:mrow><mml:msub><mml:mo>∂</mml:mo><mml:mi>t</mml:mi></mml:msub><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M3" display="inline"><mml:mrow><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>∈</mml:mo><mml:mo>[</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">π</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">π</mml:mi><mml:mo>]</mml:mo><mml:mo>×</mml:mo><mml:mo>[</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">π</mml:mi><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="italic">π</mml:mi><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> is the two-dimensional coordinate on the sphere and the solution <inline-formula><mml:math id="M4" display="inline"><mml:mrow><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is periodic in longitude. Horizontal energy transport is represented as diffusion with diffusivity <inline-formula><mml:math id="M5" display="inline"><mml:mi mathvariant="italic">ν</mml:mi></mml:math></inline-formula>, while sources and sinks of atmospheric internal energy are represented by the nonlinear function  <inline-formula><mml:math id="M6" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>:
            <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M7" display="block"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mi>u</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub><mml:msup><mml:mi>u</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          with the unknown parameters <inline-formula><mml:math id="M8" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>.  Upper and lower bounds of these three parameters, shown in Table <xref ref-type="table" rid="Ch1.T1"/>, are derived from the energy balance model in <xref ref-type="bibr" rid="bib1.bibx17" id="text.10"/>, adjusted to current estimates of the Earth's global energy budget from <xref ref-type="bibr" rid="bib1.bibx51" id="text.11"/> using appropriate simplifications. The equilibrium solution of the SEBM for the average values of the parameters approximates the current global mean temperature closely, and the magnitude of sinks and sources approximates the respective magnitudes in <xref ref-type="bibr" rid="bib1.bibx51" id="text.12"/> well. The physical ranges of the parameters are very conservative and cover current estimates of the global mean temperature during the Quaternary <xref ref-type="bibr" rid="bib1.bibx46" id="paren.13"/>. The state variable and the parameters in the model have been nondimensionalized so that the equilibrium solution of Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>) with <inline-formula><mml:math id="M9" display="inline"><mml:mrow><mml:mi>f</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula> is approximately equal to 1 and 1 time unit represents a year.</p>
      <p id="d1e497">The nonlinear function <inline-formula><mml:math id="M10" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> aggregates parametrizations from <xref ref-type="bibr" rid="bib1.bibx17" id="text.14"/> for incoming short-wave radiation, outgoing long-wave radiation, radiative air–surface flux, sensible air–surface heat flux, and the latent heat flux into the atmosphere according to their polynomial order. The quartic nonlinearity of the function <inline-formula><mml:math id="M11" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> arises from the Stefan–Boltzmann dependence of long-wave radiative fluxes on atmospheric temperature, while a linear feedback is included to represent state dependence of, e.g., surface energy fluxes and albedo.  Inclusion of quadratic and cubic nonlinearities in <inline-formula><mml:math id="M12" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (to account for nonlinearities in the feedbacks just noted) was found to exacerbate the ill-posedness of the model without qualitatively changing the character of the model dynamics within the parameter range appropriate for the study of Quaternary climate variability (e.g., without admitting multiple deterministic equilibria associated with the ice–albedo feedback).   In reality, the diffusivity <inline-formula><mml:math id="M13" display="inline"><mml:mi mathvariant="italic">ν</mml:mi></mml:math></inline-formula> and the parameters <inline-formula><mml:math id="M14" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M15" display="inline"><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>  will depend on latitude, longitude, and time.  We will neglect this complexity in our idealized analysis.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T1"><?xmltex \currentcnt{1}?><label>Table 1</label><caption><p id="d1e600">The physical upper and lower bounds of the parameters in the SEBM. </p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="center"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M16" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M17" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M18" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Upper bound</oasis:entry>
         <oasis:entry colname="col2">32.57</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M19" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">22.70</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M20" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">4.80</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Lower bound</oasis:entry>
         <oasis:entry colname="col2">27.64</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">25.46</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">6.00</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d1e727">The stochastic term <inline-formula><mml:math id="M23" display="inline"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>,  which models the net effect of unresolved or oversimplified processes in the energy budget, is a centered Gaussian field that is white in time and colored in space, specified by an isotropic Matérn covariance function with order <inline-formula><mml:math id="M24" display="inline"><mml:mrow><mml:mi mathvariant="italic">α</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> and scale <inline-formula><mml:math id="M25" display="inline"><mml:mrow><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>. That is,
            <disp-formula id="Ch1.E3" content-type="numbered"><label>3</label><mml:math id="M26" display="block"><mml:mrow><mml:mi mathvariant="double-struck">E</mml:mi><mml:mfenced open="[" close="]"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">η</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>=</mml:mo><mml:mi mathvariant="italic">δ</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mi>s</mml:mi><mml:mo>)</mml:mo><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mo>|</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>-</mml:mo><mml:mi mathvariant="italic">η</mml:mi><mml:mo>|</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          with the covariance kernel <inline-formula><mml:math id="M27" display="inline"><mml:mrow><mml:mi>C</mml:mi><mml:mo>(</mml:mo><mml:mi>r</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> being the Matérn covariance kernel given by
            <disp-formula id="Ch1.E4" content-type="numbered"><label>4</label><mml:math id="M28" display="block"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mi mathvariant="italic">α</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>r</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msup><mml:mn mathvariant="normal">2</mml:mn><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mi mathvariant="italic">α</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mi mathvariant="normal">Γ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:msqrt><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">α</mml:mi></mml:mrow></mml:msqrt><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>r</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi></mml:mfrac></mml:mstyle></mml:mrow></mml:mfenced><mml:mi mathvariant="italic">α</mml:mi></mml:msup><mml:msub><mml:mi>K</mml:mi><mml:mi mathvariant="italic">α</mml:mi></mml:msub><mml:mfenced open="(" close=")"><mml:mrow><mml:msqrt><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">α</mml:mi></mml:mrow></mml:msqrt><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mi>r</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi></mml:mfrac></mml:mstyle></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M29" display="inline"><mml:mi mathvariant="normal">Γ</mml:mi></mml:math></inline-formula> is the gamma function, <inline-formula><mml:math id="M30" display="inline"><mml:mi mathvariant="italic">ρ</mml:mi></mml:math></inline-formula> is a scaling factor, and <inline-formula><mml:math id="M31" display="inline"><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi mathvariant="italic">α</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the modified Bessel function of the second kind. We focus on the estimation of the parameters <inline-formula><mml:math id="M32" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> and assume that <inline-formula><mml:math id="M33" display="inline"><mml:mi mathvariant="italic">ν</mml:mi></mml:math></inline-formula> and the parameters of <inline-formula><mml:math id="M34" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> are known. Estimating <inline-formula><mml:math id="M35" display="inline"><mml:mi mathvariant="italic">ν</mml:mi></mml:math></inline-formula> in energy balance models with data assimilation methods is studied in <xref ref-type="bibr" rid="bib1.bibx3" id="text.15"/>, whereas estimation of parameters of <inline-formula><mml:math id="M36" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> in the context of linear SPDEs is covered for example in <xref ref-type="bibr" rid="bib1.bibx29" id="text.16"/>.</p>
      <p id="d1e1008">In a paleoclimate context, temperature observations are sparse (in space and time) and derived from climatic proxies, such as pollen assemblages, isotopic compositions, and tree rings, which are indirect measures of the climate state.  To simplify our analysis, we neglect the potentially nonlinear transformations associated with the proxies and focus on the effect of observational sparseness. This is a common strategy in the testing of climate field reconstruction methods <xref ref-type="bibr" rid="bib1.bibx55" id="paren.17"><named-content content-type="pre">e.g.,</named-content></xref>. As such, we take the data to be noisy observations of the solution at <inline-formula><mml:math id="M37" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> locations:
            <disp-formula id="Ch1.E5" content-type="numbered"><label>5</label><mml:math id="M38" display="block"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>H</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">ξ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          for <inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, where each <inline-formula><mml:math id="M40" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ξ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>∈</mml:mo><mml:mo>[</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">π</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">π</mml:mi><mml:mo>]</mml:mo><mml:mo>×</mml:mo><mml:mo>[</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">π</mml:mi><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="italic">π</mml:mi><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> is a location of observation, <inline-formula><mml:math id="M41" display="inline"><mml:mi>H</mml:mi></mml:math></inline-formula> is the observation operator, and <inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∼</mml:mo><mml:mi mathvariant="script">N</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are independent identically distributed (iid) Gaussian noise. The data are sparse in the sense that only a small number of the spatial locations are observed.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>State-space model representation</title>
      <p id="d1e1233">In practice, the differential equations are represented by their discretized systems and the observations are discrete in time; therefore, we consider only the state-space model based on a discretization of the SEBM. We refer the reader to <xref ref-type="bibr" rid="bib1.bibx43" id="text.18"/>, <xref ref-type="bibr" rid="bib1.bibx4" id="text.19"/>, <xref ref-type="bibr" rid="bib1.bibx22" id="text.20"/>, <xref ref-type="bibr" rid="bib1.bibx36" id="text.21"/>, and <xref ref-type="bibr" rid="bib1.bibx32" id="text.22"/> for studies about inference of SPDEs in a continuous-time setting.</p>
<?pagebreak page230?><sec id="Ch1.S2.SS2.SSS1">
  <label>2.2.1</label><title>The state model</title>
      <p id="d1e1258">We discretize the SPDE Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>) using linear finite elements in space and a semi-backward Euler method in time, using the computationally efficient Gaussian Markov random field approximation of the Gaussian field by <xref ref-type="bibr" rid="bib1.bibx29" id="text.23"/> (see details in Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS1"/>). We write the discretized equation as a standard state-space model:

                  <disp-formula id="Ch1.E6" content-type="numbered"><label>6</label><mml:math id="M43" display="block"><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where  <inline-formula><mml:math id="M44" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>:</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msup><mml:mo>→</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is  the deterministic function and  <inline-formula><mml:math id="M45" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> is a sequence of iid Gaussian noise with mean zero and covariance <inline-formula><mml:math id="M46" display="inline"><mml:mi mathvariant="bold">R</mml:mi></mml:math></inline-formula> described in more detail in Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E40"/>). Here the subscript <inline-formula><mml:math id="M47" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> is a time index.
Therefore, the transition probability density  <inline-formula><mml:math id="M48" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, the probability density of <inline-formula><mml:math id="M49" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> conditional on <inline-formula><mml:math id="M50" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M51" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>, is</p>
      <p id="d1e1442"><disp-formula id="Ch1.E7" content-type="numbered"><label>7</label><mml:math id="M52" display="block"><mml:mtable class="split" columnspacing="1em" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">det</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">π</mml:mi><mml:mi mathvariant="bold">R</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mi>exp⁡</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="bold">R</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:mrow></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
</sec>
<sec id="Ch1.S2.SS2.SSS2">
  <label>2.2.2</label><title>The observation model</title>
      <p id="d1e1602">In discrete form, we assume that the locations of observation are the nodes of the finite elements. Then the observation function in Eq. (<xref ref-type="disp-formula" rid="Ch1.E5"/>) is simply <inline-formula><mml:math id="M53" display="inline"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi mathvariant="normal">n</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>k</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, with <inline-formula><mml:math id="M54" display="inline"><mml:mrow><mml:msub><mml:mi>k</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>d</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> denoting the index of the node under observation, for <inline-formula><mml:math id="M55" display="inline"><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>,
and we can write the observation model as
              <disp-formula id="Ch1.E8" content-type="numbered"><label>8</label><mml:math id="M56" display="block"><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi mathvariant="bold">H</mml:mi><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="1em"/><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M57" display="inline"><mml:mrow><mml:mi mathvariant="bold">H</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is called the observation matrix and <inline-formula><mml:math id="M58" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> is a sequence of iid Gaussian noise with distribution <inline-formula><mml:math id="M59" display="inline"><mml:mrow><mml:mi mathvariant="script">N</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="bold">Q</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M60" display="inline"><mml:mrow><mml:mi mathvariant="bold">Q</mml:mi><mml:mo>=</mml:mo><mml:mi mathvariant="normal">Diag</mml:mi><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>i</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>. Equivalently, the probability of observing <inline-formula><mml:math id="M61" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> given state <inline-formula><mml:math id="M62" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is
              <disp-formula id="Ch1.E9" content-type="numbered"><label>9</label><mml:math id="M63" display="block"><mml:mtable rowspacing="0.2ex" class="split" columnspacing="1em" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">det</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">π</mml:mi><mml:mi mathvariant="bold">Q</mml:mi><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mi>exp⁡</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:mi mathvariant="bold">H</mml:mi><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="bold">Q</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:mi mathvariant="bold">H</mml:mi><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle></mml:mrow></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p>
</sec>
</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Bayesian inference for SSM</title>
      <p id="d1e1974">Given observations <inline-formula><mml:math id="M64" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>:=</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>N</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, our goal is to jointly estimate the state <inline-formula><mml:math id="M65" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>:=</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>N</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and the parameter vector <inline-formula><mml:math id="M66" display="inline"><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>:=</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> in the state-space model Eqs. (<xref ref-type="disp-formula" rid="Ch1.E6"/>)–(<xref ref-type="disp-formula" rid="Ch1.E9"/>).
The Bayesian approach estimates the joint distribution of <inline-formula><mml:math id="M67" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> conditional on the observations by drawing samples to form an empirical approximation of the high-dimensional posterior.  The empirical posterior efficiently quantifies the uncertainty in the estimation. Therefore, the Bayesian approach has been widely used <xref ref-type="bibr" rid="bib1.bibx25" id="paren.24"><named-content content-type="pre">see the review of</named-content><named-content content-type="post">and the references therein</named-content></xref>.</p>
      <p id="d1e2122">Following Bayes' rule, the joint posterior distribution of <inline-formula><mml:math id="M68" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> can be written as
            <disp-formula id="Ch1.E10" content-type="numbered"><label>10</label><mml:math id="M69" display="block"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where <inline-formula><mml:math id="M70" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the prior of the parameters and <inline-formula><mml:math id="M71" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo>∫</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mi mathvariant="normal">d</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is the unknown marginal probability density function of the observations. In the importance sampling approximation to the posterior, we do not need to know the value of <inline-formula><mml:math id="M72" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, because as a normalizing constant it will be cancelled out in the importance weights of samples.
The quantity <inline-formula><mml:math id="M73" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the likelihood of the observations <inline-formula><mml:math id="M74" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> conditional on the state <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> and the parameter <inline-formula><mml:math id="M76" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>,  which can be explicitly derived from the observation model Eq. (<xref ref-type="disp-formula" rid="Ch1.E8"/>):
            <disp-formula id="Ch1.E11" content-type="numbered"><label>11</label><mml:math id="M77" display="block"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∏</mml:mo><mml:mi mathvariant="normal">n</mml:mi></mml:munder><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          with <inline-formula><mml:math id="M78" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> given in Eq. (<xref ref-type="disp-formula" rid="Ch1.E9"/>). Finally, the probability density function of the state <inline-formula><mml:math id="M79" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> given parameter <inline-formula><mml:math id="M80" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> can be derived from the state model Eq. (<xref ref-type="disp-formula" rid="Ch1.E6"/>):
            <disp-formula id="Ch1.E12" content-type="numbered"><label>12</label><mml:math id="M81" display="block"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>)</mml:mo><mml:munderover><mml:mo movablelimits="false">∏</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:munderover><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          with <inline-formula><mml:math id="M82" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> specified by Eq. (<xref ref-type="disp-formula" rid="Ch1.E7"/>).</p>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Sampling the posterior by particle MCMC methods</title>
      <p id="d1e2761">In practice, we are interested in the expectation of quantities of interest or the probability of certain events. These computations involve integrations of the posterior that can neither be computed analytically nor  by numerical quadrature methods due to the curse of dimensionality: the posterior is a high-dimensional non-Gaussian distribution involving variables with a dimension at the scale of thousands to millions. Monte Carlo methods generate samples to approximate the posterior by the empirical distribution, so that quantities of interest can be computed efficiently.</p>
      <p id="d1e2764">MCMC methods are popular Monte Carlo methods <xref ref-type="bibr" rid="bib1.bibx31" id="paren.25"><named-content content-type="pre">see, e.g.,</named-content></xref> that generate samples along a Markov chain with the posterior as the invariant measure. For joint distributions of parameters and states, a standard MCMC method is Gibbs sampling which consists of alternatively updating the state variable <inline-formula><mml:math id="M83" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> conditional on <inline-formula><mml:math id="M84" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M85" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> by sampling
            <disp-formula id="Ch1.E13" content-type="numbered"><label>13</label><mml:math id="M86" display="block"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:math></disp-formula>
          and then updating the parameter <inline-formula><mml:math id="M87" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> conditional on <inline-formula><mml:math id="M88" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> by sampling the marginal posterior of <inline-formula><mml:math id="M89" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>:
            <disp-formula id="Ch1.E14" content-type="numbered"><label>14</label><mml:math id="M90" display="block"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
          Due to the high dimensionality of <inline-formula><mml:math id="M91" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>,  a major difficulty in sampling <inline-formula><mml:math id="M92" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the design of efficient proposal densities that can effectively explore the support of <inline-formula><mml:math id="M93" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
      <?pagebreak page231?><p id="d1e3153">Another group of rapidly developing MC methods are sequential Monte Carlo (SMC) methods <xref ref-type="bibr" rid="bib1.bibx7 bib1.bibx15" id="paren.26"/> that exploit the sequential structure of state-space models to approximate the posterior densities <inline-formula><mml:math id="M94" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> sequentially. SMC methods are efficient but suffer from the well-known problem of depletion (or degeneracy), in which the marginal distribution <inline-formula><mml:math id="M95" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> becomes concentrated on  a single sample as <inline-formula><mml:math id="M96" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>-</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:math></inline-formula> increases (see Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS2"/> for more details).</p>
      <p id="d1e3244">The particle MCMC methods introduced in <xref ref-type="bibr" rid="bib1.bibx2" id="text.27"/> provide a framework for systematically combining SMC methods with MCMC methods, exploiting the strengths of both techniques. In the particle MCMC samplers, SMC algorithms provide high-dimensional proposal distributions, and Markov transitions guide the SMC ensemble to sufficiently explore the target distribution. The transition is realized by a conditional SMC technique, in which a reference trajectory from the previous step is kept throughout the current step of SMC sampling.</p>
      <p id="d1e3251">In this study, we sample the posterior by PGAS <xref ref-type="bibr" rid="bib1.bibx30" id="paren.28"/>, a particle MCMC method that enhances the mixing of the Markov chain by sampling the ancestor of the reference trajectory. For the SMC, we use an optimal particle filter, which takes advantage of the linear Gaussian observation model and the Gaussian transition density of the state variables in our current SEBM. More generally, when the observation model is nonlinear and the transition density is non-Gaussian, the optimal particle filter can be replaced by implicit particle filters <xref ref-type="bibr" rid="bib1.bibx10 bib1.bibx37" id="paren.29"/> or local particle filters <xref ref-type="bibr" rid="bib1.bibx41 bib1.bibx42 bib1.bibx18" id="paren.30"/>;  we refer the reader to <xref ref-type="bibr" rid="bib1.bibx8" id="text.31"/>, <xref ref-type="bibr" rid="bib1.bibx27" id="text.32"/>, and <xref ref-type="bibr" rid="bib1.bibx53" id="text.33"/> for other data assimilation techniques.  The details of the algorithm are provided in Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS3"/>.</p>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Ill-posedness and regularized posteriors</title>
      <p id="d1e3284">In this section, we first demonstrate and then analyze the failure of  standard Bayesian inference of the parameters with the posteriors in Eq. (<xref ref-type="disp-formula" rid="Ch1.E10"/>). The standard Bayesian inference of the parameters fails in the sense that the posterior Eq. (<xref ref-type="disp-formula" rid="Ch1.E10"/>) tends to have a large probability mass at non-physical parameter values. In the process of approximating the posterior by samples, the values of these samples often either hit the (upper or lower) bounds in Table <xref ref-type="table" rid="Ch1.T1"/> when we use a uniform prior or exceed these bounds when we use a Gaussian prior. As we shall show next, the standard Bayesian inverse problem is <italic>numerically ill-posed</italic> because the Fisher information matrix is ill-conditioned, which makes the inference numerically unreliable. Following the idea of regularization in variational approaches, we propose using regularized posteriors in the Bayesian inference. This approach unifies the Bayesian and variational approaches: the MAP is the minimizer of the regularized cost function in the variational approach, but the Bayesian approach quantifies the uncertainty of the estimator by the posterior.</p>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>Model settings and tests</title>
      <p id="d1e3303">Based on the physical upper and lower bounds in Table <xref ref-type="table" rid="Ch1.T1"/>, we consider two priors for the parameters: a uniform distribution on these intervals and a Gaussian distribution centered at the median and with 3 standard deviations in the interval, as listed in Table <xref ref-type="table" rid="Ch1.T2"/>.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T2"><?xmltex \currentcnt{2}?><label>Table 2</label><caption><p id="d1e3313">The priors of <inline-formula><mml:math id="M97" display="inline"><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> based on the physical constraints in Table <xref ref-type="table" rid="Ch1.T1"/>. </p></caption><oasis:table frame="topbot"><oasis:tgroup cols="2">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Uniform prior</oasis:entry>
         <oasis:entry colname="col2">[27.64, 32.57]<inline-formula><mml:math id="M98" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula>[<inline-formula><mml:math id="M99" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">25.46</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M100" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">22.70</mml:mn></mml:mrow></mml:math></inline-formula>]<inline-formula><mml:math id="M101" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">[<inline-formula><mml:math id="M102" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">6.00</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M103" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">4.80</mml:mn></mml:mrow></mml:math></inline-formula>]</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Gaussian prior</oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M104" display="inline"><mml:mrow><mml:mi mathvariant="normal">mean</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">30.11</mml:mn><mml:mo>,</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">24.08</mml:mn><mml:mo>,</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">5.40</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M105" display="inline"><mml:mrow><mml:mi mathvariant="normal">covariance</mml:mi><mml:mo>=</mml:mo><mml:mi mathvariant="normal">Diag</mml:mi></mml:mrow></mml:math></inline-formula>(<inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">0.82</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M107" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">0.46</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>,  <inline-formula><mml:math id="M108" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">0.20</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>)</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T3"><?xmltex \currentcnt{3}?><label>Table 3</label><caption><p id="d1e3526">The settings of the stochastic energy balance model and its discretization.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="2">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M109" display="inline"><mml:mrow><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.1</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">Diffusion constant</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M110" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.1</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">Scale of the stochastic forcing</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M111" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.01</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">Time step size</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M112" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">12</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">Number of total nodes</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M113" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">o</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">6</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">Number of observed nodes</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M114" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="italic">ϵ</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.01</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">SD of the observation noise</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d1e3671">Throughout this study, we shall consider a relatively small numerical mesh for the SPDE with only 12 nodes for the finite elements. Such a  small mesh provides a toy model that can neatly represent the spatial structure on the sphere while allowing for systematic assessments of statistical properties of the Bayesian inference with moderate computational costs. Numerical tests show that the above FEM semi-backward Euler scheme is stable for a time step size <inline-formula><mml:math id="M115" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.01</mml:mn></mml:mrow></mml:math></inline-formula> and a stochastic forcing with scale <inline-formula><mml:math id="M116" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.1</mml:mn></mml:mrow></mml:math></inline-formula> (see Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS1"/> for more details about the discretization).  A typical realization of the solution is shown in Fig. <xref ref-type="fig" rid="Ch1.F1"/> (panels a and b), where we present the solution on the sphere at a fixed time with the 12-node finite-element mesh, as well as the trajectories of all 12 nodes.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F1" specific-use="star"><?xmltex \currentcnt{1}?><label>Figure 1</label><caption><p id="d1e3709">A typical realization of the solution to the SEBM. <bold>(a)</bold> The solution at time step <inline-formula><mml:math id="M117" display="inline"><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">10</mml:mn></mml:mrow></mml:math></inline-formula> on the sphere with the 12-node finite-element mesh. <bold>(b)</bold> The trajectories of all 12 nodes over 100 time steps. <bold>(c)</bold> Histogram estimates of the climatological probability distribution of all nodes of the true states (salmon) and the observations (blue). </p></caption>
          <?xmltex \igopts{width=497.923228pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f01.png"/>

        </fig>

      <p id="d1e3739">The standard deviation of the observation noise is set to <inline-formula><mml:math id="M118" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="italic">ϵ</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.01</mml:mn></mml:mrow></mml:math></inline-formula>, i.e., 1 order of magnitude smaller than the stochastic forcing and 2 orders of magnitude smaller than the climatological mean.</p>
      <?pagebreak page232?><p id="d1e3757">We first assume that 6 out of the 12 nodes are observed;  we  discuss  results obtained using sparser or denser observations in the discussion section. Figure <xref ref-type="fig" rid="Ch1.F1"/> also shows the climatological probability histogram of the true state variables and the partial noisy observations. The climatological distribution of the observations is close to that of the true state variables (with a slightly larger variance due to the noise). The  histograms show that the state variables are centered around <inline-formula><mml:math id="M119" display="inline"><mml:mn mathvariant="normal">1</mml:mn></mml:math></inline-formula> and vary mostly in the interval <inline-formula><mml:math id="M120" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mn mathvariant="normal">0.92</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1.05</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. We shall use a Gaussian approximation based on the climatological distribution of the partial noisy observations as a prior to constrain the state variables.</p>
      <p id="d1e3785">We summarize the settings of numerical tests in Table <xref ref-type="table" rid="Ch1.T3"/>.</p>
</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Ill-posedness of the standard Bayesian inference of parameters </title>
      <p id="d1e3798">By the Bernstein–von Mises theorem <xref ref-type="bibr" rid="bib1.bibx52" id="paren.34"><named-content content-type="pre">see, e.g.,</named-content><named-content content-type="post">chap. 10</named-content></xref>, the posterior distribution of the parameters conditional on the true state data approaches the likelihood distribution as the data size increases. That is, <inline-formula><mml:math id="M121" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> in Eq. (<xref ref-type="disp-formula" rid="Ch1.E14"/>) becomes close to the likelihood distribution <inline-formula><mml:math id="M122" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (which can be viewed as a distribution of <inline-formula><mml:math id="M123" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>) as the data size increases.
Therefore, if the likelihood distribution is numerically degenerate (in the sense that some components are undetermined), then the Bayesian posterior will also become close to degenerate, so that the Bayesian inference for parameter estimation will be ill-posed. In the following, we show that for this model the likelihood is degenerate even if the full states are observed with zero observation noise and  that the maximum likelihood estimators have large nonphysical fluctuations (particularly when the states are noisy). As a consequence, the standard Bayesian parameter inference  fails by yielding nonphysical samples.</p>
      <p id="d1e3869">We show first that the likelihood distribution is numerically degenerate because the Fisher information matrix is ill-conditioned. Following the transition density Eq. (<xref ref-type="disp-formula" rid="Ch1.E7"/>), the log likelihood of the state <inline-formula><mml:math id="M124" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> is
            <disp-formula id="Ch1.E15" content-type="numbered"><label>15</label><mml:math id="M125" display="block"><mml:mtable columnspacing="1em" class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mi>l</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mi>c</mml:mi><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="bold">R</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          where <inline-formula><mml:math id="M126" display="inline"><mml:mi>c</mml:mi></mml:math></inline-formula> is a constant independent of <inline-formula><mml:math id="M127" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.
Since <inline-formula><mml:math id="M128" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is linear in <inline-formula><mml:math id="M129" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> (cf.  Eq. <xref ref-type="disp-formula" rid="App1.Ch1.S1.E40"/>), the likelihood function is quadratic in <inline-formula><mml:math id="M130" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> and the corresponding scaled Fisher information matrix is
            <disp-formula id="Ch1.E16" content-type="numbered"><label>16</label><mml:math id="M131" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold">F</mml:mi><mml:mi>N</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:msub><mml:mfenced close=")" open="("><mml:mrow><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="bold">R</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mrow><mml:mi>k</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          where the vectors <inline-formula><mml:math id="M132" display="inline"><mml:mrow><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> are defined in Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E41"/>). As <inline-formula><mml:math id="M133" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>→</mml:mo><mml:mi mathvariant="normal">∞</mml:mi></mml:mrow></mml:math></inline-formula>,  the Fisher information matrix converges, by ergodicity of the system, to its expectation <inline-formula><mml:math id="M134" display="inline"><mml:mrow><mml:msub><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msubsup><mml:mi mathvariant="double-struck">E</mml:mi><mml:mo>[</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="bold">A</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>∘</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:msubsup><mml:mi>A</mml:mi><mml:mi mathvariant="script">T</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:msup><mml:mi mathvariant="bold">C</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mi mathvariant="bold">C</mml:mi><mml:msub><mml:mi>A</mml:mi><mml:mi mathvariant="script">T</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold">A</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>∘</mml:mo><mml:mi>l</mml:mi></mml:mrow></mml:msup><mml:mo>]</mml:mo></mml:mrow></mml:mfenced><mml:mrow><mml:mi>k</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, where the matrices <inline-formula><mml:math id="M135" display="inline"><mml:mi mathvariant="bold">A</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M136" display="inline"><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mi mathvariant="script">T</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M137" display="inline"><mml:mi mathvariant="bold">C</mml:mi></mml:math></inline-formula>, arising in the spatial–temporal discretization, are defined in Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS1"/>.
Intuitively, neglecting these matrices and viewing the vector <inline-formula><mml:math id="M138" display="inline"><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> as a scalar, this expectation matrix could be reduced to <inline-formula><mml:math id="M139" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msubsup><mml:mi mathvariant="double-struck">E</mml:mi><mml:mo>[</mml:mo><mml:msubsup><mml:mi>u</mml:mi><mml:mi>n</mml:mi><mml:mi>k</mml:mi></mml:msubsup><mml:msubsup><mml:mi>u</mml:mi><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msubsup><mml:mo>]</mml:mo><mml:msub><mml:mo>)</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, which is ill-conditioned
because <inline-formula><mml:math id="M140" display="inline"><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> has a distribution concentrated near 1 with a standard deviation at the scale of <inline-formula><mml:math id="M141" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> (see Fig. <xref ref-type="fig" rid="Ch1.F1"/>).</p>
      <p id="d1e4479">Figure <xref ref-type="fig" rid="Ch1.F2"/> shows the means and standard deviations of the condition numbers (the ratio between the maximum and the minimum singular values) of the Fisher information matrices from 100 independent simulations. Each of these simulations generates a long trajectory of length <inline-formula><mml:math id="M142" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">5</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> using a parameter drawn randomly from the prior and computes the Fisher information matrices using the true trajectory of all 12 nodes, for subsamples of lengths <inline-formula><mml:math id="M143" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> ranging from <inline-formula><mml:math id="M144" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> to <inline-formula><mml:math id="M145" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">5</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>. For both the Gaussian and uniform priors, the condition numbers are  on the scale of 10<inline-formula><mml:math id="M146" display="inline"><mml:msup><mml:mi/><mml:mn mathvariant="normal">8</mml:mn></mml:msup></mml:math></inline-formula>–10<inline-formula><mml:math id="M147" display="inline"><mml:msup><mml:mi/><mml:mn mathvariant="normal">11</mml:mn></mml:msup></mml:math></inline-formula> and therefore the Fisher information matrix is ill-conditioned. In particular, the condition number increases as the data size increased, due to the ill-posedness of the inverse problem of parameter estimation.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F2" specific-use="star"><?xmltex \currentcnt{2}?><label>Figure 2</label><caption><p id="d1e4546">The mean and standard deviation of the condition numbers of the Fisher information matrices, computed using true trajectories, out of 100 simulations of length ranging from <inline-formula><mml:math id="M148" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> to <inline-formula><mml:math id="M149" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">5</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>. The condition numbers are at the scale of 10<inline-formula><mml:math id="M150" display="inline"><mml:msup><mml:mi/><mml:mn mathvariant="normal">8</mml:mn></mml:msup></mml:math></inline-formula>–10<inline-formula><mml:math id="M151" display="inline"><mml:msup><mml:mi/><mml:mn mathvariant="normal">11</mml:mn></mml:msup></mml:math></inline-formula>, indicating that the Fisher information matrix is ill-conditioned. </p></caption>
          <?xmltex \igopts{width=341.433071pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f02.png"/>

        </fig>

      <p id="d1e4599">The ill-conditioned Fisher information matrix leads to highly variable maximum likelihood estimators (MLEs), computed from <inline-formula><mml:math id="M152" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">F</mml:mi><mml:mi>N</mml:mi></mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi>N</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> with
<inline-formula><mml:math id="M153" display="inline"><mml:mrow><mml:msub><mml:mi>b</mml:mi><mml:mi>N</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:msub><mml:mfenced open="(" close=")"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:msubsup><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="bold">R</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, which follows from Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E41"/>).</p>
      <?pagebreak page233?><p id="d1e4743">The ill-posedness is particularly problematic when <inline-formula><mml:math id="M154" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> is observed with noise, as the ill-conditioned Fisher information matrix amplifies the noise in observations and leads to nonphysical estimators. Figure <xref ref-type="fig" rid="Ch1.F3"/> shows the means and standard deviations of errors of MLEs computed from true and noisy trajectories in 100 independent simulations. In each of these simulations, the “noisy” trajectory is obtained by adding a white noise with standard deviation <inline-formula><mml:math id="M155" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="italic">ϵ</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.01</mml:mn></mml:mrow></mml:math></inline-formula> to a “true” trajectory generated from the system with a true parameter randomly drawn from the prior.  For both Gaussian and uniform priors, the standard deviations and means of the errors of the MLE from the noisy trajectories are 1 order of magnitude larger than those from true trajectories. In particular, the variations are large when the data size is small. For example, when <inline-formula><mml:math id="M156" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula>,  the standard deviation of the MLE for <inline-formula><mml:math id="M157" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> from noisy observations is  on the order of <inline-formula><mml:math id="M158" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">3</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, 2 orders of magnitude larger than its physical range in Table <xref ref-type="table" rid="Ch1.T2"/>.</p>
      <p id="d1e4820">The standard deviations decrease as the data size increases, at the expected rate of <inline-formula><mml:math id="M159" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:msqrt><mml:mi>N</mml:mi></mml:msqrt></mml:mrow></mml:math></inline-formula>. However,  the errors are too large to be practically reduced by increasing the size of the data: for example, a data size <inline-formula><mml:math id="M160" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">10</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> is needed to reduce the standard deviation of <inline-formula><mml:math id="M161" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> to less than <inline-formula><mml:math id="M162" display="inline"><mml:mn mathvariant="normal">0.1</mml:mn></mml:math></inline-formula> (which is about 10 % the size of the physical range <inline-formula><mml:math id="M163" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">6.00</mml:mn><mml:mo>,</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">4.80</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> as specified in Table <xref ref-type="table" rid="Ch1.T2"/>).
In summary, the ill-posedness leads to parameter estimators with  large variations that are far outside the physical ranges of the parameters.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F3" specific-use="star"><?xmltex \currentcnt{3}?><label>Figure 3</label><caption><p id="d1e4894">The standard deviations and means of the errors of the MLEs, computed from true and noisy trajectories, out of 100 independent simulations with true parameters sampled from the Gaussian and uniform priors. In all cases, the deviations and biases (i.e., means of errors) are  large.  In  particular, in the case of noisy observations, the deviations are on orders ranging from 10 to 1000, far beyond the physical ranges of the parameters in Table <xref ref-type="table" rid="Ch1.T1"/>. Though the deviations decrease as data size increases, an impractically large data size is needed to reduce them to a physical range. Also, the means of errors are larger than the size of physical ranges of the parameters, with values that decay slowly as data size increases. </p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f03.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS3">
  <label>3.3</label><title>Regularized posteriors</title>
      <p id="d1e4913">To overcome the ill-posedness of the parameter estimation problem, we introduce strongly regularized posteriors by normalizing the likelihood function. In addition, to prevent unphysical values of the states, we further regularize the state variables in the likelihood by an uninformative climatological prior. That is, consider the <italic>regularized posterior</italic>:
            <disp-formula id="Ch1.E17" content-type="numbered"><label>17</label><mml:math id="M164" display="block"><mml:mtable rowspacing="0.2ex" columnspacing="1em" class="split" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mi>N</mml:mi></mml:msup></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>Z</mml:mi></mml:mfrac></mml:mstyle><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:msup><mml:mfenced close="]" open="["><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          where <inline-formula><mml:math id="M165" display="inline"><mml:mrow><mml:mi>Z</mml:mi><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mo>∫</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:msup><mml:mfenced open="[" close="]"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle></mml:mfenced><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">d</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is a normalizing constant and <inline-formula><mml:math id="M166" display="inline"><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the prior of the states  estimated from a Gaussian fit to climatological statistics of the observations, neglecting correlations.  That is, we set <inline-formula><mml:math id="M167" display="inline"><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> as
            <disp-formula id="Ch1.E18" content-type="numbered"><label>18</label><mml:math id="M168" display="block"><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>:=</mml:mo><mml:munderover><mml:mo movablelimits="false">∏</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:mi mathvariant="italic">π</mml:mi><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>c</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mstyle><mml:mi>exp⁡</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>u</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>c</mml:mi></mml:msub><mml:msup><mml:mo>|</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow><mml:mrow><mml:mn mathvariant="normal">2</mml:mn><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>c</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          with <inline-formula><mml:math id="M169" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:msqrt><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>o</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>-</mml:mo><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup></mml:mrow></mml:msqrt></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M170" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>u</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M171" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the mean and standard deviation of the observations over all states. Here the multiplicative factor 2 aims for a larger band to avoid an overly narrow prior for the states.</p>
      <p id="d1e5443">This prior can be viewed as a joint distribution of the state variables assuming all components are independent identically Gaussian distributed with mean <inline-formula><mml:math id="M172" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>u</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and variance <inline-formula><mml:math id="M173" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi>c</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup></mml:mrow></mml:math></inline-formula>. Clearly, it uses the minimum amount of information about the state variables, and we expect it can be improved by taking into consideration spatial correlations or additional field knowledge in practice.</p>
      <?pagebreak page234?><p id="d1e5473">The regularized posterior can be viewed as an extension of the regularized cost function in the variational approach. In fact, the negative logarithm of the  regularized posterior is the same (up to a multiplicative factor <inline-formula><mml:math id="M174" display="inline"><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle></mml:math></inline-formula> and an additive constant <inline-formula><mml:math id="M175" display="inline"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mi>Z</mml:mi><mml:mo>-</mml:mo><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:mi>log⁡</mml:mi><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>) as the cost function in variational approaches with regularization. More precisely, we have
            <disp-formula id="Ch1.E19" content-type="numbered"><label>19</label><mml:math id="M176" display="block"><mml:mtable class="split" columnspacing="1em" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mo>-</mml:mo><mml:mi>log⁡</mml:mi><mml:msup><mml:mi>p</mml:mi><mml:mi>N</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>+</mml:mo><mml:mi>log⁡</mml:mi><mml:mi>Z</mml:mi><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:mi>log⁡</mml:mi><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          where <inline-formula><mml:math id="M177" display="inline"><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the cost function with regularization:
            <disp-formula id="Ch1.E20" content-type="numbered"><label>20</label><mml:math id="M178" display="block"><mml:mtable rowspacing="0.2ex" class="split" columnspacing="1em" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mo>-</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mi>log⁡</mml:mi><mml:mo mathsize="1.1em">[</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo mathsize="1.1em">]</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>-</mml:mo><mml:mi>N</mml:mi><mml:mi>log⁡</mml:mi><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mi>log⁡</mml:mi><mml:msup><mml:mi>p</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          When the prior is Gaussian, the regularization corresponds to  Tikhonov regularization. Therefore, the regularized posterior extends the regularized cost function to a probability distribution, with the MAP being the minimizer of the regularized cost function.</p>
      <p id="d1e5852">The  regularized posterior  normalizes the likelihood by an exponent <inline-formula><mml:math id="M179" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math></inline-formula>.  This normalization allows for a larger weight (more trust) on the prior, which can then sufficiently regularize the singularity in the likelihood and therefore reduces the probability of nonphysical samples. Intuitively, it avoids the shrinking of the likelihood as the data size increases.
When the system is ergodic, the sum <inline-formula><mml:math id="M180" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:msubsup><mml:mi>log⁡</mml:mi><mml:mo mathsize="1.1em">[</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo mathsize="1.1em">]</mml:mo></mml:mrow></mml:math></inline-formula> converges to the spatial average <inline-formula><mml:math id="M181" display="inline"><mml:mrow><mml:mi mathvariant="double-struck">E</mml:mi><mml:mo>[</mml:mo><mml:mi>log⁡</mml:mi><mml:mo mathsize="1.1em">[</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> with respect to the invariant measure as <inline-formula><mml:math id="M182" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> increases. While being effective, this factor may not be optimal <xref ref-type="bibr" rid="bib1.bibx39" id="paren.35"/>, and we leave the exploration of optimal regularization factors to future work.</p>
      <p id="d1e6014">In the sampling of the regularized posterior, we update the state variable <inline-formula><mml:math id="M183" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> conditional on <inline-formula><mml:math id="M184" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M185" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> by sampling <inline-formula><mml:math id="M186" display="inline"><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mi>c</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (with <inline-formula><mml:math id="M187" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> specified in Eq. <xref ref-type="disp-formula" rid="Ch1.E13"/>) using SMC methods. Compared to the standard PMCMC algorithm outlined in Sect. <xref ref-type="sec" rid="Ch1.S2.SS4"/>, the only difference occurs when we update the parameter <inline-formula><mml:math id="M188" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> conditional on the estimated states <inline-formula><mml:math id="M189" display="inline"><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>. Instead of  Eq. (<xref ref-type="disp-formula" rid="Ch1.E14"/>), we draw a sample of <inline-formula><mml:math id="M190" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> from the regularized posterior
            <disp-formula id="Ch1.E21" content-type="numbered"><label>21</label><mml:math id="M191" display="block"><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mi>N</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mo>∝</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo><mml:mo>[</mml:mo><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:msup><mml:mo>]</mml:mo><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
</sec>
</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Bayesian inference with regularized posteriors</title>
      <p id="d1e6292">The regularized posteriors are approximated by the empirical distribution of samples drawn using particle MCMC methods, specifically PGAS (see Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS3"/>) in combination with SMC using optimal importance sampling (see Sect. <xref ref-type="sec" rid="App1.Ch1.S1.SS2"/>). In the following section, we  first diagnose the Markov chain and choose a reasonable chain length for subsequent analyses. We then present the results of parameter estimation and state estimation.</p>
      <p id="d1e6299">In all the tests presented in this study, we use only <inline-formula><mml:math id="M192" display="inline"><mml:mrow><mml:mi>M</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:math></inline-formula> particles for the SMC, as we can be confident of the Markov chain produced by the particle MCMC methods converging to the target distribution based on theoretical results <xref ref-type="bibr" rid="bib1.bibx2 bib1.bibx30" id="paren.36"><named-content content-type="pre">see</named-content></xref>. In general, the more particles are used, the better the SMC algorithm (and hence the particle MCMC methods) will perform, at the price of increased  computational cost.</p>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>Diagnosis of the Markov chain Monte Carlo algorithm</title>
      <p id="d1e6326">To ensure that the Markov chain generated by PGAS is well-mixed and to find a length for the chain such that the posterior is acceptably approximated, we shall assess the Markov chain by three criteria: the update rate of states; the correlation length of the Markov chain; and the convergence of the marginal posteriors of the parameters. These empirical criteria are convenient and, as we discuss below, have found to be effective in our study.  We refer to
<xref ref-type="bibr" rid="bib1.bibx13" id="text.37"/> for a detailed review of various criteria for diagnosing MCMC.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F4" specific-use="star"><?xmltex \currentcnt{4}?><label>Figure 4</label><caption><p id="d1e6334">The update rate  of the states at different times along the trajectory. The high update rate at time <inline-formula><mml:math id="M193" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> is due to the initialization of the particles near the equilibrium and the ancestor sampling. The high update rate at the end time is due to the nature of the SMC filter. Note that the uniform prior has update rates close to 1 at all times.  </p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f04.png"/>

        </fig>

      <?xmltex \floatpos{t}?><fig id="Ch1.F5" specific-use="star"><?xmltex \currentcnt{5}?><label>Figure 5</label><caption><p id="d1e6357">The empirical autocorrelation functions (ACFs) of the Markov chain of parameters <inline-formula><mml:math id="M194" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and states <inline-formula><mml:math id="M195" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> at times <inline-formula><mml:math id="M196" display="inline"><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1040</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">90</mml:mn><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> and nodes <inline-formula><mml:math id="M197" display="inline"><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">8</mml:mn><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, computed from  a Markov chain with length 10 000. The ACFs fall within a threshold of 0.1 around zero within a time lag of about 25 for the Gaussian prior, and a time lag of about 5 for the uniform prior.</p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f05.png"/>

        </fig>

      <?pagebreak page235?><p id="d1e6452">The update rate of states is computed at each time of the state trajectory <inline-formula><mml:math id="M198" display="inline"><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> along the Markov chain. That is, at each time, we say the state is updated from the previous step of the Markov chain if any entry of the state vector changes. The update rate measures the mixing of the Markov chain. In general, an update rate above 0.5 is preferred, but a high rate close to 1 is not necessarily the best. Figure <xref ref-type="fig" rid="Ch1.F4"/> shows the update rates of typical simulations for both the Gaussian prior and the uniform prior. For both priors, the update rates are above 0.5, indicating a fast mixing of the chain. The rates tend to increase with time (except for the first time step) to a value close to 1 at the end of the trajectory. This phenomenon agrees with the particle depletion nature of the SMC filter: when tracing back in time to sample the ancestors, there are fewer particles and therefore the update rate is lower. The high update rate at the time <inline-formula><mml:math id="M199" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> step is due to our initialization of the particles near the equilibrium, which increases the possibility of ancestor updates in PGAS.
We also note that the uniform prior has update rates close to 1 at all times, much higher than the rates of the Gaussian prior.
Higher update rates occur for the uniform prior because the deviations of parameter samples from the previous values are larger, resulting in an increased probability of updating the reference trajectory in the conditional SMC.</p>
      <p id="d1e6485">We test the correlation length of the Markov chain by finding the smallest lag at which the empirical autocorrelation functions (ACFs) of the states and the parameters are close to zero.</p>
      <p id="d1e6488">Figure <xref ref-type="fig" rid="Ch1.F5"/> shows the empirical ACFs of the parameters and states at representative nodes, computed using a Markov chain with length 10 000. The ACFs approach zero within a time lag of around 40 (based on a threshold value of 0.1) for the Gaussian prior, and within a time lag of around 5 for the uniform prior. The smaller correlation length in the uniform prior case is again due to the larger parameter variation in the uniform prior case than the Gaussian prior case.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F6" specific-use="star"><?xmltex \currentcnt{6}?><label>Figure 6</label><caption><p id="d1e6495">The empirical marginal distributions of the samples from the posterior as the length of the Markov chain increases. Note that the marginal posteriors converge rapidly as the length of the chain increases. In particular, a chain with length 1000 provides a reasonable approximation to the posterior, capturing the shape and spread of the distribution.  </p></caption>
          <?xmltex \igopts{width=426.791339pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f06.png"/>

        </fig>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T4"><?xmltex \currentcnt{4}?><label>Table 4</label><caption><p id="d1e6507">The settings of the particle MCMC using SMC with optimal importance densities. </p></caption><oasis:table frame="topbot"><oasis:tgroup cols="2">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M200" display="inline"><mml:mrow><mml:mi>M</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">Number particles in SMC</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M201" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">Length of the Markov chain</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><inline-formula><mml:math id="M202" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col2">Number of time steps of observations.</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d1e6583">The relatively small decorrelation length of the Markov chain indicates that we can accurately approximate the posterior by a chain of a relatively short length. This result is demonstrated in Fig. <xref ref-type="fig" rid="Ch1.F6"/>, where we plot the empirical marginal<?pagebreak page236?> posteriors of the parameters, using Markov chains of three different lengths: <inline-formula><mml:math id="M203" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1000</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">5000</mml:mn></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M204" display="inline"><mml:mrow><mml:mn mathvariant="normal">10</mml:mn><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mn mathvariant="normal">000</mml:mn></mml:mrow></mml:math></inline-formula>. The marginal posteriors with <inline-formula><mml:math id="M205" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1000</mml:mn></mml:mrow></mml:math></inline-formula> are reasonably close to those with <inline-formula><mml:math id="M206" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, and those with <inline-formula><mml:math id="M207" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">5000</mml:mn></mml:mrow></mml:math></inline-formula> are almost identical to those with <inline-formula><mml:math id="M208" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>. In particular, the marginal posteriors with <inline-formula><mml:math id="M209" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">3</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> capture the shape and spread of the distributions for <inline-formula><mml:math id="M210" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>. Therefore, a Markov chain with length <inline-formula><mml:math id="M211" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> provides a reasonably accurate approximation of the posterior. Hence, we use Markov chains with length <inline-formula><mml:math id="M212" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> in all simulations from here on.  This choice of chain length may be longer than necessary, but allows for confidence that the results are robust.</p>
      <p id="d1e6730">In summary, based on the above diagnosis of the Markov chain generated by PMCMC, to run many simulations for statistical analysis of the algorithm within a limited computation cost, we use chains with length <inline-formula><mml:math id="M213" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> to approximate the posterior.  For the SMC algorithm, we use only five particles. The number of observations in time is <inline-formula><mml:math id="M214" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>Parameter estimation</title>
      <p id="d1e6768">One of the main goals in Bayesian inference is to quantify the uncertainty in the parameter-state estimation by the posterior. We access the parameter estimation by examining the samples of the posterior in a typical simulation, for which we consider the scatter plots and marginal distributions, the MAP, and the posterior mean. We also examine the statistics of the MAP and the posterior mean in 100 independent simulations. In each simulation, the parameters are drawn from the prior distribution of <inline-formula><mml:math id="M215" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>. Then, a realization of the SEBM is simulated. Finally, observations are created by applying the observation model to the SEBM realization.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F7" specific-use="star"><?xmltex \currentcnt{7}?><label>Figure 7</label><caption><p id="d1e6780">Posteriors of the parameters in a typical simulation, with both the Gaussian and the uniform prior. The true values of the parameters, as well as the data trajectory, are the same for both priors. The top row displays scatter plots of the samples (blue dots), with the true values of the parameters shown by asterisks. The bottom row displays the marginal posteriors (blue lines) of each component of the parameters and the priors (black dash-dot lines), with the posterior mean marked by diamonds and the true values marked by asterisks. The posterior correlations are <inline-formula><mml:math id="M216" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mn mathvariant="normal">01</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.20</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M217" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mn mathvariant="normal">04</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.19</mml:mn></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M218" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mn mathvariant="normal">14</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.57</mml:mn></mml:mrow></mml:math></inline-formula> in the case of the Gaussian prior and <inline-formula><mml:math id="M219" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mn mathvariant="normal">01</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.23</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M220" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mn mathvariant="normal">04</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.01</mml:mn></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M221" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ρ</mml:mi><mml:mn mathvariant="normal">14</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.05</mml:mn></mml:mrow></mml:math></inline-formula> in the case of the uniform prior.</p></caption>
          <?xmltex \igopts{width=426.791339pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f07.png"/>

        </fig>

      <p id="d1e6888">The empirical marginal posteriors of the parameters <inline-formula><mml:math id="M222" display="inline"><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> in two typical simulations, for the Gaussian and uniform priors, are shown in Fig. <xref ref-type="fig" rid="Ch1.F7"/>. The top row presents scatter plots of samples along with the true values of the parameters (asterisks) and the bottom row presents the marginal posteriors for each parameter in comparison with the priors.</p>
      <p id="d1e6927">In the case of the Gaussian prior, the scatter plots show a posterior that is far from Gaussian, with clear nonlinear dependence between <inline-formula><mml:math id="M223" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and the other parameters. The marginal posteriors of <inline-formula><mml:math id="M224" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M225" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> are close to their priors, with larger tails (to the left for <inline-formula><mml:math id="M226" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and to the right for <inline-formula><mml:math id="M227" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>). The marginal distribution of <inline-formula><mml:math id="M228" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> concentrates near the center of the prior with a larger tail to the right.  The posterior has the most probability mass near the true values of <inline-formula><mml:math id="M229" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M230" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, which are in the high-probability region of the prior.  However,  it has no probability mass near the true value of <inline-formula><mml:math id="M231" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> – which is of a low probability in the prior.</p>
      <p id="d1e7030">In the case of the uniform prior, the scatter plots show a concentration of probability near the boundaries of the physical range.  The marginal posteriors of <inline-formula><mml:math id="M232" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M233" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> clearly deviate  from the priors, concentrating near the parameter bounds (the upper bound for <inline-formula><mml:math id="M234" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and the lower bound for <inline-formula><mml:math id="M235" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> in this realization); the marginal posterior of <inline-formula><mml:math id="M236" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> is close to the prior, with slightly more probability mass for large values.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F8" specific-use="star"><?xmltex \currentcnt{8}?><label>Figure 8</label><caption><p id="d1e7090">The marginal posteriors with a different set of true values for the parameters. The marginal posteriors change little from those in Fig. <xref ref-type="fig" rid="Ch1.F7"/>.</p></caption>
          <?xmltex \igopts{width=426.791339pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f08.png"/>

        </fig>

      <p id="d1e7101">Further tests show that the posterior is not sensitive to changes in the true values of the parameters. This fact is demonstrated in Fig. <xref ref-type="fig" rid="Ch1.F8"/>, which presents the marginal distributions for another set of true values of the parameters (but<?pagebreak page237?> without changing the priors). Though the data change when the true parameters change, the posteriors, in comparison with those in Fig. <xref ref-type="fig" rid="Ch1.F7"/>, change little for both cases of Gaussian and uniform prior.</p>
      <p id="d1e7108">The non-Gaussianity of the posterior (including the concentration near the boundaries), its insensitivity to changes in the true parameter, and its limited reduction of uncertainty from the prior (Figs. <xref ref-type="fig" rid="Ch1.F7"/>–<xref ref-type="fig" rid="Ch1.F8"/>) are due to the degeneracy of the likelihood distribution and  to the strong regularization. Recall that the degenerate likelihood leads to MLEs with large variations and biases, with the standard deviation of the estimators of <inline-formula><mml:math id="M237" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M238" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> being about 10 times larger than those of <inline-formula><mml:math id="M239" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> (see Fig. <xref ref-type="fig" rid="Ch1.F3"/>). As a result, when regularized by the Gaussian prior, the components <inline-formula><mml:math id="M240" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M241" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, which are more under-determined by the likelihood, are constrained mainly by the Gaussian prior, and therefore their marginal posteriors are close to their marginal priors. In contrast, the component <inline-formula><mml:math id="M242" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> is forced to concentrate around the center of the prior but with a large tail. While dramatically reducing the large uncertainty of <inline-formula><mml:math id="M243" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M244" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> in the ill-conditioned likelihood, the regularized posterior still exhibits a slightly larger uncertainty than the prior for these two components.</p>
      <?pagebreak page238?><p id="d1e7207">In the case of the uniform prior,  it is particularly noteworthy that the marginal posteriors of <inline-formula><mml:math id="M245" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M246" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> differ more from their priors than the parameter <inline-formula><mml:math id="M247" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>.  These results are the opposite of what was found for the Gaussian prior. Such differences are due to the different mechanism of “regularization” by the two priors. The Gaussian prior eliminates the ill-posedness by regularizing the ill-conditioned Fisher information matrix with the covariance of the prior. So, the information in the likelihood, e.g., the bias and the correlations between <inline-formula><mml:math id="M248" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M249" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, is preserved in the regularized posterior. The uniform prior, on the other hand, cuts the support of the degenerate likelihood and rejects  out-of-range samples. As a result, the correlation between <inline-formula><mml:math id="M250" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M251" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> is preserved in the regularized posterior because they feature similar variations, but the correlations between <inline-formula><mml:math id="M252" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M253" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> as well as <inline-formula><mml:math id="M254" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> are weakened (Fig. <xref ref-type="fig" rid="Ch1.F7"/>).</p>
      <p id="d1e7334">In practice, one is often interested in a point estimate of parameters.  Commonly used point estimators are the MAP and the posterior mean. Figures <xref ref-type="fig" rid="Ch1.F7"/>–<xref ref-type="fig" rid="Ch1.F8"/> show that both the MAP and the posterior mean  can be far away from the truth for Gaussian as well as uniform priors. In particular, in the case of the uniform prior, the MAP values are further away from the truth than the posterior mean. In the case of the Gaussian prior, the MAP values do not present a clear advantage or disadvantage over the posterior mean.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T5" specific-use="star"><?xmltex \currentcnt{5}?><label>Table 5</label><caption><p id="d1e7344">Means and standard deviations of the errors of the posterior mean and MAP in 100 independent simulations. </p></caption><oasis:table frame="topbot"><oasis:tgroup cols="5">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">

         <oasis:entry namest="col1" nameend="col5"><bold>(a)</bold> The case of observing 6 of the 12 nodes </oasis:entry>

       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">

         <oasis:entry colname="col1"/>

         <oasis:entry colname="col2"/>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M255" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M256" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M257" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row>

         <oasis:entry rowsep="1" colname="col1" morerows="1">Gauss prior</oasis:entry>

         <oasis:entry colname="col2">Posterior mean</oasis:entry>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M258" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.44</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.58</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M259" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.09</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.42</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M260" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.11</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.20</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row rowsep="1">

         <oasis:entry colname="col2">MAP</oasis:entry>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M261" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.32</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.61</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4">0<inline-formula><mml:math id="M262" display="inline"><mml:mrow><mml:mn>.02</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.42</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M263" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.03</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.21</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row>

         <oasis:entry rowsep="1" colname="col1" morerows="1">Uniform prior</oasis:entry>

         <oasis:entry colname="col2">Posterior mean</oasis:entry>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M264" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.75</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.0</mml:mn></mml:mrow></mml:math></inline-formula>6</oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M265" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.31</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.07</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5">-<inline-formula><mml:math id="M266" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.02</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.35</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row rowsep="1">

         <oasis:entry colname="col2">MAP</oasis:entry>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M267" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.02</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.53</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M268" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.51</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.49</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M269" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.15</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.43</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row rowsep="1">

         <oasis:entry namest="col1" nameend="col5"><bold>(b)</bold>The case of observing 2 of the 12 nodes. </oasis:entry>

       </oasis:row>
       <oasis:row rowsep="1">

         <oasis:entry colname="col1"/>

         <oasis:entry colname="col2"/>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M270" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M271" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M272" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row>

         <oasis:entry rowsep="1" colname="col1" morerows="1">Gauss prior</oasis:entry>

         <oasis:entry colname="col2">Posterior mean</oasis:entry>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M273" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.32</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.61</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M274" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.03</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.37</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M275" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.10</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.20</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row rowsep="1">

         <oasis:entry colname="col2">MAP</oasis:entry>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M276" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.19</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.67</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M277" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.10</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.38</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M278" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.02</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.20</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row>

         <oasis:entry colname="col1" morerows="1">Uniform prior</oasis:entry>

         <oasis:entry colname="col2">Posterior mean</oasis:entry>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M279" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.77</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.12</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M280" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.39</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.00</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M281" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.07</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.36</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
       <oasis:row>

         <oasis:entry colname="col2">MAP</oasis:entry>

         <oasis:entry colname="col3"><inline-formula><mml:math id="M282" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.06</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.55</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col4"><inline-formula><mml:math id="M283" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.61</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.42</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

         <oasis:entry colname="col5"><inline-formula><mml:math id="M284" display="inline"><mml:mrow><mml:mn mathvariant="normal">0.27</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.42</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>

       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

      <p id="d1e7867">Table <xref ref-type="table" rid="Ch1.T5"/>a shows the means and standard deviations of the errors of the posterior mean and MAP from 100 independent simulations. In each simulation and for each prior, we drew a parameter sample from the prior and generated a trajectory of observations, and then estimated jointly the parameters and states. The table shows that both posterior mean and MAP estimates are generally biased, consistent with the biases in Figs. <xref ref-type="fig" rid="Ch1.F7"/> and <xref ref-type="fig" rid="Ch1.F8"/>. More specifically, in the case of the Gaussian prior, the MAP has slightly smaller biases than the posterior mean, but the two have almost the same variances. Both are negatively biased for <inline-formula><mml:math id="M285" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and slightly positively biased for <inline-formula><mml:math id="M286" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M287" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>. In the case of the uniform prior, the MAP features biases and standard deviations which are about 50 % larger than those of the posterior mean. Both estimators exhibit large positive biases in <inline-formula><mml:math id="M288" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, large negative biases in <inline-formula><mml:math id="M289" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, and small positive biases in <inline-formula><mml:math id="M290" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F9" specific-use="star"><?xmltex \currentcnt{9}?><label>Figure 9</label><caption><p id="d1e7945">The ensemble of sample trajectories of the state at an observed node. Top row: the sample trajectories (in cyan) concentrate around the true trajectory (in black dash-asterisk). The true trajectory is well-estimated by the ensemble mean (in blue dash-diamond) and is mostly enclosed by the 1-standard-deviation band (in magenta dash-dot lines). The relative error of the ensemble mean along the trajectory is 0.7 % and 0.8 %, filtering out 30 % and 20 % of the observation noise, respectively.
Bottom row: histograms of samples at three instants of time: <inline-formula><mml:math id="M291" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">20</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M292" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M293" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula>. The histograms show that the samples concentrate around the true states. </p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f09.png"/>

        </fig>

</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>State estimates</title>
      <p id="d1e7999">The state estimation aims both to filter out the noise from the observed nodes and to estimate the states of unobserved nodes. We access the state estimation by examining the ensemble of the posterior trajectories in a typical simulation, for which we consider the marginal distributions and the coverage probability of 90 % credible intervals. We also examine the statistics of these quantities in 100 independent simulations.</p>
      <p id="d1e8002">We present the ensemble of  posterior trajectories at an observed node in Fig. <xref ref-type="fig" rid="Ch1.F9"/> and at an unobserved node in Fig. <xref ref-type="fig" rid="Ch1.F10"/>. In each of these figures, we present the ensemble mean with a 1-standard-deviation band, in comparison with the true trajectories, superimposed on the ensembles of all sample trajectories at these nodes. We also present histograms of samples at three instants of time: <inline-formula><mml:math id="M294" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">20</mml:mn></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M295" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M296" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d1e8045">Figure <xref ref-type="fig" rid="Ch1.F9"/> shows that the trajectory of the observed node is well estimated by the ensemble mean, with a relative error of 0.7 %. Recall that the observation noise leads to a relative error of about 1 %, so the posterior filters out 30 % of the noise.  Also note that the ensemble quantifies the uncertainty of the estimation, with the true trajectory being mostly enclosed within a 1-standard-deviation band around the ensemble mean. Further, the histograms of samples at the three time instants show that the ensemble generally concentrates near the truth. In the Gaussian prior case, the peak of the histogram decreases as time increases, partially due to the degeneracy of SMC when we trace back the particles in time. In the uniform prior case, the ensembles are less concentrated than those in the Gaussian case, due to the wide spread of the parameter samples (Fig. <xref ref-type="fig" rid="Ch1.F7"/>).</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F10" specific-use="star"><?xmltex \currentcnt{10}?><label>Figure 10</label><caption><p id="d1e8055">The ensemble of sample trajectories of the state at an unobserved node. The ensembles exhibit a large uncertainty in both cases of priors, but the posterior means achieve relative errors of 0.8 % and 3.3 % in the cases of Gaussian and uniform priors, respectively. The 1-standard-deviation band covers the true trajectory at most times. Bottom row: the histogram of samples at three time instants, showing that the samples concentrate around the true states. Particularly, in the case of the Gaussian prior, the peaks of the histogram are close to the true states, even when the histograms form a multi-mode distribution. </p></caption>
          <?xmltex \igopts{width=398.338583pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f10.png"/>

        </fig>

      <p id="d1e8064">Figure <xref ref-type="fig" rid="Ch1.F10"/> shows sample trajectories of an unobserved node. Despite the fact that the node is unobserved, the<?pagebreak page239?> posterior means have relative errors of 0.8 % and 3.3 % in the cases of Gaussian and uniform priors, respectively, with a 1-standard-deviation band covering the true trajectory at most times. While the sparse observations do cause large uncertainties for both posteriors,  the histograms of samples show that the ensembles concentrate near the truth. Particularly, in the case of Gaussian priors, the peaks of the histogram are close to the true states, even when the histograms form a multi-modal distribution due to the  degeneracy of SMC.</p>
      <?pagebreak page240?><p id="d1e8069">We find that the posterior is able to filter out the noise in the observed nodes and reduce the uncertainty in the unobserved nodes from the climatological distribution.  In particular, in the case of the Gaussian prior, the ensemble of posterior samples concentrates near the true state at both observed and unobserved nodes and substantially reduces the uncertainty. In the case of the uniform prior, the ensemble of posterior samples spreads more widely and only slightly reduces the uncertainty.</p>
      <p id="d1e8072">The coverage probability (CP), the proportion of the states whose 90 % credible intervals contain the true values, is 95 % in the Gaussian prior case and 92 % for the uniform prior in the above simulation. The target probability is 90 %, as in this case 90 % of the true values would be covered by 90 % credible intervals. The values indicate statistically meaningful uncertainty estimates, for example larger uncertainty ranges at nodes with higher mean errors. The slight over-dispersiveness, i.e., higher CPs than the target probabilities, might be a result of the large uncertainty in the parameter estimates.</p>
      <p id="d1e8075">Table <xref ref-type="table" rid="Ch1.T6"/> shows the means and standard deviations of the relative errors and CPs in state estimation by the posterior mean in 100 independent simulations, averaging over observed and unobserved notes.  The relative errors at each time <inline-formula><mml:math id="M297" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> are computed by averaging the error of the ensemble mean (relative to the true value) over all the nodes. The relative error of the trajectory is the average over all  times along the trajectory. The relative errors are  1.14 % and 2.39 %, respectively, for the cases of Gaussian and uniform priors. These numbers are a result of averaging over the observed and unobserved nodes. Note that the relative errors are similar at different times <inline-formula><mml:math id="M298" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mo>(</mml:mo><mml:mn mathvariant="normal">20</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">60</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">100</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, indicating that the MCMC is able to ameliorate the degeneracy of the SMC to faithfully sample the posterior of the states.</p>
      <p id="d1e8111">In the Gaussian prior case, the CPs are above the target probability in the 100 independent simulations, with a mean of 96 %. This supports the finding from above that the posteriors are slightly over-dispersive due to the large uncertainty in the parameter estimates. The standard deviation is very small, with 2 %, which indicates the robustness of the Gaussian prior model. In the uniform prior case, the CPs are much lower, with a mean of 73 %. This might be a result of larger biases compared to the Gaussian prior case which are not compensated by larger uncertainty estimates. In addition, the standard deviation is much higher in the uniform prior case,<?pagebreak page241?> with 31 %. This shows that this case is less robust than the Gaussian prior case.</p>

<?xmltex \floatpos{t}?><table-wrap id="Ch1.T6" specific-use="star"><?xmltex \currentcnt{6}?><label>Table 6</label><caption><p id="d1e8118">Means and standard deviations of the relative errors of the posterior mean trajectories of all nodes and the relative errors at three instants of time,  computed from 100 independent simulations. In the last column, the mean and standard deviations of CPs are given in percent.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="6">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:colspec colnum="5" colname="col5" align="right"/>
     <oasis:colspec colnum="6" colname="col6" align="right"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry namest="col1" nameend="col6"><bold>(a)</bold> The case of observing 6 out of the 12 nodes. </oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Trajectory</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M299" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">20</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M300" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M301" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6">CP</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Gaussian prior (%)</oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M302" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.14</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.41</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M303" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.11</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.47</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M304" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.09</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>.</mml:mo></mml:mrow></mml:math></inline-formula>47</oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M305" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.07</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.46</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M306" display="inline"><mml:mrow><mml:mn mathvariant="normal">96</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Uniform prior  (%)</oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M307" display="inline"><mml:mrow><mml:mn mathvariant="normal">2.39</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.59</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M308" display="inline"><mml:mrow><mml:mn mathvariant="normal">2.44</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.64</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M309" display="inline"><mml:mrow><mml:mn mathvariant="normal">2.42</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.66</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M310" display="inline"><mml:mrow><mml:mn mathvariant="normal">2.41</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.63</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M311" display="inline"><mml:mrow><mml:mn mathvariant="normal">73</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">31</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry namest="col1" nameend="col6"><bold>(b)</bold> The case of observing 2 out of the 12 nodes. </oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Trajectory</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M312" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">20</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M313" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M314" display="inline"><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6">CP</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Gaussian prior (%)</oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M315" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.43</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.44</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M316" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.38</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.53</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M317" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.43</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.51</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M318" display="inline"><mml:mrow><mml:mn mathvariant="normal">1.33</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">0.54</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6">92<inline-formula><mml:math id="M319" display="inline"><mml:mo>±</mml:mo></mml:math></inline-formula>6</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Uniform prior  (%)</oasis:entry>
         <oasis:entry colname="col2"><inline-formula><mml:math id="M320" display="inline"><mml:mrow><mml:mn mathvariant="normal">2.46</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.28</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M321" display="inline"><mml:mrow><mml:mn mathvariant="normal">2.47</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.35</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M322" display="inline"><mml:mrow><mml:mn mathvariant="normal">2.49</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.33</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col5"><inline-formula><mml:math id="M323" display="inline"><mml:mrow><mml:mn mathvariant="normal">2.47</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.34</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col6"><inline-formula><mml:math id="M324" display="inline"><mml:mrow><mml:mn mathvariant="normal">75</mml:mn><mml:mo>±</mml:mo><mml:mn mathvariant="normal">25</mml:mn></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
</sec>
<sec id="Ch1.S5">
  <label>5</label><title>Discussion</title>
<sec id="Ch1.S5.SS1">
  <label>5.1</label><title>Observing fewer nodes</title>
      <p id="d1e8568">We tested the consequences of having sparser observations in space, e.g., observing only 2 out of the 12 nodes. In the Gaussian prior case, in a typical simulation with the same true parameters and observation data as in Sect. <xref ref-type="sec" rid="Ch1.S4.SS2"/>, the relative error in state estimation increases slightly, from 0.7 % to 0.8 % for the observed node and from 0.8 % to 1.1 % for the unobserved node.  As a result, the overall error increases. The parameter estimates show small but noticeable changes (see Fig. <xref ref-type="fig" rid="Ch1.F11"/>): the posteriors of the parameters have slightly wider support and the posterior means and MAPs exhibit slightly larger errors than those in Sect. <xref ref-type="sec" rid="Ch1.S4.SS2"/>.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F11" specific-use="star"><?xmltex \currentcnt{11}?><label>Figure 11</label><caption><p id="d1e8579">The case of observing 2 out of the 12 nodes: marginal posteriors of <inline-formula><mml:math id="M325" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula>. With the same true parameters and the same observation dataset as in Fig. <xref ref-type="fig" rid="Ch1.F7"/>, the marginal posteriors have slightly wider supports. </p></caption>
          <?xmltex \igopts{width=455.244094pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f11.png"/>

        </fig>

      <p id="d1e8597">We also ran 100 independent simulations to investigate sampling variability in the state and parameter estimates. Table <xref ref-type="table" rid="Ch1.T6"/>b reports the means and standard deviations of the relative errors of the posterior mean trajectory, and CPs for state estimation in these simulations. The Gaussian prior case shows small increases in both the means and the standard deviations of errors, as well as slightly lower and less robust CPs. This confirms the results quoted above for a typical simulation. The uniform prior case shows almost negligible error and CP increases. Table <xref ref-type="table" rid="Ch1.T5"/>b reports the means and standard deviations of the posterior means and MAP for parameter estimation in these simulations. Small changes in comparison to the results in Table <xref ref-type="table" rid="Ch1.T5"/>a are found. These small changes are due to the strong regularization that has been introduced to overcome the degeneracy of the likelihood.</p>
</sec>
<sec id="Ch1.S5.SS2">
  <label>5.2</label><title>Observing a longer trajectory</title>
      <p id="d1e8614">When the length <inline-formula><mml:math id="M326" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> of the trajectory of observation increases, the exponent of the regularized posterior Eq. (<xref ref-type="disp-formula" rid="Ch1.E19"/>), viewed as a function of <inline-formula><mml:math id="M327" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> only, tends to its expectation with respect to the ergodic measure of the system, i.e., <inline-formula><mml:math id="M328" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mover accent="true"><mml:mi mathvariant="italic">⟶</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mo>→</mml:mo><mml:mi mathvariant="normal">∞</mml:mi></mml:mrow></mml:mover><mml:mi mathvariant="double-struck">E</mml:mi><mml:mo>[</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> almost surely. As a result, the marginal posterior tends to be stable as <inline-formula><mml:math id="M329" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> increases. This result indicates that an increase in data size has a limited effect on the regularized posterior of parameters. This fact is verified by numerical tests with <inline-formula><mml:math id="M330" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1000</mml:mn></mml:mrow></mml:math></inline-formula>, in which the marginal posteriors only have a slightly wider support than those in Fig. <xref ref-type="fig" rid="Ch1.F7"/> with <inline-formula><mml:math id="M331" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d1e8760">In general, the number of observations needed for the posterior to reach a steady state depends on the dimension of the parameters and the speed of convergence to the ergodic measure of the system. Here we have only three parameters and the SEBM converges to its stationary measure exponentially (in fewer than 10 time steps); therefore, <inline-formula><mml:math id="M332" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula> is large enough to make the posterior be close to the steady state.</p>
      <p id="d1e8775">When the trajectory is long, a major issue is the computational cost from sampling the posterior of the states. Note that as <inline-formula><mml:math id="M333" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> increases, the dimension of the states in the posterior increases, demanding a longer Markov chain to explore the target distribution. In numerical tests with <inline-formula><mml:math id="M334" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1000</mml:mn></mml:mrow></mml:math></inline-formula>, the correlation length of the Markov chain is at least 100, about 4 times the correlation length found for <inline-formula><mml:math id="M335" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula>. Therefore, to obtain the same number of effective samples as before, we would need a Markov chain with length at least 4 times the previous length, say,  <inline-formula><mml:math id="M336" display="inline"><mml:mrow><mml:mi>L</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">4</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mn mathvariant="normal">4</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>.  The computational cost increases linearly in <inline-formula><mml:math id="M337" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mi>L</mml:mi></mml:mrow></mml:math></inline-formula>, with each step requiring an integration of the SPDE. The high computational cost, an instance of the well-known “curse of dimensionality”, renders the direct sampling of the posterior unfeasible. Two groups of methods could reduce the computational cost and make the Bayesian inference feasible. The first group of methods, dynamical model reduction, exploits the low-dimensional structure of the stochastic process to develop low-dimensional dynamical models which efficiently reproduce the statistical–dynamical properties needed in the SMC <xref ref-type="bibr" rid="bib1.bibx11 bib1.bibx34 bib1.bibx9 bib1.bibx26" id="paren.38"><named-content content-type="pre">see, e.g.,</named-content><named-content content-type="post">and the references therein</named-content></xref>. The other group of methods approximates the marginal posterior of the parameter by reduced-order models for the response of the data to parameters <xref ref-type="bibr" rid="bib1.bibx35 bib1.bibx6 bib1.bibx14 bib1.bibx12 bib1.bibx33 bib1.bibx24" id="paren.39"><named-content content-type="pre">see, e.g.,</named-content></xref>. In a paleoclimate reconstruction context, the number of observations will generally be determined by available observations and the length of the reconstruction period rather than by computational considerations.  We leave these further developments of efficient sampling methods for long trajectories as a direction of future research.</p>
</sec>
<sec id="Ch1.S5.SS3">
  <label>5.3</label><title>Estimates of the nonlinear function</title>
      <p id="d1e8859">One goal of parameter estimation is to identify the nonlinear function <inline-formula><mml:math id="M338" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (specified in Eq. <xref ref-type="disp-formula" rid="Ch1.E2"/>) in the SEBM. The posterior of the parameters also quantifies the uncertainty in the identification of <inline-formula><mml:math id="M339" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>.
Figure <xref ref-type="fig" rid="Ch1.F12"/> shows  the nonlinear function <inline-formula><mml:math id="M340" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> associated with the true parameters and with the MAPs and posterior means presented in Fig. <xref ref-type="fig" rid="Ch1.F7"/>, superposed on an ensemble of the nonlinear function evaluated with all the samples. Note that in the Gaussian prior case, the true and estimated functions <inline-formula><mml:math id="M341" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are close even though <inline-formula><mml:math id="M342" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mn mathvariant="normal">4</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> is estimated with large biases by either the posterior mean or by the MAP. In the uniform prior case,  the posterior mean has a smaller error than the MAP and leads to a better estimate of the nonlinear function. In either case, the large band of the ensemble represents a large uncertainty in the estimates.</p>

      <?xmltex \floatpos{t}?><fig id="Ch1.F12" specific-use="star"><?xmltex \currentcnt{12}?><label>Figure 12</label><caption><p id="d1e8926">Top row: the true nonlinear function <inline-formula><mml:math id="M343" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and its estimators using the posterior mean and MAP, superposed on the ensemble of all estimators using the samples. Bottom row: the distribution of the equilibrium state <inline-formula><mml:math id="M344" display="inline"><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (i.e., the zero of the nonlinear function <inline-formula><mml:math id="M345" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>) and the distribution of <inline-formula><mml:math id="M346" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:mi>u</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, with <inline-formula><mml:math id="M347" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> being samples of the prior and of the posterior.</p></caption>
          <?xmltex \igopts{width=455.244094pt}?><graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-f12.png"/>

        </fig>

      <?pagebreak page242?><p id="d1e9011">For the Gaussian prior,  neither the posterior distribution of the equilibrium state <inline-formula><mml:math id="M348" display="inline"><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">e</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (for which <inline-formula><mml:math id="M349" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">e</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>) nor of the feedback strength <inline-formula><mml:math id="M350" display="inline"><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:mi mathvariant="normal">d</mml:mi><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">e</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is substantially changed from the corresponding priors.  Both experience only a small reduction of uncertainty.  In contrast, the posterior distributions are narrower than the priors for the uniform prior case – although the posterior means and MAPs are both biased.</p>
</sec>
<sec id="Ch1.S5.SS4">
  <label>5.4</label><title>Implications for paleoclimate reconstructions</title>
      <p id="d1e9085">Our analysis shows that assessing the well-posedness of  the inverse problem of parameter estimation is a necessary first step for paleoclimate reconstructions making use of physically motivated parametric models. When the problem is ill-posed, a straightforward Bayesian inference will lead to biased and unphysical parameter estimates.
We overcome this issue by using regularized posteriors, resulting in parameter estimates in the physically reasonable range with quantified uncertainty. However, it should be kept in mind that this approach relies strongly on high-quality prior distributions.</p>
      <p id="d1e9088">The ill-posedness of the parameter estimation problem for the model we have considered is of particular interest because the form of the nonlinear function <inline-formula><mml:math id="M351" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is not arbitrary but is motivated by the physics of the energy budget of the atmosphere. The fact that wide ranges of the parameters <inline-formula><mml:math id="M352" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are consistent with the “observations” even in this highly idealized setting indicates that surface temperature observations themselves may not be sufficient to constrain physically important parameters such as albedo, graybody thermal emissivity, or air–sea exchange coefficients separately.
While state-space modeling approaches allow reconstruction of past surface climate states, it may be the case that the associated climate forcing may not contain sufficient information to extract the relative contributions of the individual physical processes that produced it. Further research will be necessary to understand whether the contribution of, e.g., a single process like graybody thermal emissivity can be reliably<?pagebreak page243?> estimated from the observations if regularized posteriors are used to constrain the other parameters of <inline-formula><mml:math id="M353" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d1e9136">If the purpose of using the SEBM is to introduce physical structure into the state reconstructions without specific concern regarding the parametric form of <inline-formula><mml:math id="M354" display="inline"><mml:mi>g</mml:mi></mml:math></inline-formula>, re-parametrization or nonparametric Bayesian inference can be used to estimate the form of the nonlinear function <inline-formula><mml:math id="M355" display="inline"><mml:mi>g</mml:mi></mml:math></inline-formula> but avoid the ill-posedness of the parameter estimation problem. This is an option if the interest is in the posterior of the climate state and not in the individual contributions of energy sink and source processes.</p>
      <p id="d1e9153">State-of-the-art observation operators in paleoclimatology are often nonlinear and contain non-Gaussian elements <xref ref-type="bibr" rid="bib1.bibx23 bib1.bibx50" id="paren.40"/>. A locally linearized observation model with data coming from the interpolation of proxy data can be used in the modeling framework we have considered, along with the assumption of Gaussian observation noise. Alternatively, it is also possible to first compute offline point-wise reconstructions by inverting the full observation operator, potentially  interpolating the results in time, and  using a Gaussian approximation of the point-wise posterior distributions as observations in the SEBM <xref ref-type="bibr" rid="bib1.bibx40" id="paren.41"><named-content content-type="pre">e.g.,</named-content></xref>. We anticipate that such simplified observation operators will limit the accuracy of the parameter estimation but that the regularized posterior would still be able to distinguish the most likely states and quantify the uncertainty in the estimation. Directly using nonlinear, non-Gaussian observation operators requires a more sophisticated particle filter as optimal filtering is no longer possible. Such approaches will increase the computational cost and face difficulties in avoiding filter degeneracy.</p>
</sec>
</sec>
<sec id="Ch1.S6" sec-type="conclusions">
  <label>6</label><title>Conclusions and future work</title>
      <p id="d1e9173">We have investigated the joint state-parameter estimation of a nonlinear stochastic energy balance model (SEBM) motivated by the problem of spatial–temporal paleoclimate reconstruction from sparse and noisy data, for which parameter estimation is an ill-posed inverse problem. We introduced strongly regularized posteriors to overcome the ill-posedness by restricting the parameters and states to  physical ranges and by normalizing the likelihood function. We considered both a uniform prior and a more informative Gaussian prior based on the physical ranges of the parameters. We sampled the regularized high-dimensional posteriors by a particle Gibbs with ancestor sampling (PGAS) sampler that combines Markov chain Monte Carlo (MCMC) with an optimal particle filter to exploit the forward structure of the SEBM.</p>
      <p id="d1e9176">Results show that the regularization overcomes the ill-posedness in parameter estimation and leads to physical<?pagebreak page244?> posteriors quantifying the uncertainty in parameter-state estimation. Due to the ill-posedness, the posterior of the parameters features a relatively large uncertainty. This result implies that there can be a large uncertainty in point estimators such as the posterior mean or the maximum a posteriori (MAP), the latter of which corresponds to the minimizer in a variational approach with regularization. Despite the large uncertainty in parameter estimation, the marginal posteriors of the states generally concentrate near the truth, reducing the uncertainty in state reconstruction. In particular, the more informative Gaussian prior leads to much better estimations than the uniform prior: the uncertainty in the posterior is smaller, the MAP and posterior mean have smaller errors in both state and parameter estimates, and the coverage probabilities are higher and more robust.</p>
      <p id="d1e9179">Results also show that the regularized posterior is robust to spatial sparsity of observations, with sparser observations leading to slightly larger uncertainties due to less information.  However, due to the need for regularization to overcome ill-posedness, the uncertainty in the posterior of the parameters cannot be eliminated by increasing the number of observations in time. Therefore, we suggest alternative approaches, such as re-parametrization of the nonlinear function according to the climatological distribution or nonparametric Bayesian inference <xref ref-type="bibr" rid="bib1.bibx38 bib1.bibx19" id="paren.42"><named-content content-type="pre">see, e.g.,</named-content></xref>, to avoid ill-posedness.</p>
      <p id="d1e9187">This work shows that it is necessary to assess the well-posedness of the inverse problem of parameter estimation when reconstructing paleoclimate fields with physically motivated parametric stochastic models. In our case, the natural physical formulation of the SEBM is ill-posed. While climate states can be reconstructed, values of individual parameters are not strongly constrained by the observations. Regularized posteriors are a way to overcome the ill-posedness but retain a specific parametric form of the nonlinear function representing the climate forcings.</p>
</sec>

      
      </body>
    <back><notes notes-type="dataavailability"><title>Data availability</title>

      <p id="d1e9194">The paper runs synthetic simulations, so there are no data to be accessed. The MATLAB codes for the numerical simulations are available at GitHub:
<uri>https://github.com/feilumath/InferSEBM.git</uri> (last access: 12 August 2019).</p>
  </notes><?xmltex \hack{\clearpage}?><app-group>

<?pagebreak page245?><app id="App1.Ch1.S1">
  <?xmltex \currentcnt{A}?><label>Appendix A</label><title>Technical details of the estimation procedure</title>
<sec id="App1.Ch1.S1.SS1">
  <label>A1</label><title>Discretization of the SEBM</title>
<sec id="App1.Ch1.S1.SS1.SSS1">
  <label>A1.1</label><title>Finite-element representation in space</title>
      <p id="d1e9225">We discretize the SEBM in space by finite-element methods <xref ref-type="bibr" rid="bib1.bibx1" id="paren.43"><named-content content-type="pre">see, e.g.,</named-content></xref>.
Denote by <inline-formula><mml:math id="M356" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula> the finite-element basis functions, and approximate the solution <inline-formula><mml:math id="M357" display="inline"><mml:mrow><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> by
              <disp-formula id="App1.Ch1.S1.E22" content-type="numbered"><label>A1</label><mml:math id="M358" display="block"><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:munderover><mml:msub><mml:mover accent="true"><mml:mi>u</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
            The coefficients <inline-formula><mml:math id="M359" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>u</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are determined by the following weak Galerkin projection of the SEBM Eq. (<xref ref-type="disp-formula" rid="Ch1.E1"/>):
              <disp-formula id="App1.Ch1.S1.E23" content-type="numbered"><label>A2</label><mml:math id="M360" display="block"><mml:mtable rowspacing="0.2ex" class="split" columnspacing="1em" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mo>〈</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mo>〉</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mo>〈</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mo>〉</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:mo>〈</mml:mo><mml:mi mathvariant="normal">∇</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">∇</mml:mi><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mo>〉</mml:mo><mml:mi mathvariant="normal">d</mml:mi><mml:mi>s</mml:mi></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>+</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:mo>〈</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mo>〉</mml:mo><mml:mi mathvariant="normal">d</mml:mi><mml:mi>s</mml:mi><mml:mo>+</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:mo>〈</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mo>〉</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
            where <inline-formula><mml:math id="M361" display="inline"><mml:mi mathvariant="italic">ϕ</mml:mi></mml:math></inline-formula> is a continuously differentiable compactly supported test function and the integral <inline-formula><mml:math id="M362" display="inline"><mml:mrow><mml:msubsup><mml:mo>∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:msubsup><mml:mo>〈</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mo>〉</mml:mo></mml:mrow></mml:math></inline-formula> is an Itô integral.</p>
      <p id="d1e9601">For convenience, we write this Galerkin approximate system in vector notation.  Denote

                  <disp-formula specific-use="gather" content-type="numbered"><mml:math id="M363" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E24"><mml:mtd><mml:mtext>A3</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>u</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>u</mml:mi><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mi>T</mml:mi></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E25"><mml:mtd><mml:mtext>A4</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mfenced open="(" close=")"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mi>T</mml:mi></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E26"><mml:mtd><mml:mtext>A5</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mi>U</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mi mathvariant="normal">Φ</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula>

              Taking  <inline-formula><mml:math id="M364" display="inline"><mml:mrow><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M365" display="inline"><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> in  Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E23"/>) and using the symmetry of the inner product, we obtain a stochastic integral equation for the coefficient <inline-formula><mml:math id="M366" display="inline"><mml:mrow><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>:
              <disp-formula id="App1.Ch1.S1.E27" content-type="numbered"><label>A6</label><mml:math id="M367" display="block"><mml:mtable rowspacing="0.2ex" columnspacing="1em" class="split" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mo>〈</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">Φ</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>〉</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo>〈</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">Φ</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>〉</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>〈</mml:mo><mml:mi mathvariant="normal">∇</mml:mi><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">∇</mml:mi><mml:msup><mml:mi mathvariant="normal">Φ</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>〉</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>)</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">d</mml:mi><mml:mi>s</mml:mi><mml:mo>+</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:mo>〈</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>〉</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">d</mml:mi><mml:mi>s</mml:mi><mml:mo>+</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:mo>〈</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>〉</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
            To simplify notation, we denote the mass and stiffness matrices by
              <disp-formula id="App1.Ch1.S1.E28" content-type="numbered"><label>A7</label><mml:math id="M368" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mo>〈</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="normal">Φ</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>〉</mml:mo><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="1em"/><mml:mspace width="1em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>=</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>〈</mml:mo><mml:mi mathvariant="normal">∇</mml:mi><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">∇</mml:mi><mml:msup><mml:mi mathvariant="normal">Φ</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>〉</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            which are symmetric, tri-diagonal, positive definite matrices in <inline-formula><mml:math id="M369" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, and we denote the nonlinear term as
              <disp-formula id="App1.Ch1.S1.E29" content-type="numbered"><label>A8</label><mml:math id="M370" display="block"><mml:mrow><mml:msub><mml:mi>G</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>:=</mml:mo><mml:mo>〈</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msup><mml:mi>U</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>〉</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d1e10183">The above stochastic integral equation can then be written as
              <disp-formula id="App1.Ch1.S1.E30" content-type="numbered"><label>A9</label><mml:math id="M371" display="block"><mml:mtable rowspacing="0.2ex" class="split" columnspacing="1em" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="bold">1</mml:mn></mml:msub><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>)</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="normal">d</mml:mi><mml:mi>s</mml:mi><mml:mo>+</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:msub><mml:mi>G</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:mi>s</mml:mi><mml:mo>+</mml:mo><mml:munderover><mml:mo movablelimits="false">∫</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mi>t</mml:mi></mml:munderover><mml:mo>〈</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>s</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>〉</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
            The mesh on the sphere and the matrices <inline-formula><mml:math id="M372" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M373" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> are computed with R package INLA <xref ref-type="bibr" rid="bib1.bibx28 bib1.bibx5" id="paren.44"/>.</p>
</sec>
<sec id="App1.Ch1.S1.SS1.SSS2">
  <label>A1.2</label><title>Representation of the nonlinear term</title>
      <p id="d1e10353">The parametric nonlinear functional  <inline-formula><mml:math id="M374" display="inline"><mml:mrow><mml:msub><mml:mi>G</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is approximated using the finite elements. We approximate each spatial integration over an element triangle in <inline-formula><mml:math id="M375" display="inline"><mml:mrow><mml:mo>〈</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>〉</mml:mo></mml:mrow></mml:math></inline-formula> by the volume of the triangular pyramid whose height is the value of the nonlinear function at the center of the element triangle <inline-formula><mml:math id="M376" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, i.e.,
              <disp-formula id="App1.Ch1.S1.E31" content-type="numbered"><label>A10</label><mml:math id="M377" display="block"><mml:mtable class="split" columnspacing="1em" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mo movablelimits="false">∫</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>u</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="italic">ξ</mml:mi></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>≈</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>⊂</mml:mo><mml:mi mathvariant="normal">supp</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:munder><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">Area</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mn mathvariant="normal">3</mml:mn></mml:mfrac></mml:mstyle><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mfenced close=")" open="("><mml:mrow><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mi>i</mml:mi></mml:munder><mml:msub><mml:mi>U</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="italic">ξ</mml:mi><mml:mi>k</mml:mi><mml:mi>c</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
            where <inline-formula><mml:math id="M378" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">ξ</mml:mi><mml:mi>k</mml:mi><mml:mi>c</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> is the center of the triangle <inline-formula><mml:math id="M379" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. In the discretized system, we assume that this approximation has a negligible error and take it as our nonlinear functional. In vector notation, it reads
              <disp-formula id="App1.Ch1.S1.E32" content-type="numbered"><label>A11</label><mml:math id="M380" display="block"><mml:mrow><mml:msub><mml:mi>G</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi mathvariant="script">T</mml:mi></mml:msub><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold">A</mml:mi><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            with <inline-formula><mml:math id="M381" display="inline"><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mi mathvariant="script">T</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mfenced close=")" open="("><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:mi mathvariant="normal">Area</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mn mathvariant="normal">3</mml:mn></mml:mfrac></mml:mstyle></mml:mfenced><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">e</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math id="M382" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">e</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denoting the number of triangle elements and the matrix <inline-formula><mml:math id="M383" display="inline"><mml:mrow><mml:mi mathvariant="bold">A</mml:mi><mml:mo>=</mml:mo><mml:mfenced close=")" open="("><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="italic">ξ</mml:mi><mml:mi>k</mml:mi><mml:mi>c</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:mfenced><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">e</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, such that the function <inline-formula><mml:math id="M384" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold">A</mml:mi><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is interpreted as an element-wise evaluation. For the nonlinear function <inline-formula><mml:math id="M385" display="inline"><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> in Eq. (<xref ref-type="disp-formula" rid="Ch1.E2"/>), we can write the above nonlinear term as
              <disp-formula id="App1.Ch1.S1.E33" content-type="numbered"><label>A12</label><mml:math id="M386" display="block"><mml:mrow><mml:msub><mml:mi>G</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:munder><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:msub><mml:mi>A</mml:mi><mml:mi mathvariant="script">T</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold">A</mml:mi><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>∘</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M387" display="inline"><mml:mrow><mml:mo>∘</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:math></inline-formula> denotes the entry-wise product of the array.</p>
</sec>
<sec id="App1.Ch1.S1.SS1.SSS3">
  <label>A1.3</label><title>Representation of the stochastic forcing</title>
      <?pagebreak page246?><p id="d1e10872">Following <xref ref-type="bibr" rid="bib1.bibx29" id="text.45"/>, the stochastic forcing <inline-formula><mml:math id="M388" display="inline"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is approximated by its linear finite-element truncation,
              <disp-formula id="App1.Ch1.S1.E34" content-type="numbered"><label>A13</label><mml:math id="M389" display="block"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:munderover><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            with the stochastic processes <inline-formula><mml:math id="M390" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> being spatially correlated and white in time. Note that for <inline-formula><mml:math id="M391" display="inline"><mml:mrow><mml:mi mathvariant="italic">ν</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.1</mml:mn></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M392" display="inline"><mml:mrow><mml:mi mathvariant="italic">ρ</mml:mi><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula> in the Matérn covariance Eq. (<xref ref-type="disp-formula" rid="Ch1.E4"/>), the process <inline-formula><mml:math id="M393" display="inline"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the stationary solution of the stochastic Laplace equation
              <disp-formula id="App1.Ch1.S1.E35" content-type="numbered"><label>A14</label><mml:math id="M394" display="block"><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup><mml:mo>-</mml:mo><mml:mi mathvariant="italic">ν</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">△</mml:mi><mml:mo>)</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi></mml:msub><mml:mi>W</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ξ</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M395" display="inline"><mml:mi>W</mml:mi></mml:math></inline-formula> is a spatio-temporal white noise <xref ref-type="bibr" rid="bib1.bibx56 bib1.bibx57" id="paren.46"/>.  Computationally efficient approximations of the forcing process are obtained using the GMRF approximation of <xref ref-type="bibr" rid="bib1.bibx29" id="text.47"/> which generates <inline-formula><mml:math id="M396" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>≡</mml:mo><mml:mfenced close=")" open="("><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> by solving Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E35"/>). That is, using the above finite-element notation, we solve for each time <inline-formula><mml:math id="M397" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> the linear system
              <disp-formula id="App1.Ch1.S1.E36" content-type="numbered"><label>A15</label><mml:math id="M398" display="block"><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>)</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mo>=</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:msub><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi></mml:msub><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mo>〈</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>,</mml:mo><mml:mi>W</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>〉</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where the random vector <inline-formula><mml:math id="M399" display="inline"><mml:mrow><mml:mo>〈</mml:mo><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>,</mml:mo><mml:mi>W</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>〉</mml:mo><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mfenced open="(" close=")"><mml:mrow><mml:mo>〈</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi>W</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>〉</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mo>〈</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi mathvariant="normal">b</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi>W</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo><mml:mo>〉</mml:mo></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula> is Gaussian  with mean 0 and covariance <inline-formula><mml:math id="M400" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>. Solving Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E36"/>) yields
              <disp-formula id="App1.Ch1.S1.E37" content-type="numbered"><label>A16</label><mml:math id="M401" display="block"><mml:mrow><mml:mi>F</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>∼</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="script">N</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M402" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi></mml:msub><mml:mo>:=</mml:mo><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="App1.Ch1.S1.SS1.SSS4">
  <label>A1.4</label><title>Semi-backward Euler time integration</title>
      <p id="d1e11474">Equation (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E30"/>) is integrated in time by a semi-backward Euler scheme:
              <disp-formula id="App1.Ch1.S1.E38" content-type="numbered"><label>A17</label><mml:math id="M403" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi>G</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msqrt><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msqrt><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msub><mml:mi>F</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            where <inline-formula><mml:math id="M404" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the approximation of <inline-formula><mml:math id="M405" display="inline"><mml:mrow><mml:mi>U</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math id="M406" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi>n</mml:mi><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M407" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> is a sequence of iid random vectors with distribution <inline-formula><mml:math id="M408" display="inline"><mml:mrow><mml:mi mathvariant="script">N</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfenced></mml:mrow></mml:math></inline-formula>, with the matrix <inline-formula><mml:math id="M409" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>  denoting
              <disp-formula id="App1.Ch1.S1.E39" content-type="numbered"><label>A18</label><mml:math id="M410" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>:=</mml:mo><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
</sec>
<sec id="App1.Ch1.S1.SS1.SSS5">
  <label>A1.5</label><title>Efficient generation of the Gaussian field</title>
      <p id="d1e11724">It follows from Eq. (<xref ref-type="disp-formula" rid="App1.Ch1.S1.E36"/>) that <inline-formula><mml:math id="M411" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msub><mml:mi>F</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is Gaussian with mean zero and covariance <inline-formula><mml:math id="M412" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>. Note that while <inline-formula><mml:math id="M413" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is a sparse matrix, its inverse matrix <inline-formula><mml:math id="M414" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula> is not. To efficiently use the sparseness of <inline-formula><mml:math id="M415" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">ρ</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, following <xref ref-type="bibr" rid="bib1.bibx29" id="text.48"/>, we approximate <inline-formula><mml:math id="M416" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> by <inline-formula><mml:math id="M417" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold">M</mml:mi><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:mo>:=</mml:mo><mml:mi mathvariant="normal">diag</mml:mi><mml:mo>(</mml:mo><mml:mo>〈</mml:mo><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>〉</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and compute the noise <inline-formula><mml:math id="M418" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msub><mml:mi>F</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> by <inline-formula><mml:math id="M419" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">C</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="script">N</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M420" display="inline"><mml:mi mathvariant="bold">C</mml:mi></mml:math></inline-formula> is the Cholesky factorization of the  inverse of the covariance matrix (called the precision matrix) <inline-formula><mml:math id="M421" display="inline"><mml:mrow><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="bold">M</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mn mathvariant="normal">0</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">κ</mml:mi></mml:msub><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="bold">M</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mn mathvariant="normal">0</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mi mathvariant="italic">κ</mml:mi></mml:msub><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="bold">M</mml:mi><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover><mml:mn mathvariant="normal">0</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula>.The precision matrix is a sparse representation of the inverse of the covariance. Therefore, the matrix <inline-formula><mml:math id="M422" display="inline"><mml:mi mathvariant="bold">C</mml:mi></mml:math></inline-formula> is also sparse and the noise sequence can be efficiently generated.</p>
      <p id="d1e11989">In summary, we can write the discretized SEBM in the form

                  <disp-formula id="App1.Ch1.S1.E40" content-type="numbered"><label>A19</label><mml:math id="M423" display="block"><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where the deterministic function <inline-formula><mml:math id="M424" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is given by

                  <disp-formula id="App1.Ch1.S1.E41" content-type="numbered"><label>A20</label><mml:math id="M425" display="block"><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi mathvariant="bold">M</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">4</mml:mn></mml:mrow></mml:munder><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:mspace width="1em" linebreak="nobreak"/></mml:mrow></mml:math></disp-formula>

            with <inline-formula><mml:math id="M426" display="inline"><mml:mrow><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi>A</mml:mi><mml:mi mathvariant="script">T</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="bold">A</mml:mi><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mrow><mml:mo>∘</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M427" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi>W</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> is a sequence of
iid Gaussian noise with mean 0 and covariance <inline-formula><mml:math id="M428" display="inline"><mml:mi mathvariant="bold">R</mml:mi></mml:math></inline-formula>:

                  <disp-formula id="App1.Ch1.S1.E42" content-type="numbered"><label>A21</label><mml:math id="M429" display="block"><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold">R</mml:mi><mml:mo>=</mml:mo><mml:msubsup><mml:mi mathvariant="italic">σ</mml:mi><mml:mi mathvariant="normal">f</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msubsup><mml:msup><mml:mi mathvariant="bold">C</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:msup><mml:mi mathvariant="bold">C</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msup><mml:msubsup><mml:mi mathvariant="bold">M</mml:mi><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msubsup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
</sec>
</sec>
<sec id="App1.Ch1.S1.SS2">
  <label>A2</label><title>SMC with optimal importance sampling</title>
      <p id="d1e12311">SMC methods approximate the target density <inline-formula><mml:math id="M430" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>  sequentially by weighted random samples called particles (hereafter we drop the subindex <inline-formula><mml:math id="M431" display="inline"><mml:mi mathvariant="italic">θ</mml:mi></mml:math></inline-formula> to simplify notation):
            <disp-formula id="App1.Ch1.S1.E43" content-type="numbered"><label>A22</label><mml:math id="M432" display="block"><mml:mrow><mml:mover accent="true"><mml:mi>p</mml:mi><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>:=</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:munderover><mml:msubsup><mml:mi>w</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:msub><mml:mi mathvariant="italic">δ</mml:mi><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi mathvariant="normal">d</mml:mi><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          with <inline-formula><mml:math id="M433" display="inline"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup><mml:msubsup><mml:mi>w</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>. These weighted samples are drawn sequentially by importance sampling based on the recurrent formation
            <disp-formula id="App1.Ch1.S1.E44" content-type="numbered"><label>A23</label><mml:math id="M434" display="block"><mml:mtable columnspacing="1em" class="split" rowspacing="0.2ex" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>p</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          More precisely, suppose that at time <inline-formula><mml:math id="M435" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula>, we have weighted samples <inline-formula><mml:math id="M436" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>. One first draws a sample <inline-formula><mml:math id="M437" display="inline"><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> from an easy-to-sample importance density <inline-formula><mml:math id="M438" display="inline"><mml:mrow><mml:mi>q</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> that approximates the “incremental density” which is proportional to <inline-formula><mml:math id="M439" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for each <inline-formula><mml:math id="M440" display="inline"><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:math></inline-formula> and computes incremental weights
            <disp-formula id="App1.Ch1.S1.E45" content-type="numbered"><label>A24</label><mml:math id="M441" display="block"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>|</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>q</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
          which  account for the discrepancy between the two densities. One then assigns normalized weights <inline-formula><mml:math id="M442" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mo>∝</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:msubsup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> to the concatenated sample trajectories <inline-formula><mml:math id="M443" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d1e12995">A clear drawback of the above procedure is that all but one of the weights <inline-formula><mml:math id="M444" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> will become close to zero as the number of iterations increases, due to the multiplication and normalization operations. To avoid this, one replaces the unevenly weighted samples  <inline-formula><mml:math id="M445" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> with uniformly weighted samples from the approximate density <inline-formula><mml:math id="M446" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>p</mml:mi><mml:mo mathvariant="normal" stretchy="true">^</mml:mo></mml:mover><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. This is the well-known <italic>resampling</italic> technique. In summary, the above operations are carried out as follows:
<list list-type="custom"><list-item><label>i.</label>
      <p id="d1e13104">draw random indices <inline-formula><mml:math id="M447" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> according to the discrete probability distribution <inline-formula><mml:math id="M448" display="inline"><mml:mrow><mml:mi mathvariant="double-struck">F</mml:mi><mml:mo>(</mml:mo><mml:mo>⋅</mml:mo><mml:mo>|</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> on the set  <inline-formula><mml:math id="M449" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>M</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, which is defined as<disp-formula id="App1.Ch1.S1.E46" content-type="numbered"><label>A25</label><mml:math id="M450" display="block"><mml:mrow><mml:mi mathvariant="double-struck">F</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:mo>|</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:mi mathvariant="normal">for</mml:mi><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>M</mml:mi><mml:mo>;</mml:mo></mml:mrow></mml:math></disp-formula></p></list-item><list-item><label>ii.</label>
      <p id="d1e13272">for each <inline-formula><mml:math id="M451" display="inline"><mml:mi>m</mml:mi></mml:math></inline-formula>, draw a sample <inline-formula><mml:math id="M452" display="inline"><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> from <inline-formula><mml:math id="M453" display="inline"><mml:mrow><mml:mi>q</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and set <inline-formula><mml:math id="M454" display="inline"><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>:=</mml:mo><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>;</p></list-item><list-item><label>iii.</label>
      <p id="d1e13401">compute and normalize the weights<disp-formula id="App1.Ch1.S1.E47" content-type="numbered"><label>A26</label><mml:math id="M455" display="block"><mml:mtable rowspacing="0.2ex" columnspacing="1em" class="split" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>:=</mml:mo><mml:msub><mml:mi mathvariant="italic">α</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>|</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:msubsup><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>q</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd/><mml:mtd><mml:mrow><mml:msubsup><mml:mi>w</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>:=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">α</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup></mml:mrow><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup><mml:msubsup><mml:mi mathvariant="italic">α</mml:mi><mml:mi mathvariant="normal">n</mml:mi><mml:mi>k</mml:mi></mml:msubsup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p></list-item></list></p>
      <p id="d1e13601">The above SMC sampling procedure is called sequential importance sampling with resampling (SIR) <xref ref-type="bibr" rid="bib1.bibx15" id="paren.49"><named-content content-type="pre">see, e.g.,</named-content></xref> and is summarized in Algorithm 1.</p>
      <p id="d1e13609"><?xmltex \hack{\begin{figure*}[p]}?><?xmltex \igopts{width=497.923228pt}?><inline-graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-g01.png"/><?xmltex \hack{\end{figure*}}?></p>
<?pagebreak page247?><sec id="App1.Ch1.S1.SS2.SSS1">
  <label>A2.1</label><title>Optimal importance sampling</title>
      <p id="d1e13625">Note that the conditional transition density of the states <inline-formula><mml:math id="M456" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi mathvariant="italic">θ</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> in Eq. (<xref ref-type="disp-formula" rid="Ch1.E7"/>) is Gaussian and the observation model in Eq. (<xref ref-type="disp-formula" rid="Ch1.E8"/>) is linear and Gaussian.  These facts allow for a Gaussian optimal importance density <inline-formula><mml:math id="M457" display="inline"><mml:mrow><mml:mi>q</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> that is proportional to <inline-formula><mml:math id="M458" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for each <inline-formula><mml:math id="M459" display="inline"><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:math></inline-formula>:
              <disp-formula id="App1.Ch1.S1.E48" content-type="numbered"><label>A27</label><mml:math id="M460" display="block"><mml:mrow><mml:mi>q</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo><mml:mo>∼</mml:mo><mml:mi mathvariant="script">N</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:mi mathvariant="bold">Σ</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
            with the mean <inline-formula><mml:math id="M461" display="inline"><mml:mrow><mml:msubsup><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> and the covariance <inline-formula><mml:math id="M462" display="inline"><mml:mi mathvariant="bold">Σ</mml:mi></mml:math></inline-formula> given by

                  <disp-formula specific-use="align" content-type="numbered"><mml:math id="M463" display="block"><mml:mtable displaystyle="true"><mml:mlabeledtr id="App1.Ch1.S1.E49"><mml:mtd><mml:mtext>A28</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:msubsup><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mo>=</mml:mo><mml:mi mathvariant="italic">μ</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">RH</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mi mathvariant="bold">Q</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:mi mathvariant="bold">H</mml:mi><mml:mi mathvariant="italic">μ</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr><mml:mlabeledtr id="App1.Ch1.S1.E50"><mml:mtd><mml:mtext>A29</mml:mtext></mml:mtd><mml:mtd><mml:mrow><mml:mstyle displaystyle="true" class="stylechange"/><mml:mi mathvariant="bold">Σ</mml:mi></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mstyle class="stylechange" displaystyle="true"/><mml:mo>=</mml:mo><mml:mi mathvariant="bold">R</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mi mathvariant="bold">RH</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold">Q</mml:mi><mml:mo>+</mml:mo><mml:msup><mml:mi mathvariant="bold">HRH</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup><mml:mi mathvariant="bold">HR</mml:mi><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mlabeledtr></mml:mtable></mml:math></disp-formula></p>
</sec>
<sec id="App1.Ch1.S1.SS2.SSS2">
  <label>A2.2</label><title>Drawbacks of SMC</title>
      <p id="d1e13993">While the resampling technique prevents <inline-formula><mml:math id="M464" display="inline"><mml:mrow><mml:msubsup><mml:mi>w</mml:mi><mml:mi>n</mml:mi><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> from being degenerate at each current time <inline-formula><mml:math id="M465" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula>, SMC algorithms suffer from the degeneracy (or particle depletion) problem: the marginal distribution <inline-formula><mml:math id="M466" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi>p</mml:mi><mml:mo stretchy="true" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> becomes concentrated on a single particle as <inline-formula><mml:math id="M467" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>-</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:math></inline-formula> increases because each resampling step reduces the number of distinct particles of <inline-formula><mml:math id="M468" display="inline"><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. As a result, the estimate of the joint density <inline-formula><mml:math id="M469" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> of the trajectory deteriorates as time <inline-formula><mml:math id="M470" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> increases.</p>
</sec>
</sec>
<sec id="App1.Ch1.S1.SS3">
  <label>A3</label><title>Particle Gibbs and PGAS</title>
      <p id="d1e14126">The framework of particle MCMC introduced in <xref ref-type="bibr" rid="bib1.bibx2" id="text.50"/> is a systematic combination of SMC and MCMC methods, exploiting the strengths of both techniques. Among the various particle MCMC methods, we focus on the <italic>particle Gibbs sampler</italic> (PG) that uses a novel  conditional SMC update <xref ref-type="bibr" rid="bib1.bibx2" id="paren.51"/>, as well as its variant, the <italic>particle Gibbs with ancestor sampling</italic> (PGAS) sampler <xref ref-type="bibr" rid="bib1.bibx30" id="paren.52"/>, because they are best fit for sampling  our joint parameter and state posterior.</p>
      <p id="d1e14144">The PG and PGAS samplers use a conditional SMC update step to realize the transition between two steps of the Markov chain while ensuring that the target distribution will be the stationary distribution of the Markov chain. The basic procedure of a PG sampler is as follows.
<list list-type="bullet"><list-item>
      <p id="d1e14149">Initialization: draw <inline-formula><mml:math id="M471" display="inline"><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> from the prior distribution <inline-formula><mml:math id="M472" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Run an SMC algorithm to generate weighted samples <inline-formula><mml:math id="M473" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mi>N</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> for <inline-formula><mml:math id="M474" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and draw <inline-formula><mml:math id="M475" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> from these weighted samples.</p></list-item><list-item>
      <p id="d1e14288">Markov chain iteration: for <inline-formula><mml:math id="M476" display="inline"><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">⋯</mml:mi><mml:mo>,</mml:mo><mml:mi>L</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula>,
<list list-type="alpha-lower"><list-item>
      <p id="d1e14317">sample <inline-formula><mml:math id="M477" display="inline"><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> from the marginal posterior  <inline-formula><mml:math id="M478" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> given by Eq. (<xref ref-type="disp-formula" rid="Ch1.E14"/>);</p></list-item><list-item>
      <p id="d1e14385">run a <italic>conditional SMC algorithm</italic>, conditioned  on <inline-formula><mml:math id="M479" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, which is called the reference trajectory. That is, in the SMC algorithm, the <inline-formula><mml:math id="M480" display="inline"><mml:mi>M</mml:mi></mml:math></inline-formula>th particle is required to move along the reference trajectory by setting <inline-formula><mml:math id="M481" display="inline"><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi><mml:mi>M</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>. Draw other samples from the importance density, and normalize the weights and resample all the particles as usual. This leads to weighted samples <inline-formula><mml:math id="M482" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mi>N</mml:mi><mml:mi>m</mml:mi></mml:msubsup><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math id="M483" display="inline"><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow><mml:mi>M</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>; and</p></list-item><list-item>
      <p id="d1e14525">draw <inline-formula><mml:math id="M484" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> from the above weighted samples.</p></list-item></list></p></list-item><list-item>
      <p id="d1e14555">Return the Markov chain <inline-formula><mml:math id="M485" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>L</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>.</p></list-item></list></p>
      <p id="d1e14604"><?xmltex \hack{\begin{figure*}[p]}?><?xmltex \igopts{width=497.923228pt}?><inline-graphic xlink:href="https://npg.copernicus.org/articles/26/227/2019/npg-26-227-2019-g02.png"/><?xmltex \hack{\end{figure*}}?></p>
      <p id="d1e14612">The conditional SMC algorithm is the core of PG  samplers. It retains the reference path throughout the resampling steps by deterministically setting <inline-formula><mml:math id="M486" display="inline"><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow><mml:mi>M</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M487" display="inline"><mml:mrow><mml:msubsup><mml:mi>A</mml:mi><mml:mi>n</mml:mi><mml:mi>M</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:math></inline-formula> for all <inline-formula><mml:math id="M488" display="inline"><mml:mi>n</mml:mi></mml:math></inline-formula> while sampling the remaining <inline-formula><mml:math id="M489" display="inline"><mml:mrow><mml:mi>M</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> particles according to a standard SMC algorithm. The reference path interacts with the other paths by contributing a weight <inline-formula><mml:math id="M490" display="inline"><mml:mrow><mml:msubsup><mml:mi>w</mml:mi><mml:mi>n</mml:mi><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>. This is the key to ensuring that the PG Markov chain converges to the target distribution. A potential risk of the PG sampler is that it yields a poorly mixed Markov chain, because the reference trajectory tends to dominate the SMC ensemble trajectories.</p>
      <?pagebreak page248?><p id="d1e14701">The PGAS sampler increases the mixing of the chain by connecting the reference path to the history of other particles by assigning an ancestor to the reference particle at each time. This is accomplished by drawing a sample for the ancestor index <inline-formula><mml:math id="M491" display="inline"><mml:mrow><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> of the reference particle, which is referred to as <italic>ancestor sampling</italic>. The distribution of the index <inline-formula><mml:math id="M492" display="inline"><mml:mrow><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> is determined by the likelihood of connecting <inline-formula><mml:math id="M493" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> to the particles <inline-formula><mml:math id="M494" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula>, in other words, according to weights
            <disp-formula id="App1.Ch1.S1.E51" content-type="numbered"><label>A30</label><mml:math id="M495" display="block"><mml:mtable rowspacing="0.2ex" class="split" columnspacing="1em" displaystyle="true" columnalign="right left"><mml:mtr><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">α</mml:mi><mml:mo mathvariant="normal">̃</mml:mo></mml:mover><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>|</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:mo>|</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent="true"><mml:mi>w</mml:mi><mml:mo mathvariant="normal">̃</mml:mo></mml:mover><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>|</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">α</mml:mi><mml:mo mathvariant="normal">̃</mml:mo></mml:mover><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">α</mml:mi><mml:mo mathvariant="normal">̃</mml:mo></mml:mover><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:msubsup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
          The above weight <inline-formula><mml:math id="M496" display="inline"><mml:mrow><mml:msubsup><mml:mover accent="true"><mml:mi mathvariant="italic">α</mml:mi><mml:mo mathvariant="normal">̃</mml:mo></mml:mover><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>|</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> can be seen as a posterior probability, where the importance weight <inline-formula><mml:math id="M497" display="inline"><mml:mrow><mml:msubsup><mml:mi>w</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> is the prior probability of the particle <inline-formula><mml:math id="M498" display="inline"><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> and the product <inline-formula><mml:math id="M499" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi mathvariant="italic">θ</mml:mi><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:mo>|</mml:mo><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>  is the likelihood  that <inline-formula><mml:math id="M500" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mi mathvariant="normal">n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> originates from <inline-formula><mml:math id="M501" display="inline"><mml:mrow><mml:msubsup><mml:mi>U</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> conditional on observation <inline-formula><mml:math id="M502" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>.
In short, the PGAS sampler assigns the reference particle <inline-formula><mml:math id="M503" display="inline"><mml:mrow><mml:msub><mml:mi>U</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>l</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> an ancestor <inline-formula><mml:math id="M504" display="inline"><mml:mrow><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> that is drawn from the distribution
<inline-formula><mml:math id="M505" display="inline"><mml:mrow><mml:mi mathvariant="double-struck">F</mml:mi><mml:mo>(</mml:mo><mml:msubsup><mml:mi>A</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:msubsup><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:mo>|</mml:mo><mml:msubsup><mml:mover accent="true"><mml:mi>w</mml:mi><mml:mo mathvariant="normal">̃</mml:mo></mml:mover><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>|</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:msubsup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msubsup><mml:mover accent="true"><mml:mi>w</mml:mi><mml:mo mathvariant="normal">̃</mml:mo></mml:mover><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>|</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>k</mml:mi></mml:msubsup><mml:mo>.</mml:mo></mml:mrow></mml:math></inline-formula></p>
      <p id="d1e15278">The above conditional SMC with ancestor sampling within PGAS  is summarized in Algorithm 2.</p><?xmltex \hack{\clearpage}?>
</sec>
</app>
  </app-group><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d1e15287">FL, NW, and AM formulated the project and designed the experiments. NW and AM derived the SEBM, and FL and NW carried out the experiments. FL developed the model code and performed the simulations with contributions from NW. FL prepared the manuscript with contributions from all the co-authors.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d1e15293">The authors declare that they have no conflict of interest.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d1e15300">Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.</p>
  </notes><ack><title>Acknowledgements</title><p id="d1e15306">The authors thank Colin Grudzien and the other reviewer for helpful comments.
This research started in a working group supported by the Statistical and Applied Mathematical Sciences Institute (SAMSI).  Fei Lu thanks Peter Jan van Leeuwen,  Kayo Ide,  Mauro Maggioni,  Xuemin Tu, and Wenjun Ma for helpful discussions. Fei Lu is supported by the National Science Foundation under grant DMS-1821211. Nils Weitzel thanks Andreas Hense and Douglas Nychka for inspiring discussions. Nils Weitzel was supported by the German Federal Ministry of Education and Research (BMBF) through the Palmod project (FKZ: 01LP1509D). Nils Weitzel thanks the German Research Foundation (code RE3994-2/1) for funding. Adam H. Monahan acknowledges support from the Natural Sciences and Engineering Research Council of Canada (NSERC), and thanks SAMSI for hosting him in the autumn of 2017.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d1e15311">This research has been supported by the National Science Foundation, USA (grant no. DMS-1821211), the German Federal Ministry of Education and Research (BMBF) (grant no. FKZ: 01LP1509D), the German Research Foundation (grant no. RE3994-2/1), and the Natural Sciences and Engineering Research Council of Canada (NSERC).</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d1e15317">This paper was edited by Stefano Pierini and reviewed by Colin Grudzien and one anonymous referee.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><label>Alberty et al.(1999)Alberty, Carstensen, and
Funken</label><?label alberty1999remarks?><mixed-citation>
Alberty, J., Carstensen, C., and Funken, S. A.: Remarks around 50 lines of
Matlab: short finite element implementation, Numer. Algorithms, 20, 117–137,
1999.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Andrieu et al.(2010)Andrieu, Doucet, and
Holenstein</label><?label andrieu2010particle?><mixed-citation>
Andrieu, C., Doucet, A., and Holenstein, R.: Particle Markov chain Monte
Carlo methods, J. R. Stat. Soc. B, 72, 269–342, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>Annan et al.(2005)Annan, Hargreaves, Edwards, and
Marsh</label><?label annan2005_paramest?><mixed-citation>
Annan, J., Hargreaves, J., Edwards, N., and Marsh, R.: Parameter estimation in
an intermediate complexity Earth System Model using an ensemble Kalman
filter, Ocean Model., 8, 135–154, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Apte et al.(2007)Apte, Hairer, Stuart, and
Voss</label><?label apte2007_SamplingPosterior?><mixed-citation>
Apte, A., Hairer, M., Stuart, A., and Voss, J.: Sampling the Posterior: An
Approach to Non-Gaussian Data Assimilation, Physica D, 230, 50–64, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Bakka et al.(2018)Bakka, Rue, Fuglstad, Riebler, Bolin, Illian,
Krainski, Simpson, and Lindgren</label><?label bakka2018_inla?><mixed-citation>
Bakka, H., Rue, H., Fuglstad, G. A., Riebler, A., Bolin, D., Illian, J., . and Lindgren, F.: Spatial modeling with R‐INLA: A review, Wiley Interdisciplinary Reviews: Computational Statistics, 10, e1443, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Branicki and Majda(2013)</label><?label BM13?><mixed-citation>
Branicki, M. and Majda, A. J.: Fundamental limitations of polynomial chaos for
uncertainty quantification in systems with intermittent instabilities, Commun.
Math. Sci., 11, 55–103, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx7"><?xmltex \def\ref@label{{Capp{\'{e}} et~al.(2005)Capp{\'{e}}, Moulines, and Ryden}}?><label>Cappé et al.(2005)Cappé, Moulines, and Ryden</label><?label CMR05?><mixed-citation>
Cappé, O., Moulines, E., and Ryden, T.: Inference in Hidden Markov Models
(Springer Series in Statistics), Springer-Verlag, New York, NY, USA, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>Carrassi et al.(2018)Carrassi, Bocquet, Bertino, and
Evensen</label><?label carrassi2018_dareview?><mixed-citation>Carrassi, A., Bocquet, M., Bertino, L., and Evensen, G.: Data assimilation in
the geosciences: An overview of methods, issues, and perspectives, WIRES
Clim. Change, 9, e535, <ext-link xlink:href="https://doi.org/10.1002/wcc.535" ext-link-type="DOI">10.1002/wcc.535</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Chekroun and Kondrashov(2017)</label><?label chekroun2017data?><mixed-citation>Chekroun, M. D. and Kondrashov, D.: Data-adaptive harmonic spectra and
multilayer Stuart-Landau models, Chaos, 27, 093110, <ext-link xlink:href="https://doi.org/10.1063/1.4989400" ext-link-type="DOI">10.1063/1.4989400</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Chorin and Tu(2009)</label><?label CT09?><mixed-citation>
Chorin, A. J. and Tu, X.: Implicit sampling for particle filters, P. Natl.
Acad. Sci. USA, 106, 17249–17254, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Chorin and Lu(2015)</label><?label CL15?><mixed-citation>
Chorin, A. J. and Lu, F.: Discrete approach to stochastic parametrization and
dimension reduction in nonlinear dynamics, P. Natl. Acad. Sci. USA, 112,
9804–9809, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Chorin et al.(2016)Chorin, Lu, Miller, Morzfeld, and Tu</label><?label CLMMT16?><mixed-citation>
Chorin, A. J., Lu, F., Miller, R. M., Morzfeld, M., and Tu, X.: Sampling,
feasibility, and priors in data assimilation, Discrete Contin. Dyn. Syst. A,
36, 4227–4246, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Cowles and Carlin(1996)</label><?label cowles1996markov?><mixed-citation>
Cowles, M. K. and Carlin, B. P.: Markov chain Monte Carlo convergence
diagnostics: a comparative review, J. Am. Stat. Assoc., 91, 883–904, 1996.</mixed-citation></ref>
      <ref id="bib1.bibx14"><label>Cui et al.(2015)Cui, Marzouk, and Willcox</label><?label cui2015data?><mixed-citation>
Cui, T., Marzouk, Y. M., and Willcox, K. E.: Data-driven model reduction for
the Bayesian solution of inverse problems, Int. J. Numer. Methods Fluids,
102, 966–990, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx15"><label>Doucet and Johansen(2011)</label><?label DJ11?><mixed-citation>
Doucet, A. and Johansen, A. M.: A tutorial on particle filtering and smoothing:
fifteen years later, in: Oxford Handbook of Nonlinear Filtering,
656–704, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>Fang and Li(2016)</label><?label fang2016_PaleoclimateData?><mixed-citation>Fang, M. and Li, X.: Paleoclimate Data Assimilation: Its Motivation,
Progress and Prospects, Sci. China Earth Sci., 59, 1817–1826,
<ext-link xlink:href="https://doi.org/10.1007/s11430-015-5432-6" ext-link-type="DOI">10.1007/s11430-015-5432-6</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>Fanning and Weaver(1996)</label><?label fanning1996atmospheric?><mixed-citation>
Fanning, A. F. and Weaver, A. J.: An atmospheric energy-moisture balance model:
Climatology, interpentadal climate change, and coupling to an ocean general
circulation model, J. Geophys. Res.-Atmos., 101, 15111–15128, 1996.</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>Farchi and Bocquet(2018)</label><?label farchi2018comparison?><mixed-citation>Farchi, A. and Bocquet, M.: Review article: Comparison of local particle filters and new implementations, Nonlin. Processes Geophys., 25, 765–807, <ext-link xlink:href="https://doi.org/10.5194/npg-25-765-2018" ext-link-type="DOI">10.5194/npg-25-765-2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>Ghosal and Van der Vaart(2017)</label><?label ghosal2017fundamentals?><mixed-citation>
Ghosal, S. and Van der Vaart, A.: Fundamentals of nonparametric Bayesian
inference, vol. 44, Cambridge University Press, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>Goosse et al.(2010)Goosse, Crespin, de Montety, Mann, Renssen, and
Timmermann</label><?label goosse2010_ReconstructingSurface?><mixed-citation>Goosse, H., Crespin, E., de Montety, A., Mann, M. E., Renssen, H., and
Timmermann, A.: Reconstructing Surface Temperature Changes over the Past 600
Years Using Climate Model Simulations with Data Assimilation, J. Geophys.
Res., 115, <ext-link xlink:href="https://doi.org/10.1029/2009JD012737" ext-link-type="DOI">10.1029/2009JD012737</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx21"><label>Guillot et al.(2015)Guillot, Rajaratnam, and
Emile-Geay</label><?label guillot2015_StatisticalPaleoclimate?><mixed-citation>
Guillot, D., Rajaratnam, B., and Emile-Geay, J.: Statistical Paleoclimate
Reconstructions via Markov Random Fields, Ann. Appl. Stat., 9, 324–352,
2015.</mixed-citation></ref>
      <?pagebreak page251?><ref id="bib1.bibx22"><label>Hairer et al.(2007)Hairer, Stuart, and
Voss</label><?label hairer2007_AnalysisSPDEs?><mixed-citation>Hairer, M., Stuart, A. M., and Voss, J.: Analysis of SPDEs Arising in Path
Sampling Part II: The Nonlinear Case, Ann. Appl.
Probab., 17, 1657–1706, <ext-link xlink:href="https://doi.org/10.1214/07-AAP441" ext-link-type="DOI">10.1214/07-AAP441</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>Haslett et al.(2006)Haslett, Whiley, Bhattacharya, Salter-Townshend,
Wilson, Allen, Huntley, and Mitchell</label><?label hwb_06?><mixed-citation>
Haslett, J., Whiley, M., Bhattacharya, S., Salter-Townshend, M., Wilson, S. P.,
Allen, J., Huntley, B., and Mitchell, F.: Bayesian palaeoclimate
reconstruction, J. R. Stat. Soc. A, 169, 395–438, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Jiang and Harlim(2018)</label><?label jiang2018parameter?><mixed-citation>
Jiang, S. W. and Harlim, J.: Parameter estimation with data-driven
nonparametric likelihood functions, arXiv preprint arXiv:1804.03272, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx25"><label>Kantas et al.(2009)Kantas, Doucet, Singh, and Maciejowski</label><?label KDSM09?><mixed-citation>
Kantas, N., Doucet, A., Singh, S. S., and Maciejowski, J. M.: An Overview of
Sequential Monte Carlo Methods for Parameter Estimation in General
State-Space Models,  Proceedings of the IFAC Symposium on System
Identification (SYSID), Saint-Malo, France, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>Khouider et al.(2003)Khouider, Majda, and Katsoulakis</label><?label KMK03?><mixed-citation>
Khouider, B., Majda, A. J., and Katsoulakis, M. A.: Coarse-grained stochastic
models for tropical convection and climate, P. Natl. Acad. Sci. USA,
100, 11941–11946, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx27"><label>Law et al.(2015)Law, Stuart, and Zygalakis</label><?label LSZ15?><mixed-citation>
Law, K., Stuart, A., and Zygalakis, K.: Data Assimilation: A Mathematical
Introduction, Springer, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx28"><label>Lindgren and Rue(2015)</label><?label lindren2015_inla?><mixed-citation>
Lindgren, F. and Rue, H.: Bayesian Spatial Modelling with R-INLA, J. Stat.
Softw., 63, 1–25, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx29"><?xmltex \def\ref@label{{Lindgren et~al.(2011)Lindgren, Rue, and
Lindstr\"{o}m}}?><label>Lindgren et al.(2011)Lindgren, Rue, and
Lindström</label><?label lindgren2011_ExplicitLink?><mixed-citation>
Lindgren, F., Rue, H., and Lindström, J.: An Explicit Link between
Gaussian Fields and Gaussian Markov Random Fields: The Stochastic
Partial Differential Equation Approach: Link between Gaussian Fields
and Gaussian Markov Random Fields, J. R. Stat. Soc. B, 73,
423–498, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx30"><?xmltex \def\ref@label{{Lindsten et~al.(2014)Lindsten, Jordan, and
Sch{\"{o}}n}}?><label>Lindsten et al.(2014)Lindsten, Jordan, and
Schön</label><?label lindsten2014particle?><mixed-citation>
Lindsten, F., Jordan, M. I., and Schön, T. B.: Particle Gibbs with ancestor
sampling, J. Mach. Learn. Res., 15, 2145–2184, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx31"><label>Liu(2001)</label><?label Liu01?><mixed-citation>
Liu, J.: Monte Carlo Strategies in Scientific Computing, Springer, 2001.</mixed-citation></ref>
      <ref id="bib1.bibx32"><label>Llopis et al.(2018)Llopis, Kantas, Beskos, and
Jasra</label><?label llopis2018_ParticleFiltering?><mixed-citation>
Llopis, F. P., Kantas, N., Beskos, A., and Jasra, A.: Particle Filtering
for Stochastic Navier–Stokes Signal Observed with Linear Additive
Noise, SIAM J. Sci. Comput., 40, A1544–A1565, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx33"><label>Lu et al.(2015)Lu, Morzfeld, Tu, and Chorin</label><?label LMTC15?><mixed-citation>
Lu, F., Morzfeld, M., Tu, X., and Chorin, A. J.: Limitations of polynomial
chaos expansions in the Bayesian solution of inverse problems, J. Comput.
Phys., 282, 138–147, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx34"><label>Lu et al.(2017)Lu, Tu, and Chorin</label><?label LTC17?><mixed-citation>
Lu, F., Tu, X., and Chorin, A. J.: Accounting for Model Error from Unresolved
Scales in Ensemble Kalman Filters by Stochastic Parameterization, Mon. Weather
Rev., 145, 3709–3723, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx35"><label>Marzouk and Najm(2009)</label><?label MN09?><mixed-citation>
Marzouk, Y. M. and Najm, H. N.: Dimensionality reduction and polynomial chaos
acceleration of Bayesian inference in inverse problems, J. Comput. Phys.,
228, 1862–1902, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx36"><label>Maslowski and Tudor(2013)</label><?label maslowski2013_DriftParameter?><mixed-citation>
Maslowski, B. and Tudor, C. A.: Drift Parameter Estimation for
Infinite-Dimensional Fractional Ornstein-Uhlenbeck Process,
B. Sci. Math., 137, 880–901, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx37"><label>Morzfeld et al.(2012)Morzfeld, Tu, Atkins, and Chorin</label><?label MTAC12?><mixed-citation>
Morzfeld, M., Tu, X., Atkins, E., and Chorin, A. J.: A random map
implementation of implicit filters, J. Comput. Phys., 231, 2049–2066, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx38"><?xmltex \def\ref@label{{M{\"{u}}ller and Mitra(2013)}}?><label>Müller and Mitra(2013)</label><?label muller2013bayesian?><mixed-citation>
Müller, P. and Mitra, R.: Bayesian nonparametric inference–why and how,
Bayesian analysis, 8, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx39"><label>O'Leary(2001)</label><?label oleary2001_NearOptimalParameters?><mixed-citation>
O'Leary, D. P.: Near-Optimal Parameters for Tikhonov and Other
Regularization Methods, SIAM J. Sci. Comput., 23, 1161–1171, 2001.</mixed-citation></ref>
      <ref id="bib1.bibx40"><label>Parnell et al.(2016)Parnell, Haslett, Sweeney, Doan, Allen, and
Huntley</label><?label parnell2016joint?><mixed-citation>
Parnell, A. C., Haslett, J., Sweeney, J., Doan, T. K., Allen, J. R., and
Huntley, B.: Joint palaeoclimate reconstruction from pollen data via forward
models and climate histories, Quaternary Sci. Rev., 151, 111–126,
2016.</mixed-citation></ref>
      <ref id="bib1.bibx41"><label>Penny and Miyoshi(2016)</label><?label penny2016_LocalParticle?><mixed-citation>Penny, S. G. and Miyoshi, T.: A local particle filter for high-dimensional geophysical systems, Nonlin. Processes Geophys., 23, 391–405, <ext-link xlink:href="https://doi.org/10.5194/npg-23-391-2016" ext-link-type="DOI">10.5194/npg-23-391-2016</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx42"><label>Poterjoy(2016)</label><?label poterjoy2016_LocalizedParticle?><mixed-citation>
Poterjoy, J.: A Localized Particle Filter for High-Dimensional
Nonlinear Systems, Mon. Weather Rev., 144, 59–76, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx43"><label>Prakasa Rao(2001)</label><?label prakasarao2001_StatisticalInference?><mixed-citation>
Prakasa Rao, B. L. S.: Statistical Inference for Stochastic Partial
Differential Equations, in: Institute of Mathematical Statistics Lecture
Notes – Monograph Series, Institute of Mathematical
Statistics, Beachwood, OH, 47–70,  2001.</mixed-citation></ref>
      <ref id="bib1.bibx44"><label>Rypdal et al.(2015)Rypdal, Rypdal, and Fredriksen</label><?label rypdal2018_sebm?><mixed-citation>
Rypdal, K., Rypdal, M., and Fredriksen, H.-B.: Spatiotemporal Long-Range
Persistence in Earth's Temperature Field: Analysis of Stochastic-Diffusive
Energy Balance Models, J. Climate, 28, 8379–8395, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx45"><?xmltex \def\ref@label{{Sigrist et~al.(2015)Sigrist, K\"{u}nsch, and
Stahel}}?><label>Sigrist et al.(2015)Sigrist, Künsch, and
Stahel</label><?label sigrist2015_StochasticPartial?><mixed-citation>
Sigrist, F., Künsch, H. R., and Stahel, W. A.: Stochastic Partial
Differential Equation Based Modelling of Large Space-Time Data Sets, J. R.
Stat. Soc. B, 77, 3–33, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx46"><label>Snyder(2016)</label><?label snyder2016_longtemp?><mixed-citation>
Snyder, C. W.: Evolution of global temperature over the past two million years,
Nature, 538, 226–228, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx47"><label>Steiger et al.(2014)Steiger, Hakim, Steig, Battisti, and
Roe</label><?label steiger2014_AssimilationTimeAveraged?><mixed-citation>
Steiger, N. J., Hakim, G. J., Steig, E. J., Battisti, D. S., and Roe, G. H.:
Assimilation of Time-Averaged Pseudoproxies for Climate
Reconstruction, J. Climate, 27, 426–441, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx48"><label>Tingley and Huybers(2010)</label><?label tingley2010_BayesianAlgorithm?><mixed-citation>
Tingley, M. P. and Huybers, P.: A Bayesian Algorithm for Reconstructing
Climate Anomalies in Space and Time. Part I: Development
and Applications to Paleoclimate Reconstruction Problems, J. Climate,
23, 2759–2781, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx49"><label>Tingley et al.(2012)Tingley, Craigmile, Haran, Li, Mannshardt, and
Rajaratnam</label><?label tingley2012_BayesianCFR?><mixed-citation>
Tingley, M. P., Craigmile, P. F., Haran, M., Li, B., Mannshardt, E., and
Rajaratnam, B.: Piecing together the past: statistical insights into
paleoclimatic reconstructions, Quaternary Sci. Rev., 35, 1–22, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx50"><label>Tolwinski-Ward et al.(2011)Tolwinski-Ward, Evans, Hughes, and
Anchukaitis</label><?label tolwinski2011efficient?><mixed-citation>
Tolwinski-Ward, S. E., Evans, M. N., Hughes, M. K., and Anchukaitis, K. J.: An
efficient forward model of the climate controls on interannual variation in
tree-ring width, Clim. Dynam., 36, 2419–2439, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx51"><label>Trenberth et al.(2009)Trenberth, Fasullo, and
Kiehl</label><?label trenberth2009_EarthGlobal?><mixed-citation>
Trenberth, K. E., Fasullo, J. T., and Kiehl, J.: Earth's Global Energy
Budget, B. Am. Meteorol. Soc., 90, 311–324, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx52"><label>Van der Vaart(2000)</label><?label van2000asymptotic?><mixed-citation>
Van der Vaart, A. W.: Asymptotic statistics, vol. 3, Cambridge university
press, 2000.</mixed-citation></ref>
      <ref id="bib1.bibx53"><label>Vetra-Carvalho et al.(2018)Vetra-Carvalho, van Leeuwen, Nerger,
Barth, Altaf, Brasseur, Kirchgessner, and
Beckers</label><?label vetra-carvalho2018_StateoftheartStochastic?><mixed-citation>
Vetra-Carvalho, S., van Leeuwen, P. J., Nerger, L., Barth, A., Altaf,
M. U., Brasseur, P., Kirchgessner, P., and Beckers, J.-M.: State-of-the-Art
Stochastic Data Assimilation Methods for High-Dimensional Non-Gaussian
Problems, Tellus A, 70, 1–43, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx54"><label>Weaver et al.(2001)Weaver, Eby, Wiebe, Bitz, Duffy, Ewen, Fanning,
Holland, MacFadyen, Matthews et al.</label><?label weaver2001uvic?><mixed-citation>
Weaver, A. J., Eby, M., Wiebe, E. C., Bitz, C. M., Duffy, P. B., Ewen, T. L.,
Fanning, A. F., Holland, M. M., MacFadyen, A., Matthews, H. D., Meissner, K. J., Saenko, O., Schmittner, A., Wang, H., and Yoshimori, M.: The
UVic Earth System Climate Model: Model description, climatology, and
applications to past, present and future climates, Atmos. Ocean., 39,
361–428, 2001.</mixed-citation></ref>
      <ref id="bib1.bibx55"><label>Werner et al.(2013)Werner, Luterbacher, and Smerdon</label><?label werner2013_ppe?><mixed-citation>
Werner, J. P., Luterbacher, J., and Smerdon, J. E.: A Pseudoproxy Evaluation of
Bayesian Hierarchical Modeling and Canonical Correlation Analysis for Climate
Field Reconstructions over Europe, J. Climate, 26, 851–867, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx56"><label>Whittle(1954)</label><?label w_54?><mixed-citation>
Whittle, P.: On stationary processes in the plane, Biometrika, 41, 434–449,
1954.</mixed-citation></ref>
      <ref id="bib1.bibx57"><label>Whittle(1963)</label><?label w_63?><mixed-citation>
Whittle, P.: Stochastic processes in several dimensions, B. Int. Statist.
Inst., 40, 974–994, 1963.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Joint state-parameter estimation of a nonlinear stochastic energy balance model from sparse noisy data</article-title-html>
<abstract-html><p>While nonlinear stochastic partial differential equations arise naturally in spatiotemporal modeling, inference for such systems often faces two major challenges: sparse noisy data and ill-posedness of the inverse problem of parameter estimation. To overcome the challenges, we introduce a strongly regularized posterior by normalizing the likelihood and by imposing physical constraints through priors of the parameters and states.</p><p>We investigate joint parameter-state estimation by the regularized posterior in a physically motivated nonlinear stochastic energy balance model (SEBM) for paleoclimate reconstruction. The high-dimensional posterior is sampled by a particle Gibbs sampler that combines a Markov chain Monte Carlo (MCMC) method with an optimal particle filter exploiting the structure of the SEBM. In tests using either Gaussian or uniform priors based on the physical range of parameters, the regularized posteriors overcome the ill-posedness and lead to samples within physical ranges, quantifying the uncertainty in estimation.  Due to the ill-posedness and the regularization, the posterior of parameters presents a relatively large uncertainty, and consequently, the maximum of the posterior, which is the minimizer in a variational approach, can have a large variation. In contrast, the posterior of states generally concentrates near the truth, substantially filtering out observation noise and reducing uncertainty in the unconstrained SEBM.</p></abstract-html>
<ref-html id="bib1.bib1"><label>Alberty et al.(1999)Alberty, Carstensen, and
Funken</label><mixed-citation>
Alberty, J., Carstensen, C., and Funken, S. A.: Remarks around 50 lines of
Matlab: short finite element implementation, Numer. Algorithms, 20, 117–137,
1999.
</mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Andrieu et al.(2010)Andrieu, Doucet, and
Holenstein</label><mixed-citation>
Andrieu, C., Doucet, A., and Holenstein, R.: Particle Markov chain Monte
Carlo methods, J. R. Stat. Soc. B, 72, 269–342, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Annan et al.(2005)Annan, Hargreaves, Edwards, and
Marsh</label><mixed-citation>
Annan, J., Hargreaves, J., Edwards, N., and Marsh, R.: Parameter estimation in
an intermediate complexity Earth System Model using an ensemble Kalman
filter, Ocean Model., 8, 135–154, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Apte et al.(2007)Apte, Hairer, Stuart, and
Voss</label><mixed-citation>
Apte, A., Hairer, M., Stuart, A., and Voss, J.: Sampling the Posterior: An
Approach to Non-Gaussian Data Assimilation, Physica D, 230, 50–64, 2007.
</mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Bakka et al.(2018)Bakka, Rue, Fuglstad, Riebler, Bolin, Illian,
Krainski, Simpson, and Lindgren</label><mixed-citation>
Bakka, H., Rue, H., Fuglstad, G. A., Riebler, A., Bolin, D., Illian, J., . and Lindgren, F.: Spatial modeling with R‐INLA: A review, Wiley Interdisciplinary Reviews: Computational Statistics, 10, e1443, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Branicki and Majda(2013)</label><mixed-citation>
Branicki, M. and Majda, A. J.: Fundamental limitations of polynomial chaos for
uncertainty quantification in systems with intermittent instabilities, Commun.
Math. Sci., 11, 55–103, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Cappé et al.(2005)Cappé, Moulines, and Ryden</label><mixed-citation>
Cappé, O., Moulines, E., and Ryden, T.: Inference in Hidden Markov Models
(Springer Series in Statistics), Springer-Verlag, New York, NY, USA, 2005.
</mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Carrassi et al.(2018)Carrassi, Bocquet, Bertino, and
Evensen</label><mixed-citation>
Carrassi, A., Bocquet, M., Bertino, L., and Evensen, G.: Data assimilation in
the geosciences: An overview of methods, issues, and perspectives, WIRES
Clim. Change, 9, e535, <a href="https://doi.org/10.1002/wcc.535" target="_blank">https://doi.org/10.1002/wcc.535</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Chekroun and Kondrashov(2017)</label><mixed-citation>
Chekroun, M. D. and Kondrashov, D.: Data-adaptive harmonic spectra and
multilayer Stuart-Landau models, Chaos, 27, 093110, <a href="https://doi.org/10.1063/1.4989400" target="_blank">https://doi.org/10.1063/1.4989400</a>, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Chorin and Tu(2009)</label><mixed-citation>
Chorin, A. J. and Tu, X.: Implicit sampling for particle filters, P. Natl.
Acad. Sci. USA, 106, 17249–17254, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Chorin and Lu(2015)</label><mixed-citation>
Chorin, A. J. and Lu, F.: Discrete approach to stochastic parametrization and
dimension reduction in nonlinear dynamics, P. Natl. Acad. Sci. USA, 112,
9804–9809, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Chorin et al.(2016)Chorin, Lu, Miller, Morzfeld, and Tu</label><mixed-citation>
Chorin, A. J., Lu, F., Miller, R. M., Morzfeld, M., and Tu, X.: Sampling,
feasibility, and priors in data assimilation, Discrete Contin. Dyn. Syst. A,
36, 4227–4246, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Cowles and Carlin(1996)</label><mixed-citation>
Cowles, M. K. and Carlin, B. P.: Markov chain Monte Carlo convergence
diagnostics: a comparative review, J. Am. Stat. Assoc., 91, 883–904, 1996.
</mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Cui et al.(2015)Cui, Marzouk, and Willcox</label><mixed-citation>
Cui, T., Marzouk, Y. M., and Willcox, K. E.: Data-driven model reduction for
the Bayesian solution of inverse problems, Int. J. Numer. Methods Fluids,
102, 966–990, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Doucet and Johansen(2011)</label><mixed-citation>
Doucet, A. and Johansen, A. M.: A tutorial on particle filtering and smoothing:
fifteen years later, in: Oxford Handbook of Nonlinear Filtering,
656–704, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>Fang and Li(2016)</label><mixed-citation>
Fang, M. and Li, X.: Paleoclimate Data Assimilation: Its Motivation,
Progress and Prospects, Sci. China Earth Sci., 59, 1817–1826,
<a href="https://doi.org/10.1007/s11430-015-5432-6" target="_blank">https://doi.org/10.1007/s11430-015-5432-6</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>Fanning and Weaver(1996)</label><mixed-citation>
Fanning, A. F. and Weaver, A. J.: An atmospheric energy-moisture balance model:
Climatology, interpentadal climate change, and coupling to an ocean general
circulation model, J. Geophys. Res.-Atmos., 101, 15111–15128, 1996.
</mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>Farchi and Bocquet(2018)</label><mixed-citation>
Farchi, A. and Bocquet, M.: Review article: Comparison of local particle filters and new implementations, Nonlin. Processes Geophys., 25, 765–807, <a href="https://doi.org/10.5194/npg-25-765-2018" target="_blank">https://doi.org/10.5194/npg-25-765-2018</a>, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Ghosal and Van der Vaart(2017)</label><mixed-citation>
Ghosal, S. and Van der Vaart, A.: Fundamentals of nonparametric Bayesian
inference, vol. 44, Cambridge University Press, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Goosse et al.(2010)Goosse, Crespin, de Montety, Mann, Renssen, and
Timmermann</label><mixed-citation>
Goosse, H., Crespin, E., de Montety, A., Mann, M. E., Renssen, H., and
Timmermann, A.: Reconstructing Surface Temperature Changes over the Past 600
Years Using Climate Model Simulations with Data Assimilation, J. Geophys.
Res., 115, <a href="https://doi.org/10.1029/2009JD012737" target="_blank">https://doi.org/10.1029/2009JD012737</a>, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Guillot et al.(2015)Guillot, Rajaratnam, and
Emile-Geay</label><mixed-citation>
Guillot, D., Rajaratnam, B., and Emile-Geay, J.: Statistical Paleoclimate
Reconstructions via Markov Random Fields, Ann. Appl. Stat., 9, 324–352,
2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Hairer et al.(2007)Hairer, Stuart, and
Voss</label><mixed-citation>
Hairer, M., Stuart, A. M., and Voss, J.: Analysis of SPDEs Arising in Path
Sampling Part II: The Nonlinear Case, Ann. Appl.
Probab., 17, 1657–1706, <a href="https://doi.org/10.1214/07-AAP441" target="_blank">https://doi.org/10.1214/07-AAP441</a>, 2007.
</mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>Haslett et al.(2006)Haslett, Whiley, Bhattacharya, Salter-Townshend,
Wilson, Allen, Huntley, and Mitchell</label><mixed-citation>
Haslett, J., Whiley, M., Bhattacharya, S., Salter-Townshend, M., Wilson, S. P.,
Allen, J., Huntley, B., and Mitchell, F.: Bayesian palaeoclimate
reconstruction, J. R. Stat. Soc. A, 169, 395–438, 2006.
</mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Jiang and Harlim(2018)</label><mixed-citation>
Jiang, S. W. and Harlim, J.: Parameter estimation with data-driven
nonparametric likelihood functions, arXiv preprint arXiv:1804.03272, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Kantas et al.(2009)Kantas, Doucet, Singh, and Maciejowski</label><mixed-citation>
Kantas, N., Doucet, A., Singh, S. S., and Maciejowski, J. M.: An Overview of
Sequential Monte Carlo Methods for Parameter Estimation in General
State-Space Models,  Proceedings of the IFAC Symposium on System
Identification (SYSID), Saint-Malo, France, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>Khouider et al.(2003)Khouider, Majda, and Katsoulakis</label><mixed-citation>
Khouider, B., Majda, A. J., and Katsoulakis, M. A.: Coarse-grained stochastic
models for tropical convection and climate, P. Natl. Acad. Sci. USA,
100, 11941–11946, 2003.
</mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>Law et al.(2015)Law, Stuart, and Zygalakis</label><mixed-citation>
Law, K., Stuart, A., and Zygalakis, K.: Data Assimilation: A Mathematical
Introduction, Springer, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>Lindgren and Rue(2015)</label><mixed-citation>
Lindgren, F. and Rue, H.: Bayesian Spatial Modelling with R-INLA, J. Stat.
Softw., 63, 1–25, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Lindgren et al.(2011)Lindgren, Rue, and
Lindström</label><mixed-citation>
Lindgren, F., Rue, H., and Lindström, J.: An Explicit Link between
Gaussian Fields and Gaussian Markov Random Fields: The Stochastic
Partial Differential Equation Approach: Link between Gaussian Fields
and Gaussian Markov Random Fields, J. R. Stat. Soc. B, 73,
423–498, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Lindsten et al.(2014)Lindsten, Jordan, and
Schön</label><mixed-citation>
Lindsten, F., Jordan, M. I., and Schön, T. B.: Particle Gibbs with ancestor
sampling, J. Mach. Learn. Res., 15, 2145–2184, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>Liu(2001)</label><mixed-citation>
Liu, J.: Monte Carlo Strategies in Scientific Computing, Springer, 2001.
</mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>Llopis et al.(2018)Llopis, Kantas, Beskos, and
Jasra</label><mixed-citation>
Llopis, F. P., Kantas, N., Beskos, A., and Jasra, A.: Particle Filtering
for Stochastic Navier–Stokes Signal Observed with Linear Additive
Noise, SIAM J. Sci. Comput., 40, A1544–A1565, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Lu et al.(2015)Lu, Morzfeld, Tu, and Chorin</label><mixed-citation>
Lu, F., Morzfeld, M., Tu, X., and Chorin, A. J.: Limitations of polynomial
chaos expansions in the Bayesian solution of inverse problems, J. Comput.
Phys., 282, 138–147, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Lu et al.(2017)Lu, Tu, and Chorin</label><mixed-citation>
Lu, F., Tu, X., and Chorin, A. J.: Accounting for Model Error from Unresolved
Scales in Ensemble Kalman Filters by Stochastic Parameterization, Mon. Weather
Rev., 145, 3709–3723, 2017.
</mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Marzouk and Najm(2009)</label><mixed-citation>
Marzouk, Y. M. and Najm, H. N.: Dimensionality reduction and polynomial chaos
acceleration of Bayesian inference in inverse problems, J. Comput. Phys.,
228, 1862–1902, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Maslowski and Tudor(2013)</label><mixed-citation>
Maslowski, B. and Tudor, C. A.: Drift Parameter Estimation for
Infinite-Dimensional Fractional Ornstein-Uhlenbeck Process,
B. Sci. Math., 137, 880–901, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Morzfeld et al.(2012)Morzfeld, Tu, Atkins, and Chorin</label><mixed-citation>
Morzfeld, M., Tu, X., Atkins, E., and Chorin, A. J.: A random map
implementation of implicit filters, J. Comput. Phys., 231, 2049–2066, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Müller and Mitra(2013)</label><mixed-citation>
Müller, P. and Mitra, R.: Bayesian nonparametric inference–why and how,
Bayesian analysis, 8, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>O'Leary(2001)</label><mixed-citation>
O'Leary, D. P.: Near-Optimal Parameters for Tikhonov and Other
Regularization Methods, SIAM J. Sci. Comput., 23, 1161–1171, 2001.
</mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>Parnell et al.(2016)Parnell, Haslett, Sweeney, Doan, Allen, and
Huntley</label><mixed-citation>
Parnell, A. C., Haslett, J., Sweeney, J., Doan, T. K., Allen, J. R., and
Huntley, B.: Joint palaeoclimate reconstruction from pollen data via forward
models and climate histories, Quaternary Sci. Rev., 151, 111–126,
2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Penny and Miyoshi(2016)</label><mixed-citation>
Penny, S. G. and Miyoshi, T.: A local particle filter for high-dimensional geophysical systems, Nonlin. Processes Geophys., 23, 391–405, <a href="https://doi.org/10.5194/npg-23-391-2016" target="_blank">https://doi.org/10.5194/npg-23-391-2016</a>, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Poterjoy(2016)</label><mixed-citation>
Poterjoy, J.: A Localized Particle Filter for High-Dimensional
Nonlinear Systems, Mon. Weather Rev., 144, 59–76, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Prakasa Rao(2001)</label><mixed-citation>
Prakasa Rao, B. L. S.: Statistical Inference for Stochastic Partial
Differential Equations, in: Institute of Mathematical Statistics Lecture
Notes – Monograph Series, Institute of Mathematical
Statistics, Beachwood, OH, 47–70,  2001.
</mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>Rypdal et al.(2015)Rypdal, Rypdal, and Fredriksen</label><mixed-citation>
Rypdal, K., Rypdal, M., and Fredriksen, H.-B.: Spatiotemporal Long-Range
Persistence in Earth's Temperature Field: Analysis of Stochastic-Diffusive
Energy Balance Models, J. Climate, 28, 8379–8395, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>Sigrist et al.(2015)Sigrist, Künsch, and
Stahel</label><mixed-citation>
Sigrist, F., Künsch, H. R., and Stahel, W. A.: Stochastic Partial
Differential Equation Based Modelling of Large Space-Time Data Sets, J. R.
Stat. Soc. B, 77, 3–33, 2015.
</mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Snyder(2016)</label><mixed-citation>
Snyder, C. W.: Evolution of global temperature over the past two million years,
Nature, 538, 226–228, 2016.
</mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>Steiger et al.(2014)Steiger, Hakim, Steig, Battisti, and
Roe</label><mixed-citation>
Steiger, N. J., Hakim, G. J., Steig, E. J., Battisti, D. S., and Roe, G. H.:
Assimilation of Time-Averaged Pseudoproxies for Climate
Reconstruction, J. Climate, 27, 426–441, 2014.
</mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>Tingley and Huybers(2010)</label><mixed-citation>
Tingley, M. P. and Huybers, P.: A Bayesian Algorithm for Reconstructing
Climate Anomalies in Space and Time. Part I: Development
and Applications to Paleoclimate Reconstruction Problems, J. Climate,
23, 2759–2781, 2010.
</mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>Tingley et al.(2012)Tingley, Craigmile, Haran, Li, Mannshardt, and
Rajaratnam</label><mixed-citation>
Tingley, M. P., Craigmile, P. F., Haran, M., Li, B., Mannshardt, E., and
Rajaratnam, B.: Piecing together the past: statistical insights into
paleoclimatic reconstructions, Quaternary Sci. Rev., 35, 1–22, 2012.
</mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>Tolwinski-Ward et al.(2011)Tolwinski-Ward, Evans, Hughes, and
Anchukaitis</label><mixed-citation>
Tolwinski-Ward, S. E., Evans, M. N., Hughes, M. K., and Anchukaitis, K. J.: An
efficient forward model of the climate controls on interannual variation in
tree-ring width, Clim. Dynam., 36, 2419–2439, 2011.
</mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>Trenberth et al.(2009)Trenberth, Fasullo, and
Kiehl</label><mixed-citation>
Trenberth, K. E., Fasullo, J. T., and Kiehl, J.: Earth's Global Energy
Budget, B. Am. Meteorol. Soc., 90, 311–324, 2009.
</mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>Van der Vaart(2000)</label><mixed-citation>
Van der Vaart, A. W.: Asymptotic statistics, vol. 3, Cambridge university
press, 2000.
</mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>Vetra-Carvalho et al.(2018)Vetra-Carvalho, van Leeuwen, Nerger,
Barth, Altaf, Brasseur, Kirchgessner, and
Beckers</label><mixed-citation>
Vetra-Carvalho, S., van Leeuwen, P. J., Nerger, L., Barth, A., Altaf,
M. U., Brasseur, P., Kirchgessner, P., and Beckers, J.-M.: State-of-the-Art
Stochastic Data Assimilation Methods for High-Dimensional Non-Gaussian
Problems, Tellus A, 70, 1–43, 2018.
</mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>Weaver et al.(2001)Weaver, Eby, Wiebe, Bitz, Duffy, Ewen, Fanning,
Holland, MacFadyen, Matthews et al.</label><mixed-citation>
Weaver, A. J., Eby, M., Wiebe, E. C., Bitz, C. M., Duffy, P. B., Ewen, T. L.,
Fanning, A. F., Holland, M. M., MacFadyen, A., Matthews, H. D., Meissner, K. J., Saenko, O., Schmittner, A., Wang, H., and Yoshimori, M.: The
UVic Earth System Climate Model: Model description, climatology, and
applications to past, present and future climates, Atmos. Ocean., 39,
361–428, 2001.
</mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>Werner et al.(2013)Werner, Luterbacher, and Smerdon</label><mixed-citation>
Werner, J. P., Luterbacher, J., and Smerdon, J. E.: A Pseudoproxy Evaluation of
Bayesian Hierarchical Modeling and Canonical Correlation Analysis for Climate
Field Reconstructions over Europe, J. Climate, 26, 851–867, 2013.
</mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>Whittle(1954)</label><mixed-citation>
Whittle, P.: On stationary processes in the plane, Biometrika, 41, 434–449,
1954.
</mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>Whittle(1963)</label><mixed-citation>
Whittle, P.: Stochastic processes in several dimensions, B. Int. Statist.
Inst., 40, 974–994, 1963.
</mixed-citation></ref-html>--></article>
