Using a Variance-based Sensitivity Analysis for Analyzing the Relation between Measurements and Unknown Parameters of a Physical Model

An implementation of uncertainty analysis (UA) and quantitative global sensitivity analysis (SA) is applied to the non-linear inversion of gravity changes and three-dimensional displacement data which were measured in and active volcanic area. A didactic example is included to illustrate the computational procedure. The main emphasis is placed on the problem of extended Fourier amplitude sensitivity test (E-FAST). This method produces the total sensitivity indices (TSIs), so that all interactions between the unknown input parameters are taken into account. The possible correlations between the output an the input parameters can be evaluated by uncertainty analysis. Uncertainty analysis results indicate the general fit between the physical model and the measurements. Results of the sensitivity analysis show quite different sensitivities for the measured changes as they relate to the unknown parameters of a physical model for an elastic-gravitational source. Assuming a fixed number of executions , thirty different seeds are observed to determine the stability of this method.


Introduction
In Moldwin and Rose (2009), the majority of the articles surveyed did not discuss measurement uncertainty or present error bars in observational or statistical analysis figures.We conducted a survey of 31 articles published between 2005 and 2010 having "sensitivity analysis" as a keyword using American Geophysical Union (AGU) Earth and Space Index (EASI) search engine.The majority (21) applied the sensitivity analysis practices to hydrology.The application of SAs can be exiguously found in the modeling of various branches of geodesy, e.g.non-linear geodetic data inversion Correspondence to: J. Zhao (j.zhao@hm.edu)(Tiede et al., 2005), model evaluation in engineering survey (Schwieger, 2004), and model optimization for trajectory estimation (Schwieger, 2006).
The study of uncertainty is usually composed of two related activities referred as uncertainty analysis and sensitivity analysis.Uncertainty analysis aims quantifying the overall uncertainty associated with the response as a result of uncertainties in the model input.Ideally, SA and UA should be computed in tandem, with UA preceding in current practice.SA is a study of how uncertainty in the output of a model (numerical or otherwise) can be apportioned, qualitatively or quantitatively, to different sources of uncertainty in the model input, and of how the given model depends upon the information fed into it, (Saltelli et al., 2000(Saltelli et al., , 2008)).It can be seen as a tool for validating and optimizing a model due to the determination of the sensitivities of the different output values concerning changes in the unknown input parameters.This knowledge results in the quantification as well as the qualification of the unknown input parameters, so it can be derived which parameter has to be known best in order to reduce the variance of a certain output value.In geodesy measurement models, the inputs are the measurement quantities and the outputs are the corrected or reduced measurements, the estimated coordinates or other parameters.SA studies the relationship between input and output quantities of the model (Schwieger, 2006).
Sensitivity analysis of model output examines how a model depends on its input parameters.Two groups of sensitivity analyses are defined: local sensitivity analysis and global sensitivity analysis (Saltelli et al., 2000).One drawback of local SA is that it is not possible to quantify the effects caused by the interactions between the unknown input parameters.Thus, we are using global SA techniques instead of local SA.
Variance-based methods are a class of global SA techniques.The main advantage of these methods is that they are intuitive and quantitative.Within this class of methods, we strongly favor those capable of computing the so-called "Total Sensitivity Indices" (TSI), which measures a parameter's main effect and all the interactions involving that parameter, especially since the analyst cannot know in advance whether his/her model will be additive in all its factors (Saltelli et al., 2000).
Due to the lack of knowledge if the underlying physical based mathematical model is additive or not, and its nonlinear behavior, a variance-based, global sensitivity analysis which can also compute the interactions between the unknown parameters has been chosen.Because the number of unknown input parameters is very small with an amount of five (X = [ξ, ψ, ζ, , m]), (details on the parameters are discussed in Sect.2) the aspect of computation time for the analysis is negligible for the choice of the method.From the two variance-based global techniques which allow the computation of TSIs, as Sobol' has been discussed in Tiede (2005) and Tiede et al. (2005), Extended Fourier Amplitude Sensitivity Test (E-FAST) has been applied.E-FAST -an evolution of FAST -is a variance-based SA technique among the methods most used.It was proposed to combine FAST better efficiency with Sobol' capacity to compute total effects by Saltelli et al. (1999).The current paper gives an assessment of this method applied to a geodetic model.

Case study Merapi volcano
As a case study, gravity changes (dg) and three-dimensional displacements (dx, dy and dz) were measured at different time epochs in a permanently active area around Merapi volcano located at Java, Indonesia.One possible explanation for the measured changes in gravity and three-dimensional displacements is given by a changing status of the magma chamber of the volcano happened within the time period between measurement epoch t and t + 1.Such a status change of the magma chamber can be produced by a change in its position, mass and its energy change of the intrusion which all would cause signals at the surface that can be measured by different sensors like GPS and gravitymeters.
Figure 1 describes the model,whereby the unknown parameters of the model are given by its position (the east coordinate (ξ ), the north coordinate (ψ) and the depth (ζ )), the mass (m) and the energy of the intrusion ( ) described by the product of pressure (p) and cubed radius (r).Table 1 shows the boundaries of the unknown input parameters.The parameters of the magma chamber are estimated by the model in Eq. (2) using gravity changes (dg), height displacements (uz), east displacements (ux) and north displacements (uy) at about 20 observation points arranged in a loop based network structure around the volcano.
The used forward model for estimating the unknown parameters of the source is based on the generalized static Navier equations which couple elastic and gravitational effects in a homogeneous half space, given by Love (1911) and Rundle (1980). − with u = displacement vector, φ = gravitational potential, ρ 0 = undisturbed density, G = gravitational constant, ge z = surface gravitational acceleration, σ = Poisson ratio and µ = shear modulus.Equation ( 2) is solved by Rundle (1980), Rundle (1982) for a layered homogeneous half space.Rundle (1982) and Fernández and Rundle (1994) evaluate the equations that are satisfied by the displacement vector u and perturbation potential φ by obtaining a general solution at the height z = 0.
For the computation of the most probable values of the unknown parameters a global optimization technique such as genetic algorithm (GA) has been used (Tiede, 2005), which maximizes the objective function that is given as the χ 2 value.For the estimation of the unknown parameters of Eq. ( 2), the objective function χ (comp) 2 ("comp" stands for "complete") is computed by taking all kind of changes: gravity changes as well as the three-dimensional displacements into account, Eq. ( 2).
with n as number of all observations (here 80), u as number of unknown model parameters (here 5), v as the vector of residuals between modeled and measured values and Q as covariance matrix holding the variances of the measured values on its diagonal.Within the generated sensitivity analyses, Sect.4, we use additional objective functions χ (dg) 2 , χ (ux) 2 , χ (uy) 2 , and χ (uz) 2 (based on Eq. ( 2) but only computed via gravity changes, east, north and height displacements separately).The objective functions are computed by the kind of observation which is given in brackets.Table 2 presents the observed changes in east ux, north uy, height uz and gravity dg, including their mean standard deviations σ of 20 points.Due to the small standard deviations of the measurements we just take the standard deviations into account as weights in the computation of the objective functions.But generally we do not compute an uncertainty analysis based on observational uncertainties here.It could be discussed in a subsequent paper.

Uncertainty analysis
The uncertainty of measurement is a parameter, associated with the result of a measurement, that characterizes the dispersion of the values that could reasonably be attributed to the measurand (GUM, 2008).In geodesy measurement, there are a number of sources of uncertainty, which include: parameter randomness due to geodetic processes; the lack of dense spatial measurements of geodetic parameters; uncertainty due to incomplete historical geodetic data collection, data measurement error, and unpredictability of future geodetic events; and model uncertainty attributed to the limitation of a simulation model to correctly represent the physical processes of the system.UA refers to the determination of the uncertainty in analysis results that derives from uncertainty in analysis input.Important components of uncertainty analysis include qualitative analysis that identifies the uncertainties, quantitative analysis of the effects of the uncertainties on the decision process, and communication of the uncertainty.The analysis of the uncertainty depends on the problem.
The approach for assessing parameter uncertainty involves the following steps, in Smith (2002): 1. Select a distribution to describe possible values of a parameter.
2. Generate data from this distribution.
3. Use the generated data as possible values of the parameter in the model to produce output.

Variance-based Extended Fourier Amplitude Sensitivity Test (E-FAST) SA
The main idea of variance-based global sensitivity analyses is based on the idea that one can determine the nature of the sensitivity through the variance V and then evaluate how the input variance contributes to the output variance.By setting (X 1 , ..., X k ) as the vector of independent unknown input parameters and Y = f (X 1 , ..., X k ) as the output value, with f as model function, an indicator for the importance of an input X i can be set by evaluating the variance of the output Y V (Y |X i ).This is done by fixing X i to a value is studied, whereby it is built into all possible values of x i .The variance of Y is given by where ) represents the first order effect for each factor and describes the importance of X i on the variance Y .The variance of the conditional expectation, ) is sometimes called main effect and used as an indicator of the significance of X i on the variance of Y , which is equivalent to the sensitivity of Y to X i .Normalizing the sensitivity value S i as the ratio between the variance of the expectation value and the variance of the output value leads to and is called the first order sensitivity index, correlation ratio or importance measure and describes the main effect of the unknown parameter X i on the output value Y .Related to Confalonieri et al. (2010), the total sensitivity index (TSI) corresponding to a single factor (index i) and the interaction of more factors that involve the index i and at least one index j = i from 1 to n: Both the main effects S i , the interaction terms S ij and higher-order terms could be computed by straightforward Monte-Carlo integration of multidimensional integrals.The main effect (or first order) sensitivity index measures only the main effect contribution of each input parameter on the output variance.The interactions among the input parameters are not taken into account.If the total effect on the output of input parameters is not equal to the sum of their first order effects is called interact.A model with interactions is said to be non-additive.For non-additive models information from all interactions is searched for, as well as the first order effect.For nonlinear models the sum of all first order indices can be very low.The sum of all the order effects that a parameter accounts for is called the total effect.So, for an input X i , the total sensitivity index ST i (Eq.5) is defined as the sum of all indices relating to X i (first and higher orders).
The classical FAST method, created in the 1970s by Cukier, Schaibly and others, and further developed by Koda and McRae, estimates the first-order effects.Saltelli et al. (1999) propose the E-FAST method, which computes both first-order effects and total effects.The term total means the factor's main effect and all the interaction terms of that factor.The main advantages of the E-FAST is pointed out in Saltelli et al. (1999) is that it allows the simultaneous computation of the first and total effect indices for a given factor.And it is robust at low sample size and computational efficient.
The main idea of the FAST method is to convert the multidimensional integral in X into a one-dimensional integral in s by using the transformation functions X i = G i (sin ω i s) for i = 1, ..., k. s ∈ (−π,π) is a scalar variable and ω i is a set of integer angular frequencies.The basic idea behind the computation of the total indices by the FAST method is to consider the frequencies that do not belong to the set {p 1 ω 1 ,p 2 ω 2 ,...,p k ω k }, for p i = 1,2,...,∞ and ∀i = 1,2,...,k.These frequencies contain information about the residual variance V − k i V i that is not accounted for by the first-order indices, that is, including the interactions between the factors at any order.
We assign a frequency ω i for the factor X i and a set of almost identical frequencies, but different from ω i , to all the remaining factors, denoted by ω ∼i .We use ∼ i to represent "all but i".The chosen frequencies and their harmonics have to be linear independent.Thus, by evaluating the spectrum at the frequencies ω ∼i and related higher harmonics pω i , we can compute the partial variance V ∼i .V ∼i is a measure including all the effects of any orders that do not involve the factor X i .The total index, denoted by TS(i), is computed by using the following equation 4 Results

Uncertainty analysis
In this case, the probability density functions (pdfs) of the unknown input parameters had been anticipated as uniform because it has not been possible to specify any areas or certain value ranges which are more likely than others within the given limits for the unknown input parameters given in Table 1.Furthermore, in cases with only poor prior knowledge of the unknown input parameters pdfs, Saltelli et al. (2000) also suggests a unique distribution.The E-FAST consists of 28 645 Monte Carlo samples.≤ ξ ≤ 41 • 10 3 ) for analysis leads to better fit between measurements and model.As shown in Fig. 3, the values of the χ (comp) 2 follow a homogeneous distribution over the range of mass component m values.The χ (comp) 2 can be seen as the sum of the four objective functions χ (dg) 2 , χ (ux) 2 , χ (uy) 2 and χ (uz) 2 .The effect of the mass component is overlapping, so no clear behavior can be determined out in this plot.Nevertheless, the distribution of χ (dg) 2 (see Fig. 4) against mass component m shows small value of χ (dg) 2 for small m.The dispersion is increasing with enlarging m.The analysis of the relation between χ (dg) 2 and the the mass component m results in a region 0 ≤ m ≤ 0.15 • 10 12 kg.This small dispersion together with small values of the objective function state that small mass values explain the measurements at best.

Variance-based Extended Fourier Amplitude Sensitivity Test (E-FAST) SA
The previous discussed results show first relations between unknown input parameters and the output values and will be analyzed in more detail by applying the E-FAST variancebased sensitivity analysis.Therefore, the first order effects as well as the TSIs are computed after Sect.3.2 for all five  outputs and visualized in a comparison in Fig. 5. Analyzing the differences between the first order effects and the TSIs, it can be seen that the model is clearly dominated by higher order effects.It means the model is a so-called non-additive model, and only taking the first order indices into account would lead to wrong sensitivity anticipations of the influence of a certain unknown to the different kind of output data.Furthermore, all the first order indices are smaller than 0.02.It indicates that all TSIs are driven primarily by interactions between the input parameters.
Figures 6 and 7 display the normalized E-FAST first order indices and TSIs with respect to the unknown input parameters for all kind of output values (χ (dg) 2 ,χ (uz) 2 ,χ(ux) 2 ,χ(uy) 2 ,χ(comp) 2 ) separately.The normalization is investigated according to with S() n,i = normalized first order sensitivity index of the values given in the brackets due to the specified unknown input parameter i, S() i = first order sensitivity index concerning the unknown input parameter i, TSI() n,i = normalized TSI due to the unknown input parameter i and TSI() i = total sensitivity index due to the unknown input parameter i.
The obvious changes in sensitivity indices in Figs. 6 and 7 caused by higher order effects confirm the described results in Fig. 5.
By analyzing the E-FAST TSIs, the influences on the observed values due to the unknown input parameters are obvious.All output values are almost equally sensitive to changes in the three location parameters (ξ , ψ, and ζ ).The three location parameters of the source have similar influences (approximately 25 %) on these output values.The mass m has only few influence (below 5 %) on the output values except χ (dg) 2 values (15 %).The energy effect , by contrast, has similar influences (around 15 %) on the output values except χ (dg) 2 values (6 %).Table 3. Nonnormalized E-FAST TSIs for the five defined output parameters with respect to the height component ζ with 4 different seeds (1, 500, 4000, 10 000) and the mean TSIs of 30 executions (the standard deviation see From the sensitivity analysis concerning all χ 2 values, conclusions about the estimation of the unknown input parameters can be drawn from Fig. 7: -The mass component m can be computed most effectively by χ (dg) 2 due to its large influence on this output value.The observations of gravity changes are most important for the determination of m.
-For the estimation of the energy effect all the output values are appropriate except χ (dg) 2 .
-The three location parameters east component ξ , north component ψ, and height component ζ are good to be estimated by all the measurement.E-FAST method requires two parameters: seed and number of executions.In SIMLAB (Simlab, 2010), a software designed for global uncertainty and sensitivity analysis, the random numbers are generated based on a user defined starting point (seed) (Saltelli et al., 2004).The standard deviations of the nonnormalized first order indices in Table 5 are much smaller than for the nonnormalized TSIs in Table 6.Furthermore, the TSIs of the east, north and height components are much stabler than that of the energy and mass components.The nonsignificant standard deviations of the three position parameters (east, north, and height) due to the use of different seeds become in fact negligible which shows that the sample size is large enough for these parameters.But the analysis of the energy and mass is impossible here.In consequence, we try to fix the east, north, and height components and do the SA again just taking the energy and mass as the input parameters.

Conclusions
The paper evaluates E-FAST variance-based UA and SA applied for geodetic data in order to determine a deeper insight into the behavior between the unknown input parameters of the physical based mathematical model and the modeled output values.The application of UA presents the general fit between model and data.By using E-FAST SA the influences on the observed values due to the unknown input parameters are determined and the computation of the input parameters can be drawn.In particular, the sensitivity concerning the mass and the energy for the objective function concerning the gravity changes are quite different compared to the other objective functions.Furthermore, E-FAST method is stable with varying number of the seed except for the energy and mass components: it has to be pointed out that no concrete analysis of mass and energy sensitivity is possible due to the large variance of the output when choosing different seeds.The next steps are to fix the east, north and height parameters and repeat the uncertainty and sensitivity analysis for the remaining parameters mass and energy.Unlike the local SA, the introduced global SA applied into a geodetic model gives both a quantitative result as well as the computation of the interactions between the unknown input parameters.Our results show that it would lead to large mistakes just applying local sensitivity analyses with no quantitative information.
The paper shows in addition that global sensitivity analysis helps in the analysis and setup of the optimization process of the unknown model parameters: in our case, the sensitivity analysis results in the consequence that we will first fix the three parameters east, north and height before we will get more information about the remaining parameters mass and energy.
Figures 2 and 3 show the output value χ (comp) 2 plotted against the east component ξ and the mass component m of the unknown input parameter separately.From these plots it can be evaluated whether the anticipated physical model explains the measurements at all (by evaluating the values of the objective function).If the value of the objective function is in a reasonable area, the model explains the data.

Fig. 5 .
Fig. 5. First order effects and TSIs computed by E-FAST sensitivity analysis for the Monte Carlo sampling (one run consisting of 28 645 samples).

Fig. 6 .
Fig.6.Normalized E-FAST first order indices for the five defined output parameters (one run consisting of 28 645 samples).

Fig. 7 .
Fig. 7. Normalized E-FAST Total Sensitivity Indices (TSI) for the five defined output parameters (one run consisting of 28 645 samples).

Table 1 .
Range limitations for the input parameters used for the computation.

Table 2 .
Observation changes of the points including the mean standard deviations.

Table 4 .
Nonnormalized E-FAST TSIs for the five defined output parameters with respect to the mass component m with 4 different seeds (1, 500, 4000, 10 000) and the mean TSIs of 30 executions (the standard deviation see Table6 row 5).