Improved moment scaling estimation for multifractal signals

A fundamental problem in the analysis of multifractal processes is to estimate the scaling exponent K(q) of moments of different order q from data. Conventional estimators use the empirical moments μ̂qr =〈|εr(τ )|〉 of wavelet coefficientsεr(τ ), whereτ is location andr is resolution. For stationary measures one usually considers “wavelets of order 0“ (averages), whereas for functions with multifractal increments one must use wavelets of order at least 1. One obtainsK̂(q) as the slope of log (μ̂qr ) against log(r) over a range ofr. Negative moments are sensitive to measurement noise and quantization. For them, one typically uses only the local maxima of|εr(τ )| (modulus maxima methods). For the positive moments, we modify the standard estimator K̂(q) to significantly reduce its variance at the expense of a modest increase in the bias. This is done by separately estimating K(q) from sub-records and averaging the results. For the negative moments, we show that the standard modulus maxima estimator is biased and, in the case of additive noise or quantization, is not applicable with wavelets of order 1 or higher. For these cases we propose alternative estimators. We also consider the fitting of parametric models of K(q) and show how, by splitting the record into sub-records as indicated above, the accuracy of standard methods can be significantly improved.


Introduction
The property of multifractality may be viewed as a scale invariance condition whereby, when the support of a random process X(t) is contracted by a factor r≥1 and its amplitude is multiplied by some random variable A r , the resulting process is equivalent to X(t).Hence X(t) is multifractal if where equality is in the sense of distributions (Gupta and Waymire, 1990;Veneziano, 1999).This definition of multi-Correspondence to: P. Furcolo (p.furcolo@unisa.it)fractality can be extended to random measures, random fields in spaces of higher dimension, and vector-valued processes (e.g.Meneveau et al., 1990;Schertzer and Lovejoy, 1991;Pflug et al., 1993;Veneziano, 1999).For example, for a random measure on the real line, Eq. (1) becomes where ε( ) is the average measure density in ⊂R.
An important extension of Eqs. ( 1) and (2) (Veneziano, 1999) considers generalized random processes X(ϕ), where ϕ(t) is a test or kernel function (for example a wavelet).On generalized random functions, see Yaglom (1986).The process X(ϕ) needs not have point values X(t); when it does, X(ϕ)= ∞ −∞ ϕ(t)X(t)dt.For X(ϕ), the multifractal property is expressed as X(ϕ r ) = A r X(ϕ),r ≥ 1 (3) where ϕ r (t)=rϕ(rt) is a scaled and contracted version of ϕ(t).What makes Eq. ( 3) important is that, by variously constraining the class of functions ϕ under which it holds (for example wavelets of certain orders), Eq. (3) includes a wide variety of multiscaling processes, such as processes with multifractal generalized increments of various orders and so-called multi-affine functions such as those considered by Vicsek and Barabasi (1991) and Benzi et al. (1993).
Multifractality is characterized by the distribution of the scaling factor A r , but for inference it is more convenient to work with the moments of A r or equivalently the moment scaling function K(q)=log r E A q r , which does not depend on r.The reason is that A r is not observable, but for q such that the moments µ q t =E |X(t)| q , µ q =E |ε( )| q , or µ q ϕ =E |X(ϕ)| q exist, one may estimate K(q) using the relationships µ q t/r ∝ r K(q) , µ q /r ∝ r K(q) , µ q ϕ r ∝ r K(q) (4) which hold for any given t, or ϕ.
Published by Copernicus Publications on behalf of the European Geosciences Union and the American Geophysical Union.
D. Veneziano and P. Furcolo: Improved moment scaling estimation for multifractal signals Equation (4) follows directly from Eqs. ( 1)-( 3) and the scaling of the moments of A r , but its use is constrained by three factors: 1.For some values of q, say q≥q * , E A q r may be finite but the moments µ q rt , µ q r or µ q ϕ may not exist.This is a well-known phenomenon (e.g.Mandelbrot, 1974;Schertzer and Lovejoy, 1987).The critical moment order q * depends on the K(q) function (Kahane and Peyriere, 1976).Similar problems of moment divergence are often encountered for q<0; 2. Equation (4) holds for the theoretical moments.The empirical moments, even at very high resolutions r, may differ significantly from the theoretical moments and not satisfy Eq. ( 4).This is associated with the so-called linearization phenomenon, whereby the K(q) function extracted from the empirical moments becomes linear in q above some moment order q + <q * (e.g.Ossiander andWaymire, 2000, 2002;Lashermes et al., 2004).In Sect.2, we describe in some detail this phenomenon and the calculation of q + from K(q); 3. The empirical moments may be heavily distorted by noise or other data artifacts (e.g.Harris et al., 1997).
Our objective is to evaluate and improve current estimators of K(q) for positive or negative q, including cases when the data are exact, corrupted by noise, or degraded by quantization.For numerical illustration and performance assessment, we consider stationary non-negative multifractal measures in the unit interval.These include discrete multiplicative cascades (Mandelbrot's cascades) as well as stationary continuous-scaling measures (conserved universal multifractal processes).In either case, A r is assumed to have lognormal distribution with unit mean.We use ε n to denote the average measure density in a generic sub-interval of length 2 −n where n is the resolution level and r=2 n is the resolution.The non-diverging moments of ε n , µ q n =E ε q n , satisfy µ q n ∝2 nK (q) .We have obtained results also for some processes with multifractal increments, but results are similar to those for the discrete cascades and stationary measures mentioned above and are not presented.It should be noted that, although discrete multiplicative cascades have many limitations as models of physical phenomena, they are pedagogically very attractive due to their simplicity of construction and simulation and the fact that, unlike "continuous scaling" processes, they satisfy the scale invariance condition in Eq. ( 2) exactly (over the diadic partition of the unit interval).
The function K(q) is usually estimated in two steps (Harris et al., 1997;Lashermes et al., 2004).In the first step one calculates the empirical moments μq n = ε q n for a range of n and q and for each q one estimates K(q) as the slope of the linear regression of log 2 ( μq n ) against n over a range [n min ,n max ] of resolution levels.The maximum resolution level n max is generally taken as the resolution level of the data, n dat .This operation is performed without assuming any parametric form of K(q); hence we refer to K(q) as a nonparametric estimator.One may generalize this basic estimator by replacing the above local averages with the absolute wavelet coefficients ε n =2 n | ϕ(2 n t)X(dt)|, where ϕ is a wavelet of order 0 or higher (the order is the lowest integer q such that ϕ q (t)dt =0).Hence a wavelet of order 0 has non-zero integral.Wavelets of order 0 are appropriate for the multifractal measures considered here.The use of higher-order wavelets (generalized differences of different order) poses problems for the negative moments, because the probability density of ε n at zero becomes nonzero and at least the moments of order q≤ − 1 diverge.
In the second step, one may use various criteria to fit a parametric model K par (q) to K(q).For example, one may fit K par (q) by least squares over a range [q min ,q max ] of moment orders.In this case the estimator K(q) depends on [n min ,n dat ] and the estimator Kpar (q) depends additionally on [q min ,q max ].
The estimation of K(q) is generally more challenging for negative than for positive moments.The reason is that the negative moments are sensitive to values of ε n close to zero and therefore are strongly affected by additive noise, data quantization and other measurement artifacts.As mentioned above, for wavelets of order 1 there are problems with the estimation of K(q) for q≤ − 1, also when the data are not corrupted.In all these cases one must use special techniques that avoid low values of ε n (e.g.Muzy et al., 1991;Bacry et al., 1993), but in general the accuracy of estimation remains lower than for the positive moments.This is why parametric models K par (q) are typically fitted to K(q) over a range of positive moments.
The bias and variance of the nonparametric estimator K(q) depend on the true K(q) function, the length and resolution of the record, the accuracy of the data, the wavelet order used, the parameters [n min ,n dat ], and the moment order q.The same is true for the parametric estimator, with q replaced by the moment range [q min ,q max ].We find how the bias and variance of various estimators vary with the above parameters when A r in Eq. ( 2) has lognormal distribution, but our qualitative conclusions hold also for different distributions of A r .In the lognormal case the variance of ln(A r ) depends on r as Var[ln(A r )]=2C 1 ln(r) where 0<C 1 <1 controls the variability of the measure and the function K(q) is given by (Schertzer and Lovejoy, 1987) The paper is organized as follows.Section 2 reviews what is known about the performance of the nonparametric estimator K(q).Section 3 introduces and evaluates a broader class of estimators that aim at reducing the error variance and the RMS error.The idea is to increase the effective number of independent realizations by partitioning the available record into a number of sub-records, then separately estimate K(q) for each sub-record, and finally average the results.As we show in Sect.3, this operation is effective in reducing the mean square error of the estimator and works well also for negative moments, if the data are error-free and one uses zero-order wavelets (e.g.local averages).Section 4 focuses on the estimation of K(q) for negative moments when the data are corrupted by errors or one uses wavelets of order 1.We show that techniques based on the wavelet transform modulus maxima (WTMM) are biased and suggest alternatives.The alternative methods have parameters that can be adjusted to focus on values of ε n at different distances from zero, a feature that is useful to deal with data corrupted by different noise levels or with different degrees of quantization.
Issues of parametric estimation are examined in Sect. 5. We show how by using the nonparametric estimators introduced in Sect. 3 one can significantly improve the accuracy of standard parametric estimators.The main conclusions are summarized in Sect.6.

The standard nonparametric estimator K(q)
An interesting property of the nonparametric estimator K(q) is that with probability 1, as the resolution level of the observations n dat →∞, K(q) becomes linear with some positive slope γ + above some order q + >1 and with some negative slope γ − below some order q − <0 (Ossiander andWaymire, 2000, 2002;Lashermes et al., 2004).The moment orders q ± and slopes γ ± are found geometrically by drawing the tangents to K(q) with Y -intercept equal to −1. γ + and γ − are the slopes of these tangent lines and q + and q − are the values of q at the points of tangency; see Fig. 1.Hence q ± and γ ± can be calculated analytically by solving where K is the derivative of K.For K(q) in Eq. ( 5), one finds Next we elaborate on the "straightening" of K(q) for q>q + , but similar considerations apply to the negative moments for q<q − .The straightening of K(q) above q + is due to the nonobservability of extreme singularities in a sample.As the resolution level n→∞, the ensemble moments µ q n =E ε q n are dominated by values of ε n on the order of 2 nK (q) , but in a finite number s of realizations, as the level n→∞ these values occur with probability one for q<q + and with probability zero for q>q + (Ossiander andWaymire, 2000, 2002;Lashermes et al., 2004;Veneziano et al., 2006).It follows q K(q) -1 q + q - 0 1 + -Fig. 1. Illustration of linearization of the estimator K(q).As n dat →∞, K(q) becomes linear with probability 1 above q + and below q − .that with probaiblity 1 the sample maximum of ε n , ε n,max , satisfies These sample maxima control the empirical moments μq n for q>q + , making them behave like μq n ∼2 n(qγ + −1) and causing the estimator K(q) to become linear with slope γ + for q>q + (with probability 1).As mentioned in the Introduction, this linearization phenomenon occurs at moment orders q + below the order q * of moment divergence.Therefore, what is of concern in practice is linearization due to non-observability of high-order singularities not linearization from divergence of the moments.
The above applies as the maximum resolution level n dat →∞.For finite n dat , K(q) displays gradual straightening as q increases, becoming asymptotically linear as q→∞.The asymptotic slope for finite n dat , say γ + , differs from γ + in Eq. ( 8) and is obtained from the sample maxima ε n,max as the slope of the linear regression of log 2 ε n,max against n in the range [n min ,n dat ].Transition to linearity might be considered to occur at the moment order q+ at which the sample maxima ε n,max contribute some fixed fraction of the empirical moments μq n (for alternative but essentially similar criteria, see Harris et al., 1997).The maxima ε n,max vary significantly from realization to realization (Harris et al., 1997;Lashermes et al., 2004); hence also the transition order q+ and the asymptotic slope γ + are realization-dependent.
For n dat finite, theoretical evaluation of the bias and variance of K(q) is difficult.Wendt andAbry (2006, 2007) and Wendt et al. (2007) estimated these quantities through asymtotic expansion or bootstrap, whereas Harris et al. (1997) used numerical simulation of discrete cascades to show how the bias and variance depend on n dat (in Harris et al., 1997, n min is close to zero and what is effectively varied is n dat ).The results confirm that for q>1 the estimator K(q) is biased towards low values and the bias and variance increase as q increases or n dat decreases.D. Veneziano and P. Furcolo: Improved moment scaling estimation for multifractal signals The bias is contributed by two sources: one is the finiteness of the sample and the other is the dependence among the empirical moments estimated from the same sample at different scales.The first source of bias may be understood as follows.The empirical moments μq n are unbiased with nonzero variance.Since the logarithmic function is concave, the log moments log 2 μq n , which are the quantities used in the regression that produces K(q), have negative bias.The bias tends to increase as the moment order q increases, causing the slope of the regression to be negatively biased.For q<q + , K(q) becomes asymptotically unbiased as the number of realizations s→∞.
To understand the second source of bias of K(q), consider the maximally biased case when a single realization ( s=1) is observed at level n dat =1.In this case the data consist of just two values, ε 1,1 and ε 1,2 , being the average measure densities in the first half and second half of the unit interval, and the estimator becomes where ε 0 =0.5 ε 1,1 +ε 1,2 is the average measure density in the unit interval and the prime sign indicates normalization by ε 0 .Normalization transforms the measure densities ε 1,1 and ε 1,2 into the partition coefficients ε 1,1 and ε 1,2 from level 0 to level 1.These partition coefficients are always between 0 and 2 and therefore have finite moments of all positive orders q.For A r lognormal with parameter C 1 =0.1, Fig. 2 compares the theoretical function K(q) in Eq. ( 5) with the log 2 of the moments of the one-step partition coefficient ε 1 =ε 1 /ε 0 .For the latter, we first calculated the distribution of ε 1 using a numerical procedure of the type in Veneziano and Furcolo (2003) and then obtained the moments E ε 1 q numerically.Figure 2 shows that, for q>1, log 2 E ε 1 q is significantly smaller than K(q).The case n dat =1 produces maximally biased estimators.As the next section shows, the bias decreases as n dat increases.

Improved nonparametric estimation of K(q) from defect-free data
With the main objective of reducing the error variance, we introduce a class of nonparametric estimators of K(q) that includes the conventional estimator K(q) as a special case.
The focus here is on the positive moments, but the estimators apply also to the negative moments if the data are defect-free and one uses order 0 wavelets.Suppose that the available record consists of a single measure realization at some finite resolution level n dat , as in Harris et al. (1997), and that the data are not contaminated by noise or quantized.The variance of K(q) is contributed in part by the strong correlation among the measure densities ε n in different intervals at the same or different resolution 5) for C 1 =0.1 with the logmoments log 2 E ε 1 q of the step-one partition coefficient ε 1 .
levels n.Therefore the variance would be reduced if one could increase the number of independent samples or reduce the correlations.To achieve this objective, we partition the available record into sub-records, each rooted in a tile at a fixed resolution level n o <n dat ; see Fig. 3 for a schematic illustration.We find Kj (q) from each sub-record j =1,...,2 n o using the conventional nonparametric estimator over a range of resolution levels [n min ,n dat ] for some n min ≥n o , and finally average the results to obtain For any given n dat , the "average estimator" K(q) depends on (n o ,n min ) or equivalently (n o , n) where n=n min −n o .For n o =0, K(q) reduces to the standard estimator K(q).A reduction in the error variance is brought about by the fact that the sub-record estimators Kj (q) may be considered as mutually independent.To show the independence property for the case of discrete cascades, note that the standard estimator K(q) is unaffected by multiplication of the record by a positive constant.Hence Kj (q) does not vary if one divides all the average densities ε n in the j -th sub-record by ε n o ,j , the average measure density for that entire sub-record.Since in discrete cascades the normalized sub-records are independent, the estimators Kj (q) are also independent.For continuous multifractal processes, the sub-record estimators Kj (q) are serially correlated.However, we have found through simulation that the correlation is at most 0.04 at j -lag 1 and is otherwise very close to zero.Therefore, in practice the estimators Kj (q) for different j may be taken as independent.
The price one pays for the increase in the number of independent samples from 1 to 2 n o is that the sub-records have a reduced resolution range n dat −n o compared with the range n dat of the original data, and this causes the bias to increase somewhat.This operation of sub-sampling and averaging is formally analogous to a variance-reduction technique used to estimate the power spectrum of stationary signals (Press et  al., 1992).In spectral analysis, one may use other variancereduction methods such as smoothing the empirical spectral density over a range of frequencies, but these alternative methods do not work well with K(q) due to the high correlation of K(q) for different q.
Following the approach of Harris et al. (1997) for K(q), we estimate the bias and error variance of K(q) in Eq. ( 10) by simulating a large number of discrete cascade realizations in the unit interval.For K(q) in Eq. ( 5), the bias, variance and root mean square error (RMSE) of K(q) depend on the multifractal parameter C 1 , the moment order q, the resolution level of the data n dat and the parameters (n o , n) of the estimator.Figure 4 shows results for C 1 =0.1, which for example is a plausible value for rainfall during rainstorms (Langousis and Veneziano, 2007), q=−2, 2, 3 and 4, n dat =14, and all possible combinations of n o =0(2)12 and n=0,...,13−n o .The remaining number of levels, 14−n o − n, is the range used to fit the regressions whose slope defines the estimators Kj (q).Notice that for C 1 =0.1, q=4 exceeds the moment order q + = √ 10=3.162above which K(q) tends to be linear.All results in Fig. 4 are based on 1000 fully dressed discretecascade simulations.Dressing is accomplished by first calculating the distribution F Z of the dressing factor Z by the procedure of Veneziano and Furcolo (2003) and then generating independent samples Z i =F −1 Z (U i ), where the U i are independent variables with uniform distribution in the unit interval.
Each panel of Fig. 4 includes several plots, one for each value of n o , whereas n varies along the horizontal axis from 0 to 13−n o .Therefore, the longer plots are associated with smaller values of n o .The longest plots, for n o =0, correspond to the traditional estimator K(q).In particular, the estimator with n o = n=0, which is the one most frequently used in practice, is marked with a star.
The bias increases when either n o increases or n decreases.The reason is that either of these changes makes the estimator Kj (q) rely more heavily on moments at reso- lution levels near the "root" of the j -th sub-record.As one approaches the root level, the traditional estimator reflects the moment scaling of the partition coefficients and is maximally biased; see Eq. ( 9) and Fig. 2. The estimator variance decreases with increasing n o (due to the larger number 2 n o of independent sub-records) but is insensitive to n, for the following reason.Lowering n increases the range of resolution levels used in the regressions, but the added levels correspond to low resolutions for which the moment estimates are less accurate.Hence it may happen that adding resolution levels increases the variance of the least-squares regression slope.We have investigated the use of weighted rather than ordinary least squares with weights that depend on the accuracy of the log-moment estimates and found that there is some gain in accuracy, but the gain hardly justifies the added complexity of the procedure.These qualitative traits of the bias and variance of K(q) hold for all positive and negative moment orders q.
The behavior of the RMSE is more complicated because the bias and variance dominate the RMSE over different ranges of n o and n (see schematic illustration in one of www.nonlin-processes-geophys.net/16/641/2009/ Nonlin.Processes Geophys., 16, 641-653, 2009 the panels of Fig. 4), which further depend somewhat on q.
In general, the bias dominates for large n o and small n, whereas the variance dominates for small n o and large n.
The importance of the bias increases as the moment order increases.
In Fig. 4, the multifractal parameter C 1 and the resolution level of the observations n dat are kept fixed.We have made similar analyses with different values of these parameters.The results (not shown) are qualitatively similar, although as expected the bias and variance tend to increase as C 1 increases or n dat decreases.The optimum value of n o varies mainly with n dat as n o ≈0.5n dat −1 and the optimum of n is such that n min = n o + n≈n dat −2.We call K(q) with these parameter setting the optimal K(q) estimator.The solid lines in Fig. 5 show the relative RMS error of the optimal estimator (the RMS error divided by the actual value of K(q)) for different q, C 1 =0.05, 0.1, 0.15 and 0.2, and n dat =8, 10, 12 and 14.For comparison, the relative RMS errors of the standard estimator K(q) (this is the estimator K(q) for n min =n o =0) are shown as dashed lines.Increasing the multifractal parameter C 1 and decreasing the resolution of the observations have deleterious effects on both estimators.The same is true for increasing the moment order q.Note that the estimators tend to become linear above q + =1/ √ C 1 , which in Fig. 5 is marked by a vertical dashed line; hence in these high-q regions the RMS error is large.For q=2 and records of length around 4000 (n dat about 12), the relative RMS error of the optimal estimator is 4.7%, 7.2%, 10.9% and 15.1% when C 1 =0.05, 0.1, 0.15 and 0.2, respectively.For the standard estimator, these relative errors are much higher (16.2%, 19.3%, 21.6% and 24.0%).
It is emphasized that we have optimized n o and n for the case when only one multifractal realization is observed.In many practical cases, the available record includes many independent realizations.The higher sample size does not affect the bias of K(q) and K(q), but reduces the error variance and therefore makes small-bias, large-variance estimators like K(q) more appealing.
4 Improved nonparametric estimation of K(q) from noisy or quantized data Next we consider the estimation of K(q) from data affected by imperfections like additive noise, multiplicative noise, or quantization.Additive noise and quantization influence mainly the estimates of K(q) for negative moments; hence the primary focus of this section is on q<0.This problem has been extensively discussed in the literature; see for example (Harris et al., 1997;Muzy et al., 1991;Bacry et al., 1993).The basic strategy is to use values of ε n away from zero, while being careful that the operation of screening out the lower values does not alter the scaling of the moments.
Fig. 5. Comparison of the relative RMS error of the optimal average estimator K(q) (solid lines) and the standard estimator K(q) (dashed lines).Different lines correspond to observations at different resolution levels n dat =8, 10, 12 and 14, increasing from above.
Results are based on 1000 independent simulations.

Standard WTMM estimators
A popular technique (Muzy et al., 1991;Bacry et al., 1993;Struzik, 2000) is to first obtain the continuous wavelet transform where φ is a wavelet of order 0 or higher and then calculate the absolute moments using only the local maxima of |ε n (τ )|.The use of wavelets of order 1 or higher is problematic, because their absolute coefficients |ε n (τ )| have nonzero probability density at zero; hence there is a higher probability of finding local maxima that are very close to zero than for wavelets of order 0. For stationary multifractal measures one can (and should) use 0-order wavelets, whereas for processes X(t) with stationary multifractal increments this is not an option.Here we consider stationary multifractal measures X(dt) in the unit interval.
In practice, X(dt) is sampled at some finite resolution level n dat and the wavelet coefficients in Eq. ( 11) are calculated for 2 n dat values of τ , equally spaced between 0 and 1.We still refer to this as the continuous wavelet transform (CWT) of the signal.The procedure that derives K(q) from the local maxima of CWT is known as wavelet transform modulus maxima (WTMM) method.Several variants of the WTMM method exist, with the general aim to avoid local maxima that are still too close to zero and therefore are affected by data inaccuracies.The most popular method ("WTMM-with-sup") connects the local maxima at different resolution levels n to form modulus maxima lines.where the supremum is along the maxima line that passes through the local maximum point (Muzy et al., 1993(Muzy et al., , 1994;;Arneodo et al., 1995).We consider explicitly only the WTMM method, but the qualitative results should extend to variants of that method, including WTMM-with-sup.Like the ordinary estimator K(q), the WTMM estimator KWTMM (q) depends on the true function K(q) and the range of resolution levels (n min ,n dat ) used in the regression.The WTMM estimator has two bias components.One component is similar to that of the standard estimator K(q) and is due to dependence among the values of ε n (and hence of the local maxima) for different n.The other component is due to a symmetry breaking of the CWT transform: in order not to affect scaling, the relative amount of overlapping of the wavelet functions used for the calculation of two consecutive coefficients should not vary with the scale (similar concerns have been expressed by Lashermes et al., 2005).This means that the number of values of ε n (τ ) from which the local maxima are extracted should vary with n as 2 n , whereas in the WTMM method that number is constant and equal to 2 n dat .This lack of scale invariance in the identification of the local maxima alters the scaling of the moments and biases the estimate of K(q); see below for a numerical illustration.
As noted above, another problem is that the WTMM procedure is ineffective in filtering out ε n (τ ) values that are close to zero.Alternatives like WTMM-with-sup are better in this regard, but suffer from other limitations: the additional symmetry breaking caused by the fact that one takes the supremum over a range of resolution levels [n,n dat ] that narrows as n increases, the fact that the method does not apply when |ε n | along the maxima lines tends to increase with increasing n and is not effective in the opposite case, the more complicated implementation etc.

Unbiased WTMM estimators
In formulating alternatives to the above WTMM methods, one should pursue three objectives: avoid distortions of the K(q) function, retain simplicity, and include tunable parameters to focus on values of |ε n (τ )| that are at different "distances" from zero.This last feature allows one to obtain accurate estimates of K(q) under different levels of data corruption, for example different additive noise levels or degrees of coarseness of the quantization.To avoid distortions, one should subject the wavelet transform to a filtering operation that preserves the scaling of the moments.For example, one can avoid the second source of bias of KWTMM (q) mentioned above by using the discrete wavelet transform whereby one retains only the coefficients ε n (j 2 −n ) for j =1,...,2 n .The estimator of K(q) that uses the local maxima of these coefficients is referred to as the scale-invariant WTMM estimator or simply the "scale-invariant estimator".Since in this case one can extract the local maxima only at resolution levels n≥2, one must set n min ≥2.
To reduce the number of local maxima that are close to zero, one could eliminate a given fraction of lowest local maxima within non-overlapping blocks of j of size 2 2+k , where k is a positive integer.This operation produces the "thinned scale-invariant estimator".
A third possibility is to use the maximum of |ε n j 2 −n | in each non-overlapping block of 2 values of j , where is a positive integer acting as a tuning parameter (larger values of let one focus on increasingly higher values of |ε n |, although the number of maxima retained for moment analysis decreases).More in general, one could retain the largest 1≤m≤2 values of ε n j 2 −n within each 2 block.Preliminary analyses have shown that the (m, ) estimator is rather insensitive to m and and that retaining 1/8 to 1/4 of the original wavelet coefficients is about optimal.We refer to this as the (m, ) WTMM estimator or simply the "(m, ) estimator".In this case it must be n min ≥ .For a schematic illustration of the continuous WTMM estimator, the scaleinvariant estimator, the thinned scale-invariant estimator, and the (m, ) estimator, see Fig. 6.

Performance of various WTMM estimators
Below we assess and compare the performance of these four estimators under perfect and imperfect data conditions.Since WTMM estimators operate on the local maxima of |ε n | in time, it is desirable that the measure X(dt) be stationary.This excludes discrete cascades.The measures we simulate are stationary lognormal multifractal measures on the unit interval, with C 1 =0.1 and one-sided log-spectral density function S ln(X) (ω)=0.1ω−1 over the scaling range.Simulation of ln(X) is performed in Fourier space: for each 2fold increase of ω starting from ω min =π•2 −2 and ending at ω max =π•2 14 , we use 2 4 waves with logarithmically spaced frequencies.We have used the simulations at two resolution levels: n dat =14 and n dat =8.
The generation of well-dressed continuous MF processes at high resolution (e.g.n dat =14) can be computationally very demanding.To dress our simulations for n dat =14 we used the following ad-hoc procedure.The signal was first (over)sampled at intervals of length 2 −(14+3) and then averaged in groups of 8 to achieve the desired resolution of 2 14 .We call this the "bare" process.To generate a sample of "dressing factors", we increased the resolution of the simulation in 10 000 intervals of size 2 −14 by a factor of 2 10 and took the ratios between the averages of the high resolution signal and the "bare" values as randomly simulated dressing factors.Finally, we obtained the dressed simulation by applying randomly selected "dressing factors" from the pool of 10 000 to the "bare" signal.
The parameters of the estimators have been set as follows.For the (m, ) WTMM estimator, in analyses with n dat =14 we have set m=4 and =5 (meaning that we select the 4 largest values from each non-overlapping block of 32 coefficients) and in those with n dat =8, we have set m=1  and =3 (thus retaining the largest value within each block of 8 coefficients).If the increments of |ε n (j 2 −n )| had independent signs, on average one fourth of the wavelet coefficients would be local maxima, whereas the above (m, ) estimators retain only one eight of the values.Due to this higher selectivity, one may anticipate the (m, ) estimator to be less affected by data inaccuracies but have larger variance than the scale-invariant estimator.To compare the (m, ) estimator with an estimator that uses approximately the same number of wavelet coefficients, in the thinned scale-invariant estimator we eliminate 50% of the local maxima within nonoverlapping j blocks of size 32.We set n min , the lower resolution level for the regression, to 2 for the WTMM and scaleinvariant estimators and to 5 for the thinned scale-invariant and (m, ) estimators.Finally, we set the upper resolution level for the regression n max to the upper limit of the moment scaling range.This upper limit coincides in some cases with n dat , the resolution level of the data, but in general depends on the estimator and the type and level of data corruption; see below.The analyzing wavelets are of order 0 (corresponding to local averaging) or order 1 (Haar wavelets).We apply the above estimators to noiseless data and simulated records with various types and levels of corruption.The uncorrupted process is a continuous-scaling stationary lognormal multifractal measure in the unit interval.Data corruption is always applied at the observation level n dat , as follows.In the case of additive noise, we add independent zeromean Gaussian variables with standard deviation equal to ei- ther 5% or 10% of the standard deviation of ε n dat (setting to zero any resulting negative value).In the case of multiplicative noise, we multiply the ε n dat values by independent lognormal random variables with mean value 1 and log-standard deviation equal to either 20% or 50% of the log-standard deviation of the noiseless values.Finally, quantization is introduced by assuming a resolution of the recording instrument equal to either the 5-th or 20-th percentile of the actual ε n dat values.
Figure 7 shows plots of the moments of order q=−1, −2, −3 and −4 for the four estimators when C 1 =0.1, n dat = 14, and the data are noiseless (top row) or variously defective, with defects set at "low levels".The analyzing wavelet is of order 0. The values plotted in Fig. 7 are the averages of the log of the moments obtained in 1000 independent simulations.Therefore the plots are affected by small statistical variability and can be used to assess systematic deviations from moment scaling.The open dots indicate the range (n min , n max ) used in each case to estimate the K(q) function.
In general, there is a progression towards increased linearity of the plots as one moves to the right, from the continuous WTMM estimator to the (m, ) estimator.Of the various forms of data corruption, additive noise and quantization are the most disruptive of moment scaling, especially at the high resolutions (because the high-resolution values can be very close to zero).These qualitative observations hold also for other values of n dat and for the higher levels of data corruption but, as we shall see later, not for analyzing wavelets of order 1.Figures 8 and 9 compare the performance of the same four estimators, applied to a single measure realization with Quantization, 20% q Fig. 8. m±σ intervals for the estimators in Fig. 6, applied to a single measure realization with C 1 =0.1 observed at resolution n dat =14.The panel at the top is for noiseless data.The bottom three rows consider variously corrupted data.The type and level of corruption are indicated in each panel; see text for details.All results are based on 1000 measure simulations.
C 1 =0.1, when observations are at resolution level n dat =14 (Fig. 8) or resolution level n dat =8 (Fig. 9).Also included are results for low and high levels of data corruption (left and right columns in each figure).In each panel, the continuous black line is the actual K(q) function and the other lines are the mean values of the different estimators, obtained from 1000 independent simulations.The vertical bars are m±σ intervals for the estimators.The panel at the top, for noiseless data, shows the maximum achievable performance of the estimators.The plots extend mainly over the negative moment orders.Results for positive moments up to order 2 mainly confirm that for q>0 all the estimators perform well and are insensitive to data artifacts.In general, the continuous WTMM estimator has smaller variance but significantly higher bias than the other estimators.This is especially true when the record covers a narrow range of scales; see Fig. 9.The other estimators have generally comparable performance, with the (m, ) estimator having lower bias but slightly higher variance than the scale-invariant and thinned scale-invariant estimators.Multiplicative noise, even at the higher level, produces minimal deterioration in the performance of all estimators. -1 Quantization, 20% q Fig. 9. Same as Fig. 8 for n dat =8.The reference straight lines have the theoretical slope K(q) from Eq. ( 5) with C 1 =0.1.The measures are observed at resolution level n dat =14 or 8.
To understand the sources of bias of the continuous WTMM estimator especially in the case of low-resolution records, Fig. 10 shows moment plots of the type in Fig. 7 for q=−2 and −4, comparing the theoretical slope K(q) (thin straight lines) with the marginal moments of the ε n values (stars; these are the moments used by the standard estimator K(q)), the moments of the local maxima used by the continuous WTMM estimator (open circles), and the moments of the local maxima used by the scale-invariant estimator (diawww.nonlin-processes-geophys.net/16/641/2009/ Nonlin.Processes Geophys., 16, 641-653, 2009 monds).The resolution level of the observations, n dat , is 14 in Fig. 10a and 8 in Fig. 10b.In all cases, the moments are log-averages over 1000 independent simulations with noiseless data.Therefore the moments for the continuous WTMM and scale-invariant estimators are the same as those in the first row of Fig. 7.It is interesting that, over a range of low resolutions, the continuous WTMM moments are practically identical to the marginal moments.This means that the distribution of the local maxima is virtually the same as the marginal distribution (due to the high degree of overlapping of the averaging windows in the continuous wavelet transform).In this scale range the continuous WTMM method displays scaling but is ineffective and retains the bias of the ordinary estimator.At higher resolutions, the WTMM moments do not scale due to the lack of scale invariance of the maximum-extraction procedure that was noted earlier in this section.By contrast, the moments of the scale-invariant method scale over all resolutions, with an exponent that is much closer to the theoretical slope K(q).The lack of scaling of the WTMM method is especially severe when the data are at low resolution (Fig. 10b), explaining the high negative bias of that method in Fig. 9. Results using Haar wavelets of order 1 are shown in Figs.11 and 12, in both cases limited to n dat =14. Figure 11 is analogous to Fig. 7 and Fig. 12 is analogous to Fig. 8.The main difference between using wavelets of order 0 or 1 is that in the latter case the absolute wavelet coefficients have nonzero probability density at zero.This makes the estimation of K(q) for negative q more difficult, especially for methods that are ineffective at filtering out the low values.This is why in Figs.11 and 12 the continuous WTMM method produces nonsensical results.The performance of the other methods is somewhat deteriorated relative to the order 0 case, although these methods do not break down.
5 Improved parametric estimation of K(q) Harris et al. (1997) discuss at length the selection of the positive moment-order range [q min ,q max ] such that the parametric estimator Kpar (q), fitted by least-squares to K(q) inside that range, is minimally affected by the straightening of K(q) for large q.They set q max =q + , the moment order above which K(q) is essentially linear, and suggest two specific ways to obtain estimates q+ from a realization.Here we consider the same problem in the context of the lognormal model in Eq. ( 5) (hence the problem of estimating C 1 ), with the difference that we replace K(q) with the average estimator K(q) in Eq. ( 10) and compare two different estimators of C 1 , one obtained from LS fitting of K(q) within some range [q min ,q max ] and the other with C 1 estimated as the slope of K(q) at q=1.We refer to these as the LS estimator and Slope estimator of C 1 , respectively.Due to the superior performance of K(q), one may expect significant gains in accuracy from using K(q) instead of K(q).In addition to choosing the range [q min ,q max ] for the LS estimator, one must select the parameters n o and n min of K(q).Three choices are especially For the range of moment orders used in the LS estimator, we set q min =0 and select q max such that the slope of the estimator K(q) at q max is a fixed fraction of the estimated asymptotic slope γ + .The latter slope is obtained by regressing the logs of the moments of the maximum measurements ε n,max at different resolution levels n, as explained in Sect. 2. In the numerical results shown below, the fraction is 0.5.Therefore our q max is about one half the value used in Harris et al. (1997).
Figure 13 compares results for different true values of C 1 (which varies along the horizontal axis), different resolutions of the data (different columns), different parameters of K(q) in Eq. ( 11) (different rows), and different ways of fitting the parametric function in Eq. ( 5) (solid lines for the LS estimator, dotted lines for the Slope estimator).The data are noiseless and the results are based on 1000 independent simulations.The vertical bars are m±σ intervals.As expected, all estimators perform much better for n dat =14, since the record is 2 6 =64 times longer than for n dat =8.This is especially true for the parameter settings (a) and (b) in Eq. ( 12), since these settings take advantage of the higher resolution to split the record into a larger number of independent sub-records and by so doing reduce the estimator variance.The gain in accuracy is not as high in the case of the standard estimator Nonlin.Processes Geophys., 16,[641][642][643][644][645][646][647][648][649][650][651][652][653]2009 www.nonlin-processes-geophys.net/16/641/2009/ D. Veneziano and P. Furcolo: Improved moment scaling estimation for multifractal signals 651 Quantization, 20% q Fig. 12. Same as Fig. 8 for Haar wavelets of order 1.
(Eq. 12c).Also notice that the settings (b) and (c) produce nearly unbiased estimates, whereas there is a large bias in case (a).The LS and Slope estimators have very similar performance levels and in fact are nearly identical (for any given setting in Eq. 12, any C 1 and any n dat , the correlation coefficient between them is about 0.95).Since one of the estimators is local and the other is global, this is an indication that the parametric estimators are highly efficient in using information from K(q).Due to its greater simplicity, the Slope estimator is more attractive, but the LS estimator is more general and can be used in cases when K(q) has several unknown parameters.Overall, in terms of the RMS error, estimator (b) performs the best.However, one can devise an even better estimator by correcting the minimum-variance estimator (a) for bias.In our experiment with lognormal MF processes, this is done by finding the value of C 1 for which the expected estimator E Ĉ1 equals the actual estimate.The top panels in Fig. 13 show that, for estimator (a), Ĉ1 underestimates C 1 by approximately 20%, independently of n dat .Therefore, biascorrected estimators are given by Ĉ1,nobias = 1.20 Ĉ1(a) where Ĉ1(a) is the estimator Ĉ1 under setting (a) in Eq. ( 12).These bias-corrected estimators are far superior to the standard parametric estimators (c).The bottom row (c) is for the standard estimator K(q) ( K(q) with n o =n min =0).Results are based on 1000 independent simulations.

Conclusions
We have revisited the classical problem of estimating the moment-scaling function K(q) of multifractal signals from data.Specifically, we have focused on stationary multifractal measures (discrete-cascades or continuous-scaling), observed at different resolution levels with or without noise or quantization.For these measures and under these data conditions, we have sought to evaluate and improve existing nonparametric estimators of K(q) for positive and negative moment orders q, as well as existing parametric estimators.All the estimators considered are based on the absolute wavelet coefficients ε n at different resolution levels n calculated from the given record.We use wavelets of order 0 (in this case ε n is just a local average) or wavelets of order 1 (of the Haar type).Numerical results have been obtained by simulating "lognormal measures" with K(q) given by Eq. ( 5).
In parametric estimation, we have assumed K(q) to satisfy Eq. ( 5) with C 1 unknown.1.When the data are accurately recorded (no noise or quantization), one can improve the performance of the standard estimator K(q), which is given by the slope of the least squares regression of log 2 ε q n against n over the resolution range of the data.The improvement is obtained by partitioning the record into sub-records, calculating K(q) for each sub-record, and averaging the results.The resulting average estimator K(q) has two parameters, one that determines the number of sub-records and the other that controls the range of moment orders used to estimate K(q) from each sub-record.Based on extensive simulations, we have determined a simple rule to set these parameters to minimize the RMS error of K(q).Different settings minimize the variance and the bias.As Fig. 5 shows, K(q) is much more accurate than K(q).2. For the positive moments, K(q) performs well also when the data are noisy or quantized and with analyzing wavelets of order 0 or 1.By contrast, for q<0, K(q) should be used only with wavelets of order 0 and data that are perfectly recorded or are corrupted by multiplicative noise.The reason for these restrictions is that in all other cases the wavelet coefficients ε n have nonzero probability density at 0 and their absolute moments of order q≤ − 1 diverge.The standard way to deal with this problem is to calculate the scaling of the moments of the local maxima of |ε n | (WTMM method and variants).We have shown that the WTMM estimator has several drawbacks, also in the case of perfect data recording and wavelets of order 0: at high resolution levels n, the moments do not scale because the local maxima are extracted in a way that violates scale invariance.For smaller n, the method is ineffective in filtering out the values of ε n that are close to zero, so that the negative modulus-maxima moments are virtually identical to the marginal wavelet transform moments.In the case of wavelets of order 1, the latter drawback is even more severe.We have suggested various alternatives to the WTMM method that are essentially unbiased and perform well also with noisy or quantized data and wavelets of either order.
3. For parametric estimation, we follow the standard procedure of least-squares fitting a parametric K(q) function to nonparametric estimates over a range of positive moments.In our case we use nonparametric estimates of the K(q) type.For the range of moments, we consider two cases: a range [0,q max ] where q max is approximately one half of the moment order beyond which K(q) becomes linear (see text for details), or a range 1 − ,1 + around 1. The latter fitting procedure is suggested by the fact that, when K(q) has the form in Eq. ( 5), the parameter C 1 is the slope of K(q) at q=1.
We call the resulting estimators of C 1 the LS estimator and the Slope estimator, respectively.These estimators perform very similarly and, when used with the optimal nonparametric estimator K(q) in place of the traditional estimator K(q), significantly reduce the RMS error of Ĉ1 .Additional accuracy can be gained by using the estimator K(q) with minimum error variance (with the parameter settings in Eq. 12a) and then correcting Ĉ1 for bias using Eq.(13).
All the numerical work reported here is for stationary multifractal measures.In many cases, one needs to estimate the moment-scaling properties of processes X(t) with multifractal increments (often referred to as multiaffine processes).
We have done some analyses also with these processes (not reported here).The main difference is that the analyzing wavelet must be of order at least 1.We have found that the qualitative results are similar to those when using the same wavelets with stationary measures.
Another context in which higher-order wavelets are used is the analysis of multifractal measures (or functions with multifractal increments) with additive polynomial trends (Lashermes and Foufoula-Georgiou, 2007).The non-scaling trend can be filtered out by choosing a wavelet of suitable order, but one has to contend with the issues of divergent negative moments and low local maxima mentioned above.
We conclude with a remark on the negative moments.There are many difficulties in accurately estimating K(q) for q<0.Moreover, one is often interested in the large values of ε n , which are controlled by the scaling of the positive moments.Therefore, it is generally recommended that one limits the inference of K(q) to the positive moments.If needed, scaling of the negative moments may be inferred through parametric extension of K(q) from the positive moments.

Fig. 3 .
Fig. 3. Illustration of the resolution levels n o ,n min , and n dat used by the average estimator K(q).

Fig. 4 .
Fig. 4. Absolute bias, standard deviation, and RMS error of K(q) for moment orders q=−2, 2, 3, 4. A single measure realization with C 1 =0.1 is observed at resolution level n dat =14.The parameters of the estimator vary as n o +0(2)12 and n=0(1)13−n o .The different plots in each panel are for different n o .The longer plots, for n o =0, correspond to the standard estimator K(q).Results from 1000 independent simulations.

Fig. 6 .
Fig.6.Schematic illustration of four estimators of K(q).Subsequent figures compare the performance of these estimators in the case of negative moments and either exact observations or variously corrupted data.

Fig. 7 .
Fig. 7. Plots of the moments of order q=−1, −2, −3 and −4 for the estimators in Fig. 6.The plotted values are the averages of the log of the moments over 1000 simulations with C 1 =0.1, observed at resolution level n dat =14.The bottom three rows include "low levels" of additive noise, multiplicative noise or quantization; see text.The open circles indicate the resolution levels n used to estimate K(q).

Fig. 10 .
Fig.10.Sources of bias for the continuous WTMM method.Comparison of the marginal moments of order q=−2 and −4 (star symbols) with the same moments of the modulus maxima used by the WTMM estimator (open circles) and scale-invariant estimator (diamonds).The reference straight lines have the theoretical slope K(q) from Eq. (5) with C 1 =0.1.The measures are observed at resolution level n dat =14 or 8.

Fig. 13 .
Fig. 13.Performance of various estimators of C 1 in Eq. (5), based on a single measure realization observed at resolution level n dat =14 or 8.The continuous lines are for the LS estimator and the dotted lines are for the Slope estimator.The parameters n o and n min vary in different rows.The top (a) and middle (b) rows correspond to the estimators K(q) with minimum-variance (n o =n min =n dat −1) and minimum RMSE (n o =0.5n dat −1;n min =n dat −2), respectively.The bottom row (c) is for the standard estimator K(q) ( K(q) with n o =n min =0).Results are based on 1000 independent simulations.