Nonlinear Processes in Geophysics Haar wavelets , fluctuations and structure functions : convenient choices for geophysics

Geophysical processes are typically variable over huge ranges of space-time scales. This has lead to the development of many techniques for decomposing series and fields into fluctuations1v at well-defined scales. Classically, one defines fluctuations as differences: ( 1v)diff = v(x+1x)−v(x) and this is adequate for many applications (1x is the “lag”). However, if over a range one has scaling 1v ∝1x , these difference fluctuations are only adequate when 0<H < 1. Hence, there is the need for other types of fluctuations. In particular, atmospheric processes in the “macroweather” range ≈ 10 days to 10–30 yr generally have−1<H < 0, so that a definition valid over the range −1<H < 1 would be very useful for atmospheric applications. A general framework for defining fluctuations is wavelets. However, the generality of wavelets often leads to fairly arbitrary choices of “mother wavelet” and the resulting wavelet coefficients may be difficult to interpret. In this paper we argue that a good choice is provided by the (historically) first wavelet, the Haar wavelet (Haar, 1910), which is easy to interpret and – if needed – to generalize, yet has rarely been used in geophysics. It is also easy to implement numerically: the Haar fluctuation ( 1v)Haarat lag1x is simply equal to the difference of the mean fromx to x+1x/2 and fromx+1x/2 to x+1x. Indeed, we shall see that the interest of the Haar wavelet is this relation to the integrated process rather than its wavelet nature per se. Using numerical multifractal simulations, we show that it is quite accurate, and we compare and contrast it with another similar technique, detrended fluctuation analysis. We find that, for estimating scaling exponents, the two methods are very similar, yet Haar-based methods have the advantage of being numerically faster, theoretically simpler and physically easier to interpret.


Introduction
In many branches of science -and especially in the geosciences -one commonly decomposes space-time signals into components with well-defined space-time scales.A widely used method is Fourier analysis which -when combined with statistics -can be used to quantify the contribution of structures of a given frequency and/or wavenumber to the total variance of the process.Often, however, real space analyses are preferable, not only because they are simpler to perform, but more importantly -because they are simpler to interpret.The latter are based on various definitions of fluctuations at a given scale and location, the simplest being the differences: where the position is x, and the scale ("lag") x and δ is the difference operator (the parentheses with the "diff" subscript are only used when it is important to distinguish the difference fluctuation from others).For simplicity, we consider a scalar function (v) of a scalar position and/or of time.We consider processes whose statistics of a given order are translationally invariant (statistically homogeneous in space, stationary in time).The resulting fluctuations can then be statistically characterized, the standard method being via the moments < v q >.When q = 2, these are the classical structure functions; when q = 2, they are the "generalized" counterparts.
The space-time variability of natural systems can often be broken up into various "scaling ranges" over which the fluctuations vary in a power law manner with respect to scale.Over these ranges, the fluctuations follow: Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.S. Lovejoy and D. Schertzer: Haar wavelets, fluctuations and structure functions where x is a random field (or series, or transects thereof at resolution x) and the equality is a statistical one, i.e. in the sense of probability distributions.Below for simplicity, we consider spatial transects with lag x, but the results are identical for temporal lags t.In fluid systems such as the atmosphere, ϕ is a turbulent flux.For example, the Kolmogorov law corresponds to taking v as a velocity component and ϕ = ε 1/3 with ε as the energy flux.The classical turbulence laws assumed fairly uniform fluxes, typically that ϕ is a quasi-Gaussian process.
Taking the q-th power of Eq. ( 2) and ensemble averaging, we see that the statistical characterization of the fluctuations in terms of generalized structure functions is S q ( x) = v ( x) q = ϕ q x x qH ≈ x ξ (q) ϕ q x = L x K(q) ; ξ (q) = qH − K (q) (3) where S q ( x) is the q-th order structure function and ξ(q) is the corresponding exponent, K(q) is the moment scaling function that characterizes the flux and L is that outer scale of the scaling regime (e.g. of a cascade process).In the classical quasi-Gaussian case, K(q) = 0 so that ξ(q) is linear.More generally, if the field is intermittent -for example if it is the result of a multifractal process -then the exponent K(q) is generally non linear and convex and characterizes the intermittency.Since the ensemble mean of the flux spatially averaged at any scale is the same (i.e.< ϕ x > is constant, independent of x), we have K(1) = 0 and ξ(1) = H .The physical significance of H is thus that it determines the rate at which mean fluctuations grow (H > 0) or decrease (H < 0) with scale x.
There is a simple and useful expression for the exponent β of the power spectrum which is a consequence of the fact that the spectrum is a second-order moment (the Fourier transform of the autocorrelation function).K(2) is therefore often regarded as the spectral "intermittency correction".Note however that, even if K(2) is not so large (it is often in the range 0.1-0.2 for turbulent fields), the intermittency is nevertheless important: firstly, because K(q) typically grows rapidly with q so that the high-order moments (characterizing the extremes) can be very large; secondly, because it is an exponent so that even when it is small, whenever the range of scales over which it acts is large enough, its effect can be large; in geo-systems, the scale ratio can be 10 9 or larger.
Over the last 30 years, structure functions based on differences and with concave rather than linear ξ(q) have been successfully applied to many geophysical processes; this is especially true for atmospheric processes at (weather) time scales shorter than ≈ 10 days which usually have 0 > H > 1.In v(x) x Fig. 1.This shows samples of the simulations analysed in Sect. 5. From bottom to top, H = −7/10, −3/10, 3/10, 7/10.All have C 1 = 0.1, α = 1.8; the sections are each 2 13 points long.One can clearly see the change in character of the series when H changes sign.At the bottom, fluctuations tend to cancel out, fluctuations are "stable".At the top, they tend to reinforce each other, and fluctuations are "unstable".addition, in turbulence theory there are many useful theoretical results for these classical structure functions (e.g.Monin and Yaglom, 1975).However, it turns out that the use of differences to define the fluctuations is overly (and needlessly) restrictive.The problem becomes evident when we consider that the mean difference cannot decrease with increasing x: for processes with H < 0, the differences simply converge to a spurious constant depending on the highest frequencies available in the sample.Similarly, differences cannot grow faster than x -they "saturate" at a linear function -whose slope depends on the lowest frequencies present in the sample; the process no longer has stationary increments.Hence, they cannot be used when H > 1.Overall, whenever H is outside the range 0 > H > 1, the exponent ξ(q) is no longer correctly estimated.The problem is that we need a definition of fluctuations such that v( x) is dominated by wavenumbers in the neighbourhood of x −1 .Appendix A gives more details about the relation between real and Fourier space and the ranges of H relevant for various types of fluctuation, and Fig. 1 shows the shape of typical sample series as H varies.
The need to define fluctuations more flexibly motivated the development of wavelets (e.g.Holschneider, 1995;Torrence and Compo, 1998), the classical difference fluctuation being only a special case, the "poor man's wavelet".To change the range of H over which fluctuations are usefully defined, one must change the shape of the defining wavelet.In the usual wavelet framework, this is done by modifying the "mother wavelet" directly, various derivatives of the Gaussian are popular, especially the second derivative "Mexican hat".Following this, the fluctuations are calculated as convolutions with fast Fourier (or equivalent) numerical techniques.
A problem with this usual wavelet implementation is that not only are the convolutions numerically cumbersome, but the physical interpretation of the fluctuations is largely lost.In contrast, when 0 < H < 1, the difference structure function gives direct information on the typical difference (q = 1) and typical variations around this difference (q = 2) and even typical skewness (q = 3) or typical Kurtosis (q = 4) or -if the probability tail is algebraic -of the divergence of highorder moments of differences.Similarly, when −1 < H < 0, one can define the "tendency structure function" (below) which directly quantifies the fluctuation's deviation from zero and whose exponent characterizes the rate at which the deviations decrease when we average to larger and larger scales.These poor man's and tendency fluctuations are also very easy to directly estimate from series with uniformly spaced data and -with straightforward modifications -to irregularly spaced data.
In Lovejoy and Schertzer (2012a, b), it is argued that, with only a few exceptions for atmospheric processes in the "weather" regime ( t 10 days), 0 < H < 1 so that fluctuations increase with scale and the classical use of differences is adequate.However, in the low frequency "macroweather" regime (10 days t 10-30 yr), on the contrary, quite generally (in time, but not in space!) we find −1 < H < 0 so that fluctuations decrease with scale (although for larger t -the "climate" regime -it is again in the range 0 > H > 1).In order to cover the whole range (for the temperature, even millions of years), one must thus use a definition of fluctuations that is valid over the range 1 > H > −1.This range is also appropriate for many solid earth applications, including topography, rock density, magnetic susceptibility, geogravity and geomagnetism; for a review, see Lovejoy and Schertzer (2007).The purpose of this paper is to argue that the Haar wavelet is particularly convenient, combining ease of calculation and of interpretation, and -if needed -it can easily be generalized to accommodate processes with H outside this range.We demonstrate this using multifractal simulations with various H exponents to systematically compare it to conventional spectral analysis and to another common -and quite similar -analysis technique, the detrended fluctuation analysis (DFA) method (Peng et al., 1994), and its extension, the multifractal detrended fluctuation analysis (MFDFA) method (Kantelhardt et al., 2001(Kantelhardt et al., , 2002)).

Defining fluctuations using wavelets
Our goal is to find a convenient definition of fluctuations valid at least over the range −1 < H < 1, but before doing so, it will be useful to first define tendency fluctuations ( v ( x)) tend .The first step is to remove the overall mean (v (x)) of the series: v (x) = v (x) − v (x) and then take averages over the lag x.For this purpose, we introduce the  The bottom is the same simulation but at 16-day resolution (in the low frequency weather regime, H = −0.4).One can clearly see their differing characters corresponding to H > 0, H < 0 respectively (see also Fig. 1).These simulations are used to illustrate the two different types of structure function needed when H > 0 (the usual, difference) and H < 0, the "tendency" structure function.
Whereas the usual structure function yields typical differences in T over an interval t, so that the mean is of no consequence, the tendency structure function uses the average of the field with the mean removed.In both cases, the q = 2 moment was chosen because it is directly related to the spectrum.Taking the square root is useful since the result is then a direct measure of the "typical" fluctuations, e.g. for the T = the temperature, in units of K.
operator T x defining the tendency fluctuation ( v ( x)) tend as or, with the help of the summation operator S (not to be confused with the structure function), equivalently by We can also use the suggestive notation ( v ( x)) tend = v (see the schematic Fig. 2).When −1 < H < 0, ( v ( x)) tend has a straightforward interpretation in terms of the mean tendency of the data to decrease with averaging.The tendency fluctuation is also easy to implement: simply remove the overall mean and then take the mean over intervals x: this is equivalent to taking the mean of the differences of the running sum.

S. Lovejoy and D. Schertzer: Haar wavelets, fluctuations and structure functions
It is now straightforward to define the Haar fluctuation by taking the second differences of the mean: where H x is the Haar operator (not to be confused with the exponent H ). Note the use of the shorthand notation s (x) = Sv.
As discussed in Sect.3, this is still a valid wavelet (see Fig. 3).However, it is almost trivial to calculate and it can be used for −1 < H < 1 (Appendix A; note there is an extra factor of x −1 with respect to the Haar wavelet coefficient; see below).The numerical factor (2) is used so that the Haar fluctuation is close to the usual differences when 0 < H < 1, (below).While this may still look a little complicated, it is actually quite simple: in words, the Haar fluctuation at lag x is simply the difference of the mean from x to x + x/2 and from x + x/2 to x + x i.e. it deals with fluctuations in the integrated process.
From the definitions, it is easy to obtain the relation: This is useful for interpreting the Haar fluctuations in terms of differences and tendencies (Sect.5.2).Since the difference operator removes any constants, the tendency operator can be replaced by an averaging operator and is thus insensitive to the addition of constants; Eq. ( 7) therefore means that the Haar fluctuation is simply the difference of the series that has been degraded in resolution by averaging.Note that, for a series of length L, only a range of scales L/2 is accessible; the definition Eq. ( 6) identifies the largest lag with scale L, the smallest with scale 2 (we could equally well have modified the definition so that this would be L/2 and 1).
If needed, the Haar fluctuations can easily be generalized to higher-order n by using n-th order differences on the sum s: We note that the n-th order Haar fluctuation is valid over the range −1 < H < n; due to the n+1-th order difference, it is insensitive to polynomials of order n in the sum and of order n − 1 in the original series, although, as we see below, for analyzing scaling series, this insensitivity has no particular advantage.Generalizations valid to order H < −1 are x ( ) x Fig. 3.This shows the "poor man's wavelet" by black bars representing the amplitudes of Dirac δ functions, (the basis of the usual difference structure function, valid for 0 < H < 1; this is only symbolic since the true δ function amplitude is of course infinite), the Haar wavelet (the basis of the Haar structure function, uniform blue shading, the second difference of the running sum, valid for −1 < H < 1), and the wavelet used for the "tendency" structure function (valid for −1 < H < 0), stippled shading.also possible by iterating the sum operator.Note further that H (1) x = H x and H (0) x v = T x v ; hence, when applied to a series with zero mean, H x = T x .Consider for a moment the case n = 2 which defines the "quadratic Haar fluctuation": This fluctuation is sensitive to structures of size x −1 -and hence useful -over the range −1 < H < 2, and it "filters" out polynomials of order 1 (lines); this is thus essentially equivalent to the commonly used quadratic (n = 2) MFDFA (multifractal detrended fluctuation analysis, Sect.4) technique that we numerically investigate below.

Fluctuations and wavelets
Wavelet analysis defines fluctuations with the help of a basic "mother wavelet" (x) and a convolution: where we have kept the notation v to indicate "fluctuation".v defined this way is called a "wavelet coefficient".The basic "admissibility" condition on (x) (so that it is a valid wavelet) is that it has zero mean.If we take the fluctuation v as the symmetric (centred where "δ" is the Dirac delta function (Eq. 1, see Fig. 3; due to the statistical translational invariance, the coordinate shift by 1/2 is immaterial).As discussed, this "poor man's" wavelet is adequate for many purposes.As the generality of the definition (Eq.10) suggests, all kinds of special wavelets can be introduced.For example, orthogonal wavelets can be used which are convenient if one wishes to reconstruct the original function from the fluctuations, or to define power spectra locally at a point x rather than globally, (averaged over all x).
The other simple to interpret fluctuation, the "tendency fluctuation" (Eq.4), can also be expressed in terms of wavelets: where I is the indicator function: See Fig. 3 for a schematic.The first term in Eq. ( 12) represents the sum, whereas the second removes the mean where L 1 is the overall length of the data set.The removal of the mean in this way is necessary so that the wavelet satisfies the admissibility condition that its mean is zero.The only essential difference between the wavelet defined by Eq. ( 12) and the "tendency fluctuation" defined in Eq. ( 4) (where the mean is removed beforehand) is the extra normalization by x in Eq. ( 4) which changes the exponent by one.Figure 3 also shows the wavelet corresponding to the Haar fluctuation: To within the division by x, and a factor of 2, the result is the equivalent of using the Haar fluctuations (Eq.6) as indicated in Fig. 3.It can be checked (Appendix A) that the Fourier transform of this wavelet is ∝ k −1 sin 2 (k/4) so that compared to the difference and tendency fluctuations, whose mother wavelet has cutoffs only at high or low wavenumbers, it is better localized in Fourier space with both low and high wavenumber fall-offs (≈ k and ≈ k −1 respectively).The Haar fluctuation is a special case of the Daubechies family of wavelets (see e.g.Holschneider, 1995 and for some applications, see Koscielny-Bunde et al., 1998, 2006;Ashok et al., 2010).It is also worth mentionning the paper by Veneziano and Furcolo (2003) who pointed out that Haar wavelets can profitably be used in theoretical calculations on multifractal cascades.
If we are interested in fluctuations valid up until H = 2, we need merely to take derivatives (or differences).For example, ψ(x) x Fig. 4. The popular Mexican hat wavelet (the second derivative of the Gaussian red, valid for −1 < H < 2) compared with the (negative) of the second finite difference wavelet (black bars representing the relative weights of δ functions, valid for 0 < H < 2), and the second-order "quadratic" Haar wavelet (blue) Eq. ( 9).
the second finite difference wavelet is (corresponding to v = (v(x + x)+v(x − x/2))/2−v(x) using centred differences, Fig. 4).Alternatively, we can compare this with the quadratic Haar wavelet, Eq. ( 9): which is very close to the Mexican hat wavelet (see Fig. 4) but is numerically much easier to implement.As with the Mexican hat, it is valid, for −1 < H < 2. It turns out that for analyzing multifractals, even over their common range of validity (0 < H < 1), the Haar structure function is somewhat better than the poor man's wavelet-based structure function (i.e. the usual structure function).This is demonstrated on numerical examples in Sect. 5. Just as one can use higher and higher-order finite differences to extend the range of H values, one can simply use higher and higher-order derivatives of the Mexican hat.
Wavelet mathematics are seductive -and there certainly exist areas such as speech recognition -where both frequency/wavenumber and temporal or spatial localization of statistics are necessary.However, the use of wavelets in geophysics is often justified by the existence of strong localized structures that are cited as evidence that the underlying process is statistically nonstationary/statistically inhomogeneous (i.e. in time or in space respectively).However, S. Lovejoy and D. Schertzer: Haar wavelets, fluctuations and structure functions multifractal cascades produce exactly such structures -the "singularities" -but their statistics are strictly translationally invariant.In this case, the structures are simply the result of the strong singularities.Nevertheless, the local statistics are usually uninteresting, and we usually average over them in order to improve statistical estimates.
Before continuing to discuss other related methods for defining fluctuations, we should mention that there have been strong claims that wavelets are indispensable for analyzing multifractals (e.g.Arneodo et al.,1999).In as much as the classical definition of fluctuations as first differences is already a wavelet, this is true (see however the DFA discussion below).Nevertheless, for most applications, one does not need the full arsenal of wavelet techniques.At a more fundamental level, however, the claim is at least debatable since, mathematically, wavelet analysis is a species of functional analysis; i.e. its objects are mathematical functions defined at mathematical points.On the contrary, the generic multifractal processes -cascades -are singular measures, they are "delocalized", see Schertzer and Lovejoy (1992), Schertzer et al. (2002Schertzer et al. ( , 2010)), so that strictly speaking, they are outside the scope of wavelet analysis.Wavelets may therefore not always be appropriate.A particular example where wavelets may be misleading is the popular "modulus maximum" technique where one attempts to localize the singularities (Bacry et al., 1989;Mallat and Hwang, 1992).If the method is applied to a cascade process, then, as one increases the resolution ("zooming in"), the singularity localization never converges implying that the significance of the method is not as obvious as is claimed.Similar comments apply to the more recent "wavelet leader" technique (Serrano and Figliola, 2009).

Fluctuations as deviations from polynomial regressions: the DFA, MFDFA techniques
We now discuss a variant method that defines the fluctuations in a different way, but still quantifies the statistics in the manner of the generalized structure function by assuming that the fluctuations are stationary: multifractal detrended fluctuation analysis.The MFDFA technique (Kantelhardt et al., 2001(Kantelhardt et al., , 2002) ) is a straightforward generalization of the original detrended fluctuation analysis (DFA, Peng et al., 1994).The method works for 1-D sections, so consider the transect v(x) of series on a regular grid with resolution = 1 unit, L units long.The MFDFA starts by replacing the original series by the running sum: (cf.Eq. 5).As discussed above, a sum is a finite difference of order −1, so that analysing s rather than v allows the treatment of transects with H down to −1 (we will discuss the upper bound shortly).We now divide the range into L x = int(L/ x) disjoint intervals, each indexed by i = 1, 2, ..., L x ("int" means "integer part").For each interval starting at x = i x, one defines the "n-th order" fluctuation by the standard deviation σ s of the difference of s with respect to a polynomial F n (x, x) = σ s in the running sum of v as follows: x = i x and p n,i (x) is the n-th order polynomial regression of s(x) over the i-th interval of length x; it is the polynomial that "detrends" the running sum s; the fluctuation σ s is the root-mean-square deviation from the regression.F is the standard DFA notation for the "fluctuation function"; as indicated, it is in fact a fluctuation in the integrated series quantified by the standard deviation σ s .Using F , we have So in terms of the usually cited DFA fluctuation exponent α (not the Levy α!), we have The fluctuation ( v) MFDFA is very close to the n-th order finite difference of v.It is also close to a wavelet-based fluctuation where the wavelet is of n-th order, although this is not completely rigorous (see however Kantelhardt et al., 2001;Taqqu et al., 1995).As a practical matter, in the above sum, we start with intervals of length x = n+1 and normalize by the number of degrees of freedom of the regression, i.e. by x − n not by x; Eq. ( 18) is the approximation for large x.
In the usual presentation of the MFDFA, one considers only a single realization of the process and averages powers of the fluctuations over all the disjoint intervals i.However, more generally, we can consider a statistical ensemble and average over all the intervals and realizations.If in addition, we consider moments other than q = 2, then we obtain the multifractal DFA (MFDFA) with the following statistics: where we have used the exponent h(q) as defined in Davis et al. (1996) and Kantelhardt et al. (2002).In what follows, we will refer to the method as the MFDFA technique, even though the only difference with respect to the DFA is the consideration of the q = 2 statistics.From our analysis above, we see that the n-th order MFDFA exponent h(q) is related to the usual exponents by where we have used the relation ( v) q MFDFA ≈ x ξ (q) .Note that, in Eq. ( 20), the 1 appears because of the initial integration of order 1 (the sum) in the MFDFA recipe.The method works up to H = n, because the fluctuations are defined as the residues with respect to n-th order polynomials of the sum scorresponding to n+1-th order differences in s and hence n-th order differences in v.The justification for defining the MFDFA exponent h(q) in this way is that in the absence of intermittency (i.e. if K(q) = constant = K(0) = 0), one obtains h(q) = H = constant.Unfortunately -contrary to what is often claimed -the nonconstancy of h(q) neither implies that that the process is multifractal nor does its constancy imply a monofractal process.To see this, it suffices to consider the (possibly fractionally integrated, order H ) monofractal β model which has K(q) = C 1 (q − 1).Therefore, h(q) = 1+H −C 1 +C 1 /q, which is not only nonconstant but even diverges as q approaches zero (indeed the same q = 0 divergence occurs for any universal multifractal with α < 1; see Eq. ( 21) below: lim q→0 h (q) → ∞).
During the last ten years, the MFDFA technique has been applied frequently to atmospheric data; of special note is the work by A. Bunde and co-workers (e.g.Koscielny- Bunde et al., 1998;Kantelhardt et al., 2001;Bunde et al., 2002Bunde et al., , 2005;;Lennartz and Bunde, 2009) who effectively took advantage of its ability to handle the H < 0, needed for macroweather processes.In addition, several numerical studies have compared the MFDFA performance to various wavelet techniques.For example, for estimating exponents, Oswiecimka et al. (2006) find it slightly superior to the wavelet transform modulus maximum method, and Huang et al. (2011) find it somewhat advantageous when compared to the wavelet leader method.However, the exact status of the MFDFA method is not clear since -at least in the multifractal case -a completely rigorous mathematical analysis has not been done, and the literature suffers from misleading claims to the effect that the MFDFA is necessary because, by its construction, it removes nonstationarities due to trends (linear, quadratic etc. up to polynomials order n − 1) in the data.While it is true that it does remove polynomial trends, these only account for fairly trivial types of nonstationarity: beyond this, the method continues to make the standard (and strong) stationarity assumptions about the statistics of the deviations which are left over after performing the polynomial detrending.In particular, the MFDFA does nothing to remove the most common genuine type of statistical nonlinearity in the geosciences: the diurnal and annual cycles which still strongly break the scaling of the MFDFA statistics, and this for any order n.Furthermore, the corresponding wavelet or finite difference definition of fluctuations can equally well take into account these polynomial trends, and finally, the method removes trends at all scales and locations so that this emphasis on detrending is misleading.In other words, the MFDFA is a variant with respect to some of the wavelet methods, in particular with respect to the Haar wavelet and its generalizations.While it may indeed yield accurate exponent estimates, it has the disadvantage that the interpretation of the fluctuations is not straightforward.In comparison, the usual difference and tendency fluctuations and structure functions have simple physical interpretations in terms of magnitudes of changes (when 1 > H > 0) and magnitudes of average tendencies (when −1 < H < 0).

The simulations
To have a clearer idea of the limitations of the various fluctuations for determining the statistics of scaling functions, we shall numerically compare their performance and their corresponding structure functions when applied to the characterization of multifractals.In order to easily compare them with standard spectral analysis, we will only consider the secondorder (q = 2) structure functions and will use simulations of universal multifractals (Schertzer andLovejoy 1987, 1997).
Recall that universal multifractals are the result of stable, attractive multiplicative cascade processes and have where 0 ≤ C 1 is the codimension of the mean and 0 ≤ α ≤ 2 is the multifractality index (the Lévy index of the generator).The limit α → 1 gives K(q) = C 1 q log q and may be obtained from Eq. ( 21) by using l'Hôpital's rule.Recall (Eq. 3) that the corresponding structure function exponent is ξ(q) = qH − K(q).The following tests were all made on simulations with parameters α = 1.8, C 1 = 0.1 with H in the range −7/10 < H < 7/10 (typical parameters in meteorology and in solid earth geophysics; see the reviews Lovejoy andSchertzer, 2007, 2010b).These parameters yield an intermittency correction K(2) = 0.18; 50 simulations were averaged to estimate the ensemble statistics.The simulations were made using the technique described in Schertzer and Lovejoy (1987) and refined in Lovejoy and Schertzer (2010a) for the flux ϕ (i.e.H = 0).For the second fractional integration (to obtain the field v from ϕ), we used a Fourier space power law filter k −H (see Eq. 2).For this fractional integration, exponential cutoffs were used at high wavenumbers to avoid small-scale numerical instabilities (this is standard for differentiation; i.e. when filtering by k −H with H < 0, it is equivalent to using the "Paul" wavelet; see Torrence and Compo, 1998).The raw series were 2 16 in length and were then degraded in resolution by factors of 4 to a length 2 14 pixels to avoid residual finite size effects in the simulation at the smallest scales.In order to avoid spurious correlations introduced by the periodicity of the simulations, the series were split in half so that the largest scale was 2 13 .
Log 10 E(k) k Fig. 5.The compensated spectra for an ensemble of 50 realizations, 2 14 each, α = 1.8, C 1 = 0.1, intermittency correction = K(2) = 0.18, with H increasing from top to bottom from −7/10 to 7/10, β = 1+2H −K(2).The dashed horizontal line is the theoretical behaviour indicated over the range used to estimate the exponent (i.e. the highest and lowest factor of 10 0.5 in wavenumber have been dropped).Each curve was offset in the vertical for clarity.
In Fig. 1 we already showed samples of the simulations visually showing the effect of increasing H from −7/10 to 7/10.In Fig. 5, we show the compensated spectra (obtained using standard Hann windows to avoid spectral leakage).As expected, we see that they are nearly flat, although for the lowest and highest factors of about 10 0.5 ≈ 3, there were more significant deviations.The slopes of the central portions were thus determined by regression, and the corresponding exponents are shown in Fig. 6.In Fig. 5, the spectra were averaged over logarithmically spaced bins, 10 per order of magnitude (with the exception of the lowest factor of 10 where all the values were used).We also performed the regressions using log-log fits using all the Fourier components (rather than just the averaged values) in the same central range; these gave virtually identical exponent estimates (the mean absolute differences in the estimated spectral exponent β were ≈ ±0.007), and were thus not shown.Similarly, if instead of using log-log linear regression, we use nonlinear regressions over the same range, we obtain identical exponents to within ±0.009, and again these estimates were too similar to be worth showing.From Fig. 6, one can see that there seems to be a residual small bias of about −0.04, the origin of which is not clear.
We next consider the Haar structure function which we compare to the second-order (q = 2), quadratic (n = 2) MFDFA structure function analogue (based on the MFDFA fluctuation ( v) MFDFA = F / x where F is the usual MFDFA scaling function; see Eq. 18).The former is valid over the range −1 < H < 1, the latter over the range −1 < H < 2. We can see (Fig. 7) that the Haar structure

H
(2) Tendency s.f.Fig. 6.This shows the regression estimates of the compensated exponents for the spectra δξ (2) = ξ (2) numerics − ξ (2) theory (red; perfect methods give δξ (2) = 0), Haar structure function (q = 2; green), quadratic, q = 2 MFDFA (dashed), the usual difference (poor man's) structure function (q = 2, blue, for H > 0), and the tendency structure function (q = 2, same line, blue for H < 0).function does an excellent job with overall bias mostly around +0.01 to 0.02, (Fig. 6), and that the MFDFA method is nearly as good (overall bias ≈ −0.02 to −0.04).It is interesting to compare this in the same figure with the results of the tendency structure function (H < 0) and the usual difference (poor man's) structure function (H > 0), (Fig. 8).The tendency structure function turns out to be quite accurate, except near the limiting value H = 0 with the large scales showing the largest deviations.In comparison, for the difference structure function, the scaling is poorest at the smaller scales requiring a range factor of ≈ 100 convergence for H = 1/10 and a factor ≈ 10 for H = 3/10.The overall regression estimates (Fig. 6) show that the biases for both increase near their limiting values H = 0 to accuracies ≈ ±0.1 (note that they remain quite accurate if one only makes regressions around the scaling part).This can be understood since the process can be thought of as a statistical superposition of singularities of various orders, and only those singularities with large enough orders (difference structure functions) or small enough orders (tendency structure functions) will not be affected by the theoretical limit H = 0.The main advantage of the Haar structure function with respect to the usual (tendency or difference) structure functions is precisely that it is valid over the whole range −1 < H < 1, so that it is not sensitive to the H = 0 limit.
Before proceeding, we could make a general practical remark about these real space statistics: most lags x do not divide the length of the series (L) exactly so that there is a "remainder" part.This is not important for the small x L since f .For each realization, there will be many disjoint intervals of length x.However, when L/3 ≈< x < L, the statistics can be sensitive to this since, from each realizathere will only be one or two segments and hence poor statistics.A simple expedient is to repeat the analysis on the reversed series and average the two results.This can indeed improve the statistics at large x when only one or a small number of realizations are available; it has been done in the analyses presented here.

The theoretical relation between poor man's, tendency and Haar fluctuations, hybrid structure functions
We have seen that, although the difference and tendency structure functions have the advantage of having simple interpretations in terms of respectively the average changes in the value of the process and its mean value over an interval, this simple interpretation is only valid over a limited range of H values.In comparison, the Haar and MFDFA fluctuations give structure functions with valid scaling exponents over wider ranges of H .But what about their interpretations?
We now briefly show how to relate the Haar, tendency and poor man's structure functions thus giving it a simple interpretation as well.
In order to see the connection between the fluctuations, we use the "saturation" relations: where C tend and C diff are proportionality constants and d = indicates equality in the random variables in the sense of probability distributions (a where "Pr" means "probability" and ζ is an arbitrary threshold).These relations were easily verified numerically and arise for the reasons stated above; they simply mean that, for series with H < 0, the differences are typically of the same order as the function itself and that for H > 0, the tendencies are of the same order: the fluctuations "saturate".By using Eqs.( 7) and ( 22), we now obtain where C tend and C diff are "calibration" constants (only a little different from the unprimed quantities -they take into account the factor of two and the change from x/2 to x).Taking the q-th moments of both sides of Eq. ( 23), we obtain the results for the various q-th order structure functions: This shows that, at least for scaling processes, the Haar structure functions will be the same as the difference (H > 0) and tendency structure functions (H < 0), as long as these are "calibrated" by determining C diff , and C tend .In other words, by comparing the Haar structure functions with the usual structure functions, we can develop useful correction factors which will enable us to deduce the usual fluctuations given the Haar fluctuations.Although the MFDFA fluctuations are not wavelet coefficients, at least for scaling processes, the same basic argument applies.Since the MFDFA and usual fluctuations scale with the same exponents, they can only differ in their prefactors, the calibration constants.
To see how this works on a scaling process, we confined ourselves to considering the root-mean-square (RMS) S( t) =< v 2 > 1/2 ; see Fig. 9 which show the ratio of the usual S( t) to those of the RMS Haar and MFDFA fluctuations.We see that, at small x, s, the ratio stabilizes after a range of about a factor 10-20 in scale and that it is quite constant up to the extreme factor of 3 or so (depending a little on the H value).The deviations are largely due to the slower convergence of the usual RMS fluctuations to their asymptotic scaling form.From the figure, we can see that the ratios C Haar = S/S Haar (Fig. 9 upper left) and corresponding C MFDFA = S/S MFDFA (Fig. 9, upper right) are well defined in the central region, and are near unity, more precisely in the range 1/2 < C Haar < 2 for the most commonly encountered range of H : −4/10 < H < 4/10.In comparison (Fig. 9, bottom) over the same range of H , we have 0.23 > C MFDFA > 0.025 so that the MFDFA fluctuation is quite far from the usual ones and varies strongly with H .For applications, one may use the semi-empirical formulae C Haar ≈ 1.1e −1.65 H and C MFDFA ≈ 0.075e −2.75 H which are quite accurate over the entire range −7/10 < H < 7/10.Although these factors in principle allow one to deduce the usual RMS structure function statistics from the MFDFA and Haar structure functions, this is only true in the scaling regime.Since the corrections for Haar fluctuations are close to unity, for many purposes they can be used directly.

Hybrid fluctuations and structure functions: an empirical example with both H < 0, and H > 0 regimes
For pure scaling functions, the difference (1 > H > 0) or tendency (−1 < H < 0) structure functions are adequate; the real advantage of the Haar structure function is apparent for functions with two or more scaling regimes: one with H > 0, one with H < 0. Can we still "calibrate" the Haar structure function so that the amplitude of typical fluctuations can still be easily interpreted?For these, consider Eq. ( 23) which motivates the definition, of a "hybrid" fluctuation as the maximum of the difference and tendency fluctuations; the "hybrid structure function" is thus the maximum of the corresponding difference and tendency structure functions and therefore has a straightforward interpretation.The hybrid fluctuation is useful if a unique calibration constant C hybrid can be found: To get an idea of how the different methods can work on real data, we consider the example of the global monthly averaged surface temperature of the earth.This is obviously highly significant for the climate, but the instrumental temperature estimates are only known over a small number of years.For this purpose, we chose three different series whose period of overlap was 1880-2008; 129 yr.These were the NOAA NCDC (National Climatic Data Center) merged land air and sea surface temperature data set (abbreviated NOAA NCDC below, from 1880 on a 5 • × 5 • grid; see Smith et al., 2008 for details), the NASA GISS (Goddard Institute for Space Studies) data set (from 1880 on a 2 • ×2 • Hansen et al.,  2010) and the HadCRUT3 data set (from 1850 to 2010 on a 5 • × 5 • grid).HadCRUT3 is a merged product created out of the HadSST2 (Rayner et al., 2006) Sea Surface Temperature (SST) data set and its companion data set CRUTEM3 of atmospheric temperatures over land, see Brohan et al. (2006).Both the NOAA NCDC and the NASA GISS data were taken from http://www.esrl.noaa.gov/psd/, the others from http://www.cru.uea.ac.uk/cru/data/temperature/. The NOAA NCDC and NASA GISS are both heavily based on the Global Historical Climatology Network (Peterson and Vose, 1997), and have many similarities including the use of sophisticated statistical methods to smooth and reduce noise.In contrast, the HadCRUTM3 data are less processed with corresponding advantages and disadvantages.We could note that detailed analysis has shown that in the H < 0 "macroweather" regime ( 10 yr in Fig. 10), the intermittency is typically low; i.e.C 1 ≈ 0.01 (Lovejoy and Schertzer, 2012b), whereas it is ≈ 0.07 in the lower frequency (climate) regime which is a value comparable to that in the weather regime, i.e. of turbulence.
Figure 10 shows the comparison of the difference, tendency, hybrid and Haar RMS structure functions; the latter increased by a factor C hybrid = 10 0.35 ≈ 2.2.It can be seen that the hybrid structure function does extremely well; the deviation of the calibrated Haar structure function from the hybrid one is ±14 % over the entire range of near a factor 10 3 in time scale.This shows that, to a good approximation, the Haar structure function can preserve the simple interpretation of the difference and tendency structure functions: in regions where the logarithmic slope is between −1 and 0, it approximates the tendency structure function, whereas in regions where the logarithmic slope is between 0 and 1, the calibrated Haar structure function approximates the difference structure function.
Before embracing the Haar structure function, let us consider its behaviour in the presence of non-scaling perturbations.It is common to consider the sensitivity of statistical scaling analyses to the presence of non-scaling external trends superposed on the data which therefore break the overall scaling.Even when there is no reason to suspect such trends, the desire to filter them out is commonly invoked to justify the use of quadratic MFDFA or high-order wavelet techniques that eliminate linear or higher-order polynomial ) for the simulations discussed in the text.Upper left shows the ratio of the RMS usual structure function (i.e.difference when H > 0, tendency when H < 0) to the Haar structure function for H = 1/10, 3/10, ..9/10 (thick, top to bottom), and −9/10, −7/10, ... − 1/10, (thin, top to bottom).The Haar structure function has been multiplied by 2 ξ(2)/2 to account for the difference in effective resolution.The central flat regions where the scaling is accurate indicate a constant ratio C Haar = S/S Haar which is typically less than a factor of two.Upper right is the same but for the ratio of the usual to MFDFA RMS structure functions after the MFDFA was normalized by a factor of 16 and the resolution correction 4 ξ(2)/2 (its smallest scale is 4 pixels).Bottom left shows the ratio of the MFDFA structure function to the Haar structure function both with the normalization and resolution corrections indicated above (note the monotonic ordering with respect to H , which increases from top to bottom).trends.However, for this purpose, these techniques are not obviously appropriate since, on the one hand, they only filter out polynomial trends (and not for example the more geophysically relevant periodic trends), while, on the other hand, even for this, they are "overkill" since the trends they filter are filtered at all scales -not just the largest.The drawback is that, with these higher-order fluctuations, we loose the simplicity of interpretation of the Haar fluctuation while obtaining few advantages.Figure 11 shows the usual (linear) Haar RMS structure function compared to the quadratic Haar and quadratic MFDFA structure functions.It can be seen that, while the latter two are close to each other (after applying different calibration constants; see the figure caption), and that the low and high frequency exponents are roughly the same, the transition time scale has shifted by a factor of about 3 so that overall they are quite different from the Haar structure function.It is therefore not possible to simultaneously calibrate the high and low frequencies.
If there are indeed external trends that perturb the scaling, these will only exist at the largest scales; it is sufficient to remove a straight line (or if needed second-, thirdor higher-order polynomials) from the entire data set (i.e. at the largest scales only).Figure 12 shows the result when the Haar structure function is applied to scaling data that have been detrended in two slightly different ways: by removing a straight line through the first and last points of the series; and by removing a regression line.Since these changes essentially effect the low frequencies only, they mostly affect the extreme factor of two in scale.If we discount this extreme factor 2, then in the figure we see that there are again two scaling regimes; the low frequency one is slightly displaced with respect to the Haar structure function, but not nearly as much as for the quadratic Haar.In other words, removing the linear trends at all scales as in the quadratic Haar or the quadratic MFDFA is too strong to allow a simple interpretation of the result and it is unnecessary if one wishes to eliminate external trends.
In summary, if all that is required are the scaling exponents, the Haar structure function and quadratic MFDFA seem to be a excellent techniques.However, the Haar structure function has the advantage of numerical efficiency, simplicity of implementation, and simplicity of interpretation.annual structure functions (fitted for 129 yr > t > 10 yr), obtaining T ( t) 2 1/2 ≈ 0.08 t 0.33 for the ensemble.In comparison, Lovejoy and Schertzer (1986) found the very similar T ( t) 2 1/2 ≈ 0.077 t 0.4 using Northern Hemisphere data (these correspond to β c = 1.66, 1.8 respectively).Reproduced from Lovejoy and Schertzer (2012a).

Conclusions
Geosystems are commonly scaling over large ranges of scale in both space and time.In order to characterize their properties, one decomposes them into elementary components with well-defined scales and characterizes the variability with the help of various statistics.Their scaling behaviours can then be characterized by exponents that are obtained by systematically changing the scale.The classical analysis techniques are spectral analysis (Fourier decomposition) and structure functions based on fluctuations defined as differences.However, over the last 20 yr, the development of wavelets has provided a mathematical basis for defining fluctuations.As far as geophysical applications are concerned, this generality has not proved to be so helpful.This is because different workers use different types of wavelets and these are often chosen for various mathematical properties which are not necessarily important for applications.In addition, the numerical determination of the wavelet coefficients is often cumbersome.Finally, and most importantly, the interpretation of the corresponding wavelet coefficients is often obscure and attention has been almost exclusively focused on the exponents.These features of wavelet analysis have lead to the popularity of a slightly different wavelet-like technique -the detrended fluctuation analysis (DFA) method and its extension, the multifractal DFA (MFDFA).An unfortunate consequence is that the criterion for the applicability of the different techniques has largely been ignored (the range of H exponents), and the advantages and disadvantages of different approaches have not been critically (and quantitatively) evaluated.In this paper, we argue that the historically first wavelet -the Haar wavelet -has several advantages.Ironically, these advantages are not related to their wavelet nature; the wavelet framework is not especially helpful nor is it needed.These advantages are not so much at the level of the accuracy of the exponents -which turn out to be about the same as for spectral analysis and the DFA techniquebut rather in the ease of implementation and interpretation of the results.This is demonstrated both with the help of multifractal simulations (exponents) and the ease of interpretation with an atmospheric example where we show that, with "calibration", the Haar structure functions can be made very close to both the usual difference structure functions (in regions where 1 > H > 0) and to simple "tendency" structure functions (in regions where −1 < H < 0).This simplicity is already lost when using either the DFA or even higher-order wavelet-based generalizations of Haar fluctuations.The Haar structure functions have recently been applied systematically to atmospheric data where they have contributed to clarifying the fact that there are not two regimes (weather and climate) but rather three: weather ( 10 day); macroweather (between ≈ 10 days and ≈ 10-30 yr, H < 0); and the climate ( 10-30 yr, H > 0) (Lovejoy and Schertzer, 2012a) (see http://www.physics.mcgill.ca/for Haar software (Mathematica and MathLab)).

Relations between real and Fourier space analyses and the limiting H exponents
We repeatedly underlined the limiting H values associated with specific types of fluctuations.In this appendix we clarify the origin of these limits by recalling the classical relation between the fluctuations and the spectra.
Consider the statistically translationally invariant process v(x) in 1-D: the statistics are thus independent of x and this implies that the Fourier components are "δ correlated": If it is also scaling, then the spectrum E(k) is a power law: ≈ k −β (where here and below, we ignore constant terms such as factors of 2π etc.).Let us first consider the difference fluctuation.In terms of its Fourier components, this fluctuation is thus = e ikx v (k) e ik x − 1 dk (A2) so that the F.T. of ( v(x, x)) diff is v (k) e ik x − 1 .We first consider the statistics of quasi-Gaussian processes for which C 1 = 0, ξ(q) = Hq.Assuming statistical translational invariance, we drop the x dependence and obtain the relation of second-order structure function to the spectrum: = 4 e ikx v (k) 2 sin 2 k x 2 dk ≈ e ikx k −β sin 2 k x 2 dk. (A3) As long as the integral on the right converges, then using the transformation of variables, x → λx, k → λ −1 k, we find ≈ e ikx k −β sin 2 k x 2 dk ∝ x β−1 (A4) so that β = ξ(2)+1 = 2H +1(C 1 = K(q) = 0 here, Eq. 21).However, for large k, the integrand ≈ k −β , which has a large wavenumber divergence whenever β ≤ 1.However, since for small k, sin 2 (k x/2) ∝ k 2 , there will be a low wavenumber divergence only when β ≥ 3.Although real world (finite) data will not diverge, in the theoretically divergent cases, the structure functions will no longer characterize the local fluctuations, but rather those on either the highest or lowest wavenumbers present in the data.In the quasi-Gaussian case, or when C 1 is small, we have ξ (2) ≈ 2H and we conclude that the use of first-order differences to define the fluctuations leads to second-order structure functions being meaningful in the sense that they adequately characterize the fluctuations whenever 1 < β < 3 (i.e.0 < H < 1).Since 0 < H < 1 is the usual range of geophysical H values, and the difference fluctuations are very simple, they are commonly used.However, we can see that there are limitations; in order to extend the range of H values, it suffices to define fluctuations using finite differences of different orders.To see how this works, consider using second (centred) differences: = e ikx ṽ(k) 1 − 1 2 (e ik x/2 + e −ik x/2 ) dk = 2 e ikx ṽ(k) sin k x 4 2 dk. (A5) Repeating the above arguments, we can see that the relation β = ξ(2) + 1 holds now for 1 < β < 5, or (with the same approximation) 0 < H < 2. Similarly, by replacing the original series by its running sum (a finite difference of order −1)as done in the MFDFA technique -we can extend the range of H values down to −1.More generally, going beyond Gaussian processes, we can consider intermittent fractionally S. Lovejoy and D. Schertzer: Haar wavelets, fluctuations and structure functions integrated flux processes (Sect.5.1), which have v (k) ≈ ϕ (k) |k| −H .We see that the F.T. of ( v(x, x)) diff is e ik x − 1 |k| −H ϕ (k).This implies that, for low wavenumbers, v (k) ≈ |k| 1−H ϕ (k) (k 1/ x), whereas at high wavenumbers, v (k) ≈ |k| −H ϕ (k) (k 1/ x).Hence, since the mean of ϕ has no scale dependence, we find that, for 0 < H < 1, the fluctuations v( x) are dominated by wavenumbers k ≈ 1/ x.So for this range of H , fluctuations defined as differences capture the variability of x-sized structures, not structures much smaller or much larger than x.More generally, since the F.T. of the n-th derivative d n v/dx n is (ik) n v (k) and the finite derivative is the same for small k but "cut-off" at large k, we find that n-th order fluctuations are dominated by structures with k ≈ 1/ x as long as 0 < H < n.This means that v( x) does indeed reflect the x scale fluctuations.

Fig. 8 .
Fig.8.This compares the compensated Haar structure function (thick), the difference structure function (thin, below the axis, for H = 3/10 (bottom) and 1/10 black, second up), and the tendency structure function (third from bottom, thin red, H = −1/10), and top, thin green, H = −3/10.It can be seen that the standard difference structure function has poor scaling for nearly two orders of magnitude when H = 1/10, and one order of magnitude for H = 3/10 (see Fig.6for quantitative estimates).

Fig. 9 .
Fig. 9.These graphs compare the ratios of RMS structure functions (S ( x) = v 2 1/2) for the simulations discussed in the text.Upper left shows the ratio of the RMS usual structure function (i.e.difference when H > 0, tendency when H < 0) to the Haar structure function for H = 1/10, 3/10, ..9/10 (thick, top to bottom), and −9/10, −7/10, ... − 1/10, (thin, top to bottom).The Haar structure function has been multiplied by 2 ξ(2)/2 to account for the difference in effective resolution.The central flat regions where the scaling is accurate indicate a constant ratio C Haar = S/S Haar which is typically less than a factor of two.Upper right is the same but for the ratio of the usual to MFDFA RMS structure functions after the MFDFA was normalized by a factor of 16 and the resolution correction 4 ξ(2)/2 (its smallest scale is 4 pixels).Bottom left shows the ratio of the MFDFA structure function to the Haar structure function both with the normalization and resolution corrections indicated above (note the monotonic ordering with respect to H , which increases from top to bottom).