Geometric and Topological Approaches to Significance Testing in Wavelet Analysis

Geometric and topological methods are applied to significance testing in the wavelet domain. A geometric test was developed for assigning significance to pointwise significance patches in local wavelet spectra, i.e., contiguous regions of significant wavelet power coefficients with respect to some noise model. This geometric significance test was found to produce results similar to an existing areawise significance test while being more computationally flexible and efficient. The geometric significance test can be readily applied to pointwise significance patches at various pointwise significance levels in wavelet power and coherence spectra. The geometric test determined that features in wavelet power of the North Atlantic Oscillation (NAO) are indistinguishable from a red-noise background, suggesting that the NAO is a stochastic, unpredictable process, which could render difficult the future projections of the NAO under a changing global system. The geometric test did, however, identify features in the wavelet power spectrum of an El Niño index (Niño 3.4) as distinguishable from a red-noise background. A topological analysis of pointwise significance patches determined that holes, deficits in pointwise significance embedded in significance patches, are capable of identifying important structures, some of which are undetected by the geometric and areawise tests. The application of the topo-logical methods to ideal time series and to the time series of the Niño 3.4 and NAO indices showed that the areawise and geometric tests perform similarly in ideal and geophys-ical settings, while the topological methods showed that the Niño 3.4 time series contains numerous phase-coherent oscillations that could be interacting nonlinearly.


Introduction
Time series are often complex, and are composed of oscillations and trends.The goal of researchers is to decide whether the embedded structures in the time series are stochastic or deterministic.Such decisions can be made using Fourier analysis, with the assumption that the underlying time series is stationary (Jenkins and Watts, 1968).In many cases, however, the stationary assumption is not satisfied, making Fourier analysis an inappropriate tool for feature extraction.For nonstationary time series, wavelet analysis (Meyers, 1993;Torrence and Compo, 1998) can be used for decomposing a time series into both frequency and time components, allowing for the extraction of transient features and dominant modes of variability.Once embedded structures in time series have been identified, a natural question arises: what physical mechanisms are responsible for the detected modes of variability?Linkages between the modes of variability and possible physical mechanisms can be obtained using wavelet coherence (Grinsted et al., 2004), a bivariate tool for detecting common oscillations between two time series.Together, wavelet power and coherence analyses have proven useful in climate science (Velasco and Mendoza, 2008;Müller et al., 2008), hydrology (Zhang et al., 2007;Özger et al., 2009;Labat, 2008Labat, , 2010)), atmospheric science (Terradellas et al., 2005;Schimanke et al., 2011), and oceanography (Lee and Lwiza, 2008).
The application of wavelet analysis alone is not sufficient for feature extraction of time series; indeed, random fluctuations can produce large values of spectral power or coherence related to the underlying process (e.g., red noise) and not necessarily the time series.In Fourier analysis, one chooses a Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.
suitable noise model and assesses the significance of features relative to some analytically or empirically derived threshold.In climate science, for example, one often compares the sample power spectrum of a time series to that of a theoretical red-noise spectrum (Hasselman, 1976;Torrence and Compo, 1998).Statistical significance testing is also necessary in the wavelet domain.Torrence and Compo (1998) were the first to assess the significance of features in wavelet power spectra using discrete red-noise background spectra.Grinsted et al. (2004), using Monte Carlo methods, extended significance testing to wavelet coherence using surrogate red-noise time series.The (pointwise) significance tests developed by Torrence and Compo (2010) and Grinsted et al. (2004), however, have multiple-testing problems, given the large number of wavelet coefficients being tested simultaneously (Maraun and Kurths, 2004).Suppose, for example, that a pointwise significance test was applied to M wavelet power coefficients at the 5 % significance level.Then, on average, there will be 0.05 M false positive results, which would make the pointwise test permissive for large M. Maraun et al. (2007) addressed these problems by developing an areawise test that sorts through contiguous regions of pointwise significance called significance patches based on their area and geometry, minimizing spurious results and, thus, giving researchers more insight into the time series in question.According to the areawise test, the larger the pointwise significance patch, the less likely it was generated from a stochastic fluctuation.
In this study, significance testing in the wavelet domain is improved through the following: (1) the development of a flexible and computationally efficient geometric test capable of minimizing spurious results from the pointwise test by associating p values to individual patches in waveletpower and wavelet-coherence spectra; and (2) the application of topological methods that can further distinguish spurious patches from true structures that can reveal information about time series undetected by current methods.Given the deficiencies of pointwise significance testing, there is a need to improve current methods of evaluating significance of features in the wavelet domain.The areawise test, though a substantial improvement from the pointwise test, has one drawback: finding the significance level of the areawise test requires a complicated root-finding algorithm, making p values for the areawise test difficult to obtain, as it would require the repeated application of a root-finding algorithm (see Sect. 4.1 for details).
The remainder of the paper is organized as follows.A brief overview of wavelet analysis is presented in Sect. 2. In Sect.3, the pointwise and areawise tests are discussed briefly.The development of the geometric test is presented in Sect. 4. In Sect.5, ideas inspired by persistence homology (Edelsbrunner, 2010) are used to show that holes, voids of pointwise significance surrounded by regions of pointwise significance, can distinguish important structures from trivial structures, linking the geometric and topological tests.Using ideas from Sects. 4 and 5, the application of a local geometric test is presented in Sect.6.The new methods are applied to time series of two idealized cases, which provide important benchmarks for the methods, and to indices of two prominent climate modes, El Niño-Southern Oscillation and the North Atlantic Oscillation (NAO), to illustrate, in a geophysical setting, the insights afforded by the methods.

Definitions
In wavelet analysis, a time series is decomposed into frequency and time components by convolving the time series with a wavelet function satisfying certain conditions.There are many different kinds of wavelet functions but the most widely used is the Morlet wavelet, a sine wave damped by a Gaussian envelope expressed as where ψ 0 is the Morlet wavelet, ω 0 is the dimensionless frequency, and η = s • t, where s is the wavelet scale, and t is time (Torrence and Compo, 1998;Grinsted et al., 2004).The wavelet transform of a discrete time series x n (n = 1, . .., N ) is given by where δt is a uniform time step determined from the time series and W X n (s) 2 is the wavelet power of a time series at scale s and time index n (Torrence and Compo, 1998;Grinsted et al., 2004).Note that for the Morlet wavelet with ω 0 = 6 the wavelet scale and the Fourier period λ are approximately equal (λ ≈ 1.03 s).
3 Existing significance testing methods

Pointwise significance testing
For climatic time series, the significance of wavelet power can be tested against a theoretical red-noise background (Torrence and Compo, 1998).For a first-order autoregressive (Markov) process with lag-1 autocorrelation coefficient α, Gaussian white noise w n , and X 0 = 0, the normalized theoretical red-noise power spectrum is given by where f = 0, . .., N/2 is the frequency index (Gilman et al., 1963).To obtain, for example, the 5 % pointwise significance level one must multiply Eq. ( 4) by the 95 % percentile of a  chi-square distribution with two degrees of freedom and divide the result by 2 to remove the degree of freedom factor (Torrence and Compo, 1998).The discrete Fourier red-noise spectrum has been shown by Torrence and Compo (1998) to be adequate in estimating the significance of local wavelet power and is thus used in this paper to estimate pointwise significance.The parameter α can be estimated using standards methods such as the Burg's and the Yule-Walker methods (Kay, 1988;Hayes, 1996).
Monthly time series and normalized wavelet power spectra for the NAO index (Hurrell et al., 2003, https://climatedataguide.ucar.edu/climate-data/hurrell-north-atlantic-oscillation-nao-index-station-based) and the Niño 3.4 index (Trenberth, 1997; http://www.cgd.ucar.edu/cas/catalog/climind/Nino_3_3.4_indices.html) are shown in Figs. 1 and 2. The Niño 3.4 index data were converted to anomalies by subtracting the mean monthly values for each month from the monthly values.Note that the normalized wavelet power is the wavelet power at every time and period divided by the variance of the time series, which allows different wavelet power spectra to be readily compared.Another important feature of the wavelet power spectrum is the cone of influence, the region in which edge effects become important or, more precisely, the e folding time of the autocorrelation for wavelet power at each scale, where the e folding time is defined by Torrence and Compo (1998) as the point at which the wavelet power for a discontinuity at the edge drops by a factor of e −2 .The wavelet power spectrum of the NAO index reveals numerous time periods of enhanced variance at an array of timescales, though no preferred time-scale is evident.For the Niño 3.4 index, the wavelet power spectrum detects statistically significant variance in the 16-64 month period band for the period 1960-2010.Another interesting feature emerges (labeled "H" in Fig. 2b): regions of no pointwise significance surrounded by regions of pointwise significance.These "holes" will turn out to be important structures in wavelet power spectra and are discussed thoroughly in Sect. 5.

Areawise significance testing
The idea behind the Maraun et al. (2007) areawise test (hereafter simply the "areawise test") is that correlations between adjacent wavelet coefficients arising from the reproducing kernel (see Appendix A) produce contiguous regions of pointwise significance that resemble the reproducing kernel.The reproducing kernel for a given analyzing wavelet represents the time-scale uncertainty, which is related to the scale and time localization properties of the analyzing wavelet.Let (t, s) denote the location of a wavelet coefficient at scale s and time t.The correlation, C(t, s, t , s ), between any two wavelet coefficients located at (t, s) and (t , s ) obtained from the wavelet transformation of a Gaussian white process is given by the reproducing kernel moved to t and stretched to s (Maraun et al., 2007) Maraun and Kurths, 2004).Thus, for significance patches generated from random fluctuations, the typical patch area is the area of the reproducing kernel.The test can be described more formally as follows: let P pw be the set of all pointwise significance values and define a critical area P crit (t, s) as the subset of the time-scale domain for which the reproducing kernel K (corresponding to the analyzing wavelet), dilated and translated to time t and scale s, exceeds the threshold of a critical level K crit .Mathematically, P crit (t, s) is given by It is noted that the critical area of the areawise test is not area of significance patches but the area of the reproducing kernel at some critical level and at some scale.For a patch of pointwise significant values, a point inside the patch is said to be areawise significant if the reproducing kernel dilated according to the scale in question entirely fits into the patch, i.e., where P aw is the subset of pointwise significant values consisting of additionally areawise significant wavelet power coefficients.According to the areawise test, entire significance patches need not be areawise significant, just portions or subsets of them.That is, it is only those points that fit inside the kernel that are deemed areawise significant.The critical area is related to the significance level of the areawise test by the following equation: where 1 − α aw is the significance level of the areawise test, A aw is the area of the areawise significance patch, A pw is the area of the pointwise significance patch, and A aw A pw is the average ratio between the areas of areawise significant patches and pointwise significance patches.It turns out that the calculation of α aw is nontrivial, involving a root-finding algorithm that solves the equation f (P crit ) − α aw = 0 (see Sect. 4).
To illustrate the importance of the areawise significance test, the test was applied to the wavelet power spectra of the NAO and Niño 3.4 index time series (Figs. 3, 4).Numerous 5 % pointwise significance patches in the Niño 3.4 wavelet power spectrum were found to contain areawise significant subsets, suggesting that these patches were less likely to be an artifact of multiple testing.For example, as indicated by the thick red contours, there are three areawise significant regions located at a period of approximately 48 months, one at 1890, one at 1905, and a third one at 1985.Many more areawise significant regions were identified at periods of less than 8 months, especially before 1955.The wavelet power spectrum of the NAO index also contained pointwise significance patches with areawise significant subsets, all at periods of less than 8 months.However, it will be shown in Sect. 4 that they all may be artifacts of multiple testing, resulting from the large number of patches to which the areawise test was applied.

Development
A disadvantage of the areawise test is the complexity of the α aw calculation, which involves a root-finding algorithm.It is therefore desirable to construct an alternative test whose significance level is easy to calculate, readily allowing the following: (1) the application of the test to patches at various pointwise significance levels, (2) the adjustments of the significance level of the test, (3) the application of the test to wavelet power spectra obtained using other analyzing wavelets, and (4) the implementation of p value adjustment procedures to control the family-wise error rates and false discovery rates.
The development of a geometric significance test will require ideas from basic geometry and set theory.In wavelet  .1)and thick red contours are the 5 % areawise significant subsets (see Sect. 3.2).Light gray shading indicates those 5 % pointwise significance patches that are geometrically significant at the q = 0.05 level and dark gray shading indicates those 1 % pointwise significance patches that are geometrically significant at the q = 0.05 level.analysis, the wavelet power is computed at a discrete set of time coordinates T with elements t i for i = 1, . .., N and at a discrete set of scales S whose elements s j (j = 1, . .., J ) are given by s j = s min s j δj (9) and with δt a time step and s min the smallest resolvable scale (Torrence and Compo, 1998).Note that the maximum value of δj for which adequate sampling can be achieved depends on the wavelet function, being approximately equal to 0.5 for the Morlet wavelet.For the geometric test, a patch will be considered to be a polygon with vertices v k = (x k , y k ) for k = 0, . .., m − 1, where x k and y k are, respectively, elements from T and S and m − 1 is the number of vertices.It is worth noting that not all patches are closed in the sense that some are located near the edges of the wavelet domain.To remedy this problem, semi-enclosed patches are artificially closed by connecting the two vertices located on the boundary of the wavelet domain with a line segment.
Perhaps the most fundamental property of a pointwise significance patch is its area, which can be calculated using the following special case of Green's theorem: where y 0 = y m and x 0 = x m (Worboys and Duckham, 2004; Appendix C).For significance patches containing holes, the total area of the holes is subtracted from the area the significance patch would have if it did not contain the holes.
What will be of particular interest is the normalized area of a significance patch, not its absolute area.To compute the normalized area, the centroid of a significance patch will need to be calculated using the following formulas (Worboys and Duckham, 2004): and where C t and C s are the time and scale coordinates, respectively, of the centroid.Recall that the centroid is the areaweighted location of a polygon.If A R is the area of the reproducing kernel dilated or contracted (at a certain critical level) to (C t , C s ), then the normalized area of a significance patch is given by and allows one to compare sizes of significance patches across all scales simultaneously.Two idealized pointwise significance patches with equal normalized area are shown in Fig. 5a and b.
The idea of the geometric significance test is to generate a null distribution of A n and use the null distribution to compute the significance of patches in the wavelet domain.In climate science, a suitable null hypothesis is red noise so that A n will be computed for a large ensemble of patches generated from red-noise processes.Using the null distribution of A n , one can assign to each patch in the wavelet domain a probability p that the patch was not generated from a random stochastic fluctuation.It is noted that the null distribution of A n depends on the choice of null hypothesis (not shown), with, for red-noise processes, A n increasing with increasing lag-1 autocorrelation coefficients.
The calculation of the geometric significance level 1 − α g , unlike the calculation of 1 − α aw , is straightforward: for the areawise test one needs to compute α aw as a function of P crit , whereas for the geometric test α g is no longer a function P crit .Moreover, the estimation of P crit involves a rootfinding algorithm that solves the equation f (P crit )−α aw = 0, where f (P crit ) is estimated using Monte Carlo simulations.Thus, the application of the areawise test to pointwise significance patches for M different values of α aw would require M Monte Carlo ensembles, making p values for the test difficult to obtain.For the geometric test, only a single Monte Carlo ensemble is needed, as a single choice of P crit is needed to generate a null distribution, from which any desired value of α g can be obtained.In fact, while the choice of P crit impacts the mean value of the null distribution, the geometric significance of a significance patch is left unchanged, as the significance is relative to a distribution of χ under some noise model (Appendix B).
The elimination of the P crit dependence from the calculation of the geometric significance level allows the geometric test to be readily performed on patches of various pointwise significance levels.For the areawise test, a new P crit must be estimated for each pointwise significance level since A pw , on average, will change depending on if the pointwise significance level 1 − α p is increased (patches shrink) or is decreased (patches grow).For the geometric test, there is no need to find a new P crit -simply compute a new null distribution based solely on the information of the pointwise significance patches at some pointwise significance level 1−α p .
Another advantage of eliminating the P crit dependence is that the geometric test can be readily applied to wavelet coherence, partial wavelet coherence (Ng, 2012), multiple wavelet coherence, and cross-wavelet spectra.The application of the geometric test to significance patches in the aforementioned wavelet spectra only requires a single Monte Carlo ensemble to generate a null distribution, eliminating the calculation of a new P crit for each wavelet spectra and for each value of α g .For the areawise test, a new P crit must be estimated for each value of α aw and for each wavelet spectra, making the areawise test difficult to implement in practical applications.
It may happen that a pointwise significance patch is so large that individual oscillations embedded in the patch cannot be detected by the geometric test.However, there are two solutions to this localization problem: the first solution is to increase the significance level of the pointwise test, allowing large patches to separate, and then perform the geometric test on the smaller patches.The second solution is to examine other properties of significance patches that may indicate the presence of multiple periodicities that form large significance patches from the merging of several smaller patches.The second solution will be addressed thoroughly in Sect. 5.
Another situation that may arise in practice is the application of the geometric test to patches located both inside and outside the cone of influence (COI).In the case of the pointwise significance test, the edge effects only influence those wavelet power coefficients that lie inside the COI; however, for the geometric test, the significance of the entire patch will be impacted even if the patch only partially lies inside the COI.The reason is that the COI will act to decrease the size of significance patches through the reduction of wavelet power in the COI and subsequently the total area of the patch.One should thus be cautious when interpreting the results of the geometric test for patches near the COI.

Multiple testing
If the geometric test was performed on K significance patches at the α geo level, then, on average, one can expect α geo K false positive results, which would make the geometric test permissive for large K.It is therefore necessary to reduce the number of false positive results.There are various ways to reduce the number of false positives, including the Walker test, Bonferroni correction, and other counting procedures (Wilks, 2006).Recently, methods for controlling the false discovery rate (FDR) have been developed, where the FDR is the expected proportion of rejected local null hypotheses that are actually true (Benjamini and Hochberg, 1995).In particular, Benjamini and Hochberg (1995) developed a method for controlling the FDR based on the number of local hypotheses being tested and the degree to which the local hypotheses were rejected, contrasting with other procedures that ignore the confidence with which the local tests reject the local hypotheses (Wilks, 2006).Moreover, the method has proven to have high statistical power, especially when only a small fraction of the K local tests correspond to false null hypotheses (Wilks, 2006).The procedure will therefore be used to control the false discovery rate of the geometric test, which will facilitate the interpretation of results.
Suppose that K local hypotheses were tested, where, in the present case, the local hypotheses refer to the testing of each patch individually under the assumption that the results of the individual tests are independent.A global geometric test can be performed at the α global level as follows: let p (l) denote the lth smallest of K local p values; then, under the assumption that the K local tests are independent, the FDR can be controlled at the q level by rejecting those local tests for which p (l) is no greater than so that the FDR level is equivalent to the global test level.According to the procedure, any local test resulting in a p value less than or equal to the largest p value for which Eq. ( 16) is satisfied is deemed significant.If no such local p values exist, then none are deemed significant and, therefore, the global test hypothesis cannot be rejected.The global geometric test will thus only deem those significant patches with p values satisfying Eq. ( 16) as significant.Throughout the paper q = α global will be set to 0.05.

Comparisons with the areawise test
With a formal geometric significance test now developed, it is useful to compare the areawise and geometric significance tests, where comparisons will be made using an empirically derived quantity.Let N sig be the number of pointwise significance patches in a given wavelet power spectrum, N a the number of patches containing an areawise significant region, N g the number of geometrically significance patches, and N ag the number patches that are both geometrically significant and that contain areawise significant regions.The quantity then measures the similarity between the two tests.The interpretation of I sim is as follows: if I sim = 1 then all patches containing areawise significant regions are also geometrically significant and all patches which do not contain areawise significant regions are also not geometrically significant.On the other hand, for values of I sim < 1 some patches containing areawise significant regions may not be geometrically significant, with the converse also being true.
To better compare the similarity between the two tests, distributions of I sim were constructed by generating 1000 synthetic wavelet power spectra of red-noise processes with fixed autocorrelation coefficients and length N = 1000 (arbitrary units) and computing I sim for each of the synthetic wavelet power spectra.The experiment was performed for red-noise processes with different lag-1 autocorrelation coefficients to determine if I sim depends on the AR1 model.The results are shown in Fig. 6a.With a mean value of 0.90, a strong agreement was found between the areawise and geometric tests, differences arising from the fact that the areawise test is a local test, finding significant regions within patches, whereas the geometric test assigns a significance value to entire patches (see discussion below).Since I sim was (c) Same as (a) but for the mean convexity of 5 % pointwise significance patches that are geometrically significant at the 5 % level and for the mean convexity of 5 % pointwise significance patches that are areawise significant at the 5 % level.
often less than 1.0, some patches containing areawise significant regions were not found to be geometrically significant and, conversely, some patches were geometrically significant without containing areawise significant regions.
The quantity r neg = N g /N a , which measures the ratio of false positive results between both tests, was also computed for case when both the geometric and areawise test levels were set to 0.05 (Fig. 6b).In this case, the mean value of r neg was found to range from 1.0 to 2 and the median value was found to be generally greater than 1.0, ranging from 1 to 1.8.No dependence on the lag-1 autocorrelation coefficients was identified.The results indicate that the geometric test is generally less conservative than the areawise test for a given wavelet power spectrum.The lack of conservativeness, however, can be remedied by controlling the FDR of the geometric test at the q = 0.05 level.Figure 6b shows r adj , the ratio of false positive results between the areawise tests and the geometric test but with FDR controlled for the geometric test.As indicated in Fig. 6b, by controlling the FDR the geometric test is much more conservative than the areawise test, resulting in fewer false positive results, with a typical value of r adj ranging from 0.02 to 0.05.
To explain the differences between the areawise and geometric tests, it will be necessary to consider the convexity of a patch, the degree to which a polygon or point set lacks concavities.The reason for considering convexity is illustrated by considering the two significance patches shown Fig. 5, which have equal values of A n but different geometries: one is convex (i.e., has no concavities, Fig. 5a) and the other is not convex (Fig. 5b).Suppose that the areawise test was performed on the two patches at the α aw level.For the convex patch shown Fig. 5a, the reproducing kernel is capable of fitting entirely inside the patch but is unable to fit inside the nonconvex patch as a result of the concavity.Thus, although having equal area, the two patches differ in their areawise significance, where the difference in significance is related to their geometry.Thus, p aw = g (C, A; H 0 ) for some function g, where p aw is the areawise test p value associated with a patch calculated under the null hypothesis H 0 and C is the convexity of the patch, which is now formally defined.
Rigorously, convexity is defined as follows: let x and y be any two points in a set Z; then the set Z is convex if for all t the line segment is in Z (Ziegler, 1995).Equivalently, a set is convex if it contains any line segment joining any pair of points in Z.Under this definition, for example, patches with thin bridges as described by Maraun et al. (2007) are not convex.
To quantify convexity, another idea from set theory, the convex hull, will be needed, which for a point set Z is defined as the intersection of all convex sets containing Z (Ziegler, 1995).In other words, it is the smallest convex set containing Z constructed from the intersection of all convex sets containing Z. Mathematically, the convex hull of a point set Z is expressed as In practical applications, the convex hull of a set can be easily computed using existing algorithms (Barber et al., 1996).It is noted that all holes are ignored in the computation of the convex hull because the computation of the convex hull assumes that there are no holes in the polygon.A patch containing a hole can never have a smallest convex set containing the set because holes allow line segments to leave the patch regardless of the size of the convex hull.
A metric for convexity will now be defined using the area of a significance patch together with the area of its convex hull as follows: if A k is the area of the convex hull of a significance patch whose area is A, then the convexity is where 0 ≤ C ≤ 1. High values of C correspond to significance patches with relatively small concavities, whereas small values of C correspond to patches with relatively large concavities, as in the case of significance patches with thin bridges.
According to the areawise test, patches with smaller values of C are less likely to be areawise significant so that it is expected that patches deemed significant by the areawise test will be primarily convex.To test this hypothesis, 10 000 patches arising from red-noise processes with different lag-1 autocorrelation coefficients were generated and the convexity of those patches deemed areawise significant at the α aw = 0.05 level was calculated.The results in Fig. 6c show the mean convexity as a function of the lag-1 autocorrelation coefficients, together with the 95 % confidence bound.The mean convexity of the patches was found to be approximately 0.8, regardless of the lag-1 autocorrelation coefficient.An identical experiment was also performed for geometrically significant patches but with the convexity of patches that are geometrically significant at the α geo = 0.05 being computed.In contrast to areawise significant patches, patches that were found to be geometrically significant, on average, had lower convexity, the reason for which is that the calculation of α geo makes no assumption about convexity.The p value for the geometric test is thus p geo = f (A; H 0 ) for some function f , contrasting with p aw which depends on convexity.The results of the experiments are consistent with Fig. 5a and b, where both the ideal patches have the same geometric significance but the ideal patch in Fig. 5b has a larger p aw so that p aw > p geo .
Convexity cannot fully explain the differences between p aw and p geo for a given patch.More generally, p aw = g (C, A, S 1 , . .., S R ; H 0 ), where S 1 -S R are shape parameters of the patch, such as aspect ratio and symmetry.Consider, for example, a convex patch whose length in the time direction is long with respect to the reproducing kernel (at some critical level) but thin in the scale direction with respect to the reproducing kernel.Such a patch would be deemed insignificant by the areawise test, though it may have an area much larger than the critical area of the areawise test.Asymmetry with respect to the scale axis, as another example, may also result in a patch being deemed insignificant by the areawise test if, for example, the width of the patch in the scale direction decreases with time.If the normalized areas of such patches are larger than the critical level of the geometric test, the patches will be geometrically significant, though they may not be areawise significant if the reproducing kernel is unable to fit inside the narrow portion of the patch.The above arguments suggest that f (A; H 0 ) = g (C, A, S 1 , . .., S R ; H 0 ) and thus the significance of patches as determined by the geometric and areawise tests need not be equal.

Geometric significance testing of climatic data
For climatic time series, significance is often tested against a red-noise background and therefore it is reasonable to expect that the areawise and geometric tests behave similarly when applied to climatic time series.As such, the areawise and geometric tests were applied to the NAO and Niño 3.4 time series.For the wavelet power spectrum of the NAO index time series (see Fig. 3), not a single patch was found to be geometrically significant after controlling the FDR at the 0.05 level, suggesting the NAO index time series is composed of stochastic fluctuations.In fact, the NAO has already been shown to be consistent with a first-order Markov process (Feldstein, 2000).Recent work by Hanna et al. (2014) claimed that the NAO variability has increased over the past 30 years; however, the results from this analysis suggest that such changes cannot be distinguished from stochastic fluctuations, which could render difficult projections of future changes of the NAO.
The wavelet power spectrum of the Niño 3.4 index (see Fig. 4) was found to contain numerous geometrically significant patches in the period band of 16-64 months, especially after 1960.The 5 % pointwise significance patch extending from 1980 to 2000, as an example, was found to be significant, as well as the patch centered at 2008.The significance patch centered at 1985 and at a period of 32 months, however, is so large that individual oscillations could not be identified.To remedy the problem, the geometric significance was applied to 1 % (α p = 0.01) pointwise significance patches with q = 0.05, resulting in 1 % pointwise significance patches at 1970, 1995, and 2007 being deemed significant, all of which also contained areawise significant regions.Patches located at a period of less than 8 months were also found to be geometrically significant, though only before 1955.

Topological significance testing of ideal time series
Topology is a branch of mathematics concerned with properties of spaces that remain unchanged after continuous deformations.So far, only geometric aspects of significance patches have been discussed.Area of a significance patch, as an example, is a geometric property in the sense that stretching the patch in both the scale and time direction would increase its area.There are properties, however, that would be unaffected by stretching the significance patch.As a motivating example, consider the significance patches shown in Fig. 4 corresponding to the wavelet power spectrum of the Niño 3.4 index (see Fig. 2), where there is a hole or void of pointwise significance located within a significance patch at 1985.This feature is topological, as the hole would remain under a continuous deformation such as stretching.A more formal definition of a hole will require some notions from topology.Let I = [0, 1] be the closed unit interval.Then a path from a point a to a point b in a significance patch P is a continuous function f : I → P with f (0) = a and f (1) = b, where in the case that f (0) = f (1) = c the path is said to be closed (Hatcher, 2001).Note that a point is a special kind of closed path called the constant path.A patch will be said to contain a hole if there exists a path in the significance patch such that it cannot be continuously deformed into a point, where the feature obstructing the path from such a deformation is a hole.The definition is consistent with notions of simple connectedness in topology (Hatcher, 2001).Figure 4 shows an example of a closed path (blue curve) in a patch that cannot be contracted to a point because it surrounds a hole located in the patch.
For a patch with a hole there will be two boundaries: an external boundary and an internal boundary representing the Table 1.Fraction of pointwise significance patches containing at least N h holes as a function of the pointwise significance level calculated from an ensemble of 200 000 significance patches generated from red-noise processes with fixed autocorrelation coefficients equal to 0.5.boundary between the hole and the patch.Thus, if a patch contains an internal boundary or contour it will contain a hole, whereas a patch without a hole will contain no internal contours.In practical applications, the existence of a hole can be determined by orienting external contours in the clockwise direction and internal contours in the counterclockwise direction, a procedure automatically implemented by the Matlab contour routine.The number of counterclockwise oriented contours is thus the number of holes in the wavelet power spectrum at a given pointwise significance level.
To begin the topological analysis, the topology of time series with known structures will be analyzed.Given the importance of red-noise processes in the spectral analysis of climatic time series, the topology of patches generated from red-noise processes is first considered to determine if pointwise significance patches can be distinguished from those generated from red-noise processes solely based on their topology.To answer this question, 10 000 wavelet power spectra of red-noise processes were generated and the number of holes (denoted by N h hereafter) at a finite set of pointwise significance levels was computed for each wavelet power spectra (Fig. 7).It was found that N h is not a random function of the pointwise significance level, as indicated by the 95 % confidence bounds.Most importantly, for pointwise significance levels of less than 10 %, few patches contained holes, suggesting that holes are an uncommon feature of significance patches generated from red-noise processes (Table 1) and therefore can be used to distinguish spurious patches from important structures.It also noted that neither the shape nor the amplitude of the curve in Fig. 7 depends on the lag-1 autocorrelation coefficient of the red-noise process.Table 1 also suggests that patches containing more than a single hole are unlikely to be the result of red noise, even for a modest pointwise significance level of 20 %.For pointwise significance levels of 1 and 5 %, no more than a single hole was identified in a given patch.
A simple algorithm for assessing the significance of holes is therefore developed.To find the significance of holes, plot the centroids of holes at a finite set of pointwise signifi- cance levels and project the centroids onto the wavelet domain, resulting in a topological wavelet diagram.The number of holes contained in a patch should also be computed, as patches with more holes are less likely to result from red noise.In accordance with Fig. 7 and Table 1, regions in the wavelet domain where holes exist below the 20 % pointwise significance level will be considered regions with significant topological features.
With a method for assessing the significance of holes, it is reasonable to analyze different ideal time series, both linear and to determine what types of time series produce holes in significance patches.Perhaps the simplest case is a single sinusoid with additive white noise (not shown), where the time series power spectrum in tested against a white-noise background spectrum.In this case, no evidence was found that a single sine wave, regardless of amplitude and signalto-noise ratio, is capable of generating holes in 5 % pointwise significance patches.A similar experiment was repeated but the power spectra of the sine waves were tested against red-noise spectra.The results also indicated that a single sine wave is incapable of producing holes in 5 % pointwise significance patches, implying holes arise from a richer structure embedded in time series.Thus, two more complex cases are considered.
To derive the Case 1 time series, first consider the nonlinear system where X in (t) is the input into the system, X out (t) is the output of the system, b is a linear coefficient, and γ is a nonlin-ear coefficient.The output from this system will be quadratically phased coupled (King, 1996), where quadratic phase coupling indicates that for frequencies f 1 , f 2 , and f 3 and corresponding phases φ 1 , φ 2 , and φ 3 the sum rules f 1 + f 2 = f 3 and φ 1 + φ 2 = φ 3 are satisfied.In Case 1, X in = cos 2πf t so that indicating that the output contains an additional frequency component at the harmonic 2f (harmonic generation) and the mean value of the output has shifted (rectification) with respect to the input.Figure 8a and b show the time series of X out and the significance of the wavelet power for the case when f = 1/64 = 1/λ 1 , b = 1, φ 1 = π/2, φ 2 = π/3, and γ = 0.25 (arbitrary units) and with Gaussian white noise added to the output.In this case, the significance of the wavelet power was tested against a red-noise background spectrum.Figure 8 shows numerous pointwise significance patches, all of which are spurious except for the one at λ 1 = 64.The areawise and geometric test correctly identified the pointwise significance patch at λ 1 = 64 to be significant but deemed a spurious patch as significant at time 140 and at λ = 3.It is noted that the geometric test only deemed the 1 % pointwise significance patch at λ 1 = 64 as significant.Also note that the pointwise significance test was unable to detect the harmonic with period λ 2 = 32 using a red-noise background spectrum.
It should be noted, however, that if the parameter γ were increased to a value greater than 1, the oscillation with period λ 2 = 32 would become more prominent.In fact, it was found that for γ ≥ 1 the areawise and geometric tests perform better (not shown), correctly identifying the oscillation with period λ 2 = 32, with the result also depending on the noise level of the white noise.Case 1 thus only serves as an illustrative example of a situation that may arise when a wavelet analysis is applied to a geophysical (often noisy) time series.
To extract more information from the wavelet power spectrum, the centroids of holes were plotted as a function of the pointwise significance level (Fig. 8c). Figure 8c shows that holes only existed at pointwise significance levels of at most 15 and 20 % and therefore not all nonlinear time series can generate holes at the 5 % pointwise significance level, suggesting that the relative difference between the primary frequency components or the resulting frequency combinations is important, as discussed below.The amplitudes of the coefficients b and γ , as well as the signal-to-noise ratio of the Gaussian white noise, turn out to be also important, which is discussed below.
Case 2 is the quadratically phase-coupled time series which consists of three frequency components: f 1 = 1/20 = 1/λ 1 , f 2 = 1/30 = 1/λ 2 , and f 1 + f 2 = 1/12 = 1/λ 3 , and γ is assumed to be 0.5.It is noted that Case 1 is a special case of Case 2. Like Case 1, wavelet power was also tested against a red-noise background.Unlike the significance patches in Fig. 8c corresponding to Case 1, holes have appeared in 5 % pointwise significance patches between periods λ 1 = 20 and λ 2 = 30 (Fig. 9b).Moreover, the 5 % pointwise significance patch containing the hole (labeled P 1 ) was found to be geo-metrically significant but was not found to contain an areawise significant subset.It is also worth noting that the areawise and geometric tests failed to detect a significant periodicity at λ 1 = 20 despite the fact that it is known to exist by construction.Figure 9c shows that a few holes existed at low pointwise significant levels (≤ 20 %), though only one was found at the 5 % pointwise significance level (light red shading).However, if one applies the pointwise significance test to the wavelet power at the 20 % significance level a feature emerges that can hardly be produced from red noise (see Table 1), namely a large 20 % significance patch (light blue shading) containing four holes located in the period band 20-30.One can thus have confidence that the feature is significant.Furthermore, by constructing a patch topologically unlike those generated from red noise, significant wavelet power extending from time 20 to 300, undetected by the pointwise, areawise, and geometric tests, has been recovered, whereas only applying the 5 % pointwise test would result in two patches that are seemingly indistinguishable from red noise (labeled P 2 and P 3 ), with only one at λ 2 = 30 being geometrically significant.
The ability of the pointwise, areawise, and geometric tests to detect significant structures inevitably depends on the parameters a, b, γ , f 1 , and f 2 .In fact, Maruan et al. (2007) has already determined that the pointwise test and areawise test are sensitive to the signal-to-noise level.It was hypothesized that the results of the topological method also depend on the parameters a, b, γ , f 1 , and f 2 .To test the hypothesis, several experiments were performed, the first of which investigated the relationship between f 1 , f 2 , and the number of holes.The experiment is described below.
Though both ideal time series contain a quadratic nonlinearity, the nonlinear interaction in Case 2 contained oscillations with nearby frequency components, allowing for the formation of holes, whereas for Case 1 no significant holes appeared in significance patches.It appears that the presence of holes depends on the relative location of two oscillations in the frequency domain and, thus, it is reasonable to suspect that there exists a critical frequency difference f crit , measuring the maximum frequency difference for which holes will appear in a wavelet power spectrum.An empirically derived f crit was determined by generating a large ensemble of time series of the form where f 2 > f 1 > 0 were generated at random, w(t) is additive white noise, and all the time series were of a fixed length.
The signal-to-noise ratio was fixed to 20 and each wavelet power spectrum was tested against a red-noise background spectrum.Figure 10 shows the mean value of N h as a function of r = (f 2 −f 1 )/f 2 , the relative fractional change.For r = 0.5, holes never appeared, whereas for r = 0.3 holes appeared frequently.There is therefore a preferred frequency combination for which holes are more likely to appear.It was estimated that the upper critical value of r is r crit = 0.45.Using the definition of r, one can write f crit = 0.45f 2 and therefore the critical frequency difference is a function of f 2 .
It turns out that even if the above experiment (not shown) was repeated using white-noise rather than red-noise background spectra r crit would still be equal to 0.45, though more holes were found to appear at signal-to-noise ratios of less than 2. It was expected, however, that r crit also depends on the amplitudes of the cosines in Eq. ( 24).Thus, a third experiment was conducted in which the amplitudes of the cosines were allowed to vary from 1 to 50 and f 1 and f 2 were allowed to vary from 0 to 0.5.The experiment was repeated for signal-to-noise ratios from 1 to 20.The results from the experiments (not shown) indicate that, for rednoise background spectra and for a signal-to-noise ratio of 20, r crit = 0.53, contrasting with the case for white-noise background spectra where r crit was found to be 0.51.
The empirical results shown in Fig. 10 have theoretical implications.Suppose that a time series contained two oscillations of equal amplitude such that frequency components of the two oscillations were such that f 2 = 2f 1 .Furthermore, suppose that the wavelet power of the oscillations were computed and the significance was tested against a red-noise or white-noise background spectrum.In this case, r = 0.45 and therefore holes will almost never appear in 5 % pointwise significance patches, making the detection of quadratic phase coupling using topological methods more difficult in the case of self-interactions.More generally, suppose that a single sinusoid X in (t) = cos 2πf t is passed through the non-Figure 10.Mean number of holes found in 5 % pointwise significance patches as a function of r = (f 2 − f 1 )/f 2 for a sum of two sinusoids with amplitudes equal to unity and frequency components f 1 and f 2 such that f 2 > f 1 > 0. Additive white noise with a signal-to-noise ratio of 30 was added to the sum of sinusoids.Pointwise significance was tested against a red-noise background.Dashed line represents the critical value of r, the value beyond which holes will rarely occur between oscillations of equal amplitude (set to unity) with frequencies f 1 and f 2 .
linear system where, after using the power-reduction for a cosine (Beyer, 1987), the output is given by where n is a positive integer and n q is a binomial coefficient.For the cosines in the summation, the frequency difference between any two cosines is where 0 Using the fact that holes can only appear between oscillation pairs with r ≤ 0.53 for a red-noise background spectrum, one can show that for large n more holes are able to appear in wavelet power spectra, with the likelihood of holes appearing depending on b and γ , with larger values of b and γ producing more holes.In this case, holes can form in the wavelet spectrum since, for example, if m = 6 and p = 5 with n = 10 the condition r ≤ 0.53 will be satisfied.The result also holds if the order of the nonlinear interaction was odd and if the cosine function X in (t) was replaced by a sine function.For an odd-order nonlinear interaction, however, r = (2m − 2p)/(2n + 1 − 2p), where 0 ≤ p < m ≤ n.

Topological significance testing of climatic time series
With a better understanding of the origins of holes contained in significance patches, the wavelet power spectra shown in Figs. 1 and 2 are now analyzed more closely.Shown in Fig. 11a is the topological wavelet diagram corresponding to the wavelet power spectrum of the Niño 3.4 index, which shows the existence of numerous holes at low (≤ 20 %) pointwise significance levels, indicating that these patches are significant features (see Table 1).For example, the rather large patch extending from 1960 to 2013 in the period band of 16-64 months contains a hole located at 1985 and at a period of 32 months that existed at the 5 % pointwise significance level.In the same patch, three more holes existed at the 10 % pointwise significance level, one located at 1975 and at a period of 48 months, a second one located at 1995 and at a period of 64 months, and a third one located at 2008 and at a period of 24 months.According to Table 1, three holes in a single 10 % pointwise significance patch under the null hypothesis of red noise is extremely unlikely, if not impossible.On can thus conclude with high confidence that the patch was not generated from a random stochastic fluctuation.Moreover, the discussion in Sect.5.1 suggests that at the very least phase-coherent oscillations were likely present in the Niño 3.4 time series, where phase coherency implies that two oscillations have a stable relative phase relationship but are not necessarily interacting nonlinearly.The wavelet topological diagram (Fig. 11b) corresponding to the wavelet power spectrum of the NAO is less interesting, containing few holes at high pointwise significance levels.At 1875, however, a patch contained holes at the 10 % pointwise significance level, suggesting that the patch is a significant feature.

Summary and discussion
A geometric significance test was developed for more rigorously assessing the significance of features in the wavelet domain.The geometric test, although related to the existing areawise test, was found to be more flexible in the sense that p values could be readily calculated, involving a single Monte Carlo ensemble.Another strength of the geometric test is that the false discovery rate can be controlled at a desired level, minimizing the number of false rejections of the null hypothesis.On the other hand, the geometric test had the disadvantage of being less local than the areawise test.
It is noted that the geometric test was only applied to patches arising from the convolution of the Morlet wavelet with a time series.The results presented in this paper are not valid for wavelet power spectra obtained using other analyzing wavelets, the reason for which is that each wavelet function has different time-and scale-localization properties that inevitably impact the geometry of patches.For example, patches found in the wavelet power spectrum obtained using a Paul wavelet are elongated in the scale direction relative to those obtained using a Morlet wavelet with ω 0 = 6, resulting in nearby patches at different scales merging together.The merging of patches at different scales will alter their geometry with respect to the relatively thin (in scale) patches obtained using the Morlet wavelet.
One disadvantage of the geometric and areawise tests is that they require a binary decision in which pointwise and geometric significance levels must be chosen.The binary decision can be circumvented by applying a p value adjustment procedure to the wavelet power coefficients directly.For example, one could apply the Benjamini and Hochberg (1995) procedure to the wavelet power coefficients or a modified version of the procedure developed by Benjamini and Yekutieli (2001), which is valid for any dependency structure among the local test statistics.The latter procedure would seem most appropriate given the autocorrelation structure of wavelet power coefficients; however, it is noted that the procedure has less statistical power than the original procedure valid for independent local test statistics, though Wilks (2006) found the Benjamini and Hochberg (1995) procedure to remain powerful even when the assumption of independence is violated.
The topology of significant patches was also analyzed.Holes in significant patches, a topological notion, were capable of distinguishing spurious patches from true structures.The holes were identified as arising from phase-coherent oscillations with nearby frequency components and may indicate the existence of a nonlinear interaction.Patches arising from different analyzing wavelets can differ topologically.For the Paul wavelet, the shrinking of patches in time, for example, was found, after a preliminary investigation, to reduce the number of holes in wavelet power spectra.The reduction in the number of holes can be attributed to the tearing of a patch in the time direction.The results, however, require further investigation and are a subject of future work.
The new methods introduced in this paper were applied to the NAO and Niño 3.4 indices, two well-known but contrasting time series.For the Niño 3.4 index, the methods de-tected geometrically significant structures as well as topological structures unlike that of red noise, which provide evidence of some predictability of El Niño-Southern Oscillation, which has become of increasing importance in climate science given that its future state is uncertain under a changing global climate system (Latif and Keenlyside, 2008).For the NAO index, the new methods were unable to detect features that are distinguishable from background noise, suggesting that the NAO is a stochastic process with little predictability.The methods developed in this paper will give researchers the tools needed for a better understanding of features found in wavelet power spectra.

Figure 1 .
Figure 1.(a) The NAO index from 1870 to 2013.(b) The normalized wavelet power spectrum of the NAO index.Thick contours enclose regions of 5 % pointwise significance.Light shading corresponds to the cone of influence, the region in which edge effects become important.

Figure 2 .
Figure 2. (a) The Niño 3.4 index time series from 1870 to 2013.Points labeled "M" indicate where the merging process occurred and points labeled "H" indicate where a hole was formed (see Sect. 5.2 for details).(b) Same as Fig. 1b except for the Niño 3.4 index for the period 1870-2013."H", together with the arrow, marks the location of a hole.

Figure 3 .
Figure 3. Significance of wavelet power for the NAO index mean monthly values for the period 1870-2013.Black contours enclose regions of 5 % pointwise significance (see Sect.3.1) and thick red contours are the 5 % areawise significant subsets (see Sect. 3.2).Light gray shading indicates those 5 % pointwise significance patches that are geometrically significant at the q = 0.05 level and dark gray shading indicates those 1 % pointwise significance patches that are geometrically significant at the q = 0.05 level.
Figure 3. Significance of wavelet power for the NAO index mean monthly values for the period 1870-2013.Black contours enclose regions of 5 % pointwise significance (see Sect.3.1) and thick red contours are the 5 % areawise significant subsets (see Sect. 3.2).Light gray shading indicates those 5 % pointwise significance patches that are geometrically significant at the q = 0.05 level and dark gray shading indicates those 1 % pointwise significance patches that are geometrically significant at the q = 0.05 level.

Figure 4 .
Figure 4. Same as Fig. 3 but for the Niño 3.4 values for the period 1870-2013.The blue curve represents a closed path f that is not contractible to a point because it surrounds a hole (see Sect. 5.1 and Fig. 2).

Figure 5 .
Figure 5. (a) An idealized convex pointwise significance patch whose boundary is indicated by the black contour and whose centroid is indicated by the black dot.For reference, the reproducing kernel associated with the areawise test is shown, which is indicated by gray shading.In this case, the reproducing kernel lies entirely inside the patch.The convexity, normalized area, and χ are displayed on the bottom left corner.(b) Same as (a) except the area of the convex hull (red curve) is not equal to the area of the patch and the reproducing kernel is unable to fit entirely inside the patch.

Figure 6 .
Figure 6.(a) Similarity index between the geometric and areawise tests for different lag-1 autocorrelation coefficients for red-noise processes (see text).(b) Same as (a) except for the ratio between the false positive results of the geometric and areawise tests.The dotted black line represents the ratio of false positive between the two tests when the false discovery rate of the geometric test is controlled at the 0.05 level.(c)Same as (a) but for the mean convexity of 5 % pointwise significance patches that are geometrically significant at the 5 % level and for the mean convexity of 5 % pointwise significance patches that are areawise significant at the 5 % level.

Figure 7 .
Figure 7. Normalized mean number of holes as a function of pointwise significance level.The number of holes was calculated by generating 10 000 synthetic wavelet power spectra of red-noise processes with fixed autocorrelation coefficients of 0.5 and computing the number of holes.Gray shading represents the 95 % confidence interval.

Figure 8 .
Figure 8.(a) Time series of Case 1, which results from passing a single sinusoidal input with period λ = 64 through Eq. (16).Gaussian additive white noise with a signal-to-noise of 2 was added to the output response.(b) The significance of wavelet power for Case 1 (see Fig. 3 for details).(c) Topological wavelet diagram corresponding to (b).Points are the centroids of the holes at a given pointwise significance level, where both the color and size of the dots indicate the pointwise significance level at which the hole existed.The shading of the patches corresponds to the pointwise significance level at which the wavelet power coefficient existed, with the color of the shading lighter than the dots for clarity.

Figure 9 .
Figure 9. (a) Time series of Case 2. Gaussian additive white noise with a signal-to-noise ratio of 8 was added to the time series.At the point labeled "A", two oscillations resonate, merging two pointwise significance patches in the wavelet domain.At the point labeled "B" no such resonance occurs and the two significance patches separate.(b) The significance of wavelet power (see Fig. 3 for details).The pointwise significance patch labeled P 1 contains a hole and the pointwise significance patches labeled P 2 and P 3 were falsely deemed insignificant by the geometric and areawise tests.(c) Same as Fig. 8c except for Case 2.