New significance test methods for Fourier analysis of geophysical time series

When one applies the discrete Fourier transform to analyze finite-length time series, discontinuities at the data boundaries will distort its Fourier power spectrum. In this paper, based on a rigid statistics framework, we present a new significance test method which can extract the intrinsic feature of a geophysical time series very well. We show the difference in significance level compared with traditional Fourier tests by analyzing the Arctic Oscillation (AO) and the Nino3.4 time series. In the AO, we find significant peaks at about 2.8, 4.3, and 5.7 yr periods and in Nino3.4 at about 12 yr period in tests against red noise. These peaks are not significant in traditional tests.


Introduction
Continuous and Discrete Fourier transform has been used for numerous studies in geophysics for a long time (Jenkins and Watts, 1968;Chatfield, 1989;Zar, 1999).However, when one applies the discrete Fourier transform to analyze finite-length time series, Fourier power spectrum obtained is distorted in the high frequency domain and the low frequency domain.The reason why Fourier power spectrum is distorted in the high frequency domain is that the time series is discontinuous at the data boundaries.The reason why Fourier power spectrum is distorted in the low frequency domain is that the time series is of finite length.So, when one uses the discrete Fourier transform to compute the Fourier power spectrum, only the middle part of Fourier power spectrum obtained is real.In order to reduce boundary effect, Correspondence to: Z. Zhang (zhangzh@bnu.edu.cn)two algorithms have been tried previously (e.g.Oppenheim and Schafer, 1989).The first algorithm is to pad the time series on each endpoint with zeroes, but it artificially creates discontinuities at the endpoints.The second algorithm uses some window (e.g."cosine damping") to preprocess the time series.Although it overcomes the boundary effect, at the same time, we lose almost all the information at the endpoints.
Statistical significance tests are used to extract intrinsic features from Fourier power spectrum of geophysical time series against a null hypothesis of some noise model.Traditional significance tests on Fourier power spectra consist of the following three steps.
Step 1. Use the discrete Fourier transform to compute the Fourier power spectrum of geophysical time series.
Step 2. Choose a suitable null hypothesis.For many geophysical phenomena, an appropriate noise model is red noise, that is, a first order autoregressive process.However, choice of noise model is crucial to reliable significance testing (Mann and Lees, 1996).
Step 3. Extract intrinsic features.If a peak in the Fourier power spectrum of geophysical time series is above this background noise spectrum at some arbitrary level, say 95 % confidence level, then it is a true feature of geophysical time series with 95 % confidence level.
Since in reality, we only deal with finite-length geophysical time series, discontinuities at the data boundaries will distort the Fourier power spectrum.Therefore, significance tests are unreliable.The purpose of this paper is to present a new and more reliable significance test method for the Fourier power spectra.
Z. Zhang and J. Moore: New significance test methods for Fourier analysis of geophysical time series This paper is organized as follows: in Sect.2, we explain why Fourier power spectrum obtained by discrete Fourier transform is distorted in the high frequency domain, so traditional significance tests are unreliable.In Sect.3, we present a new method to decompose the time series into two parts and introduce the concept of modified Fourier power spectrum as a basis for new significance tests.In Sect.4, since for many geophysical time series, an appropriate background noise or null hypothesis is red noise, we will give the modified Fourier power spectrum of red noise in a rigid statistics framework.Finally, in Sect.5, we will apply our significance tests to analyze Arctic Oscillation and Nino3.4 indices which are typical examples of climatic time series with, respectively, little and considerable autocorrelation, to demonstrate the strikingly different significance test results.

Distortion of Fourier power spectrum in the high frequency domain
The continuous Fourier transform of the continuous time series and | f (ω)| 2 is called its Fourier power spectrum.In practice, time series have finite duration, so we can only deal with the time series from 0 to T .In order to compute the Fourier transform, the time series is usually padded with zeroes.In this way, the continuous Fourier transform of f is approximated by where F is a discontinuous function at t = 0 and t = T Because of the discontinuity at the boundary points t = 0 and t = T , the Fourier transform obtained is distorted in the high frequency domain.
In fact, it is also impossible to obtain the value of f on each point in [0,T ].Now assume that we are given the discretized version of the continuous time series f sampled as From here, we see that { f ( k N δt )} N −1 k=0 can be computed approximately by the discrete Fourier transform discrete time series {f (nδt)} N −1 n=0 .However, by (2.1), we know that the Fourier transform obtained is dis in the high frequency domain.

Our decomposition method
In order to develop a new significance test to extract the intrinsic features from time series, in this se we will decompose the time series and introduce the concept of modified Fourier power spectrum.
From here, we see that { f ( k N δt )} N−1 k=0 can be computed approximately by the discrete Fourier transform of the discrete time series {f (nδt)} N−1 n=0 .However, by Eq. (1), we know that the Fourier transform obtained is distorted in the high frequency domain.

Our decomposition method
In order to develop a new significance test to extract the intrinsic features from time series, in this section, we will decompose the time series and introduce the concept of modified Fourier power spectrum.

Continuous time series
Let f be a continuous time series f (t) on [0,T ] (see Fig. 1, Top), we divide f into two parts Now we make the odd extension of v (see Fig. 1, Bottom): Again we make the 2T −periodic extension, we get v * : By the following proposition, we can see that the above approach can remove the discontinuities at the data boundary very well.The proof of this proposition can be found in Appendix A at the end of this paper.
Finally, we expand v * (t) into the Fourier series T where the Fourier coefficient c n is computed by Comparing f (t) with v * (t), we find that in v * (t), discontinuities on data boundary are removed.We call |c n | 2 the modified Fourier power spectrum of continuous time series f (t).Now we begin to simplify the formula (2): Since v * is odd and cos nπ t T is even, the first term on the righthand side of the above formula is zero.Again since sin nπ t T is odd, we have (3)

Discrete time series
In practice, we are only given the samples of some continuous time series f on [0,T ], i.e.
As for the continuous case, we decompose the time series into and This implies that We make the odd extension of the sequence {v k }: and then make the 2N −periodic extension So {v * k } is a 2N −periodic odd sequence.Moreover, {v * k } are the samples of v * (t) which has been defined in Sect.3.1.In other words, Numerically integrating Eq. ( 3), noticing that T = N δt, v * (0) = 0, and v * (T ) = 0, we obtain the relation between {v * k } and the Fourier coefficients {c n } as follows: So we can define the modified Fourier power spectrum of discrete time series {x n } N 0 as www.nonlin-processes-geophys.net/18/643/2011/ Nonlin.Processes Geophys., 18, 643-652, 2011 For many geophysical phenomena, an appropriate background is red noise (Mann and Lees, 1996;Ghil et al., 2002).
A simple model for red noise is the univariate lag-1 autoregressive [AR(1)] process {x n } N 0 : where λ is a constant and called the AR(1) coefficient, |λ| < 1, and {Z n } are independent Gaussian random variables with mean 0 and variance σ 2 .If {x n } N 1 are known, one can use the following formula (Brockwell and Davis, 1991) to estimate the AR(1) coefficient λ and noise variance σ 2 .
Denote the discrete Fourier transform of an AR(1) process Torrence and Compo (1998) showed that the Fourier power spectrum |f n | 2 is distributed as where σ 2 is the variance of time series and χ 2 2 is the chisquare distribution with two degrees of freedom.In traditional significance tests, one uses this distribution to extract the intrinsic features of geophysical time series.
In order to satisfy the needs of our new significance tests, we will give the distribution for the modified Fourier power spectrum of AR(1) process in a rigid statistics framework.
From the definition of an AR(1) process, we have x 0 = 0 and x 1 = λx 0 + Z 1 = Z 1 .Furthermore, In general, we have Based on our decomposition method for discrete time series in Sect.3.2, we have Therefore, By Eq. ( 5), the modified Fourier power spectrum of AR(1) process is |Y n | 2 , where It is easy to check the following formulas: and Furthermore, we deduce that By sin nN π N = 0, we can rewrite I n as From this and Eq. ( 6), we get where and From this, we deduce that and the mathematical expectation Noticing that {Z l } N 1 are independent Gaussian random variables with mean 0 and variance σ 2 , we get By Eq. ( 9), we have For c n,l , we have Now we compute d n .Since 1−e it − 1 −N e i(N −1)t 4sin 2 t 2 , and so Finally, we have By Eqs. ( 8)-( 12), we know that Y n is a Gaussian random variable with mean 0 and variance where In the other words, Y n is distributed as where Z is a Gaussian random variable with mean 0 and variance 1.Since the modified Fourier power spectrum of AR(1) process is |Y n | 2 , we have Theorem 1.The modified Fourier power spectrum of an AR(1) process is distributed as where Z is a Gaussian random variable with mean 0 and variance 1, and c n,l , d n are stated in Eq. ( 14).
In Appendix B, we will simplify the formula (15) further.

Applications
In this section, based on Theorem 1 (or Theorem 2 in Appendix B), we will use our significance test to extract intrinsic features of time series from the Fourier power spectrum in the high-frequency domain against a red noise Null Hypothesis.First of all, we consider an artificial example.We embed     a pure sinusoid of period 8.8 within an AR(1) process with AR(1) coefficient 0.3 and noise variance 1 (see Fig. 2).We will do significance tests by using traditional Fourier method (Fig. 3, Left) and our method (Fig. 3, Right).From Fig. 3, we see that our method discovers this pure sinusoid while the traditional method does not.
Below we examine the winter Arctic Oscillation(AO) indices (December-February 1851-1997) of Thompson and Wallace (1998).The AO is a key aspect of climate variability in the Northern Hemisphere.The AO index is defined as the leading empirical orthogonal function (EOF) of Northern Hemisphere sea level pressure anomalies pole ward of 20 • N (Thompson and Wallace, 1998), and characterized by an exchange of atmospheric mass between the Arctic and middle latitudes.We first make a significance test for the Fourier power spectrum of the AO index based on the traditional method with a red noise Null Hypothesis.In Fig. 4 (Top), we can see that only two peaks are well above the background spectrum.When one applies the discrete Fourier transform to analyze a finite-length AR(1) process, the data boundary will distort its Fourier power spectrum, especially in the high frequency domain.Now, based on Theorem 1 (or Theorem 2 in Appendix B), we use our significance test.Since discontinuities at the data boundaries are removed, we find more significant peaks at about 2.8, 4.3 and 5.7 yr periods well above the background spectrum (see Fig. 4, Bottom).These peaks are not significant in traditional tests.
Nino3.4 is used as a measure of the amplitude of the El Nino Southern Oscillation (ENSO).The Nino3.4 index is defined as the average monthly sea surface temperature anomaly in the region bounded by 5 • N to 5 • S from 170 • W to 120 • W. Unlike the winter AO annual index, the monthly Nino3.4 time series has a high autocorrelation and hence the significant level shows a strong trend with frequency.Figure 5 shows the Fourier and modified Fourier power spectra, and we find a significant peak at about 12 yr period in a test against red noise.This peak is not significant in traditional tests.The main reason for the increased significance is the removal of boundary effect in our new significance test.Notice also that the modified Fourier power spectra has higher resolution than traditional spectrum due to the odd extension made to the original time series.This high resolution may result in sharper spectral peaks at any period which is especially noticeable around 12 yr in Fig. 5.
The periodicities in the AO time series have been examined using wavelet coherence methods (Jevrejeva et al., 2003), and Monte Carlo Singular Spectrum Analysis (MC-SSA, Jevrejeva et al., 2004).These papers also discuss the relationship between the AO and Nino3 time series (Nino3 is closely related to Nino3.4 being defined as the monthly Sea Surface Temperatures (SST) averaged over the eastern half of the tropical Pacific (5 • S-5 • N, 90 • -150 • W).The MCSSA and wavelet coherence analyses revealed that periodic signals in the AO and Nino3 time series vary in over time, e.g.AO exhibits significant power in the 3.5-5.7 yr band between 1935-1950, and power in the 12-14 yr band in the second half of the record.In contrast the Fourier spectrum analysis shows a simple result representing the whole time series studied.
In general the strongest signals detected with MCSSA and wavelets were at 2.2, 3.5, 5.7 and a broad band centered on 14 yr periods.Jevrejeva et al. (2003) discuss the phase differences between these signals in various climatic time series as a function of their geographic location.They found that the signals in the 2.2-5.7 yr bands have short phase lags of about 3 months between their location of origin (the tropical Pacific, and the polar regions, implying a rapid transportation mechanism likely via the stratosphere as proposed by Baldwin and Dunkerton (2001).In contrast the 13.9 yr signal has a phase lag of about 0.7 yr between the Nino3.4 and AO series, and a further year between the atmospheric record recorded in the Southern Oscillation Index and Nino3.4.These much slower transport patterns can also be tracked in global sea surface temperature fields (Jevrejeva et al., 2004), and points to a oceanic transport mechanism from tropics to both polar regions.
The results of the Fourier significance analysis presented in Figs. 4 and 5 show similar features as described in the wavelet and MCSSA analyses.There are a broad band of signals from 2.2-5.7 yr that are associated with ENSO variability in the Pacific region -with the 5.7 yr signal detected as significant in our new test exactly coinciding with features in the MCSSA (Jevrejeva et al., 2004) and wavelet analysis (Jevrejeva et al., 2003).These signals propagate globally as a forced response to the slow warming of tropical oceans.We find a 12 yr peak in the Nino3.4series but the same period is significant in AO only in the second half of that time series (and only in our new significance test not the traditional one).It is likely that this 12 yr signal corresponds to the 13.9 yr period peak discussed above.The difference in periodicity of the peak is perhaps that the MCSSA approach is a time domain analysis, and does not rely on sinusoidal basis functions.The MCSSA analysis also reveals that the signal is growing over time, especially in the 20th century while it is insignificant in the 19th century.There are also some differences in the periodicity of the high frequency signals in Nino3.4 between methods and probably reflect choice of parameters such as the embedding dimension in the MC-SSA analysis (Ghil et al., 2002) -which may lead to a 2.8 yr signal being resolved as a 2.2 yr and 3.5 yr signals.We also note that in the traditional significance test finds a peak at 5.4 yr in Nino3.4 (Fig. 5), while the new test reveals 2 peaks at 5.5 and 5.7 yr.We present a novel way of modifying the traditional Fourier transform to remove the impact of discontinuity at the data boundaries.We show that when the modified Fourier decomposition is made, we can calculate significance levels against a red noise Null Hypothesis in a rigid statistics framework.Tests on real time series show important differences in significance levels compared with traditional methods.This suggests that the Fourier spectra produced commonly in geophysical or climatic time series are suspect.In many time series that may reflect on-going change, such as climatic series, the method we suggest has clear advantages over traditional Fourier analysis since the information at the most recent times, i.e. the data boundary is preserved.It is worth noticing that in many time series of practical importance such as the AO or Nino3.4,the differences in significance levels include multi-year and decadal frequencies.Such multi-year and decadal variability is the target of considerable current research, and therefore establishing the presence of significant oscillatory power in these bands is highly relevant.

Figure 1 :
Figure 1: (Top) A simple time series: f (t) = sin( πt 3.2 ).(Middle) The adjusted time series obtained by rem the line which connect two endpoints of the original time series.(Bottom) We do the odd extension f adjusted time series the point k N δt is approximated by the following sum:

Fig. 1 .
Fig. 1. (Top) A simple time series: f (t) = sin( π t 3.2 ).(Middle) The adjusted time series obtained by removing the line which connect two endpoints of the original time series.(Bottom) We do the odd extension for the adjusted time series with AR1 coefficient 0.3 and noise variance 1. (Middle) a pure sinusoid of period he sum of top signal and middle signal Fourier power spectrum is computed by discrete Fourier transform.(Right) Modified Fourier is computed by our method.The dashed line is 99% confidence red noise spectrum . 20

Figure 3 :Fig. 3 .
Figure 3: (Left) Fourier power spectrum is computed by discrete Fourier transform.(Right) Modified Fourier power spectrum is computed by our method.The dashed line is 99% confidence red noise spectrum .

Figure 4 :Fig. 4 .
Figure 4: (Top) Fourier power spectrum of AO indices is computed by discrete Fourier transfor Modified Fourier power spectrum of AO indices is computed by our method.The dashed line is 95 red noise spectrum .

Fig. 5 .
Fig. 5. (Top) Fourier power spectrum of Nino3.4 indices is computed by discrete Fourier transform.(Bottom) Modified Fourier power spectrum of Nino3.4 indices is computed by our method.The dashed line is 95 % confidence red noise spectrum .