Complexity signatures in the geomagnetic H component recorded by the Tromsø magnetometer ( 70 ◦ N , 19 ◦ E ) over the last quarter of a century

Abstract. Solar disturbances, depending on the orientation of the interplanetary magnetic field, typically result in perturbations of the geomagnetic field as observed by magnetometers on the ground. Here, the geomagnetic field's horizontal component, as measured by the ground-based observatory-standard magnetometer at Tromso (70° N, 19° E), is examined for signatures of complexity. Twenty-five year-long 10 s resolution data sets are analysed for fluctuations with timescales of less than 1 day. Quantile–quantile plots are employed first, revealing that the fluctuations are better represented by Cauchy rather than Gaussian distributions. Thereafter, both spectral density and detrended fluctuation analysis methods are used to estimate values of the generalized Hurst exponent, α. The results are then compared with independent findings. Inspection and comparison of the spectral and detrended fluctuation analyses reveal that timescales between 1 h and 1 day are characterized by fractional Brownian motion with a generalized Hurst exponent of ~1.4, whereas including timescales as short as 1 min suggests fractional Brownian motion with a generalized Hurst exponent of ~1.6.


Introduction and methodology
Understanding the coupling mechanisms between various processes and phenomena in the solar-terrestrial system remains a considerable challenge.An approach that has gained popularity in recent years involves examining the noise in a signal (or time series), i.e. the stochastic rather than the deterministic component.The underlying idea is that a signature in the noise occurring in a driving mechanism may re-appear in the noise of another observable, thus linking the two.This approach is akin to fingerprinting a crime scene.Examination of time series for such fingerprints was pioneered by, inter alios, Hurst (1951), Mandelbrot (1983), Grassberger and Procaccia (1983), and Koscielny-Bunde et al. (1998).Work by Eichner et al. (2003), Lennartz and Bunde (2009), and Kantelhardt et al. (2006), for example, has refined the approach.Much attention has been given to oceanographic and meteorological observables, including the recent study by Hall (2014), and more recently solar-related observables, e.g.Scafetta and West (2003) and Rypdal and Rypdal (2011).On the other hand, there has been relatively little focus on observables related to the terrestrial ionosphere and measured locally in order to examine the mapping of solar forcing to the Earth's surface.Hall et al. (2011) examined complexity in the ionospheric E region in the auroral zone: the altitude, strength and persistence of the E region are of particular interest for radio communications and for studying the possible overall shrinking of the middle atmosphere due to climatic cooling (Roble and Dickinson, 1989;Rishbeth and Clilverd, 1999).In this particular study, the geomagnetic field characteristics represented by a local time series measured, on average, beneath the auroral oval at 70 • N, 19 • E (geographic) will be examined.Stochastic variations appear as fluctuations driven by ionospheric currents and thus associated magnetic fields perturbing the background geomagnetic field.The time series employed will be described in more detail forthwith.
The approach to analysing the geomagnetic field data here is the same as that used by Hall (2014): first, the data are filtered using a boxcar to remove all periodicities less than 1 day, and the result is then subtracted from the original time series, thus producing a residual excluding all preconceived (deterministic) periodicities.This is equivalent to "deseasonalization", commonly applied to neutral atmosphere data, for example.The reasoning for removing all fluctuations of the period of 1 day or larger will become apparent when the data are described in more detail.
The majority of studies to obtain complexity signatures from time series aims at evaluating the Hurst exponent, H , as invented by Hurst (1951) as a quantification of the scaling nature, or self-affinity, of the stochastic component of the data.H , however, lies in the interval {0, 1} and cannot alone identify the process as fractional Gaussian noise (fGn) or fractional Brownian motion (fBm) as invented by Mandelbrot and van Ness (1968).In the concept of fBm, successive increments are correlated: the time series is non-stationary with temporally varying variance; fGn, on the other hand, is stationary and time-invariant in expectation value and variance.In these processes, positive correlation between successive increments indicates that preceding motion is likely to continue and negative correlation indicates that preceding motion is likely to be followed by a reversal, likelihoods commonly referred to as persistent and anti-persistent, respectively.Rather than derive H , therefore, the approach of Kantelhardt et al. (2006) is adopted here, and the generalized Hurst exponent, α, is derived.The two exponents are related: for fGn, H = α, and for fBm, H = α−1.Thus, α unambiguously characterizes a process as fBm (α > 1), persistent fGn (0.5 < α < 1) and anti-persistent fGn (0 < α < 0.5), with α = 1.5 indicating the special case of Brownian motion.Furthermore, one can define the scaling exponent of the power spectrum of the signal (the negative of the spectral slope in log-log space) by β: White noise is thus characterized by a flat spectrum, and therefore β = 0. "Pink noise" is when β = 1, and the case where β = 2 is referred to as "red noise" and corresponds to Brownian motion (see, e.g., Vasseur and Yodzis (2004)).Importantly, the generalized Hurst exponent, α, and the power spectrum scaling exponent, β, are related by as explained by, e.g., Hartmann et al. (2013) and Delignieres et al. ( 2006) and references therein.The relationship is conveniently summarized in Fig. 1 of Hall (2014).Moreover, the fractal or Hausdorff-Besicovich dimension D = 2 − H , but assuming fBm.One can see, therefore, that calculating D by, for example, the method of Grassberger and Procaccia (1983) can potentially yield H but not unambiguously provide the same information as α.
Following the sequence used by Hall (2014), rather than blindly launching into a determination of α which will almost inevitably yield some result, the data are examined first for indications of non-linearity by inspections of the probability density function (PDF) and quantile-quantile (Q-Q) analyses (Wilk and Gnanadesikan, 1968).A PDF can indicate qualitatively whether the distribution is non-Gaussian.In Q-Q plots, quantiles of the distribution of the noise in the signal are plotted against those derived from a semi-empirical Gaussian distribution having the same mean and standard deviation; a straight line will result if the signal exhibits a Gaussian distribution.With a preconception of the nature of the PDF, a number of approaches may be employed to obtain α.Some of the methods used most are described and compared by Delignieres et al. (2006), Hartmann et al. (2013) and Heneghan and McDarby (2000) (note, however, that these last authors use "α" as the spectral scaling exponent rather than β).Using experience gained from Hall (2014), α is obtained using both spectral analysis (SA) and detrended fluctuation analysis (DFA); while most physicists will feel comfortable with the more intuitive SA, DFA (Peng et al., 1993) is arguably the preferred method in contemporary research when searching for long-term memory in data.For the purposes of SA, since experimental data are under consideration, the time series should be treated as irregularly sampled; data gaps are few but, even so, must be assumed to exist.Lomb-Scargle periodogram analysis (Press and Rybicki, 1989) is more appropriate than a Fourier transform.Additionally, Fougère (1985) and Eke et al. (2000) have proposed preconditioning of the time series by applying a parabolic window, bridge detrending using the first and last points in the series, and, finally, frequency selection prior to attempting to obtain a spectral exponent.Applying frequency selection to the Lomb-Scargle periodogram is somewhat unpredictable, as discovered by Hall (2014), so the entire spectrum is retained.Finally, the spectrum S is plotted vs. f in loglog space to hopefully identify a regime exhibiting a scaling exponent β according to Eq. (1).In DFA, the stochastic component of the original time series is first cumulatively summed (each new point is the sum of the preceding points in the original).This cumulative summation is then divided into subseries of equal length (n).Each of these subseries is then detrended either by subtracting the straight line between end points (bridge detrending) or linear or polynomial fits (referred to as DFA(1), DFA(2), etc.).Variances are calculated for each subseries and are then averaged to obtain a mean F (n).After repeating this for a range of subseries lengths (usually all possible n), the function F (n) is plotted vs. n in log-log space (as was done in the spectral analysis case) to hopefully identify a regime exhibiting a scaling exponent α: wherein α is the generalized Hurst exponent.Here, a simple linear detrending will be used and DFA will be used to refer to DFA(1).

Underlying data and analysis
Analyses of geomagnetic time series are few relative to those for other observables, such as surface air temperature (SAT) and, at the other end of the solar-terrestrial system, sunspot number (SSN).Downloadable data sets, such as the auroral electrojet (AE) index, have been examined, e.g. by Rypdal and Rypdal (2011), but the AE index is really a synthesis of individual magnetometer measurements at a selection of observatories under or near the auroral oval.The index is the width of the envelope of north-south geomagnetic field perturbations obtained from typically > 10 stations (e.g.Davis and Sugiura, 1966).Wanliss and Reynolds (2003) have determined β for a number of low-latitude records although references therein accentuate the preceding focus on global indices.Hamid et al. (2009) similarly examine data from specific sites.The nature of the data used in this study is somewhat different to that of the data used by Wanliss and Reynolds (2003) and Hamid et al. (2009), however.The geomagnetic field is usually (but not exclusively) defined by three components: declination, D ( • ); horizontal, H (nT); and vertical, Z (nT).When constructing geomagnetic indices as indicators of activity, for example the k-index (Bartels et al., 1939), the horizontal component is preferred because an approximately zonally aligned current system induces a maximum perturbation in the horizontal component of the background field immediately below it.The vertical component, on the other hand, induces a zero crossing in the perturbation.The magnetometer at Tromsø (70 • N, 19 • E) is operated as an observatory-standard instrument and, as such, calibrated accurately at regular intervals (the details of which are superfluous here); this means, however, that the time series is long and reliable.This study uses values of H with 10 s time resolution in the interval 1988-2013 inclusive.In contrast to the studies by Wanliss and Reynolds (2003) and Hamid et al. (2009), here entire years are analysed.Another important difference is the geographical location: at high latitude, most geomagnetic disturbances occur in the evening sector, and the fluctuations in H reflect the rotation of the Earth underneath the auroral oval's typically zonally aligned current systems, which move meridionally backwards and forwards in a sporadic fashion over the magnetometer site.Activity can be expected to repeat at timescales of 1 day and, of course, over solar rotation (Carrington rotation) (e.g.Bartels, 1934) and longer periods, such as the 11-year solar cycle.In order to eliminate these deterministic features from the data set to be studied, a boxcar filter is applied to remove fluctuations less than 24 h; the smoothed time series is then subtracted from the original to arrive at a set of residuals representing the stochastic component.A corresponding method was employed by Hall (2014) and discussed and tested by Hall et al. (2011).As will be shown, an advantage of spectral analysis is that individual periodicities remaining in the (supposedly) stochastic residual show up as narrow spikes and, in practice, have an insignificant influence on the determination of the slope of the spectrum.On the contrary, in DFA such periodicities are in general not distinguishable.
An example of the input data -in this case for 2001, in the middle of the overall time interval -is shown in Fig. 1.The top panel shows the original data in black with a 1day smoothing superimposed.The bottom panel zooms in on 1 June 2001 and shows the result of subtracting the smoothed time series from the original to obtain a residual representing the stochastic component.In addition, the smoothed data are shown with the mean subtracted, corresponding to the upper panel.As will be demonstrated forthwith, the removal of the 1-day running mean did not affect the non-stationarity of the signal on shorter timescales.Other years are similar.In Fig. 2., the top panels show the distribution of the stochastic component with linear (left) and logarithmic (right) ordinates, again from 2001.In a somewhat unsophisticated approach, although adequate for the purpose, the maximum of the distribution and its width at half-maximum are determined.The mean is assumed to be 0 as a result of the subtraction of the deterministic component as illustrated in Fig. 1.Corresponding Gaussian and Cauchy distributions are determined, and these are also shown in the figure .A characteristic of the distributions (for all years) is that they are skewed; this is because the current in the overlying ionosphere tends to have a preferred orientation and perturbations of the horizontal geomagnetic field tend to be negative on average.Qualitatively, the Cauchy distribution is a better description of that of the data than the Gaussian.Again, all years are similar, although for less skewed distributions the suitability of the Cauchy model is even more evident.Furthermore, in contrast to the Gaussian model the data exhibit heavy tails.The bottom panels of Fig. 2 show the quantilequantile (Q-Q) portrayals: left -vs.Gaussian; right -vs.Cauchy.The departures from linearity (i.e. in the central regions of the respective plots) are indicative of long tails at both ends of the distribution relative to the model; see Chambers et al. (1963) for diagnostics of Q-Q plots.The Cauchy distribution reproduces the tails in the data distribution rather better than the Gaussian.Recall, however, that in this simple approach, the half-maximum full-width values have been matched and comparisons might have been improved by choosing 1/e or another arbitrary value.Nonetheless, a Cauchy distribution would always represent the tails in the data distribution better than a Gaussian.It is important to note at this point that a Cauchy process can be defined as a Brownian motion subordinated to a process associated with a Lévy distribution (Sato, 1999).Again, it should be noted that analyses for all 25 years exhibit similar characteristics and, thus, that the results hitherto justify the further investigation that follows.
The next step is to determine the power spectral density and its scaling with respect to frequency.As stated earlier, it is incorrect to presuppose there are no data breaks, and, therefore, a Lomb-Scargle periodogram is derived, rather than the more traditional Fourier transform, but with the time series first having been preconditioned by applying a parabolic window and bridge detrending using the first and last points.The result (again for 2001) is shown in the left panel of Fig. 3.At periods greater than 1 day, fluctuations have been effectively removed by the subtraction of the deterministic component, and, as predicted, discrete peaks at 1 day and (approximately) 12 h remain, demonstrating the method not to be perfect.Vertical dotted and dashed lines indicate familiar timescales.Convincing scaling is evident from 1 day down to approximately 5 min; there is the suggestion of a subrange between 5 and 1 min scales and then a tendency to flattening.The scaling exponent β has, therefore, been obtained over two ∼ 10 min.The linearity weakens after about 12 h, when the curve begins to flatten (as could be anticipated from inspection of the spectrum).Departure from the fitted line is easy to identify on longer timescales but not on shorter timescales, which is contrary to what is seen in the SA approach.On the other hand, DFA results in a much "cleaner" plot that in turn is conducive to deceptively reliable linear fitting.Had timescales > 1 h been excluded from the fit in the DFA, α from the overall SA and α from the DFA would have been similar.It can be argued that SA is easier to interpret because a physicist would normally have some preconception of the processes characterized by different timescales, hidden to some degree when using only DFA.The SA method is preferred here because of closer contact with any underlying physics and reliability of identification of different subranges for linear fitting.At this point the method used here departs somewhat from that used by Hall (2014), in which surrogate data were generated and then compared with the original (Theiler et al., 1992).For the purposes of this study, at least, the generation of, for example, 100 surrogate data sets corresponding to 1-year-long 10 s resolution (i.e. over 3 million points) followed by DFA analyses is not practicable for computational reasons.Inspection of the probability distribution functions combined with visual comparison with known distributions and subsequently Q-Q analyses is deemed to confirm the complex nature of the stochastic process in the data.Again, only 1 year's results are shown here; all 25 years exhibit similar values of α for both the DFA and the two SA subranges.
The results of analysing all 25 years from 1988 to 2013 inclusive, as described and illustrated for 2001 above, are shown in Fig. 4. From top to bottom, the panels show the following: DFA(1), SA (1 day-1 min) and SA (12-1 h).The final panel shows yearly mean sunspot numbers from the Solar Influences Data Analysis Center (SIDC) (Clette, 2011).For each year (small) vertical bars indicate the 1-σ uncertainty in the individual linear fits.It is evident (by comparing axes) that by taking all scales between 1 day and 1 min in the SA, α is always > 1.5.For DFA and SA (12-1 h), the αs are similar and always < 1.5.There are considerable yearto-year variations irrespective of method but no obvious periodicity that could be attributable to the two solar cycles the data set spans.On the other hand, linear regressions reveal trends, which are also indicated in the figure together with 95 % confidence limits according to the method of Working and Hotelling (1929).The trends are small but worth mentioning here to give the possibility of comparison with other studies in future.For DFA the trend is 0.04±0.05century −1 ; for SA (1 day-1 min), 0.2 ± 0.2 century −1 ; and for SA (12-1 h), 0.01 ± 0.1 century −1 .Since, in all three cases, the uncertainties are approximately equal to the trends themselves, none of the values can be considered to be significant.The mean values of α over the 25 years are as follows: DFA, 1.46 ± 0.02; SA (12-1 h), 1.39 ± 0.04; SA (1 day-1 min), 1.54 ± 0.07.

Discussion
To summarize the above findings, all analyses, irrespective of scale, indicate fBm.Both SA used in the regime 12-1 h and DFA (in which the determination of the scaling exponent is weighted towards the longest scales) indicate α ≈ 1.42.For day-minute scales, SA yields α = 1.54, but the uncertainty dictates that the result cannot be regarded as significantly different from, for example, that of the DFA.The results weighted towards longer timescale fluctuations, however, exhibit small enough uncertainties that, taken collectively, one can conclude that α is slightly less than 1.5, the significance of which will be discussed forthwith.Other studies of complexity in geomagnetic, or geomagnetically related, time series have either used isolated periods of, for example, months (Wanliss and Reynolds, 2003;Hamid et al., 2009), albeit with a time resolution comparable to that used here, or much longer derived data sets of, for example, the AE index with a different time resolution.Rypdal and Rypdal (2010) studied the AE index on timescales similar to those addressed here.Obtaining a synthesis, even over several relatively local time series (as is done when determining the AE index, but especially globally -e.g. for the KP index), will tend to even out variances over intradiurnal timescales, e.g. the occurrence of local ionospheric current systems.Any nonstationarities may be masked out such that local determinations of α and identification of processes as fGn or fBm may well differ.Wanliss and Reynolds (2003) and Hamid et al. (2009) employed geomagnetic data from specific stations, but for a low latitude.In contrast to the analyses here, Wanliss and Reynolds (2003) examined only a short time interval of 5 days but for six different Southern Hemisphere sites, determining α to increase approximately with increasing latitude (southward) from α = 1.55 to α = 1.69, i.e. fBm but with consistently slightly higher exponents than determined here.Hamid et al. (2009) employed a longer data set (1 month) than Wanliss and Reynolds (2003) but for two sites.However, Hamid et al. (2009) categorized days as active or quiet, finding that active days exhibited α = 1.64 and 1.55 for the two sites and quiet days exhibited α = 1.45 and 1.33.In this study, no attempt has been made to pick out quiet and disturbed days.Over the course of an entire year, however, it could be expected that quiet days predominate, considering that the rapid perturbations extracted from the observation and deemed to be a stochastic component rely on current systems being approximately over the magnetometer.The findings in this study, viz.that α lies slightly under 1.5, can be considered as being in good agreement with those of Wanliss and Reynolds (2003) and Hamid et al. (2009).In order to attempt to discriminate between active and quiet years (e.g.Vaquero et al., 2014), however, annual mean sunspot numbers have been plotted in the bottom panel of Fig. 4.There is no conclusive correlation between solar activity and the values of α from the two methods (and two scaling ranges).If the trends could be considered significant (which they are not), they might be seen to anticorrelate with overall solar activity over the 25 years (quieter sun), which would contradict the suggestion by Hamid et al. (2009) that active days are characterized by α > 1.5.Wanliss and Reynolds (2003) point to several publications employing high-latitude geomagnetic data, but these use the AE index.Takalo et al. (1994) find that the AE index scales with α ≈ 1 for low frequencies (timescales > 100 min) and α ≈ 1.5 for high frequencies (1-100 min, and therefore shorter than typical substorm durations).Note that Takalo et al. (1994) use "α" for power-law dependence whereas this study uses "β", and thereafter Rypdal and Rypdal (2010) convert the value of Takalo et al. (1994) to "H " -the Hurst exponent.In terms of the classification used here, and also by Kantelhardt et al. (2006), the AE appears as a non-stationary process for intra-substorm timescales, with α ≈ 1.5 but becoming stationary (i.e.fGn) when only > substorm timescales are considered.This is compatible with the findings here because, at any given instant, it is unlikely that the same geomagnetic fluctuations will be registered by more than a few geographically grouped observations: the AE and the stochastic component of H should be expected to exhibit similar signatures on short timescales, and this is the case here.
Other studies of geomagnetic signatures include analysis of the disturbance storm time (Dst) index, e.g. by Balasis et al. (2006), who calculate, explicitly, β for periods during 2001 and for the entire year.Scaling was examined in the range of 5 days-2 h and indicated α between ∼ 1.4 and ∼ 1.6.However, inspection of the spectra, particularly for the whole year (as used here, too), revealed that there is the suggestion of a change of slope at ∼ 10 h such that the higherfrequency subrange would yield a slightly higher (presumably > 1.5) result for α.This is not the same as the spectral breakpoint mentioned in this study, but together these two factors illustrate that bicoloured noise (Takalo et al., 1994) may well be present.Recent exploits in comparing complexity signatures are worthy of note: Scafetta and West (2003) proposed terrestrial temperature anomalies to be linked to solar flare intermittency via a Lévy process.Rypdal and Rypdal (2011) find identical multifractal noise signatures in both the AE index and the z component of the interplanetary magnetic field suggestive of the existence of mechanisms linking intermittency in the two.These studies utilized very long data sets, typically of a 1-month time resolution, aimed at facilitating better trend analyses and prediction of future climate.Here, similar techniques are employed on shorter (∼ years) data sets with higher time resolution (∼ seconds) but to help identify differences between geographically local complexity.Kantelhardt et al. (2006), using different hydrological data types, explain how results from shorter-term data could be modelled by an autoregressive moving average (ARMA) (Whittle, 1951), but, given the time series' lengths and similarities with other work, fBm seems a good candidate for modelling the stochastic nature of the geomagnetic field.

Conclusions
To conclude: 25 years of 10 s local measurements of the horizontal component of the geomagnetic field are examined, 1 year at a time.All variability with timescales of 1 day or longer, thus including anticipated and/or known periodicities, are removed.This is analogous to deseasonalization of, for example, monthly temperature data -a common practice for meteorological time series but, apparently, not necessarily the case, nor meaningful, for studies of the AE index.Thereafter, quantile-quantile (Q-Q) followed by spectral and detrended fluctuation analyses (SA and DFA, respectively) are performed, revealing, as a characteristic common to all years, distributions better described as Cauchy rather than Gaussian.The SA, performed here by a Lomb-Scargle periodogram analysis rather than the more usual Fourier analy-sis in order to allow for data gaps, suggests bicoloured spectra, more difficult to discern if using DFA alone.The resulting generalized Hurst exponents, α, generally indicate antipersistence with values 1.42 ± 0.02 for DFA and 1.39 ± 0.04 for SA.There is a possibility that for shorter timescales (down to 1 min) α ≈ 1.54 ± 0.07.The former is indicative of fractional Brownian motion (fBm) but with a degree of likelihood for anti-persistence, while the latter is indicative of persistence.Taking the uncertainties (σ ) into consideration, none of the analyses yield values of α which are significantly (viz.2σ ) different from 1.5 and, therefore, either significantly anti-persistent or persistent.Nevertheless, the results agree qualitatively with independent findings both derived from local data, as used here, and with zonally synthesized data, as comprises the AE index, for example.In particular, Takalo and Timonen (1994) identified two subranges in the power spectrum of AE, attributing the lower-frequency regime to turbulence in the solar wind and the higher frequencies to origins in the magnetosphere.This would, in addition, support the hypothesis that geomagnetic field variations can be universally described by fractional Brownian motion.Matching this signature with those identified in space weather parameterizations adds to a growing arsenal of tools available for understanding and forecasting the impact of solar activity on the terrestrial environment.This study also points the way to further, although computationally demanding, analyses, including the use of geomagnetic data from other locations, hypothesis testing using surrogates and more refined data selection according to disturbed conditions.

Figure 1 .
Figure 1.H component of geomagnetic field from Tromsø (70 • N, 19 • E) for the year 2001.Data are at 10 s time resolution.The top panel shows the original data in black with a 1-day smoothing superimposed in blue.The bottom panel shows a detail of the residual -the result of subtracting the smoothed time series from the original -for 1 June and with the mean-subtracted 1-day smoothing, again superimposed in blue.

Figure 2 .
Figure 2. Portrayals of the distribution of the stochastic component of the H component of the geomagnetic field from 2001 as shown in Fig. 1.Top left: linear ordinate axis; top right: logarithmic ordinate axis.In the top panels Gaussian (red) and Cauchy (blue) distributions are fitted (explained and discussed in the text).Bottom left: Q-Q plot of the observed data versus Gaussian; bottom right: Q-Q plot of the observed data vs.Cauchy.

Figure 3 .
Figure 3. Spectral (left) and detrended fluctuation (right) analyses for data shown in previous figures.Familiar timescales are indicated by vertical dotted/dashed lines.Fitted scaling exponents are shown by coloured lines together with corresponding values.

Figure 4 .
Figure 4. Generalized Hurst exponents (β) from all years 1988-2013 (with uncertainties).Top panel: DFA method; second panel: SA method using the range of 1 day-1 min; third panel: SA using only 12-1 h.Tentative linear trends are shown together with 95 % confidence limits indicated by dotted hyperbolae.Bottom panel: yearly average sunspot numbers.