An enhanced correlation identification algorithm and its application on spread spectrum induced polarization data

In spread spectrum induced polarization (SSIP) data processing, attenuation of background noise from the observed data is the essential step that improves the signal-to-noise ratio (SNR) of SSIP data. The traditional correlation identification (TCI) algorithm has been proposed to improve the SNR of these data. However, signal processing in background noise is still 10 a challenging problem. We propose an enhanced correlation identification (ECI) algorithm to attenuate the background noise. In this algorithm, the cross-correlation matching method is helpful for the extraction of useful components of the raw SSIP data and suppression of background noise. Then the frequency-domain IP (FDIP) method is used for extracting the frequency response of the observation system. Even when the signal to noise ratio (SNR) is -37.5dB, this ECI algorithm can still be able to keep 3.0% relative error. Experiments on both synthetic and real SSIP data show that the ECI algorithm can not only 15 suppress the background noise but also better preserves the valid information of the raw SSIP data to display the actual location and shape of adjacent high resistivity anomalies, which can improve subsequent steps in SSIP data processing and imaging.


Introduction
Induced Polarization (IP) technology operated in both the time domain and the frequency domain is useful in exploration for groundwater mapping, mineral exploration, and other environmental studies (Revil 2012(Revil , 2019Høyer et al. 2018). Since the 20 phenomenon of IP in time domain was first discovered by Schlumberger in 1920s, there has been consistent efforts to explore its utilization in various researches. In 1959, the frequency-domain IP (FDIP) approach is proposed by Collett and Seigel, which became the most classic and widely used mapping technique. For example, the first variable-frequency approach was proposed by Wait et al in 1959, then the spectrum approach of the complex resistivity was developed by Zonge and Wynn in 1975, and the dual-frequency IP approach was presented and developed by He et al. in 1993 andHan et al. in 2013. Recently, 25 spread spectrum induced polarization (SSIP) is a popular branch of FDIP which uses pseudo-random current pulses of opposite polarity as an excitation source (Chen et al., 2007;Xi et al., 2013Xi et al., , 2014He et al. 2015). According to the intrinsic broadband characteristics of the source itself, the spectral response of an observation system can be simultaneously calculated at multiple frequencies in electrical exploration . Thus, this SSIP technology has been gaining attention and application in electrical prospecting (Xi et al., 2014;Lu et al., 2019;Wang and He, 2020).

30
SSIP technology has a certain degree of noise immunity, but it is still polluted by inevitable background noise in IP data acquisition. The background noise can be mainly categorized into two types: the Gaussian noise and the impulsive interference with different percentage of outliers Kimiaefar et al., 2018;Li et al., 2019). If the background noise is not effectively reduced, the remnant noise can affect the calculation of complex resistivity and may mislead subsequent interpretations of the subsurface structure.

35
The field of FDIP denoising has achieved quite good results through the constant research of experts and scholars. There have been many algorithms that can be used to suppress the FDIP random noise (Mo et al., 2017), such as smooth filter (Guo, 2017), Mean stack (Liu, 2015), digital filter (Meng et al., 2015), and robust stacking . The smooth can effectively attenuate Gaussian noise, but the impulsive interference with intense energy leaves the effectiveness of this algorithm limited. Therefore, an effective attenuative algorithm for background noise is still a challenging task for traditional 40 https://doi.org /10.5194/npg-2020-8 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License. noise suppression algorithms (Neelamani et al., 2008;Liu et al., 2017). SSIP method also faces the same issue (Liu et al., , 2017. Recently, the new algorithm based on a circular cross-correlation method, the traditional correlation identification (TCI) algorithm, has also been used to suppress the SSIP noise (Li et al. 2013;Zhang et al. 2020). Due to its effective denoising ability, the identification method has gained more attention and development. However, the TCI algorithm is restricted because 5 the excitation signal is sensitive to the random noise. For this problem, we propose an enhanced correlation identification (ECI) algorithm for reducing the noise in SSIP data. The ECI algorithm obtains cross-correlations between the transmitter output signal, the excitation signal, and the response signal. The performance of the ECI algorithm is demonstrated on both synthetic and field SSIP data. Experimental results show that the ECI algorithm can effectively control the root mean square of noise (NRMS) increase, enhance its denoising performance in background noise and improve the valid signal preservation to display 10 the actual location and shape of high resistivity anomalies with higher resolution.

15
between the electrodes M and N is measured. To simultaneously obtain the spectral response of subsurface at various frequencies, pseudo-random sequence based the excitation signal i t ( ) is considered. Thus, the spectral response of subsurface be retrieved by the TCI algorithm, and its spectral response be expressed as (Li et al., 2013): where ui S f ( ) is the cross-power spectral density of ( ) is the impulse spectral response of the observing system. Given this observation mode using low-power signals, the magnetotelluric system is a time-invariant system and let us suppose (1) can further be expressed as: correlation function of ( ) i t , ( ) U f and ( ) I f depict the geometric factor defined by the frequency spectrum of ( ) u t and the frequency spectrum of ( ) i t respectively, and τ denotes time-delay.
https://doi.org/10.5194/npg-2020-8 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License. In the practical field environment, this observation mode is contaminated by the background noise, as shown in Figure2. The output of the sensors A k ( 1, 2,3 k= ) can be expressed as: where ( ) k n t is the background noise. 10 Therefore, according to Eq. (2), the formula of the TCI algorithm is given as: Eq. (6) demonstrates that the TCI algorithm has a weak denoising effect when 3 ( ) n t is the massive intense noise. In other words, the TCI algorithm depends on the energy intensity of 3 ( ) n t present in ( ) i t .

The ECI theoretical model
That the denoising ability of the TCI algorithm is limited is caused by that ( ) i t is sensitive to 3 ( ) n t . To solve this problem, the ECI algorithm is proposed and its derivation process is as follows.

5
Firstly, let us suppose that the telluric system is a time-invariant system under low-power signals. Then, 2 ( ) y t and 3 ( ) y t can further be given as: where 2 A and 3 A denote the response coefficient between excitation signal ( ) i t and response signal ( ) u t , respectively.

10
For three sensor output signals, their cross-correlation functions are the periodic correlation functions of time τ . When the length of the correlation window NT is specified. The cross-correlation functions can be expressed as follows: where 1 2 ( ) n n R τ and 1 3 ( ) n n R τ are the cross-correlations of 1 ( ) n t , 2 ( ) n t and 3 ( ) n t respectively, and τ is time-delay that lies in 15 the range of NT to NT . According to the design features of ZW-CMDSII, 1 ( ) n t caused mainly by the background noise of the instrument is relatively small energy. 2 ( ) n t and 3 ( ) n t may possess more massive energy because they are mainly composed of field background noise. Thus we can assume that the cross-relation of the two background noises is zero levels and obtain 1 2 ( ) 0  n n R τ and 1 3 ( ) 0  n n R τ . Based on the above analyses, we can further obtain: Finally, according to Eq. (2) and Eq. (9), Eq. (10) can be expressed as following https://doi.org/10.5194/npg-2020-8 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License.
So, Eq. (13) is the formula of the ECI algorithm. The derivation process of this formula clearly describes that the ECI algorithm can effectively suppress the background noise and be independent on the degree of 3 ( ) n t present in ( ) i t .

Experiment on synthetic SSIP data record
We test the ECI algorithm for attenuating background noise of SSIP data sets in comparison with the FDIP algorithm and the TCI algorithm. For the comparison, the SNR, root mean square of noise (NRMS) and relative error ( ε ) as the objective parameters to 5 judge the performance of denoising, which are calculated as follows:   We use synthetic Gaussian noise with the deviation and mean value of 0.1 and 1.1 as a standard template. The excitation signal ( ) i t is polluted by synthetic different energy levels of the Gaussian noise, as shown in Figure 5. The denoised results obtained by the ECI algorithm and the others are shown in Figures 5a-c. These figures show that the relative error and SNR of the excitation 5 signal ( ) i t are calculated and compared at the three main frequencies when the noise RMS ranges from 0 to 0.9. From Figure 5b, it can be seen that the denoising ability of TCI algorithm depends on the energy intensity of the noise presence in the excitation signal ( ) i t , consistent with the conclusion of Eq. (7). Meanwhile, Figure 5 indicates that the ECI algorithm possesses the best denoising performance under the same noise RMS and SNR. Particularly, when the SNR is -37.5 dB, this ECI algorithm is still able to keep 3.0% relative error at the primary frequency 160Hz.  Previous literature has shown that if the percentages of outliers in impulsive noise exceed 50% , the traditional denoising algorithm will be limited (Liu et al., , 2017. Thus, Synthetic impulsive noise is added to the excitation signal ( ) i t in ten 5 percent steps. Their standard deviations (SDs) and skewnesses (SKs) are shown in Figure 6. As depicted in Figure 7, the three algorithms have a certain degree of denoising performance versus the different percentages of the synthetic outliers against the raw data. However, the ECI algorithm still has superior denoising performance and holds smaller volatility of the relative error when the percentage of the outliers is more significant than 50% .

Experiment on real SSIP data record
To further verify the performance of the ECI algorithm, the Wenner array, the traditionally applied system in the field, was 5 selected for performing laboratory tests, as shown in Figure 8. SSIP data was acquired with high-density meter and 20 electrodes at 1m spacing. A Wenner acquisition sequence was adopted with 55 potential measurements expressed utilizing the green and points. The figure shows an example of two high resistance cavities. The two cavities were presented by the letters A and B, and their calibers were about 1.8 m × 2 m. The measured excitation signal had a range between 0.04 and 0.19 A approximately. The transmitter output signal is a three-order sequence with 80 Hz frequency, and its voltage is about ± 11.8 10 V. The sampling frequency is 625 kHz. The excitation and response data of 40 periods were recorded at each point.

15
https://doi.org/10.5194/npg-2020-8 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License. Figure 9 demonstrates the experimental SSIP data processed by the three algorithms, inverted with Res2DInv. It can be observed that the location and shape of two abnormal bodies are distinguished only in the ECI algorithm while recognized as one whole in the other algorithms. We believe the reason that ECI has higher detection precision is due to its higher denoising ability. Figure 9. Inverted resistivity sections of the two high resistivity anomalies at 80Hz with using (a) the FDIP method, (b) the TCI algorithm, and (c) the ECI algorithm.

5
To verify the reason of the improved detecting precision, the SD for each red point is calculated, as shown 10. This figure shows that among the 33 SD of SSIP data by processed the ECI method is the lowest at all points except for the 28th. The 10 average SD in ECI processing the SSIP data is 7% and 3.8% lower than the FDIP and TCI respectively. Also, the maximum value of SDs with the ECI method is 5% and 1.4% lower than the others, and the minimum value is 8% and 10% lower, respectively. The These results quantitively indicate that the ECI algorithm improves the accuracy and robustness of the collected data. Therefore, we believe that the ECI algorithm has advantage on suppressing background noise, which benefits the subsequent steps in SSIP data processing and imaging. Figure 10. Standard deviation (SD) of the ECI algorithm and the others to the red data dots at 80Hz. https://doi.org/10.5194/npg-2020-8 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License.

Conclusions
We propose the ECI algorithm that effectively attenuates the background noise in SSIP data and improves the complex resistivity spectrum. This method uses the correlation function to neutralize the influence of the background noise in the SSIP data, and the spectrum complex resistivity can be calculated at multiple frequencies by the formula of the complex resistivity. Simulation results show that the ECI algorithm can effectively attenuate the background noise and improve the SNR. Subsequently, the practicability 5 of the ECI algorithm is further verified by a field test. The results demonstrate that the SD of the SSIP data is improved, which benefits the accuracy and stability of the collected data. There is a good agreement between the complex resistivity and the geological target body with high resistance, which indicates that the ECI algorithm can help to improve the quality of interpretation and inversion in the survey area. Furthermore, there are simulation experiments show that the denoising ability of ECI method is less effective when the SSIP data is contaminated by weak Gaussian random noise. Therefore, denoising algorithm based on 10 pseudo random sequence correlation identification is still left open for more investigation.