Improved singular spectrum analysis for time series with missing data
Abstract. Singular spectrum analysis (SSA) is a powerful technique for time series analysis. Based on the property that the original time series can be reproduced from its principal components, this contribution develops an improved SSA (ISSA) for processing the incomplete time series and the modified SSA (SSAM) of Schoellhamer (2001) is its special case. The approach is evaluated with the synthetic and real incomplete time series data of suspended-sediment concentration from San Francisco Bay. The result from the synthetic time series with missing data shows that the relative errors of the principal components reconstructed by ISSA are much smaller than those reconstructed by SSAM. Moreover, when the percentage of the missing data over the whole time series reaches 60 %, the improvements of relative errors are up to 19.64, 41.34, 23.27 and 50.30 % for the first four principal components, respectively. Both the mean absolute error and mean root mean squared error of the reconstructed time series by ISSA are also smaller than those by SSAM. The respective improvements are 34.45 and 33.91 % when the missing data accounts for 60 %. The results from real incomplete time series also show that the standard deviation (SD) derived by ISSA is 12.27 mg L−1, smaller than the 13.48 mg L−1 derived by SSAM.