24 Sep 2018
Status: this preprint was under review for the journal NPG but the revision was not accepted.

Exploring the effects of missing data on the estimation of fractal and multifractal parameters based on bootstrap method

Xin Gao and Xuan Wang

Abstract. A time series collected in the nature is often incomplete or contains some missing values, and statistical inference on the population or process with missing values, especially the population or process having multifractal properties is easy to ignore. In this study, the simulation and actual data were used to obtain the probability distributions of fractal parameters through a new bootstrap resampling mechanism with the aim to statistically infer the estimation accuracy of the time series containing missing values and four kinds of interpolated series. Firstly, the RMS errors results showed that compared with the four interpolation methods for one parameter H required for fBm the direct use of the series with missing values has the highest estimation accuracy, while it shows certain instability in the estimations of the multifractal parameters C1 and α, especially at higher missing levels, however, the accuracy of the parameters estimated by preprocessing of piecewise linear interpolation method can be improved; in addition, it is also concluded that α is more sensitive to the changes caused by these processing than another parameter C1. Secondly, the effects on the ability of statistical inference for a population caused from the data losses are explored through the estimation of confidence intervals and hypothesis testing by proposing a new bootstrap resampling mechanism, and the conclusions showed that whether it is a mono-fractal parameter or multifractal parameters, the large deviations from the estimates of original series occur on the series with missing values when the losses are serious, while the defects can be compensated by the preprocessing using PLI and PBI methods; similarly, although the results of the incomplete series at the low missing levels are close to the original and PLI series, while at the high missing levels, the probabilities of Type II Errors of the neighboring values are unable to ignore, but the PLI or PBI method can avoid the erroneous judgments.

Short summary
An observation series obtained in the natural world is often incomplete and contains many missing values, however, many natural phenomena are concerned with fractals or multifractals, therefore, the fractal modeling of a time series with missing values has certain uncertainties, In this study, the simulation and actual data were used to statistically infer the estimation accuracy of the time series containing missing values and four kinds of interpolated series.