Preprints
https://doi.org/10.5194/npg-2018-38
https://doi.org/10.5194/npg-2018-38
24 Sep 2018
 | 24 Sep 2018
Status: this preprint was under review for the journal NPG but the revision was not accepted.

Exploring the effects of missing data on the estimation of fractal and multifractal parameters based on bootstrap method

Xin Gao and Xuan Wang

Abstract. A time series collected in the nature is often incomplete or contains some missing values, and statistical inference on the population or process with missing values, especially the population or process having multifractal properties is easy to ignore. In this study, the simulation and actual data were used to obtain the probability distributions of fractal parameters through a new bootstrap resampling mechanism with the aim to statistically infer the estimation accuracy of the time series containing missing values and four kinds of interpolated series. Firstly, the RMS errors results showed that compared with the four interpolation methods for one parameter H required for fBm the direct use of the series with missing values has the highest estimation accuracy, while it shows certain instability in the estimations of the multifractal parameters C1 and α, especially at higher missing levels, however, the accuracy of the parameters estimated by preprocessing of piecewise linear interpolation method can be improved; in addition, it is also concluded that α is more sensitive to the changes caused by these processing than another parameter C1. Secondly, the effects on the ability of statistical inference for a population caused from the data losses are explored through the estimation of confidence intervals and hypothesis testing by proposing a new bootstrap resampling mechanism, and the conclusions showed that whether it is a mono-fractal parameter or multifractal parameters, the large deviations from the estimates of original series occur on the series with missing values when the losses are serious, while the defects can be compensated by the preprocessing using PLI and PBI methods; similarly, although the results of the incomplete series at the low missing levels are close to the original and PLI series, while at the high missing levels, the probabilities of Type II Errors of the neighboring values are unable to ignore, but the PLI or PBI method can avoid the erroneous judgments.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Xin Gao and Xuan Wang
 
Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement
 
Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement
Xin Gao and Xuan Wang
Xin Gao and Xuan Wang

Viewed

Total article views: 2,689 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
2,255 366 68 2,689 147 82 72
  • HTML: 2,255
  • PDF: 366
  • XML: 68
  • Total: 2,689
  • Supplement: 147
  • BibTeX: 82
  • EndNote: 72
Views and downloads (calculated since 24 Sep 2018)
Cumulative views and downloads (calculated since 24 Sep 2018)

Viewed (geographical distribution)

Total article views: 2,236 (including HTML, PDF, and XML) Thereof 2,221 with geography defined and 15 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 20 Nov 2024
Download
Short summary
An observation series obtained in the natural world is often incomplete and contains many missing values, however, many natural phenomena are concerned with fractals or multifractals, therefore, the fractal modeling of a time series with missing values has certain uncertainties, In this study, the simulation and actual data were used to statistically infer the estimation accuracy of the time series containing missing values and four kinds of interpolated series.