New methods and algorithms in data mining, broadly construed to include computational statistics, signal processing, information theory, machine learning, network science, nonlinear dynamics, and database mining, are motivated in climate change and weather extremes owing to (a) the massive volume and complexity of the data, (b) strengths and limitations of our physical understanding and of physics-based computer models, (c) multivariate dependence in space–time, including long memory processes and long-range spatial dependence, (d) the presence of colored and even 1/f noise, along with chaos and nonlinear dynamics, and (e) the growing importance of extreme values and rare events. However, data mining may lead to spurious insights unless appropriate precautions are taken, and may even generate misleading results when complex dependence predominates and if processes are chaotic. Under "non-stationary" or changing conditions, confidence in data mining approaches alone may be limited even further. Incorporating physics in data mining algorithms and methods can help in the interpretability of results, lead to better generalization, and produce meaningful insights.
This special issue presents recent scientific developments in physics-guided data mining, where the physics is incorporated within the data-driven models through, for example, variable selection, learning of data-driven or network models, effective pre- or post-processing, interpretability, and explainability. Contributions are broadly focused on physics-guided mining of weather and climate data, where the data may be obtained from in situ and remote-sensing observations, paleoclimate reconstructions, reanalysis products, and numerical simulations from physics-based weather and climate models. Such developments are intended to provide new insights into climate change and/or weather extremes, such as heat waves, cold snaps, heavy precipitation, floods, droughts, tropical and extratropical cyclones, tornadoes, and storm surges, as well as climate variability and change on interannual to glacial–interglacial timescales, both historical and projected. In addition, novel studies on statistical downscaling, data assimilation, large-scale optimization, and stochastic differential-equation-based methods in conjunction with physics-guided data mining are presented.