Until the 1980s, scaling notions were restricted to self-similar homogeneous special cases. I review developments over the last decades, especially in multifractals and generalized scale invariance (GSI). The former is necessary for characterizing and modelling strongly intermittent scaling processes, while the GSI formalism extends scaling to strongly anisotropic (especially stratified) systems. Both of these generalizations are necessary for atmospheric applications. The theory and some of the now burgeoning empirical evidence in its favour are reviewed.

Scaling can now be understood as a very general symmetry principle. It is needed to clarify and quantify the notion of dynamical regimes. In addition to the weather and climate, there is an intermediate “macroweather regime”, and at timescales beyond the climate regime (up to Milankovitch scales), there is a macroclimate and megaclimate regime. By objectively distinguishing weather from macroweather, it answers the question “how long does weather last?”. Dealing with anisotropic scaling systems – notably atmospheric stratification – requires new (non-Euclidean) definitions of the notion of scale itself. These are needed to answer the question “how big is a cloud?”. In anisotropic scaling systems, morphologies of structures change systematically with scale even though there is no characteristic size. GSI shows that it is unwarranted to infer dynamical processes or mechanisms from morphology.

Two “sticking points” preventing more widespread acceptance of the scaling paradigm are also discussed. The first is an often implicit phenomenological “scalebounded” thinking that postulates a priori the existence of new mechanisms, processes every factor of 2 or so in scale. The second obstacle is the reluctance to abandon isotropic theories of turbulence and accept that the atmosphere's scaling is anisotropic. Indeed, there currently appears to be no empirical evidence that the turbulence in any atmospheric field is isotropic.

Most atmospheric scientists rely on general circulation models, and these are scaling – they inherited the symmetry from the (scaling) primitive equations upon which they are built. Therefore, the real consequence of ignoring wide-range scaling is that it blinds us to alternative scaling approaches to macroweather and climate – especially to new models for long-range forecasts and to new scaling approaches to climate projections. Such stochastic alternatives are increasingly needed, notably to reduce uncertainties in climate projections to the year 2100.

Perhaps the most obvious difficulty in understanding the atmosphere is in dealing with its enormous range of scales. The single picture in Fig. 1 shows clouds with horizontal spatial variability ranging from millimetres to the size of the planet, a factor of 10 billion in scale. In the vertical direction the range is more modest but still huge: about 10 million. The range of temporal variability is extreme, spanning a range of 100 billion billion: from milliseconds to the planet's age (Fig. 2).

The earliest approach to atmospheric variability was phenomenological: weather as a juxtaposition of various processes with characteristic morphologies, air masses, fronts, and the like. Circumscribed by the poor quality and quantity of the then available data, these were naturally associated with narrow-scale-range, mechanistic processes.

At first, ice ages, “medieval warming”, and other evidence of low-frequency processes were only vaguely discerned. Weather processes were thought to occur with respect to a relatively constant (and unimportant) background: climate was conceived as simply long-term “average” weather. It was not until the 1930s that the International Meteorological Organisation defined “climate normals” in an attempt to quantify the background “climate state”. The duration of the normals – 30 years – was imposed essentially by fiat: it conveniently corresponded to the length of high-quality data then available: 1900–1930. This 30-year duration is still with us today with the implicit consequence that – purely by convention – “climate change” occurs at scales longer than 30 years.

Interestingly, yet another official timescale for defining “anomalies” has been developed. Again, for reasons of convenience (and partly – for temperatures – due to the difficulty in making absolute measurements), anomalies are defined with respect to monthly averages. Ironically, a month wavers between 28 and 31 d: it is not even a well-defined unit of time.

The overall consequence of adopting, by convenience, monthly and 30-year timescales is a poorly theorized, inadequately justified division of atmospheric processes into three regimes: scales less than a month, a month up to 30 years, and a lumping together of all slower processes with timescales longer than 30 years. While the high-frequency regime is clearly “weather” and the slow processes – at least up to ice age scales – are “climate”, until Lovejoy (2013) the intermediate regime lacked even a name. Using scaling – and with somewhat different transition scales – the three regimes were finally put on an objective quantitative basis, with the middle regime baptized “macroweather”. By using scaling to quantitatively define weather, macroweather, and climate, we can finally objectively answer the question: how long does weather last? A bonus, detailed in Sect. 2, is that scaling analyses showed that what had hitherto been considered simply climate is itself composed of three distinct dynamical regimes. Rather than lumping all low frequencies together, we must also distinguish between macroclimate and megaclimate.

To review how scaling defines dynamical regimes, let us define scaling using fluctuations – for example of the temperature or of a component of the wind.
For the moment, consider only one dimension, i.e. time series or spatial
transects. Temporal scaling means that the amplitudes of fluctuations are
proportional to their timescale raised to a scale-invariant exponent. For appropriately nondimensionalized quantities,

Later, in Sect. 2.5, fluctuations as differences (sometimes called “poor man's wavelets”) are replaced by (nearly as simple) Haar fluctuations based on Haar wavelets (see also Appendix B) and in Sect. 3, and Eq. (1) is interpreted stochastically. Finally, in Sect. 4, the notion of scale itself is generalized by introducing a scale function that replaces the usual (Euclidean) distance function (metric). These anisotropic-scale functions are needed to handle scale in 2D or higher spaces, especially with regard to stratification.

In atmospheric regimes where Eq. (1) holds, average fluctuations over
durations

Over the Phanerozoic eon (the last 540 Myr), the five scaling regimes are weather, macroweather, climate, macroclimate, and megaclimate
(Lovejoy, 2015). Starting at around a millisecond (the dissipation time), this
covers a total range of

If the key statistical characteristics of the atmosphere at any given scale are determined by processes acting over wide ranges of scales – and not by a plethora of narrow-range ones – then we must conclude that the fundamental dynamical processes are in fact dynamical “regimes” – not uninteresting “backgrounds”. While there may also be narrow-range processes, they can only be properly understood in the context of the dynamical regime in which they operate, and in any event, spectral or other analysis shows that they generally contribute only marginally to the overall variability. The first task is therefore to define and understand the dynamical regimes and then – when necessary – the narrow-range processes occurring within them.

Cigarette smoke (left) showing wisps and filaments smaller than 1 mm up to about 1 m in overall size. The upper right shows two clouds, each several kilometres across with resolutions of 1 m or so. The lower right shows the global-scale arrangement of clouds taken from an infrared satellite image of the Earth with a resolution of several kilometres. Taken together, the three images span a range of several billion in spatial scale. Reproduced from Lovejoy (2019).

Left: 1000 points of various time series collectively spanning the
range of scales of 470 Myr to 0.067 s

The first 8196 points of the temperature series measured by a GulfStream 4 flight over the Pacific Ocean at 196 mb and 1 s resolution (corresponding to 280 m). Because the aircraft speed is much greater than the wind, this can be considered a spatial transect. The bottom shows the absolute change in temperature from one measurement to the next normalized by dividing by the typical change (the standard deviation). This differs from the spike plot on the right-hand side of Fig. 2 only in the normalization, here by the standard deviation, not by the absolute difference. Reproduced from Lovejoy and Schertzer (2013).

Before answering the quite different scaling question “How big is a
cloud?”, it is first necessary to discuss a complication: that the scaling
is different for every level of activity. It turns out that the wide range
over which the variability occurs is only one of its aspects: even at fixed
scales, the variability is much more extreme than is commonly believed.
Interestingly, the extremeness of the variability at a fixed scale is a

To graphically see this, it is sufficient to produce a “spike plot” (the right-hand-side columns of Fig. 2, time, and the corresponding spatial plot, Fig. 3). These spike plots are simply the absolute first differences in the values normalized by their overall means (in Fig. 3, the normalization is slightly different, by the standard deviation). In the right-hand-side column of Fig. 2 and the bottom of Fig. 3, we see – with a single but significant exception, macroweather in time (Fig. 2) – that they all have strong spikes signalling sharp transitions. In turbulence jargon, the series are highly “intermittent”.

How strong are the spikes? Using classical (Gaussian) statistics, we may use probability levels to quantify them. For example, Fig. 2 (right) shows solid horizontal lines that indicate the maximum spike that would be
expected from a Gaussian process with the given number of spikes. For the
1000 points in each series in Fig. 2, this line thus corresponds to a
Gaussian probability

The spikes visually underline the fact that variability is not simply a question of the range of scales that are involved: at any given scale, variability can be strong or weak. In addition, events can be highly clustered, with strong ones embedded inside weak ones and even stronger ones inside strong ones in a fractal pattern repeating to smaller and smaller scales. This fractal sparseness itself can itself become more and more accentuated for the more and more extreme events/regions: the series will generally be multifractal.

Scaling is also needed to answer the question “how big is a cloud?” (here
“cloud” is taken as a catch-all term meaning an atmospheric structure or eddy). Now the problem is what we mean by “scale”. The series and transects in
Figs. 2 and 3 are 1D, so that it is sufficient to define the scale of a fluctuation by the duration (time) or length (space) over which it
occurs (actually, time involves causality, so that the sign of

Consider Fig. 4, which displays a cloud vertical cross section from the CloudSat radar. In the figure, the gravitationally induced stratification is striking, and since each pixel in the figure has a horizontal resolution of
1 km but a vertical resolution of 250 m, the actual stratification is 4 times stronger than it appears. What is this cloud's scale? If we use the usual Euclidean distance to determine the scale, should we measure it in
the horizontal or vertical direction? In this case, is the cloud scale its width (200 km) or its height (only

If the horizontal–vertical aspect ratio were the same for all clouds, the two choices would be identical to within a constant factor, and the anisotropy would be “trivial”. The trouble is that the aspect ratio itself turns out
to be a strong power-law function of (either) horizontal or vertical scale, so that, for any cloud,

To further appreciate the issue, consider the simulation in Fig. 5 that shows a vertical cross section of a multifractal cloud liquid water density field. The left-hand-side column (top to bottom) shows a series of blow-ups in an isotropic (“self-similar”) cloud. Moving from top to bottom, blow-ups of the central regions by successive factors of 2.9 are displayed. In order for the cross sections to maintain a constant 50 % “cloud cover”, the density threshold distinguishing the cloud (white or grey) from the non-cloud (black) must be systematically adjusted to account for this change in resolution. This systematic readjustment of the threshold is required due to the multifractality, and with this adjustment, we see that the cross sections are self-similar; i.e. they look the same at all scales.

The effect of differential (scale-dependent) stratification is revealed in the right-hand-side column that shows the analogous zoom through an anisotropic multifractal simulation with a stratification exponent

In the isotropic simulations (left-hand side), the only difficulty in defining the size of the cloud is the multifractal problem of deciding, for each resolution, which threshold should be used to distinguish cloud from no cloud. However, in the more realistic anisotropic simulation on the right, there is an additional difficulty in answering the question of “how big is a cloud?” Should we use the horizontal or vertical cloud extent? It turns out (in Sect. 4) that, to ensure that the answer is well defined, we need a new notion of scale itself: generalized scale invariance (GSI).

A vertical cloud cross section of radar backscatter taken by the radar on the CloudSat satellite with resolutions of 250 m in the vertical and 1 km in the horizontal. The black areas are those whose radar reflectivities are below the radar's minimum detectable signal. The arrows show rough estimates of the horizontal and vertical extents of the cloud. The two differ by a factor of more than 10. How do they characterize the size of this cloud? Adapted from Lovejoy et al. (2009b).

Left column: a sequence “zooming” into a vertical cross section of an isotropic multifractal cloud (the density of liquid water was simulated and then displayed using false colours with a grey sky below a low threshold). From top to bottom, we progressively zoom in by a factor of 2.9 (total factor

The presentation and emphasis of this review reflect experience over the last years that has shown how difficult it is to shake traditional ways of thinking. In particular, traditional mechanistic meteorological approaches are based on a widely internalized but largely unexamined “scalebound” view that prevents scaling from being taken as seriously as it must be. As we will see (Sect. 2), the scalebound view persists in spite of its increasing divorce from the real world. Such a persistent divorce is only possible because practising atmospheric scientists rely almost exclusively on numerical weather prediction (NWP) or global circulation models (GCMs), and these inherit the scaling symmetry from the atmosphere's primitive equations upon which they are built.

The problem with scaleboundedness is not so much that it does not fit the facts, but rather that it blinds us to promising alternative scaling
approaches. New approaches are urgently needed. As argued in Lovejoy (2022a), climate projections based on GCMs are reaching diminishing
returns, with the latest IPCC AR6 (Arias, Bellouin et al., 2021) uncertainty ranges larger than ever before: cf. the latest climate sensitivity range of 2–5.5 K rise in
global temperature following a CO

There are also sticking points whose origin is in the other, statistical,
turbulence strand of atmospheric science. Historically, turbulence theories
have been built around two statistical symmetries: a scale symmetry
(scaling) and a direction symmetry (isotropy). While these two are
conceptually quite distinct, even today, they are almost invariably
considered together in the special case called “self-similarity”, which is a basic assumption of theories and models of isotropic 2D and isotropic 3D turbulence. Formalizing scaling as a
(nonclassical) symmetry principle clarifies the distinct nature of scale and
direction symmetries. In the atmosphere, due to gravity (not to mention
sources of differential rotation), there is no reason to assume that the scale symmetry is an isotropic one: indeed, atmospheric scaling is
fundamentally anisotropic. The main unfortunate consequence of assuming
isotropy is that it implies an otherwise unmotivated (and unobserved) scale
break somewhere near the scale height (

As we show (Sect. 4), scaling accounts for both the stratification that systematically increases with scale as well as its intermittency. Taking into account gravity in the governing equations provides an anisotropic scaling alternative to quasi-geostrophic turbulence (“fractional vorticity equations”; see Schertzer et al., 2012). The argument in this review is thus that scaling is the primary scale symmetry: it takes precedence over other scale symmetries such as isotropy. Indeed, it seems that isotropic turbulence is simply not relevant in the atmosphere (Lovejoy et al., 2007).

This review primarily covers scaling research over the last 4 decades, especially multifractals, generalized scale invariance, and their now extensive empirical validations. This work involved theoretical and
technical advances, revolutions in computing power, the development of new
data analysis techniques, and the systematic exploitation of mushrooming quantities of geodata. The basic work has already been the subject of
several reviews (Lovejoy and Schertzer, 2010c, 2012b), but
especially a monograph (Lovejoy and Schertzer, 2013). Although a book covering some of the subsequent developments was published more recently (Lovejoy, 2019), it was
nontechnical, so that this new review brings its first four chapters up to
date and includes some of the theory and mathematics that were deliberately omitted so as to render the material more accessible. The last three
chapters of Lovejoy (2019) focused on developments in the climate-scale (and lower-frequency) regimes that will be reviewed elsewhere. The present review is thus limited to the (turbulent) weather regime and its transition to
macroweather at scales of

In order to maintain focus on the fundamental physical scaling issues and implications, the mathematical formalism is introduced progressively – as needed – so that it will not be an obstacle to accessing the core scientific ideas.

This review also brings to the fore several advances that have occurred in the last 10 years, especially Haar fluctuation analysis (developed in detail in Appendix B), and a more comprehensive criticism of scalebound approaches made possible by combining Haar analysis with new high-resolution instrumental and paleodata sources (Lovejoy, 2015). On the other hand, it leaves out an emerging body of work on macroweather modelling based on the fractional energy balance equation for both prediction and climate projections (Del Rio Amador and Lovejoy, 2019, 2021a, b; Procyk et al., 2022) as well as their implications for the future of climate modelling (Lovejoy, 2022a).

The presentation is divided into three main sections. Keeping the technical and mathematical aspects to a minimum, Sect. 2 focuses on a foundational atmospheric science issue: what is the appropriate conceptual and theoretical framework for handling the atmosphere's variability over huge ranges of scales? It discusses how the classical scalebound approach is increasingly divorced from real-world data and numerical models. Scaling is discussed but with an emphasis on its role as a symmetry principle. It introduces fluctuation analysis based on Haar fluctuations that allow for a clear quantitative empirical overview of the variability over 17 orders of magnitude in time. Scaling is essential for defining the basic dynamical regimes, underlining the fact that between the weather and the climate sits a new macroweather regime.

Section 3 discusses the general scaling process: multifractals. Multifractals naturally explain and quantify the ubiquitous intermittency of atmospheric processes. The section also discusses an underappreciated consequence, the divergence of high-order statistical moments – equivalently power-law probability tails – and relates this to “tipping points” and “black swans”. The now large body of evidence for the divergence of moments is discussed and special attention paid to the velocity field where the divergence of moments was first empirically shown 40 years ago in the atmosphere, then in wind tunnels, and most recently in large direct numerical simulations of hydrodynamic turbulence.

In Sect. 4 a totally different aspect of scaling is covered: anisotropic scaling, notably scaling stratification. The section outlines the formalism of GSI needed to define the notion of scale in anisotropic scaling systems. By considering buoyancy-driven turbulence, the 23/9D model is derived: it is a consequence of Kolmogorov scaling in the horizontal and Bolgiano–Obukhov scaling in the vertical. This model is “in between” flat 2D isotropic turbulence and “voluminous” isotropic 3D turbulence – it is strongly supported by now burgeoning quantities of atmospheric data. It not only allows us to answer the question “how big is a cloud?”, but also to understand and model differentially rotating structures needed to quantify cloud morphologies.

In the introduction, the conventional paradigm based on (typically deterministic) narrow-range explanations and mechanisms was contrasted with the alternative scaling paradigm that builds statistical models expressing the collective behaviour of high numbers of degrees of freedom and that provides explanations over huge ranges of scales.

Let us consider the narrow-range paradigm in more detail. It follows in the steps of van Leuwenhoek, who – peering through an early microscope – was famously said to have discovered a “new world in a drop of water” – microorganisms (circa 1675). Over time, this evolved into a “powers of ten” view (Boeke, 1957) in which every factor of 10 or so of zooming revealed qualitatively different processes and morphologies. Mandelbrot (1981) termed this view “scalebound” (written as one word), which is a useful shorthand for the idea that every factor of 10 or so involves something qualitatively new: a new world, new mechanisms, new morphologies, etc.

The first weather maps were at extremely low spatial resolution, so that only a rather narrow range of phenomena could be discerned. Unsurprisingly, the corresponding atmospheric explanations and theories were scalebound. Later, in the 1960s and 1970s under the impact of new data, especially in the mesoscale, the ambient scalebound paradigm was quantitatively made explicit in space–time Stommel diagrams (discussed at length in Sect. 2.6) in which various conventional mechanisms, morphologies, and phenomena were represented by the space scales and timescales over which they operate. For a recent inventory of scalebound mechanisms from seconds to decades, see Williams et al. (2017).

While Stommel diagrams reflected scalebound thinking, the goal was the modest one of organizing and classifying existing empirical phenomenology, and it did this in the light of the prevailing mechanistic analytic dynamical meteorology. It was Mitchell (1976), writing at the dawn of the paleoclimate revolution, who, more than anyone, ambitiously elevated the scalebound paradigm into a general framework spanning a range of scales from (at least) an hour to the age of the planet (a factor of tens of billions, upper left, Fig. 6). Mitchell's data were limited, and he admitted that his spectrum was only an “educated guess”. He imagined when the data would become available that their spectra would consist of an essentially uninteresting white-noise “background” interspersed with interesting quasi-periodic signals representing the important physical processes. Ironically, Mitchell's scalebound paradigm was proposed at the same time as the first GCMs (Manabe and Wetherald, 1975). Fortunately, the GCMs are scaling, inheriting the symmetry from the governing equations (Schertzer et al., 2012); see Chap. 2 of Lovejoy and Schertzer (2013).

Mitchell's schematic (upper-left panel) was so successful that, more than 4 decades later, his original figure is still faithfully reproduced (e.g. Dijkstra, 2013) or updated by very similar scalebound schematics with only minor updates. Even though the relevant geodata have since mushroomed, the updates notably have less quantification and weaker empirical support than the original. The 45-year evolution of the scalebound paradigm is shown in the other panels of Fig. 6. Moving to the right of the figure, there is a 25-year update, modestly termed an “artist's rendering” (Ghil, 2002). This figure differs from the original in the excision of the lowest frequencies and by the inclusion of several new multimillennial-scale “bumps”. In addition, whereas Mitchell's spectrum was quantitative, the artist's rendering retreated to using “arbitrary units”, making it more difficult to verify empirically. Nearly 20 years later, the same author approvingly reprinted it in a review (Ghil and Lucarini, 2020).

As time passed, the retreat from quantitative empirical assessments continued, so that the scalebound paradigm has become more and more abstract. The bottom left of Fig. 6 shows an update downloaded from the NOAA paleoclimate data site in 2015 claiming to be a “mental model”. Harkening back to Boeke (1957), the site went on to state that the figure is “intended … to provide a general “powers of ten” overview of climate variability”. Here, the vertical axis is simply “variability”, and the uninteresting background – presumably a white noise – is shown as a perfectly flat line.

At about the same time, Lovejoy (2015) pointed out that Mitchell's original figure was in error by an astronomical factor (Sect. 2.4), so that – in an
effort to partially address the criticism – an update in the form of a
“conceptual landscape” was proposed (Fig. 6, bottom right, von der Heydt et al., 2021). Rather than plotting the log of the spectrum

The evolution of the scalebound paradigms of atmospheric dynamics (1976–2021). The upper-left “educated guess” is from Mitchell (1976), and the upper-right “artist's rendering” is from Ghil (2002) and Ghil and Lucarini (2020). The lower left shows the NOAA's “mental model” (downloaded from the site in 2015), and the lower right shows the “conceptual model” from von der Heydt et al. (2021).

Although scaling in atmospheric science goes back to Richardson in the
1920s, it was the

Although Mandelbrot emphasized fractal geometry, i.e. the scaling of geometrical sets of points, it soon became clear (Schertzer and Lovejoy, 1985c) that the physical basis of scaling (more generally scaling fields and scaling processes) is in fact a scale-symmetry principle – effectively a scale-conservation law that is respected by many nonlinear dynamical systems, including those governing fluids (Schertzer and Lovejoy, 1985a, 1987; Schertzer et al., 2012).

Scaling is seductive because it is a symmetry. Ever since Noether published
her eponymous theorem (Noether, 1918) demonstrating the equivalence between
symmetries and conservation laws, physics has been based on symmetry
principles. Thanks to Noether's theorem, by formulating scaling as a general
symmetry principle, the scaling

In the case of fluids, we can verify this symmetry on the equations as implemented for example in GCMs (e.g. Stolle et al., 2009, 2012, and the discussion in Sect. 2.2.3) – but only for scales larger than the (millimetric) dissipation scales, where the symmetry is broken and mechanical energy is converted into heat: this is true for Navier–Stokes turbulence; however, the atomic-scale details are not fully clear. Kadau et al. (2010) and Tuck (2008, 2022) argue that scaling can continue to much smaller scales. The scaling is also broken at the large scales by the finite size of the planet. In between, boundary conditions such as the ocean surface or topography might potentially have broken the scaling, but in fact they turn out to be scaling themselves and so do not introduce a characteristic scale (e.g. Gagnon et al., 2006).

In the atmosphere one therefore expects scaling. It is expected to hold unless processes can be identified that act preferentially and strongly enough at specific scales that could break it. This turns the table on scalebound thinking: if we can explain the atmosphere's structure in a scaling manner, then this is the simplest explanation and should a priori be adopted. The onus must be on the scalebound approach to demonstrate the inadequacy of scaling and the need to replace the hypothesis of a unique wide scaling-range regime by (potentially numerous) distinct scalebound mechanisms.

Once a scaling regime is identified – either theoretically or empirically (preferably by a combination of both) – it is associated with a single basic dynamical mechanism that repeats scale after scale over a wide range, and hence it provides an objective classification principle.

The atmospheric scaling paradigm is almost as old as numerical weather
prediction, both being proposed by Richardson in the 1920s. Indeed, ever since Richardson's scaling

From the beginning, Richardson argued for a wide-range scaling holding from millimetres to thousands of kilometres (Fig. 7). Richardson himself attempted an empirical verification, notably using data from pilot balloons and volcanic ash (and
later – in the turbulent ocean – with bags of parsnips that he watched
diffusing from a pier on Loch Lomond; Richardson and Stommel, 1948). However, there remained a dearth of data spanning the key “mesoscale” range

In the 1970s, motivated by Charney's isotropic 2D geostrophic turbulence (Charney, 1971), the ambitious “EOLE” experiment was undertaken specifically to study large-scale atmospheric turbulence. EOLE (for the Greek wind god) ambitiously used a satellite to track the diffusion of hundreds of constant-density balloons (Morel and Larchevêque, 1974), but the results turned out to be difficult to interpret. Worse, the initial conclusions – that the mesoscale wind did not follow the Kolmogorov law – turned out to be wrong, and they were later re-interpreted (Lacorta et al., 2004) and then further re-re-interpreted (Lovejoy and Schertzer, 2013), finally vindicating Richardson nearly 90 years later.

Therefore, when Lovejoy (1982), benefitting from modern radar and satellite data, discovered scaling right through the mesoscale (Fig. 7, right), it was the most convincing support to date for Richardson's daring 1926 wide-range scaling hypothesis. Although at first it was mostly cited for its empirical verification that clouds were indeed fractals, today, 40 years later, we increasingly appreciate its vindication of Richardson's scaling from 1 to 1000 km, right through the mesoscale. It marks the beginning of modern scaling theories of the atmosphere. This has since been confirmed by massive quantities of remotely sensed and in situ data, both on Earth (Fig. 8) and more recently on Mars (Fig. 9, discussed in detail in Sect. 3.4).

Richardson's pioneering scaling model (Richardson, 1926) of turbulent diffusion (left) with an early update (Lovejoy, 1982) (right) using radar rain data (black) and satellite cloud data (open circles).

Planetary-scale power-law spectra (

Earth (left) and Mars (right). The zonal spectra (top right) of Mars as functions of the nondimensional wavenumbers for pressure (

In Sect. 2.1, we discussed the debate between scaling and mechanistic, generally deterministic, scalebound approaches. However, even in the statistical (turbulence) strand of atmospheric science, there evolved an alternative to Richardson's wide-range scaling: the paradigm of isotropic turbulence.

In the absence of gravity (or another strong source of anisotropy), the basic isotropic scaling property of the fluid equations has been known for a long time (Taylor, 1935; Karman and Howarth, 1938). The scaling symmetry justifies the numerous classical fluid dynamics similarity laws (e.g. Sedov, 1959), and it underpins models of statistically isotropic turbulence, notably the classical turbulence laws of Kolmogorov (Kolmogorov, 1941), Bolgiano and Obukhov (buoyancy-driven, Sect. 4.1) (Bolgiano, 1959; Obukhov, 1959), and Corrsin and Obukhov (passive scalar) (Corrsin, 1951; Obukhov, 1949).

These classical turbulence laws can be expressed in the form

Theories and models of isotropic turbulence were developed to understand the
fundamental properties of high Reynolds number turbulence, and this was independent of whether or not it could be applied to the atmosphere. Since
the atmosphere is a convenient very high Reynolds number laboratory (

Figure 10 graphically shows the problem: although the laws of isotropic turbulence are themselves scaling, they imply a break in the middle of the “mesoscale” at around 10 km. To model the larger scales, Fjortoft (1953) and Kraichnan (1967) soon found another isotropic scaling paradigm: 2D isotropic turbulence. Charney in particular adapted Kraichnan's 2D isotropic turbulence to geostrophic turbulence (Charney, 1971), and the result is sometimes called “layerwise” 2D isotropic turbulence. While Kraichnan's 2D model was rigidly flat with strictly no vortex stretching, Charney's extension allowed for some limited vortex stretching. Figure 10 shows the implied difference between the 2D isotropic and 3D isotropic regimes.

Even though isotropy had originally been proposed purely for theoretical convenience, armed with two different isotropic scaling laws, it was now being proposed as the fundamental atmospheric paradigm. If scaling in atmospheric turbulence is always isotropic, then we are forced to accept a scale break. The assumption that isotropy is the primary symmetry implies (at least) two scaling regimes with a break (presumably) near the 10 km scale height, i.e. in the mesoscale. The 2D–3D model with its implied “dimensional transition” (Schertzer and Lovejoy, 1985c) already contradicted the wide-range scaling proposed by Richardson.

An important point is that the implied scale break is neither physically nor empirically motivated: it is purely a theoretical consequence of assuming the predominance of isotropy over scaling. One is forced to choose: which of the fundamental symmetries is primary, isotropy or scaling?

By the time a decade later that the alternative (wide-range) anisotropic scaling paradigm (see Fig. 11 for a schematic) was proposed (Schertzer and Lovejoy, 1985c, a), Charney's beautiful theory along with its 2D–3D scale break had already been widely accepted, and even today it is still taught. More recently (Schertzer et al., 2012), generalized scale invariance was linked directly to the governing equations, so that a clear anisotropic theoretical alternative to Charney's isotropic theory is available.

A schematic showing the geometry of isotropic 2D models (top, for the large scales); the volumes of average structures (disks) increase as the square of the disk diameter. The isotropic 3D model is a schematic of 3D turbulence models for the small scales, with the volumes of the spheres increasing as the cube of the diameter. These geometries are superposed on the Earth's curved surface (the blue spherical segments on the right). We see (bottom right, Earth's surface) that – unless they are strongly restricted in range – the 3D isotropic models quickly imply structures that extend into outer space.

A schematic diagram showing the change in shape of average
structures which are isotropic in the horizontal (slightly curved to
indicate the Earth's surface) but with scaling stratification in the vertical.

The basic signature of scaling is a power-law relation of a statistical characteristic of a system as a function of space scale and/or timescale. In the empirical test of the Richardson

Following Mitchell, we may consider variability in the spectral domain: for example, the power spectrum of the temperature

Alternatively, we can consider scaling in real space. Due to “Tauberian
theorems” (e.g. Feller, 1971), power laws in real space are transformed into power laws in Fourier space (and vice versa). This result holds whenever the scaling range is wide enough – i.e. even if there are high- and/or low-frequency cutoffs (needed if only for the convergence of the transforms). If we consider fluctuations

The evolution of the scaling picture for 1986–1999. Top: the rms difference structure functions estimated from local (central
England) temperatures since 1659 (open circles, upper left), Northern Hemisphere temperature (black circles), and paleotemperatures from Vostok (Antarctic, solid triangles), Camp Century (Greenland, open
triangles), and an ocean core (asterisks). For the Northern Hemisphere temperatures, the (power-law, linear in this plot) climate regime starts at about 10 years. The reference line has a slope

In spite of its growing disconnect with modern data, Mitchell's figure and
its scalebound updates continue to be influential. However, within 15 years of Mitchell's famous paper, two scaling composites, over the ranges 1 h to
10

A comparison of Mitchell's educated guess of a log–log spectral plot (grey, bottom, Mitchell, 1976) superposed with modern evidence from spectra of a selection of the series described in Table 1 and
Lovejoy (2015) from which this figure is reproduced. On the far right,
the spectra from the 1871–2008 20CR (at daily resolution) quantify the difference between the globally averaged temperature (bottom right, red line)
and local averages (2

Artist's rendering with data superposed. Adapted from Ghil (2002) and reprinted in Ghil and Lucarini (2020).

Returning to the artist's rendering, Fig. 14 shows that, when compared to the data, it fares no better than Mitchell's educated guess. The next update – the NOAA's mental model – only specified that its vertical axis be proportional to “variability”. If we interpret variability as the root-mean-square (rms) fluctuation at a given scale and the flat “background” between the bumps as white noise, then we obtain the comparison in Fig. 15. Although the exact definition of these fluctuations is discussed in Sect. 2.5, they give a directly physically meaningful quantification of the variability at a given timescale. In Fig. 15, we see that the mental model predicts that successive average Earth temperatures of 1 million years would differ by only tens of micro-Kelvin. A closely similar conclusion would hold if we converted Mitchell's spectrum into rms real-space fluctuations.

The most recent scalebound update – the “conceptual landscape” – is
compared with modern data in Fig. 16. Although the various scaling regimes
proposed in Lovejoy (2013) (updated in Fig. 18 and discussed below) are
discreetly indicated in the background, in many instances, there is no obvious relation between the regimes and the landscape. In particular, the
word “macroweather” appears without any obvious connection to the figure,
but even the landscape's highlighted scalebound features are not very close
to the empirical curve (red). Although the vertical axis is only
“relative”, this quantitative empirical comparison was made by exploiting
the equal-area property mentioned above. The overlaid solid red curve was estimated by converting the disjoint spectral power laws shown in the
updated Mitchell graph (Fig. 8). In addition, there is also an attempt to
indicate the amplitudes of the narrow spectral spikes (the green spikes in Fig. 13) at diurnal, annual, and – for the epoch 2.5–0.8 Myr – obliquity spectral peaks at (41 kyr)

Mental model with data. The data spectrum in Fig. 13 is replotted
in terms of fluctuations (grey, top; see Fig. 17). The diagonal axis corresponds to the flat baseline of Fig. 6 (lower left) that now has a
slope of

Conceptual landscape with data. The superposed red curves use the empirical spectra in Fig. 13 and adjust the (linear) vertical scale for a rough match with the landscape. The vertical lines indicate huge periodic signals (the diurnal and annual cycles on the right and on the left, the obliquity signal seen in spectra between 0.8 and 2.5 Myr ago). Adapted from von der Heydt et al. (2021).

The scalebound framework for atmospheric dynamics emphasized the importance of numerous processes occurring at well-defined timescales, the quasi-periodic “foreground” processes illustrated as bumps – the signals – on Mitchell's nearly flat background. The point here is not that these processes and mechanisms are wrong or non-existent: it is rather that they only explain a small fraction of the overall variability, and this implies that they cannot be understood without putting them in the context of their dynamical (scaling) regime. This was also demonstrated quantitatively and explicitly over at least a significant part of the climate range by Wunsch (2003).

One of the lessons to be drawn from the educated guesses, artists'
renderings, and conceptual landscapes is that, although spectra can be calculated for any signal, the interpretations are often not obvious. The
problem is that we have no intuition about the physical meaning of the units
– K

The advantage of fluctuations such as in Fig. 12 (top) is that the numbers – e.g. the rms temperature fluctuations at some scale – have a straightforward physical interpretation. However, the differences used to define fluctuations (see Fig. 17, top) have a non-obvious problem: on
average, differences cannot decrease with increasing time intervals (in
Appendix B, this problem is discussed more precisely in the Fourier domain).
This is true for any series that has correlations that decrease with

However, do regions of negative

It took a surprisingly long time to clarify this issue. To start with, in
classical turbulence,

New clarity was achieved with the help of the (first) Haar wavelet
(Haar, 1910). There were two reasons for this: the simplicity of its
definition and calculation and the simplicity of its interpretation
(Lovejoy and Schertzer, 2012a). To determine the Haar fluctuation over a time interval

Schematic illustration of difference (top) and anomaly (middle)
fluctuations for a multifractal simulation of the atmosphere in the weather regime (

Figure 18 shows a modern composite using the rms Haar fluctuation, spanning a range of scales of

Also shown in Fig. 16 are reference lines indicating the typical scale
dependencies. These correspond to typical temperature fluctuations

With the help of the figure, we can now understand the problem with the
usual definition of climate as “long-term” weather. As we average from 10 d to longer durations, temperature fluctuations do indeed tend to
diminish – as expected if they converged to the climate. Consider for
example the thick solid line in Fig. 18 (corresponding to data at
75

The interpretation of the apparent point of convergence as the climate state
is supported by the analysis of global data compared with GCMs in “control runs” (i.e. with fixed external conditions, Fig. 19). When averaged over
long enough times, the control runs do indeed converge, although the convergence is “ultra slow” (at a rate characterized by the exponent

The broad sweep of atmospheric variability with rms Haar fluctuations showing the various (roughly power-law) atmospheric regimes, adapted and updated from the original (Lovejoy, 2013) and the update
in Lovejoy (2015), where the full details of the data sources are given (with the exception of the paleo-analysis marked “Grossman”, which is from
Grossman and Joachimski, 2022). The dashed vertical lines show the rough divisions between
regimes; the macroweather–climate transition is different in the preindustrial epoch. Starting at the left, we have, the high-frequency analysis (lower left) from thermistor data taken at McGill at 15 Hz. Then, the thin curve starting at 2 h is from a weather station, the next (thick) curve is from the
20th century reanalysis (20CR), and the next, “S”-shaped curve is from the EPICA core. Finally, the three far-right curves are benthic paleotemperatures (from “stacks”). The quadrillion estimate is for the spectrum: it depends
somewhat on the calibration of the stacks. With the calibration in the
figure, the typical variation of consecutive 50 million year averages is

Top (brown): the globally averaged, rms Haar temperature fluctuations averaged over three data sets (adapted from Lovejoy, 2019, where there are full details: the curve over the corresponding timescale range in Fig. 19 is at
75

Returning to Fig. 18, however, we see that, beyond a critical timescale

Regarding the last 100 kyr, the key point about Fig. 18 is that we have three regimes – not two. Since the intermediate regime is well reproduced by control runs (Fig. 19), it is termed “macroweather”: it is essentially averaged weather.

If the macroweather regime is characterized by slow convergence of averages with scale,
it is logical to define a climate state as an average over durations that
are long enough so that the maximum convergence has occurred – i.e. over
periods

Again from Fig. 18, we see that the climate state itself starts to vary in a
roughly scaling way up until Milankovitch timescales (at about 50 kyr, half the period of the main 100 kyr eccentricity frequency) over which fluctuations are typically of the order

Space–time diagrams are log-time–log-space plots for the ocean (Stommel, 1963, Fig. 20, left) and the atmosphere (Orlanski, 1975, Fig. 20, right). They highlight the conventional morphologies, structures, and processes typically indicated by boxes or ellipses in the space–time regions in which they have been observed. Since the diagrams refer to the lifetimes of structures co-moving with the fluid, these are Lagrangian space–time relations. The Eulerian (fixed-frame) relations are discussed in the next section.

A striking feature of these diagrams – especially in Orlanski's atmospheric
version (Fig. 20, right panel) but also in the updates (Fig. 21) – is the near-linear, i.e. power-law, arrangement of the features. As pointed out in Schertzer et al. (1997a), in the case of Orlanski's diagram, the slope of the line is
very close to the theoretically predicted value

The original space–time diagrams (Stommel, 1963, ocean, left; Orlanski, 1975, the atmosphere, right). The solid red lines are theoretical lines assuming the horizontal Kolmogorov scaling with the measured mean
energy rate densities indicated. The dashed red lines indicate the size of
the planet (half-circumference 20 000 km), where the timescale at which they meet is the lifetime of planetary structures (

The original figures are space–time diagrams for the ocean (left) and atmosphere (right) from Ghil and Lucarini (2020); note that space and time have been swapped as compared to Fig. 20. As in Fig. 20, solid red lines have been added, showing the purely theoretical predictions. On the right, a solid blue line was added showing the planetary scale. The dashed red line (also added) shows the corresponding lifetimes of planetary structures (the same as in Fig. 20). We see once again that wide-range horizontal Kolmogorov scaling is compatible with the phenomenology, especially when taking into account the statistical variability of the space–time relationship itself, as indicated in Fig. 22.

A space–time diagram showing the effects of intermittency and, for the oceans, the deep currents associated with very low

Thinking of the atmosphere as a heat engine that converts solar energy into
mechanical energy (wind) allows us to estimate

On Earth, direct estimates of

Using the value

In space, up to planetary scales, the basic wind statistics are controlled
by

Figure 24 shows atmospheric and oceanic spectra clearly showing the weather–macroweather transition and ocean weather–ocean macroweather transitions at the theoretically calculated timescales. It also shows the only other known weather–macroweather transition, this time on Mars using Viking lander data. The Martian transition time may be theoretically determined by using
the Martian value

The weather–macroweather transition time

The three known weather–macroweather transitions: air over the Earth (black, and upper left, grey), the sea surface temperature (SST, ocean)
at 5

In the previous section, we discussed the space–time relations of structures of size

The key difference between the Eulerian and Lagrangian statistics is that
the former involves an overall mean advection velocity

In Taylor's laboratory turbulence,

In order to test the space–time scaling on real-world data, the best sources are remotely sensed data such as the space–time lidar data discussed in
Radkevitch et al. (2008) or the global-scale data from geostationary satellites in the infrared (IR), whose spectra are shown in Fig. 25 (Pinel et al., 2014). The figure uses 1440 consecutive hourly images at
5 km resolution over the region 30

There are two remarkable aspects of the figure. The first is that, in spite of an apparently slight curvature (normally a symptom of deviations from
perfect scaling), it is in reality largely a “finite-size effect” on otherwise excellent scaling. This can be seen by comparison with the black
curve that shows the consequences of the averaging over the (roughly
rectangular) geometry of the observing region combined with the “trivial”
anisotropy of the spectrum (implied by the matrix

The second remarkable aspect of Fig. 25 is the near-perfect superposition of the 1D spectra

Given the space–time scaling, one can use the real space statistics to define Eulerian space–time diagrams. Using the same data, this is shown in Fig. 26, where we see that the relationship is nearly linear in a linear–linear plot (i.e. with a constant velocity) up to about 10 d, corresponding to near-planetary scales as indicated in the figure. Note some minor differences between the EW and NS directions.

The zonal, meridional, and temporal spectra of 1386 images (

The Eulerian (fixed-frame) space–time diagram obtained from the same satellite pictures analysed in Fig. 25, lower left, reproduced from
Pinel et al. (2014). The slopes of the reference lines correspond to
average winds of 900 km d

Up until now, we have discussed scaling at a fairly general level as an invariance under scale changes, contrasting it with scaleboundedness and emphasizing its indispensable role in understanding the atmosphere, the ocean, and, more generally, the geosphere. There are two basic elements that must be considered: (a) the definition of the notion of scale and scale change and (b) the aspect of the system or process that is invariant under the corresponding change (the invariant).

We have seen that, in general terms, a system is scaling if there exists a power-law relationship (possibly deterministic, but usually statistical)
between fast and slow (time) or small and large (space, or both, space–time). If the system is a geometric set of points – such as the set of meteorological
measuring stations (Lovejoy et al., 1986), then the set is a fractal set and the
density of its points is scaling – it is a power law whose exponent is its fractal codimension. Geophysically interesting systems are typically not
sets of points but rather scaling fields such as the temperature

In such a system, some aspect – most often a suitably defined fluctuation

The simplest case is where the fluctuations in a temporal series

Equation (12) relates the probabilities of small and large fluctuations; it is usually easier to deal with the deterministic equalities that follow by
taking

Equation (13) is the general case where the resolution

Combining Eqs. (13) and (14), we obtain

In the case of “simple scaling” where

The more general “nonlinear scaling” case, where

Note that, in the literature, the notation “

We could mention that, here and in Sect. 3.3, where we discuss the corresponding multiscaling probability distributions, we use the

The

Atmospheric modelling is classically done using the deterministic equations of thermodynamics and continuum mechanics. However, in principle, one could have used a more fundamental (lower-level) approach – statistical mechanics – but this would have been impossibly difficult. However, in strongly nonlinear fluid flow, the same hierarchy of theories continues to higher-level turbulent laws. These laws are scaling and may – depending on the application – be simpler and more useful. A concrete example is in the macroweather regime where (strongly nonlinear, deterministic) GCMs are taken past their deterministic predictability limit of about 10 d. Due to their sensitivity to initial conditions, there is an inverse cascade of errors (Lorenz, 1969; Schertzer and Lovejoy, 2004), so that, beyond the predictability limit, small-scale errors begin to dominate the global scales, so that the GCMs effectively become stochastic. To some degree of approximation, since the intermittency is low (the spikiness on the right-hand side of Fig. 2 and at the bottom of Fig. 3), this stochastic behaviour is amenable to modelling by linear stochastic processes, in this case, the half-order and fractional energy balance equations (HEBE, FEBE, Lovejoy, 2021a, b; Lovejoy et al., 2021; Lovejoy, 2022c). The key issue – of whether linear or nonlinear stochastic processes can be used – thus depends on their “spikiness” or intermittency (multifractality).

Classically, intermittency was first identified in laboratory flows as
“spottiness” (Batchelor and Townsend, 1949) in the atmosphere by the concentration of atmospheric fluxes in tiny, sparse regions. In time series, it is associated
with turbulent flows undergoing transitions from “quiescence” to
“chaos”. Quantitative intermittency definitions developed originally for
fields (space) are of the “on–off” type, the idea being that when the energy or other flux exceeds a threshold, then it is “on”, i.e. in a special state – perhaps of strong/violent activity. At a specific measurement
resolution, the on–off intermittency can be defined as the fraction of space where the field is “on” (where it exceeds the threshold). In a scaling system, for any threshold, the “on” region will be a fractal set and both
the fraction and the threshold will be characterized by exponents (by

With the help of multifractals, we can now quantitatively interpret the spike plots. Recall that

As long as

The top row is a reproduction of the intermittent spikes taken from
the gradients in the aircraft data at the bottom of Fig. 3. The original
series is 2294 km long with resolution 280 m, and hence it covers a scale range of a factor of

The same as Fig. 27 but in terms of the corresponding singularities
obtained through the transformation of variables

What happens if we change the resolution of

Examine now the vertical axes. We see that – as expected – the amplitude of
the spikes systematically decreases with resolution, and the plots are clearly not scale-invariant. We would like to have a scale-invariant description of the spikes and a scale-invariant probability distribution of the spikes. For this,
each spike is considered to be a singularity of order

To leading order (i.e. setting the prefactor

In the general scaling case, the set of spikes that exceed a given threshold
form a fractal set whose sparseness is quantified by the fractal codimension

Gaussian series are not intermittent since

Returning to Fig. 2, we have

While

These equations imply one-to-one relationships between the spike singularities

At first sight, general (multifractal) scaling involves an entire exponent
function – either

Figures 28 and 29 show the universal

Table 1 shows various empirical estimates relevant to atmospheric dynamics.
We see that, generally,

Universal

Universal

We compare various horizontal parameter estimates, attempting to
give summarized categories of values (radiances) or approximate values (

Using spike plots, we can simply demonstrate the unique character of the
macroweather regime: low intermittency in time but high intermittency in space. We introduced the

Consider the data shown in Fig. 31 (macroweather time series and spatial
transects, top and bottom respectively). Figure 32 compares the rms fluctuations (with exponent

For many applications, the exceptional smallness of macroweather
intermittency makes the “monoscaling” approximation (i.e.

A comparison of temporal and spatial macroweather series at 2

In the preceding sections, we gave evidence that diverse atmospheric fields are scaling up to planetary scales. In addition, we argued that they generally
were multifractal, with each statistical moment

Before proceeding to empirical analyses of the fluxes, a few comments are
required. The flux in Eq. (14) is assumed to be normalized, i.e.

The first-order and rms Haar fluctuations of the series and transect in Fig. 31. One can see that, in the spiky transect, the fluctuation statistics converge at large lags (

A comparison of the intermittency function

Figure 34 shows the first empirical trace-moment estimate (Schertzer and Lovejoy, 1987). It was applied to data from a land-based radar whose 3 km altitude reflectivity maps were 128 km wide with a 1 km resolution. The vertical axis is

The trace moments characterize a fundamental aspect of the atmosphere's
nonlinear dynamics – its intermittency – in fully developed turbulence, which is expected to be a “universal” feature, i.e. found in all high Reynolds number flows. In our case, the closest universality test is to compare Earth with Mars (using the same reanalyses as in Fig. 9). Figure 38 shows the result when this technique
is applied to both terrestrial and Martian reanalyses for pressure, wind, and
temperature (for both planets, the reanalyses were at altitudes corresponding
to about 70 % of surface pressure). One can note that, (a) as predicted, the turbulence is universal, i.e. not sensitive to the forcing mechanisms and
boundaries, so that the behaviour is nearly identical on the two planets, (b) there is clear multiscaling (the logarithmic slopes

Table 1 shows typical values of multifractal parameters estimated from trace
moments (Sect. 3.4) of various atmospheric fields. Over the decades, many
multifractal analyses of geofields have been performed, including of
atmospheric boundary conditions, notably the topography on Earth (Lavallée et al., 1993; Gagnon et al., 2006), Mars (Landais et al., 2015), and the
sea surface temperature (Lovejoy and Schertzer, 2013). We can see that the universal multifractal index (

The moments

The same as Fig. 34 except for TRMM reflectivities (4.3 km
resolution). The moments are for

Trace moments from

The cascade structure of lidar aerosol backscatter; see the example in Fig. 45. Moments of normalized fluxes (indicated as

A comparison of the scaling of the normalized fluxes
(

The multifractal process

However, we can already see a problem with this naïve construct. When
we reach the top (corresponding to data at 280 m resolution), we are still
far from the turbulent dissipation scale that is roughly 1 million times
smaller: the top line is better modelled by continuing the cascade down to
very small (dissipation) scales and then – imitating the aircraft sensor –
averaging the result over 280 m. A multifractal process at scale

Mathematically, we can represent the dressed process as

A basic result going back to Mandelbrot (1974) and generalized in
Schertzer and Lovejoy (1985c, 1987, 1992, 1994) shows that
the statistical moments are related as

The critical moment for divergence

We can now briefly consider the conditions under which there are nontrivial
solutions to Eq. (28) with finite

It is now convenient to define the strictly increasing “dual” codimension
function

To find the corresponding dressed probability exponent

A schematic illustration of the relation between

To get an idea of how extreme the extremes can be, consider the temperature
fluctuations with

The probability distribution of daily temperature differences in
daily mean temperatures from Macon, France, for the period 1949–1979 (10 957 d). Positive and negative differences are shown as separate curves. A best-fit Gaussian is shown for reference, indicating that the extreme fluctuations correspond to more than 7 standard deviations. For a Gaussian
this has a probability of 10

A relevant example of the importance of the power-law extremes is global warming. Over about a century, there has been 1

There are now numerous atmospheric fields whose extremes have been studied
and power-tail exponents (

A summary of various estimates of the critical order of divergence
of moments (

While the temperature is of fundamental significance for the climate, the wind is the dynamical field, so that it is analogously important at weather scales (as well as in mechanically forced turbulence). For example, numerous statistical models of fully developed turbulence are based on “closure” assumptions that relate high-order statistical moments to lower-order ones, thus allowing the evolution of the statistics in high Reynolds number turbulence to be modelled. Closures thus postulate the finiteness of some (usually all) high-order statistical moments of the velocity field.

In fully developed turbulence, in the inertial (scaling) range,

Before discussing this further, let us consider the evidence for the
divergence of high-order moments in the velocity/wind field. The earliest evidence is shown in Fig. 41 (left): it comes from radiosondes (balloons) measuring the changes in horizontal wind velocity in the vertical direction.
Schertzer and Lovejoy (1985c) found

These early results had only order 10

Results from a much larger sample and from a more controlled laboratory
setting (a wind tunnel), also in the temporal domain, are shown in Fig. 43
(data taken from Mydlarski and Warhaft, 1998, and analysed in Radulescu et al., 2002, and Lovejoy and Schertzer, 2013). In this case, by placing sensors at varying separations, one
can estimate the exponents in both the inertial and dissipation ranges. In
the inertial range, the result (

The previous results from the wind and laboratory turbulence allowed
estimates of the probability tails down to levels of only about 10

We could note that values

The left-hand-side plot shows the probability distribution of the squares of horizontal wind differences in the vertical direction, estimated
from radiosondes. The curves from left to right are for layer thicknesses 50, 100, … 3200 m. The curves' straight reference lines have
slopes corresponding to

The left-hand-side figure shows the probability distribution of changes

Probability distributions from laboratory turbulence from pairs of
anemometers separated by small dissipation range (DR) distances and larger (IR) distances. Slopes corresponding to

Probability distributions of enstrophy (

In these log–log plots of probability densities, we see that most of the distributions show evidence of log–log linearity near the extremes. When
judging possible deviations, it could be recalled that, due to inadequate instrumental response times, postprocessing noise-reduction procedures
(e.g. smoothing), or outlier-elimination algorithms, extremes can easily be underestimated. Since, physically, the extremes are consequences of variability building up over a wide range of spatial scales,
we expect that numerical model outputs (including reanalyses) will
underestimate the extremes. For example, Lovejoy (2018) argued that the models'
small hyperviscous scale range (truncated at

The power-law fluctuations in Figs. 41–44 are so large that, according to classical assumptions, they would be outliers. In atmospheric science, thanks to the scaling, very few processes are Gaussian and extremes occur much more frequently than expected, a fact that colleagues and I regularly underscored starting in the 1980s (see Table 2 and, for a review, Chap. 5 of Lovejoy and Schertzer, 2013).

At best, Gaussians can be justified for additive processes, with the added
restriction that the variance is finite. However, once this restriction is
dropped, we obtain “Lévy distributions” with power-law extremes but with exponents

To underscore the importance of nonclassical extremes, Taleb introduced the terms “grey and black swans” (Taleb, 2010). Originally, the former designated Lévy extremes, and the latter was reserved for extremes that were so strong that they were outliers with respect to any existing theory. However, the term “grey swan” never stuck, and the better-known expression “black swan” is increasingly used for any power-law extremes.

All of this is important in climate science, where extreme events are often associated with tipping points. The existence of black swan extremes leads to a conundrum: since black swans already lead to exceptionally big extremes, how can we distinguish “mere” black swans from true tipping points?

So far, we have only discussed scaling in 1D (series and transects), so that the notion of scale itself can be taken simply as an interval (space)
or lag (time), and large scales are simply obtained from small ones by
multiplying by their scale ratio

The most obvious problem is stratification in the horizontal (see Figs. 4 and 5). This is graphically shown in Fig. 45 of airborne lidar backscatter from
aerosols. At low resolution (bottom), one can see highly stratified layers.
However, zooming in (top) shows that the layers have small structures that are in fact quite “roundish” and hinting that, at even higher resolutions, there might be stratification instead in the vertical. If we determine the spectra
in the horizontal and compare them with those in the vertical, we obtain Fig. 46; the spectra show power laws in both directions but with markedly different
exponents. As shown below, it turns out that the key ratio is

The difference in horizontal and vertical exponents is a consequence of
scaling stratification: the squashing of structures with scale. In the
simplest case, called “self-affinity”, the squashing is along orthogonal
directions that are the same everywhere in space – for example along the

Bottom: a vertical section of laser backscatter from aerosols (smog particles) taken by an airborne lidar (laser) flying at 4.5 km altitude
(purple line) over British Columbia near Vancouver (the topography is shown in black; the lidar shoots two beams, one up and one down) (Lilley
et al., 2004). The resolution is 3 m in the vertical and 96 m in the horizontal. The top panel is at a fairly coarse resolution, and we mostly see a layered structure.
Top: the black box in the lower left is shown blown up at the top of the figure. We are now starting to discern vertically aligned and roundish structures.
The aspect ratio is about

The lower curve is the power spectrum for the fluctuations in the
lidar backscatter ratio, a surrogate for the aerosol density (

The average mean absolute difference in the horizontal wind from
238 dropsondes over the Pacific Ocean taken in 2004. The data were analysed over regions from the surface to higher and higher altitudes (the different lines from bottom to top, separated by a factor of 10 for clarity). Layers
of thickness

To deal with anisotropic scaling, we need an anisotropic definition of the notion of scale itself.

The simplest scaling stratification is called self-affinity: the
squashing is along orthogonal directions whose directions are the same
everywhere in space – for example along the

The problem is to define the notion of scale in a system where there is no
characteristic size. Often, the simplest (but usually unrealistic)
self-similar system is simply assumed without question: the notion of
scale is taken to be isotropic. In this case, it is sufficient to define the
scale of a vector

To generalize this, we introduce a scale function

GSI is exploited in modelling and analysing many atmospheric fields (wind, temperature, humidity, precipitation, cloud density, aerosol concentrations;
see Lovejoy and Schertzer, 2013). To give the idea, we can define the “canonical” scale
function for the simplest stratified system representing a vertical

Figure 48 shows some examples of lines of constant scale function defined by
Eq. (36) with varying

Equipped with a scale function, the general anisotropic generalization of
the 1D scaling law (Eq. 12) may now be expressed by using the scale

A series of ellipses each separated by a factor of 1.26 in scale,
red indicating the unit scale (here, a circle and thick lines). Upper left to lower right:

Kolmogorov theory was mostly used to understand laboratory hydrodynamic
turbulence, which is mechanically driven and can be made approximately isotropic (unstratified) by the use of either passive or active grids. In
this case, fluctuations in

In addition to the dynamical equations with quadratic invariant

The classical way of dealing with buoyancy is to use the Boussinesq
approximation, i.e. to assume the existence of a scale separation and then
define density (and hence buoyancy) perturbations about an otherwise
perfectly stratified “background” flow. This leads to the classical
isotropic buoyancy subrange turbulence discovered independently by
Bolgiano (1959) and Obukhov (1959). Unfortunately, it was postulated to be an

However, if there is wide-range atmospheric scaling, then there is no scale separation (as outlined in Chap. 6 in Lovejoy and Schertzer, 2013), and so we can make a more physically based argument which is analogous to that used for deriving passive scalar variance cascades in passive scalar advection – the Corrsin–Obhukhov law (Corrsin, 1951; Obukhov, 1949) (itself analogous to the energy flux cascades that lead to the Kolmogorov law).

If we neglect dissipation and forcing, then

We can see that the two laws in Eq. (43) are special cases of the more general
anisotropic scaling law Eq. (40) since, for pure horizontal displacements (

In Sect. 2.6 and 2.7, we mentioned that for Lagrangian frame temporal velocity fluctuations we should use the size–lifetime relation that is implicit in
the horizontal Kolmogorov law. If we assume horizontal isotropy, then, for velocity fluctuations, we have

Using the spacescale and timescale function, we may now write the space–time generalization of the Kolmogorov law as

The result analogous to that of the previous subsection, the corresponding
simple (“canonical”) spacescale and timescale function, is

We now seek to express the generalized Kolmogorov law in an Eulerian framework. The first step is to consider the effects on the scale function of an overall advection. We then consider statistical averaging over turbulent advection velocities.

Advection can be taken into account using the Galilean transformation

It will be useful to study the statistics in Fourier space; for this purpose
we can use the result (e.g. Chap. 6 of Lovejoy and Schertzer, 2013) of the Fourier generator

The above results are for a deterministic advection velocity, whereas in reality, the advection is turbulent. Even if we consider a flow with zero
imposed mean horizontal velocity (as argued by Tennekes, 1975) in a scaling
turbulent regime with

The statistics of the intensity gradients of real fields are influenced by
random turbulent velocity fields and involve powers of such scale functions
but with appropriate “average” velocities. In this case, considering only
the horizontal and time, we introduce the nondimensional variables (denoted by a circumflex “

As discussed in Lovejoy and Schertzer (2013), the above real spacescale function is needed to interpret “satellite winds” (deduced from time series of satellite cloud images), and in Sect. 2.7 the Fourier equivalent of Eq. (58)
(based on the inverse matrix

The first experimental measurement of the joint (

A contour plot of the mean squared transverse (top) and
longitudinal (bottom) components of the wind as estimated from a year's
(

To illustrate what the 23/9D model implies for the atmosphere, we can make multifractal simulations of passive scalar clouds: these were already discussed in Fig. 5, which showed that, in general, scaling leads to morphologies, structures that change with scale even though there is no characteristic scale involved. Figure 5 compares a zoom into an isotropic (self-similar) multifractal cloud (left) and into a vertical section of a stratified cloud with 23/9D. While zooming into the self-similar cloud yields similar-looking cross sections at all scales, zooming into the 23/9D cloud on the right of Fig. 5 displays continuously varying morphologies. We see that, at the largest scale (top), the cloud is in fairly flat strata; however, as we zoom in, we eventually obtain roundish structures (at the spheroscale), and then, at the very bottom, we see vertically oriented filaments forming, indicating stratification in the vertical direction (compare this with the lidar data in Fig. 45).

The anisotropic stratification and elliptical dimension of rain areas (as determined by radar) go back to Lovejoy et al. (1987) and, with much more vertical resolution, to CloudSat, a satellite-borne radar analysed in Fig. 50 (see the sample CloudSat image in Fig. 4). From Fig. 51, we see that the mean relation between horizontal and vertical extents of clouds is very close to the predictions of the 23/9D theory, with a spheroscale (averaged over 16 orbits) of about 100 m. The figure also shows that there is fair amount of variability (as expected since the spheroscale is a ratio of powers of highly variable turbulent fluxes, Eq. 44). Figure 51 shows the implications for typical cross sections. The stratification varies considerably as a function of the spheroscale (and hence buoyancy and energy fluxes).

Finally, we can compare the CloudSat estimates with those of other atmospheric fields (Table 3). The estimates for

A space (horizontal)–space (vertical) diagram estimated from the absolute reflectivity fluctuations (first-order structure functions) from 16 CloudSat orbits. Reproduced from Lovejoy et al. (2009b).

The theoretical shapes of average vertical cross sections using the empirical parameters estimated from CloudSat-derived mean parameters:

This table uses the estimate of the vertical

What about numerical weather models? We mentioned that in the horizontal
they show excellent scaling (and see Fig. 8 for reanalysis spectra, Fig. 9 for the comparison of Mars and Earth spectra, and Fig. 38 for the cascade structures). According to the 23/9D model, the dynamics are dominated by
Kolmogorov scaling in the horizontal (

The choices of horizontal and vertical numbers of degrees of
freedom that were made during the historical development of general
circulation models. According to the 23/9D mode I, the dynamics are dominated
by Kolmogorov scaling in the horizontal (

Due to the larger north–south temperature gradients, large atmospheric structures 10 000 km in the east–west direction are typically “squashed” to a size about

With this possible exception, we conclude that, unlike the vertical, there is little evidence for any overall stratification in the horizontal analogous to the
vertical, but there is still plenty of evidence for the existence of
different shapes at different sizes and the fact that shapes commonly rotate by various amounts at different scales. We thus need to go beyond
self-affinity and (at least) add some rotation. Mathematically, to add
rotation to the blow-up and squashing that we discussed earlier, we only need to add off-diagonal elements to the generator

Figures 53–56 show a few examples of contours at different scales, each representing the shapes of the balls at systematically varying scales. We can see that we have the freedom to vary the unit balls (here circles and rounded triangles) and the amounts of squashing and rotation. In Fig. 53, with unit balls taken to be circles, we show the self-similar case in the upper left, a stratified case in the upper right, a stratified case with a small amount of rotation (lower left), and another case with lots of rotation (lower right). Figure 56 shows the same but with unit balls as rounded triangles, Fig. 55 takes the lower-right example and displays the balls over a factor of 1 billion in scale, and in Fig. 56 we show an example with only a little rotation but over the same factor of 1 billion in scale. We can see that, if these represent average morphologies of clouds at different scales, even though there is a single unique rule or mechanism to go from one scale to another, the average shapes change quite a bit with scale.

Blow-ups and reductions by factors of 1.26 starting at circles
(red). The upper left shows the isotropic case, the upper right shows the
self-affinity (pure stratification case), the lower-left example is stratified but along oblique directions, and the lower-right example has structures
that rotate continuously with scale while becoming increasingly stratified.
The matrices used are

The same as above except that now the unit ball is the rounded triangle. Reproduced from Lovejoy (2019).

The same blow-up rule as in the lower right of Fig. 53 but showing an overall blow-up by a factor of 1 billion. Starting with the inner thick
grey ball in the upper-left corner, we see a series of 10 blow-ups, each by a factor of 1.26 spanning a total of a factor of 10 (the outer thick, grey ball). Then, that ball is shrunk (as indicated by the dashed lines) so as to
conveniently show the next factor of 10 blow-up (top middle). The overall range of scales in the sequence is thus 10

A different example of balls with squashing but with only a little
rotation: the maximum rotation of structures in this example from very small
to very large scales is 55

We have explored ways in which quite disparate shapes can be generated using
blow-ups, squashings, and rotations. With the help of a unit ball, we generated families of balls, any member of which would have been an equally good starting point. The unit ball has no particular importance, and it does not
have any special physical role to play. If we have a scaling model based on
isotropic balls, then replacing them with these anisotropic balls will also
be scaling when we use the anisotropic rule to change scales: any
morphologies made using such a system of balls will be scale-invariant. Mathematically anisotropic space–time models (see Schertzer and Lovejoy, 1987; Wilson et al., 1991; Lovejoy and Schertzer, 2010b, a) are produced in the same way as isotropic ones, except that the usual vector norm is replaced by a spacescale and timescale function and the usual dimension of space–time

We already showed a self-similar and stratified example where the balls were used to make a multifractal cloud simulation of a vertical section (Fig. 5). Let us now take a quick look at a few examples of horizontal and 3D multifractal cloud simulations.

The simulation of a cross section of a stratified multifractal cloud in Fig. 5 already shows that the effect of changing the balls can be quite subtle. Let us take a look at this by making multifractal cloud simulations with realistic (observed) multifractal parameters (these determine the fluctuation statistics, not the anisotropy) and systematically varying the families of balls (Fig. 57). In the figure, all the simulations have the same random “seed”, so that the only differences are due to the changing definition of scale. First we can explore the effects of different degrees of stratification combined with different degrees of rotation. We consider two cases: in the first (Fig. 57), there is roughly a circular unit ball within the simulated range, and in the second (Fig. 58), all the balls are highly anisotropic. Each figure shows a pair: the cloud simulation (left) and the family of balls that were used to produce it on the right.

From the third column in Fig. 57 with no stratification, we can note that changing the amount of rotation (moving up and down the column) changes nothing; this is simply because the circles are rotated to circles: rotation is only interesting when combined with stratification. The simulations in Fig. 58 might mimic small clouds (for example 1 km across) produced by complex cascade-type dynamics that started rotating and stratifying at scales perhaps 10 000 times larger. In both sets of simulations, the effect of stratification becomes more important up and down away from the centre line, and the effects of rotation vary from the left to the right, becoming more important as we move away from the third column.

Figure 59 shows examples where rotation is strong and the scale-changing rule is the same everywhere; only the unit ball is changed. By making the latter have some long narrow parts, we can obtain quite “wispy”-looking clouds.

Figure 60 shows another aspect of multifractal clouds. In Sect. 3.5.5 we discussed the fact that, in general, the cascades occasionally produce extreme events. If we make a sufficiently large number of realizations of the process, from time to time we will generate rare cloud structures that are almost surely absent on typical realizations. For example, a typical satellite picture of the tropical Atlantic Ocean would not have a hurricane, but from time to time hurricanes do appear there. The multifractality implies that this could happen quite naturally, without the need to invoke any special scalebound “hurricane process”. In the examples in Fig. 60, we use a rotating set of balls (Fig. 61). However, in order to simulate occasional, rare realizations, we have “helped” the process by artificially boosting the values in the vicinity of the central pixel. The two different rows are identical except for the sequence of random numbers used in their generation. For each row, moving from left to right, we boosted only the central region to simulate stronger and stronger vortices that are more and more improbable. As we do this, we see that the shapes of the basic set of balls begin to appear out of the chaos.

Left: multifractal simulations with nearly isotropic unit scales with stratification becoming more important up and down away from the centre
line and the rotation parameter (left to right) becoming more important as
we move away from the third column.
Right: the balls used in the simulations to the left. This is an extract from the multifractal explorer website:

The same as above except that the initial ball is highly anisotropic in an attempt to simulate the effect of stretching due to a wide range of larger scales. Reproduced from Lovejoy and Schertzer (2007b).

Simulations of cloud liquid water density with the scale-changing rule the same throughout: only the unit balls are systematically modified so as to yield more and more “wispy” clouds. Reproduced from Lovejoy et al. (2009).

The cloud simulations above are for the density of cloud liquid water; they used false colours to display the more and less dense cloud regions. Real clouds are of course in 3D space, and the eye sees the light that has been scattered by the drops. Therefore, if we make 3D cloud simulations, instead of simply using false colours, we can obtain more realistic renditions by simulating the way light interacts with the clouds (see Figs. 8, 25, and 26 for various scaling analyses of cloud radiances at various wavelengths). The study of radiative transfer in multifractal clouds is in its infancy; see however Naud et al. (1997), Schertzer et al. (1997b), Lovejoy et al. (2009d), and Watson et al. (2009).

Figures 61 and 62 show the top and side views of a multifractal cloud with the usual false colours; Figs. 64 and 65 show the same cloud rendered by simulating light travelling through the cloud with both top (Fig. 64) and bottom (Fig. 65) views. Finally, in Fig. 66, we show a simulation of thermal infrared radiation emitted by the cloud, similar to what can observed from infrared weather satellites. We see that quite realistic morphologies are possible.

Up until now, we have only discussed space, but of course clouds and other atmospheric structures evolve in time. Since we have argued that the wind field is scaling and the wind moves clouds around, it effectively couples space and time. We therefore have to consider scaling in space and in time: in space–time. The time domain opens up a whole new realm of possibilities for simulations and morphologies. While the balls in space must be localized – since they represent typical spatial structures, “eddies” – in space–time they can be delocalized and form waves. In this case it turns out that it is easier to describe the system using the Fourier methods. Figure 67 shows examples of what can be achieved with various parameters.

Each row has a different random seed but is otherwise identical. Moving from left to right shows a different realization of a random multifractal process, with the central part boosted by factors increasing from left to right in order to simulate very rare events. The balls are shown in Fig. 61. Reproduced from Lovejoy and Schertzer (2013).

The balls used in the simulations above. Contours of the (rotation-dominant) scale function used in the simulations in Fig. 60. Reproduced from Lovejoy and Schertzer (2013).

The top layer of cloud liquid water using a grey-shaded rendition. Reproduced from Lovejoy and Schertzer (2013).

A side view of the previous one. Reproduced from Lovejoy and Schertzer (2013).

The top view with light scattering for the Sun (incident at 45

The same as Fig. 64 except viewed from the bottom. Reproduced from Lovejoy and Schertzer (2013).

The same as Fig. 64 except for a grey-shaded rendition of a thermal infrared field as might be viewed by an infrared satellite. Reproduced from Lovejoy and Schertzer (2013).

Examples of simulations in space–time showing wavelike morphologies. The same basic shapes are shown but with the wavelike character increasing clockwise from the upper left. Reproduced from the reference Lovejoy et al. (2008b).

An infrared satellite image from a satellite at 1.1 km resolution,

Estimates of the shapes of the balls in each

A multifractal simulation of a cloud with texture and morphology varying in both location and scale, simulated using nonlinear GSI; the anisotropy depends on both scale and position according to the balls shown in Fig. 71. Reproduced from Lovejoy and Schertzer (2013).

The set of balls displayed according to their relative positions used in the simulation shown in Fig. 70. Reproduced from Lovejoy and Schertzer (2013).

Generalized scale invariance is necessary since zooming into clouds displays systematic changes in morphology with the magnification, so that in order to be realistic, we needed to generalize the idea of self-similar scaling. The first step was to account for the stratification. When the direction of the stratification is fixed (pure stratification), there is no rotation with scale. We saw that, to model the horizontal plane, we needed to add rotation and, to a first approximation, we could think of the different cloud morphologies as corresponding to different cloud types – cumulus, stratus, cirrus, etc.

However, there is still a problem. Up until now, we have discussed linear GSI where the generator is a matrix, so that the scale-changing operator

In the more general nonlinear GSI, the notion of scale depends not only on
the scale, but also on the location. In nonlinear GSI we introduce the
generator of the infinitesimal scale change

Locally (in a small enough neighbourhood of a point), linear GSI is defined
by the tangent space; i.e. the elements of the linear generator are

Figures 70 and 71 show an example. The physics behind this are analogous to those in Einstein's theory of general relativity. In the latter, it is the distribution of mass and energy in the universe that determines the
appropriate notion of distance, i.e. the metric. With GSI, it is the
nonlinear turbulent dynamics that determine the appropriate notions of scale
and size. Note however an important difference: the GSI notion of scale is generally

With nonlinear GSI a bewildering variety of phenomena can be described in a scaling framework. The framework turns out to be so general that it is hard to make further progress. It is like saying “the energy of the atmosphere is conserved”. While this is undoubtedly true – and this enables us to reject models that fail to conserve it – this single energy symmetry is hardly adequate for modelling and forecasting the weather. One can imagine that, if one must specify the anisotropy both as a function of scale and as a function of location, many parameters are required. At a purely empirical level, these are difficult to estimate since the process has such strong variability and intermittency. In order to progress much further, we will undoubtedly need new ideas. However, the generality of GSI does make the introduction of scalebound mechanisms unnecessary.

We have given the reader a taste of the enormous diversity of cloud morphologies that are possible within the scaling framework. We discussed morphologies that were increasingly stratified at larger scales, that rotated with scale but only a bit, or that rotated many times. There were filamentary structures, there were structures with waves, and there were structures whose character changed with position. Although all of these morphologies changed with scale, they were all consequences of dynamical mechanisms that were scale-invariant. The scalebound approach is therefore logically wrong and scientifically unjustified. When scalebound mechanisms and models based solely on phenomenological appearances are invoked, they commit a corollary of the scalebound approach: the “phenomenological fallacy” (Lovejoy and Schertzer, 2007c). More concisely, the phenomenological fallacy is the inference of mechanisms from phenomenology (appearances).

Starting in the 1970s, deterministic chaos, scaling, and fractals have transformed our understanding of many nonlinear dynamical systems including
the atmosphere: they were the main components of the “nonlinear
revolution”. While deterministic chaos is largely a deterministic paradigm with a small number of degrees of freedom, the scaling, fractal, and later multifractal paradigm is a stochastic framework with a large number of degrees of freedom
that is particularly appropriate to the atmosphere. Ever since Richardson proposed his

Without further developments, neither classical approach is a satisfactory theoretical framework for atmospheric science. Fortunately, by the turn of the millennium, numerical models – based on the scaling governing equations – had matured to the point that they were increasingly – and today, often exclusively – being used to answer atmospheric questions. As a consequence, the deficiencies of the classical approaches are thus increasingly irrelevant for applied atmospheric science. However, there are consequences: elsewhere (Lovejoy, 2022a), I have argued that the primary casualty of the disconnect between high-level atmospheric theory and empirical science is that it blinds us to potentially promising new approaches. If only to reduce the current large (and increasing) uncertainties in projection projections, new approaches are indeed urgently needed.

This review therefore focuses on the new developments in scaling that
overcame these restrictions: multifractals to deal with scaling intermittency (Sect. 3) and generalized scale invariance (Sect. 4) to deal with scaling stratification and more generally scaling anisotropy. GSI clarifies the significance of scaling in geoscience since it shows that
scaling is a rather general symmetry principle: it is thus the simplest
relation between scales. Just as the classical symmetries (temporal, spatial
invariance, directional invariance) are equivalent (Noether's theorem) to
conservation laws (energy, momentum, angular momentum), the (nonclassical)
scaling symmetry conserves the scaling exponents

There are now massive data analyses of all kinds – including ones based on new techniques, notably trace moments and Haar fluctuations – that confirm and quantify atmospheric scaling over wide ranges in the horizontal and vertical. Since this includes the wind field, this implies that the dynamics (i.e. in time) are also scaling. Sections 1 and 2 discuss how, over the range of milliseconds to at least hundreds of millions of years, temporal scaling objectively defines five dynamical ranges: weather, macroweather, climate, macroclimate, and megaclimate. The evolution of the scalebound framework from the 1970s (Mitchell) to the 2020s (Von der Leyden et al.) shows that it is further and further divorced from empirical science. This is also true of the usual interpretation of space–time (Stommel) diagrams that are re-interpreted in a scaling framework (Sect. 2.6). These scalebound frameworks have survived because practising atmospheric scientists increasingly rely instead on general circulation models that are based on the primitive dynamical equations. Fortunately, the outputs of these models inherit the scaling of the underlying equations and are hence themselves scaling: they can therefore be quite realistic. For decades, this has allowed the contradiction between the scaling reality and the dominant “mental model” to persist.

Similar comments apply to the still dominant isotropic theories of
turbulence that – although based on scaling – illogically place priority on
the directional symmetry (isotropy) ahead of the scaling one – and this in spite of the obvious and strong atmospheric stratification. In order for
these theories to be compatible with the stratification – notably the

The review also emphasizes the impact of the analysis of massive and new sources of atmospheric data. This involves the development of new data analysis techniques, for example trace moments (Sect. 3) that not only directly confirm the cascade nature of the fields, but also give direct estimates of the outer scales, which turn out to be close to planetary scales (horizontal) and the scale height (vertical). However, for scales beyond weather scales (macroweather), fluctuations tend to decrease rather than increase with scale, and this requires new data analysis techniques. Haar fluctuations are arguably optimal – being both simple to implement and simple to interpret (Sect. 2, Appendix B).

There is still much work to be done. While this review was deliberately restricted to the shorter (weather regime) timescales corresponding to highly intermittent atmospheric turbulence, scaling opens up new vistas at
longer timescales too. This has important implications for macroweather – both monthly and seasonal forecasts – that exploits long-range (scaling) memories (Lovejoy et al., 2015; Del Rio Amador and Lovejoy, 2019, 2021a, b) as well as
for multidecadal climate projections (Hébert et al., 2021b; Procyk et al., 2022). In addition, the growing paleodata archives from the
Quaternary and Pleistocene are clarifying the preindustrial weather–macroweather transition scale (Lovejoy et al., 2013a; Reschke et al., 2019; Lovejoy and Lambert, 2019) and confirming the scaling of paleotemperatures over scale
ranges of millennia through to Milankovitch scales (

In 1994, a new

Over the following decades, there evolved several more or less independent strands of scaling analysis, each with their own mathematical
formalism and interpretations. The wavelet community dealing with
fluctuations directly, the DFA community wielding a method that could be conveniently implemented numerically, and the turbulence community focused on intermittency. In the meantime, most geoscientists continued to use spectral analysis, occasionally with singular spectral analysis (SSA), the multitaper method (MTM), or other refinements.
New clarity was achieved by the first “Haar” wavelet (Haar, 1910). There were two reasons for this: the simplicity of its definition and calculation
and the simplicity of its interpretation (Lovejoy and Schertzer, 2012). To determine the
Haar fluctuation over a time interval

The inadequacy of using differences as fluctuations forces us to use a
different definition. The root of the problem is that “cancelling” series
(

How can we remedy the situation? First, consider the case

With this in mind, consider a series with

From the way it was introduced by a running-sum transformation of the series, we see that the anomaly fluctuation will be dominated by low-frequency details whenever

It turns out that many geophysical phenomena have both

However, what does the Haar fluctuation mean, and how do we interpret it? Consider first the Haar fluctuation for a series with

So what about other ranges of

To get a clearer idea of what is happening, let us briefly put all of this into the framework of wavelets, a very general method for defining
fluctuations. The key quantity is the “mother wavelet”

Difference fluctuations

Anomaly fluctuations

Haar fluctuations

In order to understand the convergence/divergence of different scaling
processes, it is helpful to consider the Fourier transforms (indicated with a tilde). The general relation between the Fourier transform of the
fluctuation at lag

Taking the modulus squared and ensemble averaging (“

We may now consider the convergence of the fluctuation variance using
Parseval's theorem:

When

The simpler wavelets discussed in the text: see Table A1 for mathematical definitions and properties. The black bars symbolizing Dirac delta functions (these are actually infinite in height) indicate the difference fluctuation (poor man's wavelet), the stippled red line indicates the anomaly fluctuation, the blue rectangles shows the Haar fluctuation (divided by 2), and the red line shows the first derivative of the Gaussian.

The higher-order wavelets discussed in the text: the black bars (representing Dirac delta functions) indicate the second difference fluctuation, the solid blue the quadratic Haar fluctuation, and the red the “Mexican hat wavelet” or second derivative of the Gaussian fluctuation.

The difference between different fluctuations is the integral on the far
right of Eq. (B14). As long as it converges, the difference between using two different
types of fluctuations is therefore the ratio

In summary, when the wavelet falls off quickly enough at high and low frequencies, the fluctuation variance converges to the expected scaling
form. Conversely, whenever the inequality

A comparison of various wavelets along with their frequency (Fourier) representation and low- and high-frequency behaviours. On the right, the range of

The simple wavelets and fluctuations discussed in the text in the frequency domain. The power spectrum of the wavelet filter

The power-spectrum filters for the higher-order wavelets/fluctuations discussed in the text, along with reference lines indicating the asymptotic power-law behaviours. Note that the Mexican hat (second derivative of the Gaussian) decays exponentially at high
frequencies, equivalent to an exponent

The theoretical calibration constant

To illustrate various issues, we made a multifractal simulation with

What is going on? The first thing to be clear about is that statistical stationarity is not a property of a series or even of a finite number of series, but rather of the stochastic process generating the series.

However, once one assumes that the process comes from a certain theoretical
framework – such as random walks – then the situation is quite different
because this more specific hypothesis can be tested. However, let us take a closer look. A theoretical Brownian motion process

We mentioned that the DFA technique was valid for

To understand the DFA, take the running sum

A simulation of a (multifractal) process with

The running sum

The

Comparison of the bias in estimates of the second-order structure function

Finally, the usual DFA approach defines the basic exponent

In applications of the DFA method, much is said about the ability of the
method to remove nonstationarities. Indeed, it is easy to see that an

We conclude that the only difference between analysing a data series with the DFA or with wavelet-based fluctuation definitions is the extra and needless complexity of the DFA – the regression part – that makes its interpretation and mathematical basis unnecessarily obscure. Indeed, Fig. B9 numerically compares spectra, (Haar) wavelets, and DFA exponent estimates, showing that Haar wavelets are at least as accurate as DFA but have the added advantage of simplicity of implementation and simplicity of interpretation.

If the underlying process is multifractal, one naturally obtains huge fluctuations (in space, huge structures, “singularities”), but these are totally outside the realm of quasi-Gaussian processes, so that when they are inappropriately interpreted in a quasi-Gaussian framework, they will often be mistakenly treated as nonstationarities (in space, mistakenly as inhomogeneities).

Much software is available from the site:

This review contains no original data analyses.

The author is a member of the editorial board of

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the special issue “Interdisciplinary perspectives on climate sciences – highlighting past and current scientific achievements”. It is not associated with a conference.

The author was helped by numerous colleagues and students over the last decades, especially Daniel Scherzter, with whom much of the multifractal and GSI material was developed. Adrian Tuck was involved in multidecadal collaborations on the subject of aircraft turbulence and scaling analyses thereof.

This paper was edited by Tommaso Alberti and reviewed by Adrian Tuck and Christian Franzke.