Idealized models of the joint probability distribution of wind speeds

The joint probability distribution of wind speeds at two separate locations in space or points in time completely characterizes the statistical dependence of these two quantities, providing more information than linear measures such as correlation. In this study, we consider two models of the joint distribution of wind speeds obtained from idealized models of the dependence structure of the horizontal wind velocity components. The bivariate Rice distribution follows from assuming that the wind components have Gaussian and isotropic fluctuations. The bivariate Weibull distribution arises from power law transformations of wind speeds corresponding to vector components with Gaussian, isotropic, mean-zero variability. Maximum likelihood estimates of these distributions are compared using wind speed data from the mid-troposphere, from different altitudes at the Cabauw tower in the Netherlands, and from scatterometer observations over the sea surface. While the bivariate Rice distribution is more flexible and can represent a broader class of dependence structures, the bivariate Weibull distribution is mathematically simpler and may be more convenient in many applications. The complexity of the mathematical expressions obtained for the joint distributions suggests that the development of explicit functional forms for multivariate speed distributions from distributions of the components will not be practical for more complicated dependence structure or more than two speed variables.


Introduction
A fundamental issue in the characterization of atmospheric variability is that of dependence: how the state of one atmospheric variable is related to that of another at a different position in space, or point in time. The simplest measure of statistical dependence, the correlation coefficient, is a natural measure for Gaussian-distributed quantities but does not fully characterize dependence for non-Gaussian variables. The most general representation of dependence between two or more quantities is their joint probability distribution. The joint probability distribution for a multivariate Gaussian is well known, and expressed in terms of the mean and covariance matrix (e.g. Wilks, 2005;von Storch and Zwiers, 1999). No such general expressions for non-Gaussian multivariate distributions exist. Copula theory (e.g. Schlözel and Friederichs, 2008) allows joint distributions to be constructed from specified marginal distributions. However, which copula model to use for a given analysis is not generally known a priori and is usually determined empirically through a statistical model selection exercise.
The present study considers the bivariate joint probability distribution of wind speeds. As these are quantities which are by definition bounded below by zero, the joint distribution and the marginal distributions are non-Gaussian. While the correlation structure (equivalently, the power spectrum) of wind speeds in time (e.g. Brown et al., 1984;Schlax et al., 2001;Gille, 2005;Monahan, 2012b) and in space (e.g. Carlin and Haslett, 1982;Nastrom and Gage, 1985;Wylie et al., 1985;Brown and Swail, 1988;Xu et al., 2011) has been considered, relatively little attention has been paid to developing expressions for the joint distribution. Previous studies have used copula methods to model horizontal spatial dependence of wind speeds for wind power applications (Grothe and Schnieders, 2011;Louie, 2012;Veeramachaneni et al., 2015) and dependence of daily wind speed maxima (Schlözel and Friederichs, 2008). While these earlier analyses have focused on probabilistic modelling of simultaneous wind speed values at different spatial locations in the horizontal, dependence structures in the vertical (e.g. for vertical interpolation of wind speeds) or in time are also of interest. For example, an analysis in which the need for an explicit parametric form for the joint distribution of wind speeds at different altitudes has arisen is the hidden Markov model (HMM) analysis considered in Monahan et al. (2015). In an HMM analysis of continuous variables, it is necessary to specify the parametric form of the joint distribution within each hidden state. In Monahan et al. (2015), the joint distribution of wind speeds at 10 and 200 m and the potential temperature difference between these altitudes was modelled as a multivariate Gaussian distribution, despite the fact that for at least the speeds this distribution cannot be strictly correct. This pragmatic modelling approximation was made because of the absence of a more appropriate parametric distribution for the quantities being considered. The alternative approach of using the wind components at the two levels (for which the multivariate Gaussian model may be a better approximation) instead of the speeds directly has the downside of increasing the dimensionality of the state vector from three to five, dramatically increasing the number of parameters to be estimated (with the covariance matrices in particular increasing from 9 to 25 elements) and reducing the statistical robustness of the results.
A number of previous studies have constructed univariate speed distributions starting from models for the joint distribution of the horizontal components (e.g. Cakmur et al., 2004;Monahan, 2007;Drobinski et al., 2015). One useful benefit of this approach is that it allows the statistics of the speed and the components to be related to each other. The specific goal of the present study is to extend this approach to bivariate distributions, constructing models of the joint probability distribution of wind speeds that are directly connected to the joint distributions of the horizontal components of the wind. As the following results will demonstrate, generalizing this approach to the bivariate speed distribution results in rather complicated mathematical expressions. Expressions for multivariate distributions of more than two speeds will be even more complicated, and may not be analytically tractable. Through the analysis of the bivariate speed distribution, we will probe how far the development of closed-form, analytic expressions for parametric speed distributions based on distributions of components can practically be extended.
Both Weibull and Rice distributions have been used to model the univariate wind speed distribution (cf. Carta et al., 2009;Monahan, 2014;Drobinski et al., 2015), and models of multivariate distributions with Weibull or Ricean marginals have been developed (e.g. Crowder, 1989;Lu and Bhattacharyya, 1990;Kotz et al., 2000;Sagias and Karagiannidis, 2005;Yacoub et al., 2005;Mendes and Yacoub, 2007;Villanueva et al., 2013). Much of this work has been done in the context of wireless communications (Sagias and Karagiannidis, 2005;Yacoub et al., 2005;Mendes and Yacoub, 2007): the present study builds upon the results of these earlier analyses.
The two probability density functions (pdfs) we will consider, the bivariate Rice and Weibull distributions, both start with simple assumptions regarding the distributions of the wind components. The bivariate Rice distribution follows directly from the assumption of Gaussian components with isotropic variance, but nonzero mean. In contrast, the bivariate Weibull distribution is obtained from nonlinear transformations of the magnitudes of Gaussian, isotropic, meanzero components. While the univariate Weibull distribution has been found to generally be a better fit to observed wind speeds than the univariate Rice distribution (particularly over the oceans, e.g. Monahan, 2006Monahan, , 2007, the direct connection of the Rice distribution to the distribution of the components (which the Weibull distribution does not have) is useful from a modelling and theoretical perspective (e.g. Cakmur et al., 2004;Monahan, 2012a;Culver and Monahan, 2013;Sun and Monahan, 2013;Drobinski et al., 2015). The sixparameter bivariate Rice distribution that we will consider is more flexible than the five-parameter bivariate Weibull distribution, and able to model a broader range of dependence structures. Furthermore, it is directly connected to the univariate distributions and dependence structure of the wind components. However, the bivariate Weibull distribution is mathematically much simpler than the bivariate Rice distribution and easier to use in practice. Other flexible bivariate distributions for non-negative random variables exist, such as the α-µ distribution discussed in Yacoub (2007). Because the Weibull and Rice distributions are common models for the univariate wind speed distributions, this study will focus specifically on their bivariate generalizations.
The bivariate Rice and Weibull distributions are developed in Sect. 2, starting from discussion of the bivariate Rayleigh distribution (which is a limiting case of both of the other models). In this section, we repeat some of the formulae obtained by Sagias and Karagiannidis (2005), Yacoub et al. (2005), and Mendes and Yacoub (2007) for completeness and because of notational differences between this study and the earlier ones. In Sect. 3, the ability of these distributions to model wind speed data from the middle troposphere, and from the near-surface flow over land and the ocean, is considered. Examples of dependence structures in both space (horizontally and vertically) and in time are considered. A discussion and conclusions are presented in Sect. 4.

Empirical models of the bivariate wind speed distribution
As a starting point for developing models of the bivariate wind speed distribution, we consider the joint distribution of the horizontal wind vector components u i = (u i , v i ), i = 1, 2 (where the subscripts i denote wind vectors at two different locations, two different points in time, or both). In particular, we assume that 1. the two orthogonal wind components are marginally Gaussian with isotropic and uncorrelated fluctuations: 2. and the cross-correlation matrix of the two vectors is where we have expressed the correlations in both Cartesian and polar coordinates: (µ 1 , µ 2 ) = (ρ cos ψ, ρ sin ψ), with 0 ≤ ρ = µ 2 1 + µ 2 2 1/2 ≤ 1. This assumed correlation structure implies that the correlation matrix becomes diagonal when the vector u 2 is rotated through the angle −ψ: where The joint distribution of the horizontal components resulting from these assumptions is (Mendes and Yacoub, 2007). Note that only considering the horizontal components of the wind vector implicitly restricts the resulting distributions to timescales sufficiently long that the vertical component of the wind contributes negligibly to the speed.

Bivariate Rayleigh distribution
The joint distributions of the speeds w i = u 2 i + v 2 i obtained from the pdf Eq. (5) when both vector wind components are mean-zero is the bivariate Rayleigh distribution (e.g. Battjes, 1969). Transforming variables to wind speed w i and direction θ i = tan −1 (v i /u i ), the joint distribution (Eq. 5) with Integrating over the wind directions to obtain the marginal distribution for the wind speeds, we obtain where we have used the fact that 2π 0 e α cos θ cos kθ dθ = 2π I k (α), where I k (z) is the modified Bessel function of order k. Note that the correlation angle ψ drops out after integration over θ 1 and θ 2 . As a result, for the bivariate Rayleigh distribution, p(w 1 , w 2 ) depends only on the three parameters (σ 1 , σ 2 , ρ). For ρ = 0, p(w 1 , w 2 ) factors as the product of the marginal distributions of w 1 and w 2 : and the wind speeds are independent. As ρ → 1, we can use the asymptotic result to show that where δ(·) is the Dirac delta function. In this limit, w 1 and w 2 are perfectly correlated and Rayleigh distributed.
Because 2 F 1 (−1/2, −1/2, 1; ρ 2 ) is an increasing function of ρ 2 with 2 F 1 (−1/2, −1/2, 1; 0) = 1, corr(w 1 , w 2 ) must be non-negative for the bivariate Rayleigh distribution. Plots of p(w 1 , w 2 ) for the three values ρ = 0, 0.5, and 0.85 are presented in Fig. 1, along with the marginal distributions of w 1 and w 2 (which are the same for all three panels). The marginal distributions are positively skewed and the contours of the joint distributions are more tightly concentrated below and to the left of their peaks than elsewhere. As expected, the distributions become more tightly concentrated around the 1 : 1 line as the dependence parameter ρ increases.

Bivariate Rice distribution
The assumptions leading to the bivariate Rayleigh distribution are too restrictive to model observed wind speeds in most circumstances. A more general distribution results from assuming that the wind components are Gaussian, isotropic, and uncorrelated, but with nonzero mean (Eq. 5).
Changing variables to wind speed w i and direction θ i , the joint distribution becomes where and where we have defined the magnitude and direction of the mean vector wind: The marginal distribution for the wind speeds is obtained by integrating the joint distribution over θ 1 and θ 2 . To evalu-ate this integral, we make use of the result 1 (2π) 2 2π 0 2π 0 exp [α 1 cos θ 1 + α 2 cos θ 2 + β 1 sin θ 1 + β 2 sin θ 2 +γ cos(θ 1 − θ 2 + ψ) dθ 1 dθ 2 where and it is important that tan −1 (b/a) be evaluated as the angle between the vector (a, b) and the vector (1, 0) (that is, as the four-quadrant inverse tangent). Equation (22) follows from the Fourier series along with repeated use of trigonometric identities and the integral Eq. (8). Finally, we obtain the expression for the bivariate Rice distribution (Mendes and Yacoub, 2007) Expressed in terms of the magnitude and direction of the mean wind vectors, where ν = When numerically evaluating the bivariate Rice distribution, it is convenient to transform the infinite series in Eqs. (25) and (26) into an integral which is then approximated numerically (Appendix A). Note that p(w 1 , w 2 ) depends on the relative orientation of the mean wind vectors and the correlation angle ψ only through the combination φ 1 −φ 2 +ψ. Because of this symmetry, the quantities φ 1 − φ 2 and ψ cannot be determined individually from wind speed data alone. As a result, p(w 1 , w 2 ) is determined by six parameters: (U 1 , σ 1 , U 2 , σ 2 , ρ, φ 1 −φ 2 + ψ). For particular applications, it may be appropriate to fix either φ 1 − φ 2 or ψ, allowing the other angle to be estimated from data. For example, when considering the temporal dependence structure of winds assumed to have stationary statistics, it can be assumed that φ 1 − φ 2 = 0. The bivariate Rice distribution also has the discrete symmetry that it is invariant under the transformation φ 1 − φ 2 + ψ → −(φ 1 − φ 2 + ψ). Note that these symmetries are in addition to the invariance of the distribution of components (Eq. 5) to the rotation of the coordinate system θ i → θ i + θ , i = 1, 2, under which the angles φ 1 −φ 2 and ψ are individually invariant.
Integrating over w 2 to obtain the marginal distribution for w 1 we obtain the univariate Rice distribution with mean and variance A. H. Monahan: Joint wind speed distributions (with equivalent expressions for w 2 obtained by integrating over w 1 ), where 1 F 1 (α; β; z) is the confluent hypergeometric function (Gradshteyn and Ryzhik, 2000). Equation (29) follows from Eq. (25) using the integral (Gradshteyn and Ryzhik, 2000) ∞ Neumann's theorem (Watson, 1922) ∞ k=0 k I k (x)I k (y) cos kφ = I 0 x 2 + y 2 + 2xy cos φ , (33) and the fact that Note that each wind speed marginal distribution depends only on the magnitude of the mean wind vector, while the joint distribution also depends on the angle between the two mean wind vectors. Furthermore, as ρ → 0 only the first term contributes to the infinite series in Eq. (25), and the joint distribution reduces to the product of the marginals. The joint moments of the bivariate Rice distribution can be evaluated using the Taylor series expansion: which allows the double integral defining the moments to factorize as the products of individual integrals over w 1 and w 2 that can be evaluated using The resulting expression for the joint moments is (37) (Mendes and Yacoub, 2007). When the mean vector winds are equal to zero, only the k = 0 terms contribute to this expression and Eq. (12) is recovered. Defining the variables V i = U i / √ 2σ i , the correlation coefficient between w 1 and w 2 is given by The correlation coefficient corr(w 1 , w 2 ) depends only on the four quantities (U 1 /σ 1 , U 2 /σ 2 , φ 1 −φ 2 +ψ, ρ). Monahan (2012b) considered the correlation structure of wind speeds using the approximation corr (w 1 , w 2 ) corr w 2 1 , w 2 2 . The assumed covariance structure of the wind components then results in the approximate expression: Plots of the correlation coefficient (Eq. 38) and the approximation (Eq. 40) as functions of ρ are shown in Fig. 2 for ,2,0) and (1,5,0). Agreement between the exact and approximate values of the correlation coefficient is reasonably good in both cases, with the largest discrepancies generally occurring for larger absolute values of ρ. Note that negative wind speed correlation values are permitted by the bivariate Rice distribution. Examples of the joint Rice pdf (and the associated marginals) are presented in Fig. 3 for (U 1 , σ 1 , U 2 , σ 2 ) = (6, 4, 2, 5) and (ρ, φ 1 − φ 2 + ψ) = (0.85, π ), (0, 0), and (0.85, 0). By construction, the marginal distributions are the same in each panel. The distributions of both w 1 and w 2 are positively skewed, and take respective maxima at values of about σ 1 and just less than 2σ 2 . For the different values of the dependence parameter ρ, the joint distributions have considerably different shapes. The joint distribution for (ρ, φ 1 −φ 2 +ψ) = (0.8, π ) describes (weakly) negatively correlated variables with a nonlinear dependence structure evident in ridges of enhanced probability extending to the left and right upward from the probability maximum. For (ρ, φ 1 − φ 2 + ψ) = (0, 0), probability contours are concentrated towards smaller values of w 1 and w 2 (as is the case for the marginal distributions). Finally, w 1 and w 2 are evidently positively correlated for (ρ, φ 1 − φ 2 + ψ) = (0.8, 0), with a slight curvature in the shape of the distribution indicating the existence of some nonlinear dependence.
Although the bivariate Rice distribution differs from the bivariate Rayleigh distribution only by allowing for nonzero mean wind vector components, the resulting expressions for the joint pdf (Eq. 25) and the moments (Eq. 37) are much more complicated for the bivariate Rice than the bivariate Rayleigh. Furthermore, while the univariate Rice distribution is a convenient model for the pdf of wind speed, observed winds show clear deviation from Ricean behaviour (e.g. Monahan, 2006Monahan, , 2007. We will therefore consider another model of the bivariate wind speed distribution with Weibull marginals, which turns out to result in simpler math-ematical expressions (at the cost of a more artificial derivation than that of the bivariate Rice distribution).

Bivariate Weibull distribution
As in Sagias and Karagiannidis (2005) and Yacoub et al. (2005), we obtain the bivariate Weibull distribution from the bivariate Rayleigh distribution through separate power law transformations of w 1 and w 2 . The pdf of a Weibull distributed variable is with moments where a and b are denoted the scale and shape parameters respectively. The Rayleigh distribution is a special case of the Weibull distribution with a = √ 2σ and b = 2. Weibull distributed variables remain Weibull under a power law transformation, with suitably modified scale and shape parameters: if x is Weibull with scale parameter a and shape parameter b, x k will be Weibull with scale parameter a k and shape parameter b/k. A joint wind speed distribution with Weibull marginal distributions can therefore be constructed from a joint Rayleigh distribution using the appropriate power law and scale transformations.
If we start with (x 1 , x 2 ) as bivariate Rayleigh distributed with σ i = 1/ √ 2, i = 1, 2, we obtain marginal Weibull distributions with specified scale and shape parameters through the transformations The joint pdfs transform as  Fig. 1 for the bivariate Rice distribution with (U 1 , σ 1 , U 2 , σ 2 ) = (6, 4, 2, 5) and (ρ, φ 1 − φ 2 + ψ) = (0.8, π), (0, 0), and (0.8, 0). and so we obtain the bivariate Weibull distribution An analogous approach to constructing bivariate Weibull distributions through nonlinear transformations of a bivariate Gaussian was followed in Villanueva et al. (2013); the resulting expressions are considerably more complicated than those considered here. Evidently, p(w 1 , w 2 ) factorizes into the product of the marginal distributions as ρ → 0. As ρ → 1, we can use Eq. (10) to make the approximation where the last equality follows from the fact that As expected, w 1 and w 2 are completely dependent in the limit that ρ → 1 (although they are not perfectly correlated if b 1 = b 2 as the functional relationship between the two variables will be nonlinear).
The relatively simple form of the bivariate Weibull distribution permits a relatively simple expression for the conditional distribution The factor in the second set of braces characterizes how conditioning on the value of w 1 changes the distribution of w 2 from its marginal distribution (and corresponds to a copula density; e.g. Schlözel and Friederichs, 2008). Note that for w 1 sufficiently large we can write w 2 a 2 3b 2 /4−1 w 1 For ρ not too close to zero, the conditional distribution for large w 1 is concentrated around the nonlinear regression curve Computing the moments, we obtain a m 1 a n ( Sagias and Karagiannidis, 2005;Yacoub et al., 2005). The correlation coefficient is then given by corr(w 1 , w 2 ) = (53) For b 1 , b 2 > 1 (the relevant range of shape parameters for wind speeds), 2 F 1 (−1/b 1 , −1/b 2 , 1; ρ 2 ) is an increasing function of ρ 2 with 2 F 1 (−1/b 1 , −1/b 2 , 1; 0) = 1. Therefore, the bivariate Weibull distribution is unable to represent situations in which the wind speeds are negatively correlated. Examples of the bivariate Weibull distribution for (a 1 , b 1 , a 2 , b 2 ) = (4, 1.5, 5, 7) and values of ρ = 0, 0.7, and 0.95 are shown in Fig. 4. Again, the marginal distributions in all three cases are the same by construction. The distribution of w 1 is positively skewed with a maximum near a value of w 1 = 0.5a 1 , while that of w 2 is negatively skewed with a maximum near w 2 = a 2 . For ρ = 0 the joint distribution is simply the product of the marginals. As ρ increases, w 1 and w 2 become positively correlated -although the correlation is weak even for ρ = 0.7 for this set of parameter values. At the value of ρ = 0.95, while the correlation of the two variables is only moderate, a strong nonlinear dependence is evident in the concentration of the distribution around the curve given by Eq. (48).

Fits of bivariate Rice and Weibull distributions to observed wind speeds
Many wind datasets from different locations are available, and it is impracticable to consider joint distributions of wind speeds from even a small fraction of these. In this section, we will consider examples of the joint distribution of wind speeds using data from a representative range of settings. Bivariate distributions of wind speeds at both different locations in space and different points in time will be considered.
The sampling of the wind speeds considered will be temporal (that is, individual samples will correspond to a specific time for spatial joint pdfs and a specific pair of times for temporal joint pdfs). Best-fit values of the parameters of the bivariate Weibull and Rice distributions we present were obtained numerically as maximum likelihood estimates (Table 1), with φ 1 − φ 2 set to zero. Goodness-of-fit of the distributions was assessed using a statistical test described in Appendix B. In order to distinguish how well the parametric joint distributions model the marginal distributions from how well they represent dependence between variables, the goodness-of-fit analyses were repeated for each pair of time series with the values of one of the pair shuffled in time. This shuffling destroys the dependence structure without affecting the distributions of the marginals. Use of a bivariate analysis rather than separate univariate goodness-of-fit tests for the marginals allows direct comparisons of p-values, as exactly the same test is used for the original and shuffled data.

Wind speeds at 500 hPa
We first consider the joint distribution of 00Z December, January, and February 500 hPa wind speeds from 1979 to 2014. These data were taken from the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim Reanalysis (Dee et al., 2011), subsampled to every second day to minimize the effect of serial dependence on the goodness-of-fit test (Monahan, 2012b). The wind speed data were computed as the magnitude of zonal and meridional components.
The joint distributions of wind speeds at four pairs of latitudes along 216 • W are presented in Fig. 5. A moderately strong negative correlation (r = −0.55) is evident between wind speeds at (39 • S, 216 • W) and (54.75 • S, 216 • W) (Fig. 5a). Because it is unable to model a negative correlation between wind speeds, the best-fit bivariate Weibull pdf differs substantially from the distribution of the observed winds (Fig. 5e). The goodness-of-fit test correspondingly rejects the null hypothesis that the observations are drawn from this distribution (p = 0). In contrast, the bivariate Rice distribution provides a reasonable model of the joint distribution of wind speeds at these two locations (Fig. 5i) and the goodness-of-fit test provides no evidence that these data are statistically incompatible with this distribution (p = 0.31). For wind speeds at these two locations, the bivariate Rice distribution is evidently a better model than the bivariate Weibull distribution.
In contrast, for wind speeds at (12 • N, 216 • W) and (15.75 • N, 216 • W), the null hypotheses of being drawn from either the bivariate Rice or Weibull distributions are rejected at the 95 % significance level for both distributions. These wind speeds are weakly correlated (r = 0.37) but show evidence of nonlinear dependence. The joint pdf of the observed speeds is characterized by two ridges of high probability extending to the right, upward and downward away from the  Fig. 1 for the bivariate Weibull distribution with (a 1 , b 1 , a 2 , b 2 ) = (4, 1.5, 5, 7) and ρ = 0, 0.7, and 0.95 and the speeds w 1 , w 2 scaled respectively by the Weibull scale parameters a 1 , a 2 .
region of maximum probability (Fig. 5b, f, j). These ridges are not captured by either of the best-fit bivariate Weibull or Rice distributions, although a hint of this structure is evident in the Rice distribution. While the fit of one or both of these parametric distributions to these observed wind speed data may be sufficiently good for practical applications, nevertheless we can confidently exclude the possibility that these data are drawn from either distribution. The wind speeds at (3 • S, 216 • W) and (0.75 • N, 216 • W) are correlated (r = 0.68) and their scatter clusters around a straight line extending away from the origin (Fig. 5c). Both the best-fit bivariate Weibull and Rice distributions appear to the eye to be good fits to the data (Fig. 5g, k), and in neither case can the null hypothesis be rejected that the data are drawn from these distributions. Only small differences exist between the two best-fit distributions for these data.
Finally, the wind speeds at (15 • S, 216 • W) and (45 • S, 216 • W) are uncorrelated (r = 0.06) and fit sufficiently well by both the bivariate Weibull and Rice distributions (Fig. 5d, h, l) that in neither case is the null hypothesis rejected. As in the previous example, the bivariate best-fit Rice and Weibull distributions are essentially indistinguishable for these data.
Considering the spatial correlation structure of these 500 hPa winds, we find cases in which one distribution (the Rice) is evidently a better fit to the data than the other (the Weibull), in which neither distribution provides a statistically significant fit to the data, and in which both distributions fit the data equally well. p-values from the analyses with temporally shuffled data, quoted in parentheses in Fig. 5, show that for none of the four pairs of wind speeds considered can the null hypotheses of univariate Weibull or univariate Rice distributions for the pair of marginals be rejected. This result indicates that for these data the rejection of the full bivariate distributions occurs because of a failure to adequately represent the dependence structure between the two random variables.
The temporal dependence structure of the wind speed at (39 • S, 216 • W) is illustrated in Fig. 6. As with the previous calculations, the pairs of lagged wind speeds (w(t n ), w(t n+s )) were subsampled to 2-day resolution to minimize the effect of serial dependence on the results of the goodness-of-fit tests. As the lag increases, the value of the dependence parameter ρ decreases as expected for both the Weibull and Rice distributions (Fig. 6h, l). For most lags, the null hypothesis of the Weibull distribution as a model for the joint distribution is rejected (p < 0.05; Fig. 6d). The rejection of the null hypothesis of a bivariate Weibull distribution is most robust for lags shorter than 3 days. In contrast, the null hypothesis of a bivariate Rice distribution is rejected less often than it is not -although p < 0.05 for more than one-third of the lags (Fig. 6d). Inspection of the example distributions shown demonstrates that the bivariate Weibull distributions are broader around their principal axis for small to intermediate wind speeds in a way that is not consistent with observations ( Fig. 6e-g). Such structures are not seen in the best-fit Rice distributions (Fig. 6i-k). Note that for lags of 0.5 and 1.5 days the observed distributions suggest a flaring out of the joint distribution for large wind speeds that is accounted for by neither the bivariate Weibull nor Rice distributions. There is good evidence that these data were not drawn from a bivariate Weibull distribution, and the evidence that they are drawn from a bivariate Rice distribution is not strong. p-values of fits obtained after shuffling of the lagged wind speed time series (in brackets in Fig. 6eg, i-k and dashed lines in Fig. 6d) are generally larger than those of the unshuffled data: with a few exceptions (e.g. a lag of 1.75 days), the rejection of the null hypothesis of the full dataset being drawn from either of the parametric distributions considered is not associated with a failure to fit the marginals.

Wind speeds over land at 10 m and 200 m
We next consider wind speeds at altitudes of 10 and 200 m measured from a 213 m tower in Cabauw, Netherlands (51.971 • N, 4.927 • E), maintained by the Cabauw Experimental Site for Atmospheric Research (CESAR;van Ulden and Wieringa, 1996) with 10 min resolution from 1 January 2001 through 31 December 2012. We will focus on data from Table 1. Maximum likelihood parameter estimates for the wind speed data shown in Figs. 5, 6, 7, and 8.
Maximum likelihood estimates of the bivariate Weibull and Rice distributions for the full nighttime data show evident disagreement between the scatter of the data and the best-fit distributions (Fig. 7e, i). Goodness-of-fit tests for both distributions were rejected (with p = 0 in both cases). When conditioned on being in either regime R 1 or R 2 , both the bivariate Rice and Weibull distributions result in much better representations of the nighttime data (Fig. 7f, g, j, k). In all cases the p-values exceed 0.05, so the fits cannot be rejected at the 95 % significance level. While neither the bivariate Rice nor Weibull distributions is a good probability model for the full joint distribution of wind speeds at these two altitudes, both are reasonable models for the distributions conditioned by regime occupation. Finally, the fits of neither the bivariate Rice nor Weibull distributions to the daytime data are statistically significant at the 95 % significance level. In particular, the joint Rice distribution is too broad (relative to observations) for small values of w 10 and w 200 and neither distribution is broad enough at higher wind speed values. p- (b, f, j) Nighttime data conditioned on being in regime R 2 (very stable boundary layer). (c, g, k) Nighttime data conditioned on being in regime R 1 (weakly stable boundary layer). (d, h, l) All daytime data (08:00-16:00 UTC). Values of the best-fit model parameters are given in Table 1. values of fits with w 200 shuffled in time are all larger than those of fits to the original data; only for the Rice fit to the full nighttime data is the shuffled p-value below 0.05. As seen in the previous data considered, there is no systematic evidence that the failure of the joint Weibull or Rice distributions to model the joint distributions of the Cabauw data results from a failure to model the marginals.

Sea surface wind speeds
Twice-daily December, January, and February level 3.0 gridded SeaWinds scatterometer equivalent neutral 10 m wind speeds between 60 • S and 60 • N at a resolution of 0.25 • × 0.25 • from the National Aeronautics and Space Administration (NASA) Quick Scatterometer (QuickSCAT; Perry, 2001) are available from December 1999 through February 2008. Data flagged as having possibly been corrupted by rain were excluded from the following analysis. Although the data are nominally twice-daily, it is often the case that data for either the ascending or descending pass of the satellite are missing. The maximum likelihood parameter estimates of bivariate wind speed distributions and goodnessof-fit tests were carried out using every third non-missing data point in order to minimize the effect of serial depen-dence. For the goodness-of-fit tests M = 10 quantiles were used because of the relatively small sample sizes. Because of the near-polar orbit of the satellite results, observations of wind speed at different locations are not simultaneous. The joint distributions we consider therefore combine dependence in both space and time.
Joint distributions of wind speed at (6.5 • S, 135 • W) with speeds at points along a zonal transect to 165 • W (in increments of 1 • ) were estimated. As the distance between the two positions increases, there is a decreasing trend in the best-fit bivariate Weibull dependence parameter ρ (Fig. 8h) with some small fluctuations likely due to sampling variability. The same is not true for the best-fit bivariate Rice dependence parameter ρ, which fluctuates wildly (Fig. 8l). Large fluctuations also seen in the best-fit value of cos ψ are clearly correlated with those of ρ: where one parameter is anomalously large (relative to the spatial trend), the other is anomalously small. Of the 30 pairs of points considered, the null hypothesis of a bivariate Rice distribution is rejected (at the 95 % significance level) at only one (Fig. 8d). In contrast, p < 0.05 for the bivariate Weibull distribution at several longitudes, particularly close to the base point at (6.5 • S, 135 • W). Inspection of the best-fit bivariate Weibull pdfs ( Fig. 8e-g) shows that they are broader for smaller wind  Table 1. speeds than for larger values, a feature not evident in the bestfit bivariate Rice distributions  or the scatter of data (Fig. 8a-c). From these results, we find only equivocal evidence that the pairs of wind speed data along this zonal transect are drawn from a bivariate Weibull distribution and no strong evidence to reject the null hypothesis that they are drawn from a bivariate Rice distribution. Again, the p-values of fits to temporally shuffled wind speeds are generally similar to or larger than those of fits to the full distribution. The bivariate Rice distribution with the wind speed at 142 • W (Fig. 8k) illustrates a rare example in which the parametric fit to the original data passed the goodness-of-fit test at the 5 % significance level while the fit to the shuffled data did not; it is evident from Fig. 8d that this situation is not common. The surface wind vector components in the tropics are known to be non-Gaussian (e.g. Monahan, 2007), so we have a priori reasons to believe the joint distribution should not be Ricean. The fact that the data do not generally allow for a rejection of the null hypothesis that the winds are bivariate Rice (for either the original or shuffled datasets) is likely a consequence of the relatively small sample size.
The large variations in best-fit estimates of ρ and cos ψ for the bivariate Rice distribution result from the fact that for some parameter values the distribution is only weakly sensitive to simultaneous changes in these parameters: increases in ρ can be offset by decreases in cos ψ with only small changes to the joint distribution. To demonstrate this weak sensitivity, 50 realizations of bivariate Ricean variables with (U 1 , σ 1 , U 2 , σ 2 , ρ, φ 1 − φ 2 + ψ) = (7.3, 1.5, 6.9, 1.5, 0.5, 0) were generated for each of the sample sizes of N = 250, 1500, and 9000. Maximum likelihood estimates of these parameters obtained from these realizations demonstrate that for the smaller samples ρ and cos ψ show strong and correlated sampling variability, with large increases in ρ combined with large decreases in cos ψ (Fig. 9a). As expected, these sampling fluctuations become smaller as the sample size increases (Fig. 9b, c). Despite the large variation of ρ and cos ψ for small to intermediate sample sizes, there is relatively little variation in the structure of the corresponding bivariate Rice distributions ( Fig. 9d-f). An indication of why increases in ρ should counterbalance decreases in cos ψ with only small effects on the joint distribution is given by the approximate expression for the correlation coefficient, Eq. (40). The value of this approximation is unaffected by changes in ρ and ψ that leave the numerator invariant. The compensation between sampling variations in ρ and cos ψ is evident the fact that corr(w 1 , w 2 ) given by Eq. (40) is an excellent approximation to the sample correlation coefficient even for estimates of ρ and cos φ which are far away from the population values ( Fig. 9g-i). Note that there is no evident relationship between sampling fluctuations in corr(w 1 , w 2 ) and those of ρ and ψ: the range of sample correlation values for (ρ, cos ψ) near the population values of (0.5,1) (open circles) is the same as that for values of (ρ, cos ψ) far from these values (stars). In these parameter ranges, the dependence between w 1 and w 2 constrains ρ and ψ not individually, but together -over large ranges of values for sufficiently small sample sizes.

Conclusions
This study has considered two idealized probability models for the joint distribution of wind speeds, both derived from models for the joint distribution of the horizontal wind components. The first, the bivariate Rice distribution, follows from assuming that the wind vector components are bivariate Gaussian with an idealized covariance structure. The second, the bivariate Weibull distribution, arises from nonlinear transformations of variables with a bivariate Rice distribution in the limit that the mean vector winds vanish (the bivariate Rayleigh distribution). While the bivariate Rice distribution has the advantage of being more flexible and naturally related to a simplified model for the joint distribution of the wind components, the bivariate Weibull distribution is mathematically much simpler and easier to work with. Through consideration of a range of joint distributions of observed wind speeds (over land and over the ocean; at the surface and aloft; in space and in time) the bivariate Rice distribution was shown to generally model the observations better than the bivariate Weibull distribution. However, in many circumstances the differences between the two distributions are small and the convenience of the bivariate Weibull distribution relative to the bivariate Rice distribution is a factor which may motivate its use.
The fact that the bivariate Rice distribution is easier to work with, but less flexible, than the bivariate Weibull distribution is evident from inspection of their analytic forms and the relative number of parameters to fit (five vs. six). If the bivariate Weibull distribution was generically appropriate for modelling the bivariate wind speed distribution, there would be no need to consider more complicated models such as the bivariate Rice. This study provides an empirical assessment of the relative practical utility of the two models, trading off the ability to model more general dependence structures (e.g. negatively correlated speeds) against model simplicity. Neither the univariate nor the bivariate Weibull or Rice distributions are expected to represent the true distributions of wind speeds (e.g. Carta et al., 2009). The results of this analysis characterize the practical utility of these models, rather than making a claim to their "truth". It is noteworthy that for the data considered in this study, the failure of either the bivariate Rice or Weibull distributions to adequately fit the joint distribution of wind speeds (at a significance level of 5 %) is not generally associated with a corresponding failure of the parametric distribution to model the marginals. An interesting direction of future study would be consideration of other parametric models for the joint distribution of nonnegative quantities, such as the α-µ distribution (Yacoub, 2007), copula-based models (e.g. Schlözel and Friederichs, 2008), or distributions obtained through nonlinear transformations of multivariate Gaussians (e.g. Brown et al., 1984).
Many of the assumptions that have been made regarding the distribution of the wind components are known not to hold in various settings. For example, the vector wind components are generally not Gaussian, either aloft or at the surface (e.g. Monahan, 2007;Luxford and Woollings, 2012;Perron and Sura, 2013), and fluctuations will not generally be isotropic (especially over land; cf. Mao and Monahan, 2017). Furthermore, when used to model temporal dependence the assumed correlation structure cannot account for the anisotropy in autocorrelation of orthogonal components in either space (e.g. Buell, 1960) or time (e.g. Monahan, 2012b). Relaxing the assumptions regarding isotropy of correlation structure results in expressions for the joint speed distributions involving integrals over angle which are not analytically tractable.
While it is possible to relax the assumption of Gaussian components for univariate speed distribution (e.g. Monahan, 2007;Drobinski et al., 2015), extending this analysis to the bivariate case would involve specifying a non-Gaussian dependence structure for the components. At present, there is no physically based model for such dependence. Without any such physical justification, the only option is an empirical investigation of the ability of different copula models to represent observed joint wind speed distributions. It is unlikely that a copula-based model for dependence of components will admit analytically tractable expressions for joint speed distributions. A copula-based analysis of either the components or the speeds directly is also likely necessary for modelling extreme wind speeds (either large percentiles, peaks over threshold, or block maxima), as the tails of the bivariate Rice and Weibull distributions may not be adequate for this task. Finally, extending the approach used in this study to obtain explicit closed-form results for the bivariate wind speed distribution to a higher-dimensional multivariate setting -of wind speeds alone, or of a mixture of wind speeds and other meteorological quantities -will be analytically intractable for any except the simplest (and likely unrealistic) covariance structures. It may not be practical to extend the program of obtaining closed-form expressions for joint speed distributions from models of the component distributions much further beyond the bivariate Rice and Weibull speed distributions considered in this study.
Ultimately, it would be best for models of the joint distribution of wind speeds to arise from physically based (if still idealized) models, as has been done for the univariate case in Monahan (2006) and Monahan et al. (2011). The development of such models represents an interesting direction of future study.
Data availability. The ERA-Interim 500 hPa zonal and meridional wind components were obtained from the European Centre for Medium-Range Weather Forecasts at http://www.ecmwf.int/en/ research/climate-reanalysis/era-interim (last access: 17 November 2015). The Cabauw tower data were downloaded from the Cabauw Experimental Site for Atmospheric Research (CESAR) at http: //www.cesar-database.nl/ (last access: 13 June 2014). The Level 3.0 QuickSCAT data were downloaded from the NASA Jet Propulsion Laboratory Physical Oceanography Distributed Active Archive Center (http://podaac.jpl.nasa.gov/dataset/QSCAT_LEVEL_3_V2, last access: 18 February 2010).