Statistical analysis of polychaete population density : dynamics of dominant species and scaling properties in relative abundance fluctuations

We consider here the dynamics of two polychaete populations based on a 20 yr temporal benthic survey of two muddy fine sand communities in the Bay of Morlaix, Western English Channel. These populations display high temporal variability, which is analyzed here using scaling approaches. We find that population densities have heavy tailed probability density functions. We analyze the dynamics of relative species abundance in two different communities of polychaetes by estimating in a novel way a “mean square drift” coefficient which characterizes their fluctuations in relative abundance over time. We show the usefulness of using new tools to approach and model such highly variable population dynamics in marine ecosystems.


Introduction
One of the key features of environmental and geophysical field studies is their high variability at many different time and space scales.Because of these external influences and also because of the stochasticity introduced by the reproduction and predator prey processes, population dynamics are also characterized by high variability over time and space (Pimm and Redfearn, 1988;Blarer and Doebeli, 1999;Ferriere and Cazelles, 1999;Inchausti and Halley, 2001;Schmitt et al., 2008).The dynamics of many natural populations involve intermittent rarity, the alternation over variable periods of time of phases of extremely low abundance and short outbreaks (Ferriere and Cazelles, 1999).While many studies in population ecology have focused on average evolutions over time, several recent studies have been concerned by the modeling of their large events or extremes (Gaines and Denny, 1993;Arino and Pimm, 1995;Katz et al., 2005;Pascual and Guichard, 2005;Reynolds and Freckleton, 2005).Large values are ecologically important for many reasons: large numbers of individuals of a species may lead to higher ecological pressure on preys and competitors, but also higher availability for predators.For example, in the framework of marine ecology, phytoplankton blooms show very large densities, and may induce ecological disturbances of the marine coastal ecosystems (Boyer et al., 2009).
The search for universal scaling laws in biodiversity and ecology often involves considering a form of power-law distribution (Katz et al., 2005).As reviewed by Marquet et al. (2007), power laws can emerge in population dynamics (Keitt and Stanley, 1998;Ferriere and Cazelles, 1999) or in patterns of abundance, distribution, and richness (Frontier, 1985;Banavar et al., 1999;Harte et al., 1999Harte et al., , 2001;;Marquet, 2002;Labra et al, 2005).In this paper, we consider two polychaete sets of data, as they are considered good surrogates to identify the main environmental conditions that control the structure and functioning of benthic communities (Dauvin et al., 1994;Olsgard et al., 2003;Mutlu et al., 2010).Polychaetes colonize a large range of soft and hard marine sediment habitats, from intertidal to hadal (parts of the ocean below 6000 m) zones.This group often dominates benthic macrofauna (Grémare et al., 1998) and has been shown to constitute a good surrogate for describing its distribution (Olsgard et al., 2003); they can also be used as markers of different ecological conditions (Gambi and Giangrande, 1986).Numerous reproductive strategies linked to different life spans can be found in polychaetes; these life spans may range from only weeks to several years; larval planktonic duration varies from some hours to several months (Giangrande, 1997).Based on long-term series changes in diversity of two softbottom communities, we study the dynamics of polychaete populations using different statistical techniques.In a first section, we present the databases.Then, we consider the total abundance and show that they have heavy tail distributions, and we consider the relation between total abundance and species diversity.In a second section, we focus on dynamical properties of some relative abundance time series extracted from the biodiversity databases and find a general scaling relation between the magnitude of the "ecological drift" and the relative abundance.The last section is a discussion and conclusion.

Heavy tails in probability distributions of total abundance data
We first consider total abundance data: the time series for polychaete total abundance in PN and RM stations are shown in Fig. 2a and b, respectively, where we represent the sum of all species.These plots illustrate the high variability of these populations, and the burst-like events associated to sudden population outbreaks.The 10 most abundant species in each dataset are given in Tables 1 and 2 and Tables 3 and 4 give the abundance of species belonging to the Spionid family as they are one of the most representative and abundant families of polychaetes.
Let us note also that both sites were polluted by hydrocarbons from the Amoco Cadiz wreck in April 1978.The spill's impact was more pronounced at PN than in RM site (Dauvin, 1998(Dauvin, , 2000)); however, here we do not focus on the influence of this particular event, and consider the species dynamics as a whole.The primary effects of the Amoco Cadiz oil spill in spring 1978 were only important at the PN site where amphipod Ampelisca dominated before the oil spill while at RM tolerant polychaetes such as Chaetozone gibber dominated.At PN, the spill was abrupt and the amphipod populations disappeared immediately with the introduction of hydrocarbons in the sediment (Dauvin, 1998); the polychaetes were not affected by the oil spill except for a small increase in abundance of a Heterocirrus alatus population (Cirratulidae) at Pierre Noire site during the autumn 1978 and the summer increase of Pseudopolydora pulchra at both sites in 1982 (Dauvin, 2000).We considered the total abundance, for all sampling dates and locations considered together, in order to characterize their probability distribution function.For this we focus on the complementary cumulative distribution function F (x) = Pr(X ≥ x) = ∞ x p(x )dx where p(x) is the probability density function (pdf).This function is shown for both PN and RM databases in Fig. 3a-b.In the same plots, we also represent a fit obtained for a normal random variable with the same mean and variance as the times series (µ PN = 58.3,σ PN = 456.3and µ RM = 92.5, σ RM = 345.2).We also represent lognormal fits with the same mean and variance as the logarithm of experimental data (µ PN = 2.09, σ PN = 1.86 and µ RM = 2.05, σ RM = 2.09).The normal fits are very far from the data, whereas the lognormal fit are much closer, especially for the PN case, however a Kolmogorov-Smirnov test of lognormality of the data gave negative results for both datasets.Since the tail of the continuous curve in Fig. 3a is approximately linear, we have also tested the power-law fit (hyperbolic tail for the probability density function of the form p(x) ≈ x −α for large x values) using recently proposed procedures in Clauset et al. (2009).We have used Matlab codes given by the main author in the companion page of his paper1 .The first step of the proposed method is to estimate the lower threshold x min for which a powerlaw fit of the pdf is optimal, and the power-law tail α.We obtain α PN = 2.18 ± 0.1 and x minPN = 99 ± 52 and α RM = 2.33 ± 0.4 and x minRM = 443 ± 280 for, respectively, the PN and RM datasets, where the error bars are obtained using the procedure described in Clauset et al. (2009).The second step of the method is important and called MLE (maximum likelihood estimator); it uses goodnes-of-fits based on the Kolmogorov-Smirnov test in order to determine if the data are consistent with a power-law hypothesis.The output of the method is a p-value, fraction of synthetic distances that are larger than the empirical distance from a pure power law.
It is a number between 0 and 1 indicating the validity of the power-law fit: when p < 0.1 the power-law hypothesis is rejected.We obtained p PN = 0.03 and p RM = 0.007, showing that, even if visually the tail in Fig. 3a is approximately linear, the distribution is not purely a power-law.These results show that total abundance data are heavytailed, with a distribution which is not lognormal nor powerlaw.We do not push further here the search for the adequate distribution and note that this could belong to generalized Gamma distributions.

Relation between total abundance and species richness
A possible definition of biological diversity is "the variety and abundance of species in a defined unit of study" (Magurran, 2004).Abundance and relative abundance are often considered amongst the many facets of diversity as the importance (relative or absolute) of species is implicitly involved in the estimation of species richness.In this framework, a general pattern of biodiversity is the richness-abundance relation, where species diversity (total number of species) is larger when total abundance of the sample is larger (Marquet et al., 2007).There were many different polychaete species identified for each sampling station: a total of 140 species for Pierre Noire site and 115 species for Riviere de Morlaix site.Figure 4 shows the species richness (total number of polychaete species found) versus the total abundance, for each sampling date.The two databases form clouds of data over which we calculated a conditional average using a local Gaussian kernel smoothing (Wand and Jones, 1995).This method is useful to extract a smooth nonlinear conditional average evolution based on a cloud of points.The resulting curves show that there is an increase of the species richness with total abundance; they also provide an illustration of this species richness-abundance relationship.
The PN site has approximately a species richness which is twice the one of RM site, for the same total abundance.We found that the conditional average relationship between species richness and total abundance is increasing with increasing abundance, in agreement with one of the general patterns in biodiversity (Marquet et al., 2007).These results also show that the RM site is substantially less species-rich than the PN site; the former is under estuarine influence and is also a zone with a natural accumulation of mud fine particles enriched in organic matter.These estuarine benthic communities are generally less species-rich than open marine communities such as the PN site (Dauvin, 1997).
3 Dynamical properties: relative abundance and ecological drift

Relative abundance
In the previous section we considered the statistics of total abundance time series.In this section, we are interested in some dynamical properties of these databases, focusing mainly on species information.We consider the quantification of the relative abundance fluctuations in relation with the mean abundance of one species.We consider here relative abundance data, i.e. species abundance fluctuations relative to the total population abundance.For this, we normalized each species abundance by the total abundance N(t) at time t.For a species i with abundance N i (t) at time t, we introduce the ratio r i (t) according to: This transforms the species dynamics with high burst values between 0 and large positive numbers, to a relative abundance dynamics between 0 and 1.Using this approach, we discuss below a dynamical property which is found for all species.

Ecological drift and its quantification
In the field of genetic population dynamics, the concept of genetic drift of an isolated population is now classical.This drift is the consequence of the intrinsic random nature of the reproduction process.It is characterized by a stochastic variability of some gene's frequency inside a finite population, from one generation to the next.Similarly, a theory of dynamical biodiversity has been developed in the last decade, Hubbell's neutral theory of biodiversity and biogeography (Hubbell, 2001;He and Hu, 2005;Hu et al., 2006), which was inspired by Kimura's Neutral Theory of Molecular Evolution in the field of population genetics (Kimura, 1994).In Hubbell's theory, genes are replaced by species and the concept of ecological drift was introduced (Hubbell, 2001), to characterize the temporal evolution of relative abundance of a species inside a finite isolated community.Here, using marine benthic databases, we aim to quantify such ecological drift.We must emphasize that we borrow here the concept of ecological drift and relative abundance, but this has no relation with the validity of the hypotheses of neutral theory for the database analyzed.In other words, we borrow a concept, but we do not need the underlying hypotheses of the theory.
In order to characterize the fluctuations of a species' dynamics, we consider its mean square displacement (MSD) (which could also be seen as its mean square drift): (2) where r(t) = r(t)−r(t −1) and the relative abundance here is estimated with respect to the total polychaetes population.
For the study of such dynamics, we considered only subsamples of our databases for which the sampling period was regular (from 1978 to 1981 for PN data, monthly samples, and from 1980-1996 for RM, with five samples per year).Relative abundance is a strongly fluctuating quantity.Tables 3 and 4 show the population dynamics MSD for the most abundant species, the average value MSD m estimated over all species, and the ratio MSD/MSD m in order to quantify the difference in the ecological drift of several abundant species.We see that species with high MSD are the most abundant in the communities and display high variability in their abundance values; conversely rare species have low MSD values and similarly low variability in abundance.Hu et al. (2006) have argued that the ecological drift is likely to be larger for rarer species, whereas abundant species should have a low ecological drift.This prediction was based on the analogy between ecological drift and genetic drift, and on the neutral model where ecological drift is obtained from a random birth and death process chosen among randomly chosen species, explaining the prediction of a smaller effect for large community sizes.Here we have tested this prediction using the population dynamics mean square displacement as a measure of the strength of the ecological drift.Figure 5 shows the mean species relative abundance < r > versus MSD, in log-log plot.This relation is the opposite of what could be expected according to Hu et al. (2006) with an increase and not a decrease of the ecological drift of a species with its abundance.Moreover, we find a remarkable scaling relation: with β = 1.5 and C = 0.2.The power law relation found here has an exponent of 1.5, for which we have no interpretation.This relation could be linked to a more general allometric relation.It is also remarkable that in Fig. 5, the two databases perfectly collapse, whereas for other comparisons (i.e.Fig. 4) the curves were quite different for PN and RM series.

Discussion and conclusion
In this paper, we have considered the community dynamics inside the highly variable total abundance.For this, we have normalized the data, at each time step, by the total abundance.Each species was characterized by a relative abundance value, bounded between 0 and 1.We considered the "ecological drift", corresponding to the time evolution of the relative abundance of a species and proposed, for the first time, a quantification of the strength of this drift, using the mean square displacement coefficient from one time step to the other.We found that this coefficient is related to the abundance of the species; Tables 5 and 6 indicate that the strength of the ecological drift is stronger for abundant species, providing a relation which is the opposite of what could be expected according to Hu et al. (2006) in the framework of neutral theory.This result has indicated that the more abundant a species is, the more it fluctuates, this being a possible manifestation of species turnover (Vilenkin, 2006).This relation could be linked to a more general allometric relation.Gaston and McArdle (1994) mention that the abundance of animals in different habitats can have different variabilities and that these are thought to be driven by levels of temporal variability exhibited by the habitats.The more variable the environment the larger temporal variability is likely to be; disturbed or temporally variable habitats tend to be occupied by species with high rates of population growth, like opportunistic species (i.e.Pseudopolydora pulchra and Spio decoratus).The results we found confirm and quantify such observations.Dauvin (1998) has shown similar patterns with periodic alternation between cold and hot winters in the Bay of Morlaix and identified them as the main driver of the longterm changes of the Pierre Noire community after the Amoco Cadiz spill which remains the main event during the 1977-1996 survey.
Nearshore estuarine and marine infaunal communities are often characterized by high variability in abundance and in dominance patterns between years (Peterson, 1975;Mahoney and Livingston, 1982;Flint and Kalke, 1985;Holland, 1985;Nichols, 1985;Posey, 1986), although there are examples of long-term stability in dominance for lower salinity areas (Hines et al., 1987;Holland et al., 1987).Despite this variability, these infaunal communities are generally dominated by relatively few taxa, with less than 10-20 % of species often comprising larger than 99 % of a community by number, and most other taxa represented by only a few individuals.Here we have recalled that both communities are indeed dominated by very few numbers of species (i.e.Pseudopolydora pulchra, Spio decoratus, Chaetozone gibber...) but nevertheless we found common patterns valid for all species, whatever their total abundance.In future works we will apply and develop these methodologies to other biodiversity databases, in order to determine the degree of generality and potential universality of this scaling relationship.

Fig. 3 .
Fig. 3.The complementary cumulative distribution function (loglog plot) of polychaete abundance: (a) in Pierre Noire site (above) and (b) in Riviere de Morlaix site (below).The continuous lines correspond to the data, dotted lines to normal distributions obtained from the same mean and variance as experimental data, and dashdotted lines to a lognormal distribution with the same mean and variance as the logarithm of experimental data.

Fig. 4 .
Fig. 4. Species richness versus total abundance for each sampling date.Conditional averages are shown as continuous lines.

Fig. 5 .
Fig. 5. MSD as a function of polychaete species mean relative abundance.Species with high MSD values are the most abundant in the communities; conversely rare species have low MSD.Equation of the fit: MSD = 0.2 < r > 1.5 .

Table 1 .
List of most abundant species and contribution to total abundance for the PN site.

Table 2 .
List of most abundant species and contribution to total abundance for the RM site.

Table 3 .
List of most abundant Spionid species and contribution to total spionid abundance for the PN site.

Table 4 .
List of most abundant Spionid species and contribution to total spionid abundance for the RM site.

Table 5 .
Mean square drift (MSD) for the most abundant species of polychaetes in PN station.The third column is the normalised coefficient.

Table 6 .
Mean square drift (MSD) for the most abundant species of polychaetes in RM station.The third column is the normalised coefficient.