Influence of thresholding in mass and entropy dimension of 3-D soil images

With the advent of modern non-destructive tomography techniques, there have been many attempts to analyze 3-D pore space features mainly concentrating on soil structure. This analysis opens a challenging opportunity to develop techniques for quantifying and describe pore space properties, one of them being fractal analysis. Undisturbed soil samples were collected from four horizons of Brazilian soil and 3-D images at 45 μm resolution. Four different threshold criteria were used to transform computed tomography (CT) grey-scale imagery into binary imagery (pore/solid) to estimate their mass fractal dimension (Dm) and entropy dimension ( D1). Each threshold criteria had a direct influence on the porosity obtained, varying from 8 to 24% in one of the samples, and on the fractal dimensions. Linear scaling was observed over all the cube sizes, however depending on the range of cube sizes used in the analysis,Dm could vary from 3.00 to 2.20, realizing that the threshold influenced mainly the scaling in the smallest cubes (length of size from 1 to 16 voxels). Dm and D1 showed a logarithmic relation with the apparent porosity in the image, however, the increase of both values respect to porosity defined a characteristic feature for each horizon that can be related to soil texture and depth.


Introduction
Soil structure may be defined as the spatial arrangement of soil particles, aggregates and pores.The geometry of each one of these elements, as well as their spatial arrangement, has a great influence on the transport of fluids and solutes Correspondence to: A. M. Tarquis (anamaria.tarquis@upm.es)through the soil.Fractal geometry has been increasingly applied to quantify soil structure, using fractal parameters, due to the complexity of the soil structure, and thanks to the advances in computer technology (Tarquis et al., 2003, and references therein).The value of fractal parameters can be derived from indirect methods, such as water retention curves or directly through image analysis (Crawford et al., 1995).
Several authors maintain that the exact value of the mass fractal dimension cannot be readily calculated (Crawford et al., 1999;Bird et al., 2006;Perrier et al., 2006).Tel and Vicsek (1987), for instance, proposed practical methods to compute it, indicating that the standard methods for determining fractal dimensions must be applied with some caution.The main difficulty is that the ideal limit cannot be reached in practice (Buczhowski et al., 1998).Moreover, there is an effect of the image manipulation on D m and D 1 values (Babeye et al., 1998).
Computed tomography (CT) has provided an alternative for observing intact soil structure (Anderson et al., 1988;Warner et al., 1989;Grevers and Jong, 1994;Peyton et al., 1994;Perret et al., 1997Perret et al., , 1998Perret et al., , 1999Perret et al., , 2003;;Rasiah and Aylmore, 1998a, b;Rogasik et al., 1999;Gantzer and Anderson, Published by Copernicus Publications on behalf of the European Geosciences Union and the American Geophysical Union.2002; Pierret et al., 2002;Anderson et al., 2003;Rachmant et al., 2005;Gibson et al., 2006).The principal benefits of CT techniques are: reducing the physical impact of sampling, providing three-dimensional (3-D) information and allowing rapid scanning to study sample dynamics in near real-time (Rasiah and Aylmore, 1998b).Because of these benefits, CT scanning has been used to extract fractal dimensions related to soil structure.Peyton et al. (1994) evaluated the fractal dimension of macropore-scale density.Zeng et al. (1996) calculated fractal lacunarity in a silt loam soil.Several authors have dedicated their attention to the appropriate pore-solid CT threshold, before calculating mass fractal and surface fractal dimensions (Rogasik et al., 1999;Gantzer and Anderson, 2002;Perret et al., 2003;Rachmant et al., 2005).Recently, Gibson et al. (2006) compared three fractal analytical methods to quantify the heterogeneity within soil aggregates; in this work, the frequency distribution of pore and solid components was clearly dependent on thresholding, which could not be generalized.
As far as we know, they didn't quantify this effect on D m and D 1 .The aim of the present study is to evaluate the effect of the image thresholding value as well as the cube size on the calculation of mass fractal dimension (D m ) and entropy dimension (D 1 ).To this end, soil images from four horizons, obtained from a Brazilian Argissolo (Melo and dos Santos, 1996) were analyzed to obtain these fractal dimensions applying four different thresholds.

Soil studied
Intact soil samples were collected from four horizons of an Argissolo in the Brazilian Soil Classification (EMBRAPA SOLOS, 2006), or Ultisol (FAO Soil Classification), formed on the Tertiary Barreiras group of formations in Pernambuco state (Itapirema Experimental Station) presenting a hardsetting behavior, found throughout the coastal tablelands of northeast Brazil.The natural vegetation of the region is tropical, coastal rainforest.Macromorphology and micromorphology, mineralogy, as well as key physical and characteristics of this soil, have been studied, from a genetic perspec-tive, by Melo and dos Santos (1996).Physical characteristics, of relevance to the current study, are provided in Table 1.

CT imaging and image pre-treatment
The intact soil samples were imaged using an EVS (now GE Medical.London Canada) MS-8 MicroCT scanner.Though some samples required paring to fit the 64 mm diameter imaging tubes, field orientation was maintained.Imaging parameters were 155 keV and 25 µA.
Proprietary software (GE Medical), was used to reconstruct the 16-bit, 3-D imagery from the sequence of axial views.The resulting voxel size was 45.1 µm.File sizes ranged from 70 to 200 Mb, which made subsequent processing of the entire volume practically impossible.Accordingly, one subvolume was extracted from each of the four original volumes (using GE Medical Microview); care was taken to ensure no edged effect of the subvolume.The subvolumes measured 256×256×256 units, corresponding to about 16.8 million voxels.A 3-D Gaussian filter in MicroView (GE Healthcare, 2006) was also run on each sub-volume to reduce noise, typical of CT imagery.An example of a typical 3-D imagery is provided in Fig. 1.

Binary thresholding of CT imagery
CT imagery of soil, like other digital imagery, typically contains a large proportion of mixed-voxels (voxels whose digital number is the weighted average of more than one constituent -such as a solid/air interface).To facilitate identification of constituent peaks in the grey-scale histogram, a 3-D filter, executed in NIH ImageJ (Rasband, 2006) was run on each sub-volume to mask voxels which differed by more than 0.1% from the surrounding neighborhood of 124 voxels (5×5×5 unit volume).Full details of this technique can be found in Elliot and Heck (2007).Histograms of the unmasked voxels were subsequently ported into OriginPro (Origin Lab Corporation, 2006); after smoothing the histograms (adjacent averaging of 25 levels), peaks were identified in the Peak Fitting Module.The major peak with the lowest mean digital number was taken to be that corresponding to the void space; the next major peak was considered to be solid material.Based on the central tendency and dispersions (assuming Gaussian distributions) of the two peaks,

Mass fractal dimension
The binarized image is considered to represent two basic phases: pores and solid.Fractal analysis (FA) in 3-D im-ages involves partitioning the space into cubes to construct samples and recording the number of cubes which cover the pores phase; this is repeated for different size cubes (Perret et al., 2003).
The cube-counting (CC), similar to the box-counting method in 2-D, combines voxels to form larger, mutually exclusive cubes each containing a different set of voxels.Given an L×L×L-voxel image, partitioned to a cube size of δ×δ×δ, the number of cubes (n(δ)) will follow the proportion of line size δ:  one pixel belonging to the pore space (N(δ)) is recorded.
Being N j (δ) the number of cubes of size δ with j pixels belonging to the pore space (with a value from 1 to δ×δ×δ): For a fractal set a log-log plot of N(δ) vs. δ gives: which yields a line of slope equal to −D m ; being D m the mass fractal dimension.It is expected that, as the object fills the space in 3-D, D m will approach the Euclidean dimension (E) of three.
Given the relation (1) it is now instructive to seek bounds on the values the function N(δ) can take based on Bird et al. (2006) work.We denote the fraction of the image occupied by pore phase at the finest resolution (i.e.porosity) by p.If p=1 the number of cubes required to cover the set is L r 2 and this is trivially an upper bound for N(δ) when p<1.In order to derive a lower bound, we consider the situation in which the cubes cover the set but no part of the complementary set.If this were to occur, then the number of cubes is equal to L r 2 f .In general, the cubes will cover part of the complementary set and consequently the former number of cubes is a lower bound for N(δ).We may now write that N(δ) must satisfy the inequalities: In terms of the log-log plot used to extract a mass fractal dimension, these inequalities result in two parallel reference  lines of slope −3, between which the measured data must lie.
The vertical spacing between these lines is equal to log(p).For all images sharing a common value of p, the cube counting data will lie between these bounding lines.Analytically, we can conclude that the higher the porosity, the closer the boundary lines are and then D m value will be closer to 3.

Entropy dimension
This calculation in 3-D imagery involves partitioning the space into cubes to construct samples with multiple scales.
The cube-counting method (CC), similar to box-counting method in 2-D, combines voxels to form larger, mutually exclusive cubes each containing a different set of voxels.Given an L×L×L-voxel image, partitioned to a cube grid of size δ×δ×δ, the fraction of pore space (µ i ) in each n(δ) cubes (density) is calculated from: where m i is the number of pore class voxels in cube i, and m T is the total number of pore class voxels in an image.In this case, the pore density is the measure and the cube grid the support.When computing the number of cubes of size δ, the possible values of m i range from 0 to δ×δ×δ.So, if N j (δ) is the number of cubes containing j voxels of pore volume in a given grid, Eqs. ( 4) and (5) can then be combined (Barnsley et al., 1988): where: Nonlin  By using the distribution function N j (δ) it simplifies calculations and reduces computational errors (Barnsley et al., 1988).The entropy of the system is estimated through this dimension by the relation (Feder, 1989): The higher the D 1 is, the lower the information (higher uncertainty) we have on the distribution of pore/solid fractions achieving a higher homogeneity.Contrary, the lower the values of D 1 , the more information (lower uncertainty) we have on this distribution and then lower homogeneity.The lower and upper limits of S(δ) were calculated in function of the porosity (p) for 3-D binary images given: the bounding functions when included on the plot of entropy against ln(δ) again yield two parallel lines of slope 3, with separation of ln(p).The higher the porosity, the closer the two boundaries and D 1 will approach 3.

Thresholding methods
As indicated in Table 2, the mean of the modal values for the air and solid distributions consistently resulted in the largest thresholding values, followed by the equi-probability value for the two distributions.Though the thresholding values corresponding to the other two criteria were consistently lower than those obtained from both distributions, there was no consistent trend between the air mean and the lower 3rd standard deviation of the solid distributions.A variation of specific thresholding values, among subvolumes of a given sample, suggests the high variability in the intensity field obtained from the CT scan.As indicated in Fig. 3, apparent porosity was more sensitive to the selected thresholding criteria in the Bt/Bw horizons than for the A2 and AB horizons, which followed a more linear trend.However, Bt/Bw is less sensitive to thresholding criteria than the Bt2 horizon.

Mass fractal dimension
For all horizons, the porosity obtained varied as a function of the threshold method applied (Table 3).As expected, threshold method D gave the higher porosity, from 24% for the A2 horizon to 28% for the AB horizon, and threshold method A gave the lowest, from 8% for the Bt/Bw horizon to 10% for the AB horizon.In all cases, we obtain statistically significant straight-line fits (R 2 >0.98) for the full range of box sizes considered (from 1 to 256 voxels) as it is shown in Table 3 (D m1 ).The mass fractal dimensions derived from these calculations are very close to 3, with the lowest being 2.70 for A2 horizon using threshold method A, and the highest 2.94 for horizons AB, Bt2 and Bt/Bw, when applying method D.
Comparing among horizons (Table 3), the A2 horizon always exhibited the lowest value in D m and Bt2 the highest value, regardless of the thresholding method.In order to closely examine the relationship between porosity and mass fractal dimension, a bi log plot of N(δ) and δ is shown in more detail (Fig. 4).It is now apparent that soil porosity is mainly affecting the smallest voxels from length size δ=2 to 16 that influence the D m value obtained.If we reduce our estimation of D m to this size range (D m2 ) values are lower and show more variation (Table 3).From now on, we will use only D m1 and named it as D m .
At the highest porosity values (Table 3), the differences in D m among the horizons are much smaller than at the lowest values.Moreover, it was observed that, for each horizon, the trend was not linear, but rather showed several increments that can be fitted by a logarithmic curve.To study this observation closer, several plots were done to see the variation of D m versus porosity for each horizon and sub-voxel (Fig. 5).What can easily be seen is that for a certain horizon it didn't matter what the porosity for each sub-voxel and threshold was, all of them follow a pattern.This pattern is different depending on the horizon.In the case of the horizon AB, subvoxel c is different from the other two, highlighting a possible problem in the image due to a beam hardening effect that was verified.For this reason, we didn't include sub/voxel c in the logarithmic regression for this horizon.
Comparing these results to the particle size distribution (PSD) and depth for each horizon (Table 1), it can be observed that the most superficial one (A2) and with higher coarse sand percentage (62%) presents the steepest curve with respect to porosity (Fig. 5).Next horizon, AB, reduces the curve convexity as it shows a reduction to 26% of coarse sand and its depth (lower than 35 cm) protect it from any tillage practice.With the last two horizons, the clay percentage increases to double (35%) and has a low percentage in coarse sand.The curves of D 0 versus porosity are quite close and show the lower coefficients values multiplying Ln(porosity).

Entropy and correlation dimension
After calculating S(δ) for all binary images, it was plotted against log(δ); a clear linear pattern was observed in all plots (data not shown).The same comments that we have made for D m could be applied here too.The higher the threshold value (from A to D), the higher the porosity and D 1 increases approaching the value of 3 and the s.e.decreases.At threshold D all D 1 are higher than 2.90, pointing out a high homogeneity.As we did for mass dimension, we plotted D 1 for each horizon versus the porosity of the soil at that threshold value (Fig. 6) realizing again that each sub-voxel differs in the porosity and D 1 associated, the three of them varies D 1 similarly when the threshold is pushed forward.For example, this trend for horizon A2 is much more linear for D 1 case than for D m values.At the same time, AB shows a big difference in sub-voxel c with respect to sub-voxel a and b.Once these fractal dimensions versus porosity have been analyzed, all the horizons are plotted together for D 1 and D m (Fig. 7).In both it is clear that A2 shows a very distinctive trend more than the other three horizons pointing out the influence of a high percentage of coarse sand and being the most superficial horizon that can be affected by tillage prac-

Conclusions
3-D CT images from undisturbed soil samples were obtained from four horizons and three adjacent positions, having a total set of 12 samples.In order to describe the porosity structure, four threshold values were applied to convert each image into binary and then calculate D m and D 1 as a quantification of soil morphology.
The sensitivity of each horizon to the threshold value on porosity was revealed indicating the differences in the CT unit values histogram among horizons although with these types of analyses, results cannot reflect the space arrangement of pores.
Linear scaling was observed over all the cube sizes for both fractal dimensions, however depending on the range of cube sizes used in the linear regression, the values obtained changed.For example, in one of the images D m could vary from 3.00 to 2.20.Observing the log-log plots to estimate the dimensions, we conclude that threshold influenced mainly the scaling in the smallest cubes (length of size from 1 to 16 voxels).D m and D 1 showed a logarithmic relation with the apparent porosity in the image for the 12 samples studied.Plotting all the data of porosity and its D m or D 1 independently of the threshold applied, the increase of both with respect to porosity defining a characteristic feature for each horizon that indirectly differentiate their structures.These differences can be explained based on the texture and depth of each horizon.These D m variations are in agreement with the functional box-counting concept presented by Lovejoy et al. (1987) to extract the multiple dimensions of multiscale fields.Further research is necessary to present a wider variability in soil texture and applying this type of analysis so a statistical analysis can be made in order to estimate these relationships.However, from these results we can conclude that the higher the porosity, the harder it is to differentiate the differences in D m and D 1 among horizons.
This indicates that fractal/multifractal analysis of the CT unit values could be applied to be able to choose an optimal thresholding as it is already used in many areas of geoscience.

Figure 1 .
Figure 1.Typical grey-scale CT imagery (orthogonal planes) of sub-volumes for each of the

Fig. 1 .
Fig. 1.Typical grey-scale CT imagery (orthogonal planes) of sub-volumes for each of the horizons studied.Dark regions correspond to less attenuating regions (pores), lighter areas to solid components.Z-axis is vertical, with Z+ representing top of sample.Length of edge of cube is 11.54 mm.

Figure 2 .
Figure 2. Graphical representation of the different thresholding criteria, used to binarize the frequency distribution of CT values: A) CT value corresponding to 0.01% distribution of the CT values distribution of solid, B) mean of CT values of air ( CTair µ ), C) equi-probability CT values of air and solid and D) average of mean CT values of air ( CTair µ ) and mean CT values

Fig. 2 .
Fig. 2. Graphical representation of the different thresholding criteria, used to binarize the frequency distribution of CT values: A -CT value corresponding to 0.01% distribution of the CT values distribution of solid, B -mean of CT values of air (µ CTair ), Cequi-probability CT values of air and solid and D -average of mean CT values of air (µ CTair ) and mean CT values of solid (µ CTsolid ).

Figure 3 .
Figure 3. Soil porosity versus CT units of threshold for each horizon and sub-volume.Linear fits for each horizon represents the change of porosity per CT unit.

Fig. 3 .
Fig. 3. Soil porosity versus CT units of threshold for each horizon and sub-volume.Linear fits for each horizon represents the change of porosity per CT unit.

Fig. 4 .
Fig. 4. Log-log plot of N (δ) versus cube size (δ) at different threshold values for each horizon and sub-volume a. Left columns shows all the size range, right column shows from δ=2 till δ=16 voxels.Gray line is the maximum line boundary, blue and red lines are the minimum line boundary respectively for B and D threshold values.

Fig. 5 .
Fig. 5. Mass fractal dimension (D m ) versus porosity for each horizon and sub-voxel.Logarithmic regression lines and their corresponding equation are included in each graphic.

Figure 6 .
Figure 6.Entropy fractal dimension ( 1 D ) versus porosity for each horizon and sub-voxel.Logarithmic regression lines and

Fig. 6 .
Fig. 6.Entropy fractal dimension (D 1 ) versus porosity for each horizon and sub-voxel.Logarithmic regression lines and their corresponding equation are included in each graphic.

Fig. 7 .
Fig. 7. Mass fractal dimension (D m ) and entropy fractal dimension (D 1 ) versus porosity for each horizon.Horizon AB is represented only by sub-voxels a and b.

Table 2 .
Threshold values obtained applying different methods: A) CT value that represent the 0.1% of the CT values distribution corresponding to solid; B) average of CT values corresponding to air; C) intersection point of frequency distribution of CT values corresponding to air and solid and D) mean of average CT values corresponding to air and average CT values corresponding to solid.

Table 3 .
Average (three replicates) porosity and mass dimension (D m ) for each horizon and thresholding criteria based on a range from δ=2 till δ=256 voxels (D m1 ) and from δ=2 till δ=16 voxels (D m2 ).For all linear fits the R 2 obtained was higher than 0.98.
m D ) versus porosity for each horizon and sub-voxel.Logarithmic regression lines and their corresponded equation are included in each graphic.
Table 4 shows the results obtained from function of soil horizon and threshold applied.

Table 4 .
Entropy dimension (D 1 ), average of each horizon based on the three replicates applying four different threshold values.Horizon Bt2 shows the minimum slope in variations of D m and D 1 versus the porosity and therefore, versus threshold, similarly with low content in coarse sand, low content in silt and 35% in clay.AB and Bt/Bw are inside the area marked by the other two horizons.Bt/Bw has the lowest content in coarse sand (18%), a higher content in clay (35%) as Bt2 but higher content in silt compared with the rest of the horizons (10%).