Professional Documents
Culture Documents
Geoderma
journal homepage: www.elsevier.com/locate/geoderma
Faculty of Agriculture and Environment, The University of Sydney, 1 Central Avenue, Australia Technology Park, Eveleigh, NSW 2015, Australia
EcoSciences Precinct, Department of Science, Information Technology and Innovation, GPO Box 5078, Brisbane, QLD 4001, Australia
a r t i c l e
i n f o
Article history:
Received 16 April 2015
Received in revised form 31 July 2015
Accepted 10 August 2015
Available online xxxx
Keywords:
Geostatistics
Spatial variability
Soil prole
Depth function
Area-to-point kriging
a b s t r a c t
Datasets for modelling and mapping soil properties often consist of samples from many spatial locations, collected from several different soil depth intervals. However, interest may lie in the spatial distribution of the property
for a particular target depth interval, which may or may not correspond to the sampled intervals. It is the task
of the data analyst to put the data together in such a way that useful and reliable conclusions can be drawn for
the soil depths of practical interest. Previous studies to tackle this problem include multi-stage approaches
and point-data-based 3-dimensional geostatistical approaches. One disadvantage of a multi-stage approach
for example, rst tting splines to the data for sampled proles, then imputing new data for the target interval,
before considering a spatial analysis with the imputed data is that the imputation generally ignores any uncertainty in the imputed data, which might give misleading conclusions. Point geostatistical methods, on the other
hand, assume that the data represent the value of the target variable at a specic point in the prole, rather than
its average over a sampling interval; this too could give misleading estimates. In this work, we present a statistical
method that properly deals with the sample support of soil prole data so that all data can be considered in a
single geostatistical analysis. The approach is based on the area-to-point kriging framework, which can be
used to represent the uncertainty from data that are averages over non-negligible sample supports (in our
case, the different sampled depth intervals). We combine a covariance model for the increment-averaged data
in the vertical domain with another model for the horizontal variation. This enables us to (i). process all data
in a single analysis, and (ii). calculate predictions for any target depth and support based on the same statistical
model. We test the approach on data from the MurrayDarling basin in eastern Australia, where interest lies
in mapping various soil properties that could have an effect on water salinity of the nearby Muttama Creek:
we illustrate the methodology for predicting clay content. Finally we discuss a number of possible extensions
of the methodology to broaden its applicability, which should provide the basis of further studies.
2015 Elsevier B.V. All rights reserved.
1. Introduction
Soil properties vary signicantly both across the landscape and
through the soil prole, and interest lies in characterizing and mapping
this variation to provide land users with useful information. Datasets
often consist of samples from many spatial locations, at several different
depth intervals. Within a particular study, these depth intervals may
be xed (e.g. 010 cm, 1020 cm, and 2030 cm). Other studies may
consider different xed intervals, or sampling intervals that are dened
according to soil horizons and therefore vary between locations within
the study. It is then the task of the data analyst to draw useful and
reliable conclusions for soil depths of practical interest. For example,
the GlobalSoilMap project specications (Arrouays et al., 2014) dictate
that soil properties should be mapped for depth intervals of 05 cm,
515 cm, 1530 cm, 3060 cm, 60100 cm and 100200 cm.
Corresponding author.
E-mail address: Thomas.Orton@dsiti.qld.gov.au (T.G. Orton).
http://dx.doi.org/10.1016/j.geoderma.2015.08.013
0016-7061/ 2015 Elsevier B.V. All rights reserved.
a mixed-model approach for estimating depth functions, whilst properly accounting for the interval support of the data; their focus was on the
estimation of depth proles, whereas our focus is more on the use of
such data for modelling and mapping using spatial datasets of soil
horizon data. Other 3-D approaches (e.g. Poggio and Gimona, 2014;
Veronesi et al., 2012) generally suffer the same drawback; all data are
assumed to have identical vertical support, which ignores their different
uncertainties.
In a geostatistical framework, the sample support of data that are
averages of an attribute over non-negligible areal units can be dealt
with by area-to-point kriging (ATP kriging; Kyriakidis, 2004). This
method allows the sampling units and prediction supports to all have
different sizes and shapes. It has been applied in several case studies
in recent years to analyse areal-averaged data (Kyriakidis and Yoo,
2005; Kerry et al., 2012; Schirrmann et al., 2012; Truong et al., 2014).
Although usually carried out to account for the horizontal support (i.e.
the data are areal averages), there is no reason that the same methodology cannot be carried out to deal with the vertical support of soil prole
data (i.e. for data that are measurements of the average value of a soil
property over depth intervals). This was noted in Heuvelink (2014),
although we are unaware of any studies that have implemented such
an approach.
In this work, we combine the ATP approach for the vertical distribution with standard kriging approaches for the horizontal distribution.
Thus, a statistical model for the complete dataset (all spatial locations
and all depth intervals) is dened, with the support of each datum (a
combination of spatial location and depth interval) properly represented. We refer to this model for increment-averaged data, and the predictions built on the model, as increment-averaged kriging (IAK). We
propose that this all-in-one model should provide a better assessment
of prediction uncertainty compared with a two-stage approach, or an
approach that represents interval data by their mid-points (although
comparison of the different methodologies is not undertaken in the
current study).
We consider the methodology in the framework of a linear mixed
model (LMM; Lark et al., 2006). Thus, part of the variation of the target
variable can be explained by a collection of explanatory variables, with
the remainder being modelled as spatially dependent (i.e. data close
to each other in horizontal space and at similar depths are more likely
to be similar than data far apart in space and at different depths). We
allow interactions between depth and the spatial explanatory variables,
so that different relationships can be modelled at different depths in the
prole. We also allow the variance parameters of residuals to depend on
depth, which provides a mechanism to represent different uncertainties
at different depths in the prole.
Usually in ATP-kriging studies, the average covariances must be
calculated numerically (by a discretization approach), due to the complex nature of the areal data units in 2-D space. However, for our
increment-averaged data, the average covariances can be computed
analytically. We derive an expression for the covariance of the
increment-averaged data, based on an exponential model for the point
covariances. This signicantly reduces the computational load of maximum likelihood methods compared with numerical procedures. Nonetheless, for large datasets (when the total number of data is more than a
few thousand), likelihood approximation techniques may have to be
used (e.g. Stein et al., 2004; Eidsvik et al., 2014); we do not consider
these here though.
We test the proposed IAK approach on data from the Murray
Darling basin in eastern Australia, where interest lies in mapping
soil properties that could have an effect on water salinity of the
nearby Muttama Creek. Soil cores (to a depth of 1 m) were collected from 55 spatial locations over the Muttama catchment, and each
core was divided into horizons, giving a total of 192 samples. We
use this case study to illustrate the IAK approach, mapping clay
content and its attendant uncertainty based on the data from
these samples.
175
2. Theory
Throughout the following, we will assume that the horizontal
support of the data and of the prediction is point support. The method
can be extended to deal with data that are both areal- and depth-wise
averages, if this were to be required in another study. We begin our
presentation of the methodology with a simple stationary model for
the point covariances. We then extend this model to a more realistic
one, allowing variances to depend on depth, before describing how
this relates to the average covariances required to model the variation
of increment-averaged data.
2.1. IAK model: initial stationary model for point covariances
We begin our development towards a statistical model for the analysis of depth interval-averaged data by considering a 3-D model for
point data (i.e. with depths, d, taken to be xed points):
yx; d x; d x; d
where X(x, d) contains the known values of the covariates and is the
vector of associated parameters (to be estimated). This is known as the
xed-effect function, and X(x, d) constitutes a row of the xed-effect
design matrix. We assume that the residuals, (x, d), follow a multivariate normal distribution with mean zero and covariances depending
only on the horizontal and vertical separation distances (this assumption will be relaxed in Section 2.2). As a rst approach, we assume a
separable (product) covariance model (De Iaco et al., 2011):
CovY x; d; Y x 0 ; d 0 2 x h x ; x d hd ; d
Nm
X
si x;i h x ; x;i
i1
1
if hx 0
is the nugget correlation function;
0
otherwise
x;i(h x ; x;i), i = 1, , Nm, are Nm spatial correlation functions, with
parameter vectors x;i; parameters si, i = 1, , Nm are the proportions of
variance associated with each of the Nm spatial correlation functions;
where: x;0 hx f
176
m
and the parameter s0 1i1
si gives the proportion associated
with the nugget variation.
Reparameterizing in terms of ci = si 2, i = 0, , Nm, (i.e. the variances associated with the nugget and each of the Nm spatial correlation
functions), we can write the full product covariance model as:
Eq. (1) presented the statistical model for the point-support variable,
with mean function given by Eq. (2) and covariance function Eq. (8).
However, our data are given as depth interval averages, and we must
use these point-support models to calculate expectations and covariances for the interval support. In the geostatistical literature, this
process is known as regularization (e.g. Goovaerts, 2008), with its
reverse the use of interval-support data to infer point-support
models known as deconvolution.
We assume that the measurement of variable Y for an interval I =
[u, l] (where l N u) represents the arithmetic mean of point values of Y
within this interval. The depth interval-averaged variable is then a linear combination of multivariate normal variables, and is also multivariate
normal (see e.g. Kyriakidis and Yoo, 2005). Its mean and covariance
matrix are given by interval averages of the respective statistics for
the point-support variable.
First, the expectation on interval support is:
CovY x; d; Y x 0 ; d 0
8 N !
m
X
>
>
>
ci d hd ; d
>
<
if hx 0
i0
!
:
Nm
otherwise
X
>
>
>
>
ci x;i hx ; x;i d hd ; d
:
i1
5
We work with this form herein.
For the vertical correlation, we will assume an exponential function.
One property of the exponential function that we will make use of in
this work is that it is integrable; this will allow us to derive an analytical
function for the average covariances (that represent the correlation
between observations of depth-interval averages), rather than requiring a numerical procedure to approximate them.
2.2. IAK model: non-stationary variances
ci x; d; x 0 ; d 0 f i x; d f i x 0 ; d 0 :
Although the fi(x, d) functions can in theory be chosen to model variances that depend on spatially-varying covariates, here we consider
only the dependence on depth. In particular, we will consider:
f i x; d f i d P ri d; i
where Xx; I is the average of the point-support design matrix, X(x, d),
8
>
>
>
>
<
x; I Xx; I
P ri d; i P ri d 0 ; i d hd ; d
if hx 0
0
i0
!
:
Cov Y x; d; Y x0 ; d
Nm
otherwise
X
>
>
0
>
>
P ri d; i P ri d ; i x;i hx ; x;i d hd ; d
:
i1
8
We write the complete set of covariance parameters as =
{ x , d, }, where contains all of the i s and x contains all of
the x;i s.
1
jIj jI 0 j
Z Z
dI
CovY x; d; Y x 0 ; d 0 dd 0 dd 10
d 0 I 0
where |I| and |I| are the lengths of the intervals I and I, respectively
(Kyriakidis, 2004). We use as shorthand C to denote the full data covariance matrix with elements dened by Eq. (10).
In ATP kriging, these average covariances are usually computed
numerically, by discretizing the areal units into a number of points,
calculating covariances between these points, and averaging the values.
Such a numerical procedure is necessary because the integrals (to
compute these averages) of the point covariance function over irregular
two-dimensional areal units are analytically intractable. However, in
our case, the integrals are in one dimension (depth) and as a result
are analytically tractable for certain point-covariance functions. In this
work, we consider the exponential model for the depth-wise correlation
function, d(hd; d), where d = ad is a single distance parameter
(approximately one third of the effective range of correlation), which
allows analytical results for the 1-D interval-averaged covariance. This
signicantly reduces the computational load compared with a numerical procedure, particularly when maximum likelihood methods are
used for parameter estimation. The interval-averaged exponential
covariance function is derived in the Supplementary material and
presented in Appendix A. Herein, we refer to the methods described
in this section for increment-averaged data, and the predictions built
177
3. Methods
yi 0 1 d i 2 d 2i 3 w i 4 d i w i 5 d 2i wi i ;
i 1; ; N;
14
Parameter estimation for kriging is often carried out by a method-ofmoments approach. For ATP-kriging, Goovaerts (2008) presents an
iterative method-of-moments approach to perform deconvolution and
estimate parameters of a point-support variogram. For point-support
data, Lark (2000) demonstrated theoretical advantages of maximum
likelihood (compared to method-of-moments) to estimate parameters,
and this approach was suggested by Kyriakidis (2004) as an alternative
for ATP-kriging parameter estimation. A further improvement over
maximum likelihood is residual maximum likelihood (REML), introduced by Patterson and Thompson (1971) to reduce the bias in variance
parameters as a result of the unknown xed-effect parameters (Lark
et al., 2006). We t parameters for the IAK model using REML; the
REML formula is exactly the same as in the usual case, but with Xx; I
in place of X(x, d) to give the xed-effect design matrix X , and
Eq. (A6) used to calculate the elements of the covariance matrix, C:
1 1 T 1
ln C ln X C X
2
2
T 1 1 T 1
1 T 1
1
y C C X X C X
y ;
X C
2
lnR j y k
11
12
13
Eqs. (12) and (13) can be used in Wald tests to determine the significance of particular covariates.
3.2. IAK xed-effect and covariance model selection algorithm
We model the horizontal and vertical trends by considering interactions between the spatial covariates and depth. By doing this, different
spatial trends can be represented at different depths. We assume that
our spatial covariates vary in space only; thus, the only mechanism
to represent different trends with depth is some kind of interaction
between these spatial covariates and depth. Throughout the following,
we reserve the term predictors to refer to the columns of the xedeffect design matrix (which may include interactions with depth), and
use covariates or input variables to refer to the original spatial covariates (without interactions with depth). If a model includes predictors
based on a categorical input variable of three classes (without depth
interactions), then removing this single input variable from the model
would reduce the number of predictors by two.
We begin with a large pool of potential spatial covariates,
from which we which we initially remove redundant (highly correlated) covariates. Following Bishop et al. (2015), we identify pairs of
15
and variance:
1
16
178
4. Case study
Fig. 1. The study area in the MurrayDarling basin, eastern Australia. Coordinates are relative to an origin south west of the study area. The fty estimation data proles are shown by
crosses and the ve numbered validation data proles by open circles. Note the close proximity of four of the validation locations to data points, so that their symbols overlap.
mobilisation from landscapes, and its spatial variability, are vital for
salinity control and effective management of the land. Many soil variables affect the release of salt into waterways; here we focus on soil
texture, in particular clay content.
Soil cores were collected in 2013 from 55 locations across the study
area (Fig. 1). Each soil core was taken to a depth of 1 m (or less where
shallower soil did not permit this), and divided into horizons. All locations provided between three and six horizons giving a total of 192 samples available for laboratory analysis. Amongst a number of soil
properties, clay content was measured using the hydrometer method.
Fig. 2 shows histograms summarizing these data at three depths in the
prole (based on the midpoints of sampling intervals, d, for display
purposes only); upper ( d 0:15 m ), middle ( 0:15bd0:5 m ), and
lower (dN0: 5 m). The data at each depth appear reasonably symmetric,
and we proceed with analysis under the assumption that clay content is
a Gaussian random variable. The right-hand panel of this gure shows
all of the data plotted with the lengths of lines indicating the sampling
intervals. This shows the range of sampling intervals in the dataset,
and the increasing trend in clay content down the prole. Thicker bars
occur where multiple data are very similar.
The aim of this study is to model the clay content data in all samples
and map it over the study area for any required depth interval of interest. As detailed previously, we choose the three depths of 010 cm, 50
60 cm, and 90100 cm for mapping, to illustrate differences in the spatial distribution of clay content down the soil prole. For modelling and
mapping, we utilize spatial covariate data on 29 covariates, as listed in
Table 1. For further details of these covariates, we refer to Bishop et al.
(2015). We acknowledge here that this dataset of just 55 spatial locations does not provide the sternest test of the methodology. At this
stage, the aim of the case study is more to illustrate the potential of
the methodology to deal with a dataset of various sampled depth
179
Fig. 2. Histogram plots (left) of the clay content data in the upper (sample midpoint d 0:15 m), middle (0:15 mbd 0:5 m), and lower (dN0:5 m) soil proles. The right-hand plot shows all
data plotted with vertical lines representing the sample depth intervals.
180
Table 1
Summary information of the available covariates.
Category
Spatial
support/scale
Source
Spatial coordinates
Digital terrain attributes
n/a
90-m raster
n/a
NASA
105-m raster
ABARES
Geoscience Australia
Radiometrics
Land use
Geology
model was best, with orders r0 = 2 and r1 = 1 selected for the nugget
and spatial standard deviations, respectively (the black and grey dotted
lines in Fig. 3). Both components were smallest at the top of the prole.
The nugget, represented by a quadratic function of depth, reached a
maximum at around 60 cm before decreasing, whilst the spatial standard deviation continued to increase down the prole. The selected
Gaussian model gave a smaller AIC than the three pure nugget models,
indicating that there is some spatial correlation in the residuals from the
tted trend model. We work with the Gaussian spatial correlation
model with r0 = 2 and r1 = 1 for IAK herein.
5.3. Fixed-effect model selection
5. Results
Wald tests were used to remove predictors from the full design
matrix that did not contain useful information for predicting the target
variable. The procedure presented in Section 3.2 resulted in a design
matrix with 21 columns (reduced from 69), based on 8 different spatial
variables (4 digital terrain model variables, 3 radiometrics variables, and
the geological classes, Fig. 4). We report the tted xed-effect model,
applied for three depth intervals: 010 cm, 5060 cm and 90100 cm
(Table 3a). To apply the model for depth interval [u, l], where the
model contains the three terms, 1 elev, 2 d elev and 3 d2 elev
(where elev is the elevation), for example, we present the coefcient
17
where ij is the prediction of yij (the validation datum for the jth of the ni
layers of validation prole i), uij and lij are the bounds of its sampled
depth interval, ui1 is the upper bound of the top layer for prole i (0 in
each case), and lini is the lower bound of the bottom layer for prole i.
The full pool of spatial covariates (Table 1) consisted of 23 continuous variables (horizontal coordinates, DTM-derived and radiometrics
variables) and two categorical variables (land use and geology, both
with three classes). Highly-correlated covariates were removed according to the algorithm detailed in Section 3.2. This resulted in ve of the 23
continuous covariates being removed from the predictor pool (total
dose, Th and U from the radiometrics variables because of high correlations with dose rate; ratio U:Th because of its high correlation with ratio
U2:Th; length-slope factor due to its correlation with slope). A full
design matrix was formulated for IAK based on interactions between
these spatial covariates and both depth, d, and d2. (Recall that interactions between the spatial covariates themselves were not considered.)
This matrix contained 174 rows (for the 50 spatial locations with data
for three to six horizons at each) and 69 columns: 22 (18 columns for
continuous predictors + 4 columns for categorical predictors) multiplied by 3 (no interaction, interaction with d, interaction with d2) plus
3 (terms for the constant, d and d2).
5.2. Spatial covariance model selection
Twenty-one different covariance models were tted using the full
design matrices to give xed effects. Exponential and Gaussian spatial
covariance models were compared through their AICs, with differentordered polynomials of d used to give the nugget and spatial standard
deviations (Table 2). A pure nugget model, representing no spatial
correlation, was also compared. The results suggest that the Gaussian
Table 2
AICs of the 18 tested covariance models; r1 is the order of the polynomial used to model
the square root of the spatial variance, r0 is the order for the square root of the nugget.
Selected model is shown in bold type.
Pure nugget
Nugget
r0
0
1
2
691.9
666.1
662.6
Exponential
Gaussian
r1
r1
680.5
665.7
664.3
664.4
658.1
656.6
661.4
657.9
657.8
679.2
661.7
660.1
664.5
657.0
654.2
661.2
657.8
655.8
181
The residuals from the IAK xed-effect function were modelled with
a Gaussian correlation model (with effective range of correlation 12 km)
with nugget effect. The variance parameters of this model depended on
depth, as shown in Fig. 3 (the black and grey solid lines, for the nugget
and spatial standard deviations, respectively); the functions are very
similar to those tted based on the full xed-effect design matrix.
The increasing standard deviations down the prole reect larger
uncertainty with depth. The vertical correlation model had an effective
range of 70 cm.
5.4. Validation
Fig. 3. Fitted functions for the nugget, (f0(d), black lines) and spatial (f1(d), grey lines)
standard deviations (see Eqs. (6) and (7)). Dotted lines are the functions tted with the
full xed-effect design matrix, solid lines are tted after removal of insignicant xed
effects.
Fig. 4. The eight selected covariates. aacn: altitude above channel network, mrrtf: multi-resolution ridge-top atness index, wndx: weathering intensity index.
182
Table 3
Coefcients and standardized coefcients of the tted xed-effect function, presented for three illustrative depths. The three largest standardized coefcients are highlighted for
each depth.
Depth, m
Int
(a)
00.1
0.50.6
0.91.0
dosef
wndx
(b)
00.1
0.50.6
0.91.0
5.87
5.87
5.87
elev
mrrtf
0.0121
0.0193
0.207
1.81
2.89
31.0
0.608
4.64
8.84
1.08
8.22
15.6
aacn
0.0112
0.365
0.666
slope
g1
g2
46.1
132
275
5.92
30.6
50.3
14.5
20.0
24.3
5.92
30.6
50.3
14.5
20.0
24.3
0.373
12.2
22.3
3.41
9.79
20.4
Int: intercept, mean for reference geological class, other; K: potassium; dosef: dose rate; wndx: weathering intensity index; elev: elevation; mrrtf: multi-resolution ridge-top atness index;
aacn: altitude above channel network; slope: slope; g1: mean for felsic geological class in comparison to reference class, other; g2: mean for mac geological class in comparison to reference class, other
which guarantees the agreement between predictions at different supports demonstrated by the IAK approach in this study. This coherence
would seem to be a desirable property of a method that is to be
employed for prediction over various scales (in terms of the widths
of target interval).
5.5. Mapping
Fig. 6 (left) shows maps of the predicted clay contents at depths of
010 cm, 5060 cm and 90100 cm, and Fig. 6 (right) shows the associated widths of the 95% prediction intervals. For presentation, values less
than 0% or greater than 100% were truncated to these limits. The topsoil
map is relatively homogeneous for large parts of the study area. It was
suggested (Table 3) that geology and weathering intensity index were
the most important predictors for dening the spatial distribution in
the 010-cm depth interval, and the effects of the geology can clearly
be seen in the resultant map. Lower in the prole, the predicted clay
contents are more variable, with more extreme predictions; geology,
potassium and the radiometrics dose rate were suggested (Table 3) as
being the most important for the 90100-cm depth. The maps of prediction interval widths show the smallest uncertainties in the topsoil and
the largest uncertainties lower in the prole.
6. Discussion
We have presented a framework for analysis of soil prole data, in
which averages of the soil property over depth intervals (horizons
or xed depth intervals), rather than observations at exact points in
the prole, are measured. The increment-averaged kriging (IAK)
approach built on the methodology of area-to-point kriging
(Kyriakidis, 2004) accounts properly for the vertical support of
the data. This is in contrast to approaches built on assuming samples
were collected from interval midpoints that ignore the thicknesses of
Fig. 5. Validation data (shaded bars) and predictions for the ve validation locations (left to right). Continuous lines show predictions of 1-cm averages. Solid and dashed vertical lines
show predictions and 95% prediction intervals, respectively, at the support of the validation data.
183
Fig. 6. Predictions (left) and widths of 95% prediction intervals (right) for clay content (%) at depths 010 cm (top), 5060 cm (middle), and 90100 cm (bottom).
184
and dene:
0
Si I ; I0 ; i ; d ad Hlu0 H l u
0
P 5 min l ; l ; i i P 5 maxu ; u0 ; i i
"
( 0 )
l l
0
0
2
ad P 2 max l ; l ; i P 2 min l ; l ; i exp
ad
0
ju uj
P 2 maxu ; u0 ; i P 2 minu ; u0 ; i exp
ad
( 0
)
l
u
0
0
P 2 max u ; l ; i P 2 min u ; l ; i exp
ad
#
0
ju lj
0
0
P 2 maxl ; u ; i P 2 minl ; u ; i exp
ad
A2
7. Conclusions
185
3
i0 i1 ad 2 i2 a2d
5
i1 2 i2 ad
i2
3
i0 i1 ad 2 i2 a2d
5
i 4
i1 2 i2 ad
i2
A3
2
2
6
6
6
i i 6
6
6
4
Acknowledgements
A4
i0
i0 i0
i0
i0 i1 i1 i0 =2 i0 i1 i1 i0 =2
i0 i2 i1 i1 i2 i0 =3 i0 i2 i1 i1 i2 i0 =3
i1 i2 i2 i1 =4 i1 i2 i2 i1 =4
i2 i2 =5 i2 i2 =5
3
7
7
7
7
7
7
5
A5
Cov Y x; I ; Y x0 ; I 0
8
Nm
X
>
1
>
>
Si I ; I 0 ; i ; d
>
< lu l0 u0
>
>
>
>
:
if hx 0
i0
Nm
X
otherwise
1
0
0
S
I
;
I
;
h
;
x
i
i
x;i
d
x;i
lu l u0 i1
A6
Supplementary material
Appendix A
Here we present an expression for the average covariance for Y(x, I)
and Y(x, I), where I = [u , l] is the depth interval (of an observation
or prediction) at horizontal (point) location x, and I ' = [u , l]
is the depth interval at x. We assume that the point covariances i.e.
for Y(x, d) and Y(x, d) are given by Eq. (8), with order-2 polynomial
functions (i.e. quadratic equations) of depth for the standard deviations associated with the nugget and spatial variances (see
Eqs. (6) and (7)). We also assume that d(hd; d) is the exponential
correlation function:
h
d hd ; d d hd ; ad exp d
ad
A1
186
Breidt, F.J., Hsu, N.-J., Ogle, S., 2007. Semiparametric mixed models for incrementaveraged data with application to carbon sequestration in agricultural soils. J. Am.
Stat. Assoc. 102, 803812.
Brus, D.J., Kempen, B., Heuvelink, G.B.M., 2011. Sampling for validation of digital soil
maps. Eur. J. Soil Sci. 62, 394407.
Clifford, D., Dobbie, M.J., Searle, R., 2014. Non-parametric imputation of properties for soil
proles with sparse observations. Geoderma 232, 1018.
De Iaco, S., Myers, D.E., Posa, D., 2001. Spacetime analysis using a general productsum
model. Stat. Probab. Lett. 52, 2128.
De Iaco, S., Myers, D.E., Posa, D., 2011. Strict positive deniteness of a product of covariance functions. Commun. Stat. 40, 44004408.
Department of Environment and Climate Change NSW, 2009. Salinity Audit: Upland
catchments of the New South Wales MurrayDarling Basin Available at: www.
environment.nsw.gov.au/resources/salinity/09153SalinityAudit.pdf.
Eidsvik, J., Shaby, B.A., Reich, B.J., Wheeler, M., Niemi, J., 2014. Estimation and prediction in
spatial models with block composite likelihoods. J. Comput. Graph. Stat. 23, 295315.
Gelman, A., 2008. Scaling regression inputs by dividing by two standard deviations. Stat.
Med. 27, 28652873.
Goovaerts, P., 2008. Kriging and semivariogram deconvolution in the presence of irregular
geographical units. Math. Geosci. 40, 101128.
Haskard, K.A., Lark, R.M., 2009. Modelling non-stationary variance of soil properties by
tempering an empirical spectrum. Geoderma 153, 1828.
Hengl, T., de Jesus, J.M., MacMillan, R.A., Batjes, N.H., Heuvelink, G.B.M., Ribeiro, E.,
Samuel-Rosa, A., Kempen, B., Leenaars, J.G.B., Walsh, M.G., Gonzalez, M.R., 2014.
SoilGrids1km global soil information based on automated mapping. PLoS ONE 9,
e105992.
Heuvelink, G.B.M., 2014. Uncertainty quantication of GlobalSoilMap products. In:
GlobalSoilMap: basis of the global spatial soil information system. In: Arrouays, D.,
McKenzie, N.J., Hempel, J.W., Richer de Forges, A.C., McBratney, A.B. (Eds.).
Heuvelink, G.B.M., Grifth, D.A., 2010. Spacetime geostatistics for geography: a case
study of radiation monitoring across parts of Germany. Geogr. Anal. 42, 161179.
Kempen, B., Brus, D.J., Stoorvogel, J.J., 2011. Three-dimensional mapping of soil organic
matter content using soil type-specic depth functions. Geoderma 162, 107123.
Kerry, R., Goovaerts, P., Rawlins, B.G., Marchant, B.P., 2012. Disaggregation of legacy soil
data using area to point kriging for mapping soil organic carbon at the regional
scale. Geoderma 170, 347358.
Kyriakidis, P.C., 2004. A geostatistical framework for area-to-point spatial interpolation.
Geogr. Anal. 36, 259289.
Kyriakidis, P.C., Yoo, E.-H., 2005. Geostatistical prediction and simulation of point values
from areal data. Geogr. Anal. 37, 124151.
Lacarce, E., Saby, N.P.A., Martin, M.P., Marchant, B.P., Boulonne, L., Meersmans, J., Jolivet, C.,
Bispo, A., Arrouays, D., 2012. Mapping soil Pb stocks and availability in mainland
France combining regression trees with robust geostatistics. Geoderma 170, 359368.
Lark, R.M., 2000. Estimating variograms of soil properties by the method-of-moments and
maximum likelihood. Eur. J. Soil Sci. 51, 717728.
Lark, R.M., 2009. Kriging a soil variable with a simple nonstationary variance model.
J. Agric. Biol. Environ. Stat. 14, 301321.
Lark, R.M., Bishop, T.F.A., 2007. Cokriging particle size fractions of the soil. Eur. J. Soil Sci.
58, 763774.
Lark, R.M., Cullis, B.R., Welham, S.J., 2006. On spatial prediction of soil properties in the
presence of a spatial trend: the empirical best linear unbiased predictor (E-BLUP)
with REML. Eur. J. Soil Sci. 57, 787799.
Malone, B.P., McBratney, A.B., Minasny, B., Laslett, G.M., 2009. Mapping continuous depth
functions of soil carbon storage and available water capacity. Geoderma 154,
138152.
Malone, B.P., McBratney, A.B., Minasny, B., 2011a. Empirical estimates of uncertainty for
mapping continuous depth functions of soil attributes. Geoderma 160, 614626.
Malone, B.P., de Gruijter, J.J., McBratney, A.B., Minasny, B., Brus, D.J., 2011b. Using
additional criteria for measuring the quality of predictions and their uncertainties
in a digital soil mapping framework. Soil Sci. Soc. Am. J. 75, 10321043.
Marchant, B.P., Newman, S., Corstanje, R., Reddy, K.R., Osborne, T.Z., Lark, R.M., 2009.
Spatial monitoring of a non-stationary soil property: phosphorus in a Florida water
conservation area. Eur. J. Soil Sci. 60, 757769.
Martin, M.P., Orton, T.G., Lacarce, E., Meersmans, J., Saby, N.P.A., Paroissien, J.B., Jolivet, C.,
Boulonne, L., Arrouays, D., 2014. Evaluation of modelling approaches for predicting
the spatial distribution of soil organic carbon stocks at the national scale. Geoderma
223225, 97107.
Minty, B., Franklin, R., Milligan, P., Richardson, M., Wilford, J., 2009. The radiometric map
of Australia. Explor. Geophys. 40, 325333.
Orton, T.G., Pringle, M.J., Page, K.L., Dalal, R.C., Bishop, T.F.A., 2014. Spatial prediction of soil
organic carbon stock using a linear model of coregionalisation. Geoderma 230231,
119130.
Orton, T.G. Pringle M.J. Allen D.E. Dalal R.C. Bishop T.F.A. in press. A geostatistical method
to account for the number of aliquots in composite samples for normal and lognormal variables, Eur. J. Soil Sci. http://dx.doi.org/10.1111/ejss.12297.
Patterson, H.D., Thompson, R., 1971. Recovery of inter-block information when block sizes
are unequal. Biometrika 58, 545554.
Poggio, L., Gimona, A., 2014. National scale 3D modelling of soil organic carbon stocks
with uncertainty propagation an example from Scotland. Geoderma 232, 284299.
Schielzeth, H., 2010. Simple means to improve the interpretability of regression coefcients. Methods Ecol. Evol. 1, 103113.
Schirrmann, M., Herbst, R., Wagner, P., Gebbers, R., 2012. Area-to-point kriging of soil
phosphorus composite samples. Commun. Soil Sci. Plant Anal. 43, 10241041.
Stein, M.L., 2005. Spacetime covariance functions. J. Am. Stat. Assoc. 100, 310321.
Stein, M.L., Chi, Z., Welty, L.J., 2004. Approximating likelihoods for large spatial datasets.
J. R. Stat. Soc. Ser. B 66, 275296.
Truong, P.N., Heuvelink, B.M., Pebesma, E., 2014. Bayesian area-to-point kriging using
expert knowledge as informative priors. Int. J. Appl. Earth Obs. Geoinf. 30, 128138.
Veronesi, F., Corstanje, R., Mayr, T., 2012. Mapping soil compaction in 3D with depth
functions. Soil Tillage Res. 124, 111118.
Webster, R., Oliver, M.A., 2001. Geostatistics for environmental scientists. John Wiley &
Sons, Chichester, UK.