Title: | Geostatistical Modelling of Spatially Referenced Prevalence Data |
---|---|
Description: | Provides functions for both likelihood-based and Bayesian analysis of spatially referenced prevalence data. For a tutorial on the use of the R package, see Giorgi and Diggle (2017) <doi:10.18637/jss.v078.i08>. |
Authors: | Emanuele Giorgi, Peter J. Diggle |
Maintainer: | Emanuele Giorgi <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.5.4 |
Built: | 2025-03-01 03:51:07 UTC |
Source: | https://github.com/giorgilancs/prevmap |
This function computes the multiplicative constant used to adjust the value of sigma2
in the low-rank approximation of a Gaussian process.
adjust.sigma2(knots.dist, phi, kappa)
adjust.sigma2(knots.dist, phi, kappa)
knots.dist |
a matrix of the distances between the observed coordinates and the spatial knots. |
phi |
scale parameter of the Matern covariance function. |
kappa |
shape parameter of the Matern covariance function. |
Let denote the
by
matrix of the distances between the
observed coordinates and
pre-defined spatial knots. This function computes the following quantity
where is the Matern kernel (see
matern.kernel
) and is the distance between the
-th sampled location and the
-th spatial knot.
A value corresponding to the adjustment factor for sigma2
.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Plots the autocorrelogram for the posterior samples of the model parameters and spatial random effects.
autocor.plot(object, param, component.beta = NULL, component.S = NULL)
autocor.plot(object, param, component.beta = NULL, component.S = NULL)
object |
an object of class 'Bayes.PrevMap'. |
param |
a character indicating for which component of the model the autocorrelation plot is required: |
component.beta |
if |
component.S |
if |
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function performs Bayesian estimation for a geostatistical binary probit model. It also allows to specify a two-levels model so as to include individual-level and household-level (or any other unit comprising a group of individuals, e.g. village, school, compound, etc...) variables.
binary.probit.Bayes( formula, coords, data, ID.coords, control.prior, control.mcmc, kappa, low.rank = FALSE, knots = NULL, messages = TRUE )
binary.probit.Bayes( formula, coords, data, ID.coords, control.prior, control.mcmc, kappa, low.rank = FALSE, knots = NULL, messages = TRUE )
formula |
an object of class |
coords |
an object of class |
data |
a data frame containing the variables in the model. |
ID.coords |
vector of ID values for the unique set of spatial coordinates obtained from |
control.prior |
output from |
control.mcmc |
output from |
kappa |
value for the shape parameter of the Matern covariance function. |
low.rank |
logical; if |
knots |
if |
messages |
logical; if |
This function performs Bayesian estimation for the parameters of the geostatistical binary probit model. Let and
denote the indices of the
-th household and
-th individual within that household. The response variable
is a binary indicator taking value 1 if the individual has been tested positive for the disease of interest and 0 otherwise. Conditionally on a zero-mean stationary Gaussian process
,
are mutually independent Bernoulli variables with probit link function
, i.e.
where is a vector of covariates, both at individual- and household-level, with associated regression coefficients
. The Gaussian process
has isotropic Matern covariance function (see
matern
) with variance sigma2
, scale parameter phi
and shape parameter kappa
.
Priors definition. Priors can be defined through the function control.prior
. The hierarchical structure of the priors is the following. Let be the vector of the covariance parameters
c(sigma2,phi)
; each component of has independent priors that can be freely defined by the user. However, in
control.prior
uniform and log-normal priors are also available as default priors for each of the covariance parameters. The vector of regression coefficients beta
has a multivariate Gaussian prior with mean beta.mean
and covariance matrix beta.covar
.
Updating regression coefficents and random effects using auxiliary variables. To update and
, we use an auxiliary variable technique based on Rue and Held (2005). Let
denote a set of random variables that conditionally on
and
, are mutually independent Gaussian with mean
and unit variance. Then,
if
and
otherwise. Using this representation of the model, we use a Gibbs sampler to simulate from the full conditionals of
,
and
. See Section 4.3 of Rue and Held (2005) for more details.
Updating the covariance parameters with a Metropolis-Hastings algorithm. In the MCMC algorithm implemented in binary.probit.Bayes
, the transformed parameters
are independently updated using a Metropolis Hastings algorithm. At the -th iteration, a new value is proposed for each parameter from a univariate Gaussian distrubion with variance
. This is tuned using the following adaptive scheme
where is the acceptance rate at the
-th iteration, 0.45 is the optimal acceptance rate for a univariate Gaussian distribution, whilst
and
are pre-defined constants. The starting values
for each of the parameters
and
can be set using the function
control.mcmc.Bayes
through the arguments h.theta1
, h.theta2
and h.theta3
. To define values for and
, see the documentation of
control.mcmc.Bayes
.
Low-rank approximation.
In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process might be computationally beneficial. Let
and
denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then
is approximated as
, where
are zero-mean mutually independent Gaussian variables with variance
sigma2
and is the isotropic Matern kernel (see
matern.kernel
). Since the resulting approximation is no longer a stationary process (but only approximately), sigma2
may take very different values from the actual variance of the Gaussian process to approximate. The function adjust.sigma2
can then be used to (approximately) explore the range for sigma2
. For example if the variance of the Gaussian process is 0.5
, then an approximate value for sigma2
is 0.5/const.sigma2
, where const.sigma2
is the value obtained with adjust.sigma2
.
An object of class "Bayes.PrevMap".
The function summary.Bayes.PrevMap
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: matrix of the posterior samples of the model parameters.
S
: matrix of the posterior samples for each component of the random effect.
const.sigma2
: vector of the values of the multiplicative factor used to adjust the values of sigma2
in the low-rank approximation.
y
: binary observations.
D
: matrix of covariarates.
coords
: matrix of the observed sampling locations.
kappa
: shape parameter of the Matern function.
ID.coords
: set of ID values defined through the argument ID.coords
.
knots
: matrix of spatial knots used in the low-rank approximation.
h1
: vector of values taken by the tuning parameter h.theta1
at each iteration.
h2
: vector of values taken by the tuning parameter h.theta2
at each iteration.
call
: the matched call.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Rue, H., Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. Chapman & Hall, London.
Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.
control.mcmc.Bayes
, control.prior
,summary.Bayes.PrevMap
, matern
, matern.kernel
, create.ID.coords
.
This function performs Bayesian estimation for a geostatistical binomial logistic model.
binomial.logistic.Bayes( formula, units.m, coords, data, ID.coords = NULL, control.prior, control.mcmc, kappa, low.rank = FALSE, knots = NULL, messages = TRUE, mesh = NULL, SPDE = FALSE )
binomial.logistic.Bayes( formula, units.m, coords, data, ID.coords = NULL, control.prior, control.mcmc, kappa, low.rank = FALSE, knots = NULL, messages = TRUE, mesh = NULL, SPDE = FALSE )
formula |
an object of class |
units.m |
an object of class |
coords |
an object of class |
data |
a data frame containing the variables in the model. |
ID.coords |
vector of ID values for the unique set of spatial coordinates obtained from |
control.prior |
output from |
control.mcmc |
output from |
kappa |
value for the shape parameter of the Matern covariance function. |
low.rank |
logical; if |
knots |
if |
messages |
logical; if |
mesh |
an object obtained as result of a call to the function |
SPDE |
logical; if |
This function performs Bayesian estimation for the parameters of the geostatistical binomial logistic model. Conditionally on a zero-mean stationary Gaussian process and mutually independent zero-mean Gaussian variables
with variance
tau2
, the linear predictor assumes the form
where is a vector of covariates with associated regression coefficients
. The Gaussian process
has isotropic Matern covariance function (see
matern
) with variance sigma2
, scale parameter phi
and shape parameter kappa
.
Priors definition. Priors can be defined through the function control.prior
. The hierarchical structure of the priors is the following. Let be the vector of the covariance parameters
c(sigma2,phi,tau2)
; then each component of has independent priors freely defined by the user. However, in
control.prior
uniform and log-normal priors are also available as default priors for each of the covariance parameters. To remove the nugget effect , no prior should be defined for
tau2
. Conditionally on sigma2
, the vector of regression coefficients beta
has a multivariate Gaussian prior with mean beta.mean
and covariance matrix sigma2*beta.covar
, while in the low-rank approximation the covariance matrix is simply beta.covar
.
Updating the covariance parameters with a Metropolis-Hastings algorithm. In the MCMC algorithm implemented in binomial.logistic.Bayes
, the transformed parameters
are independently updated using a Metropolis Hastings algorithm. At the -th iteration, a new value is proposed for each from a univariate Gaussian distrubion with variance
that is tuned using the following adaptive scheme
where is the acceptance rate at the
-th iteration, 0.45 is the optimal acceptance rate for a univariate Gaussian distribution, whilst
and
are pre-defined constants. The starting values
for each of the parameters
,
and
can be set using the function
control.mcmc.Bayes
through the arguments h.theta1
, h.theta2
and h.theta3
. To define values for and
, see the documentation of
control.mcmc.Bayes
.
Hamiltonian Monte Carlo. The MCMC algorithm in binomial.logistic.Bayes
uses a Hamiltonian Monte Carlo (HMC) procedure to update the random effect ; see Neal (2011) for an introduction to HMC. HMC makes use of a postion vector, say
, representing the random effect
, and a momentum vector, say
, of the same length of the position vector, say
. Hamiltonian dynamics also have a physical interpretation where the states of the system are described by the position of a puck and its momentum (its mass times its velocity). The Hamiltonian function is then defined as a function of
and
, having the form
, where
is the conditional distribution of
given the data
, the regression parameters
and covariance parameters
. The system of Hamiltonian equations then defines the evolution of the system in time, which can be used to implement an algorithm for simulation from the posterior distribution of
. In order to implement the Hamiltonian dynamic on a computer, the Hamiltonian equations must be discretised. The leapfrog method is then used for this purpose, where two tuning parameters should be defined: the stepsize
and the number of steps
. These respectively correspond to
epsilon.S.lim
and L.S.lim
in the control.mcmc.Bayes
function. However, it is advisable to let and
take different random values at each iteration of the HCM algorithm so as to account for the different variances amongst the components of the posterior of
. This can be done in
control.mcmc.Bayes
by defning epsilon.S.lim
and L.S.lim
as vectors of two elements, each of which represents the lower and upper limit of a uniform distribution used to generate values for epsilon.S.lim
and L.S.lim
, respectively.
Using a two-level model to include household-level and individual-level information.
When analysing data from household sruveys, some of the avilable information information might be at household-level (e.g. material of house, temperature) and some at individual-level (e.g. age, gender). In this case, the Gaussian spatial process and the nugget effect
are defined at hosuehold-level in order to account for extra-binomial variation between and within households, respectively.
Low-rank approximation.
In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process might be computationally beneficial. Let
and
denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then
is approximated as
, where
are zero-mean mutually independent Gaussian variables with variance
sigma2
and is the isotropic Matern kernel (see
matern.kernel
). Since the resulting approximation is no longer a stationary process (but only approximately), sigma2
may take very different values from the actual variance of the Gaussian process to approximate. The function adjust.sigma2
can then be used to (approximately) explore the range for sigma2
. For example if the variance of the Gaussian process is 0.5
, then an approximate value for sigma2
is 0.5/const.sigma2
, where const.sigma2
is the value obtained with adjust.sigma2
.
An object of class "Bayes.PrevMap".
The function summary.Bayes.PrevMap
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: matrix of the posterior samples of the model parameters.
S
: matrix of the posterior samples for each component of the random effect.
const.sigma2
: vector of the values of the multiplicative factor used to adjust the values of sigma2
in the low-rank approximation.
y
: binomial observations.
units.m
: binomial denominators.
D
: matrix of covariarates.
coords
: matrix of the observed sampling locations.
kappa
: shape parameter of the Matern function.
ID.coords
: set of ID values defined through the argument ID.coords
.
knots
: matrix of spatial knots used in the low-rank approximation.
h1
: vector of values taken by the tuning parameter h.theta1
at each iteration.
h2
: vector of values taken by the tuning parameter h.theta2
at each iteration.
h3
: vector of values taken by the tuning parameter h.theta3
at each iteration.
acc.beta.S
: empirical acceptance rate for the regression coefficients and random effects (only if SPDE=TRUE
).
mesh
: the mesh used in the SPDE approximation.
call
: the matched call.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Neal, R. M. (2011) MCMC using Hamiltonian Dynamics, In: Handbook of Markov Chain Monte Carlo (Chapter 5), Edited by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng Chapman & Hall / CRC Press.
Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.
control.mcmc.Bayes
, control.prior
,summary.Bayes.PrevMap
, matern
, matern.kernel
, create.ID.coords
.
This function performs Monte Carlo maximum likelihood (MCML) estimation for the geostatistical binomial logistic model.
binomial.logistic.MCML( formula, units.m, coords, times = NULL, data, ID.coords = NULL, par0, control.mcmc, kappa, kappa.t = NULL, sst.model = NULL, fixed.rel.nugget = NULL, start.cov.pars, method = "BFGS", low.rank = FALSE, SPDE = FALSE, knots = NULL, mesh = NULL, messages = TRUE, plot.correlogram = TRUE )
binomial.logistic.MCML( formula, units.m, coords, times = NULL, data, ID.coords = NULL, par0, control.mcmc, kappa, kappa.t = NULL, sst.model = NULL, fixed.rel.nugget = NULL, start.cov.pars, method = "BFGS", low.rank = FALSE, SPDE = FALSE, knots = NULL, mesh = NULL, messages = TRUE, plot.correlogram = TRUE )
formula |
an object of class |
units.m |
an object of class |
coords |
an object of class |
times |
an object of class |
data |
a data frame containing the variables in the model. |
ID.coords |
vector of ID values for the unique set of spatial coordinates obtained from |
par0 |
parameters of the importance sampling distribution: these should be given in the following order |
control.mcmc |
output from |
kappa |
fixed value for the shape parameter of the Matern covariance function. |
kappa.t |
fixed value for the shape parameter of the Matern covariance function in the separable double-Matern spatio-temporal model. |
sst.model |
a character value that specifies the spatio-temporal correlation function.
Deafault is |
fixed.rel.nugget |
fixed value for the relative variance of the nugget effect; |
start.cov.pars |
a vector of length two with elements corresponding to the starting values of |
method |
method of optimization. If |
low.rank |
logical; if |
SPDE |
logical; if |
knots |
if |
mesh |
an object obtained as result of a call to the function |
messages |
logical; if |
plot.correlogram |
logical; if |
This function performs parameter estimation for a geostatistical binomial logistic model. Conditionally on a zero-mean stationary Gaussian process and mutually independent zero-mean Gaussian variables
with variance
tau2
, the observations y
are generated from a binomial distribution with probability and binomial denominators
units.m
. A canonical logistic link is used, thus the linear predictor assumes the form
where is a vector of covariates with associated regression coefficients
. The Gaussian process
has isotropic Matern covariance function (see
matern
) with variance sigma2
, scale parameter phi
and shape parameter kappa
.
In the binomial.logistic.MCML
function, the shape parameter is treated as fixed. The relative variance of the nugget effect, nu2=tau2/sigma2
, can also be fixed through the argument fixed.rel.nugget
; if fixed.rel.nugget=NULL
, then the relative variance of the nugget effect is also included in the estimation.
Monte Carlo Maximum likelihood.
The Monte Carlo maximum likelihood method uses conditional simulation from the distribution of the random effect given the data
y
, in order to approximate the high-dimensiional intractable integral given by the likelihood function. The resulting approximation of the likelihood is then maximized by a numerical optimization algorithm which uses analytic epression for computation of the gradient vector and Hessian matrix. The functions used for numerical optimization are maxBFGS
(method="BFGS"
), from the maxLik package, and nlminb
(method="nlminb"
).
Using a two-level model to include household-level and individual-level information.
When analysing data from household sruveys, some of the avilable information information might be at household-level (e.g. material of house, temperature) and some at individual-level (e.g. age, gender). In this case, the Gaussian spatial process and the nugget effect
are defined at hosuehold-level in order to account for extra-binomial variation between and within households, respectively.
Low-rank approximation.
In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process might be computationally beneficial. Let
and
denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then
is approximated as
, where
are zero-mean mutually independent Gaussian variables with variance
sigma2
and is the isotropic Matern kernel (see
matern.kernel
). Since the resulting approximation is no longer a stationary process (but only approximately), the parameter sigma2
is then multiplied by a factor constant.sigma2
so as to obtain a value that is closer to the actual variance of .
An object of class "PrevMap".
The function summary.PrevMap
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: estimates of the model parameters; use the function coef.PrevMap
to obtain estimates of covariance parameters on the original scale.
covariance
: covariance matrix of the MCML estimates.
log.lik
: maximum value of the log-likelihood.
y
: binomial observations.
units.m
: binomial denominators.
D
: matrix of covariates.
coords
: matrix of the observed sampling locations.
method
: method of optimization used.
ID.coords
: set of ID values defined through the argument ID.coords
.
kappa
: fixed value of the shape parameter of the Matern function.
kappa.t
: fixed value for the shape parameter of the Matern covariance function in the separable double-Matern spatio-temporal model.
knots
: matrix of the spatial knots used in the low-rank approximation.
mesh
: the mesh used in the SPDE approximation.
const.sigma2
: adjustment factor for sigma2
in the low-rank approximation.
h
: vector of the values of the tuning parameter at each iteration of the Langevin-Hastings MCMC algorithm; see Laplace.sampling
, or Laplace.sampling.lr
if a low-rank approximation is used.
samples
: matrix of the random effects samples from the importance sampling distribution used to approximate the likelihood function.
fixed.rel.nugget
: fixed value for the relative variance of the nugget effect.
call
: the matched call.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Christensen, O. F. (2004). Monte carlo maximum likelihood in model-based geostatistics. Journal of Computational and Graphical Statistics 13, 702-718.
Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.
Laplace.sampling
, Laplace.sampling.lr
, summary.PrevMap
, coef.PrevMap
, matern
, matern.kernel
, control.mcmc.MCML
, create.ID.coords
.
coef
extracts parameters estimates from models fitted with the functions linear.model.MLE
and binomial.logistic.MCML
.
## S3 method for class 'PrevMap' coef(object, ...)
## S3 method for class 'PrevMap' coef(object, ...)
object |
an object of class "PrevMap". |
... |
other arguments. |
coefficients extracted from the model object object
.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
coef
extracts parameters estimates from models fitted with the functions lm.ps.MCML
.
## S3 method for class 'PrevMap.ps' coef(object, ...)
## S3 method for class 'PrevMap.ps' coef(object, ...)
object |
an object of class "PrevMap.ps". |
... |
other arguments. |
a list of coefficients extracted from the model in object
.
Emanuele Giorgi [email protected]
Draws a sample of spatial locations within a spatially continuous polygonal sampling region.
continuous.sample(poly, n, delta, k = 0, rho = NULL)
continuous.sample(poly, n, delta, k = 0, rho = NULL)
poly |
boundary of a polygon. |
n |
number of events. |
delta |
minimum permissible distance between any two events in preliminary sample. |
k |
number of locations in preliminary sample to be replaced by near neighbours of other preliminary sample locations in final sample (must be between 0 and |
rho |
maximum distance between close pairs of locations in final sample. |
To draw a sample of size n
from a spatially continuous region , with the property that the distance between any two sampled locations is at least
delta
, the following algorithm is used.
Step 1. Set and generate a point
uniformly distributed on
.
Step 2. Increase by 1, generate a point
uniformly distributed on
and calculate the minimum,
, of the distances from
to all
.
Step 3. If , increase
by 1 and return to step 2 if
, otherwise stop;
Step 4. If , return to step 2 without increasing
.
Sampling close pairs of points. For some purposes, it is desirable that a spatial sampling scheme include pairs of closely spaced points. In this case, the above algorithm requires the following additional steps to be taken.
Let k
be the required number of close pairs. Choose a value rho
such that a close pair of points will be a pair of points separated by a distance of at most rho
.
Step 5. Set and draw a random sample of size 2 from the integers
, say
;
Step 6. Replace by
, where
is uniformly distributed on the disc with centre
and radius
rho
, increase by 1 and return to step 5 if
, otherwise stop.
A matrix of dimension n
by 2 containing event locations.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
plot.pred.PrevMap
displays contours of predictions obtained from spatial.pred.linear.MLE
, spatial.pred.linear.Bayes
,spatial.pred.binomial.MCML
and spatial.pred.binomial.Bayes
.
## S3 method for class 'pred.PrevMap' contour(x, type = NULL, summary = "predictions", ...)
## S3 method for class 'pred.PrevMap' contour(x, type = NULL, summary = "predictions", ...)
x |
an object of class "pred.PrevMap". |
type |
a character indicating the type of prediction to display: 'prevalence', 'odds', 'logit' or 'probit'. |
summary |
character indicating which summary to display: 'predictions','quantiles', 'standard.errors' or 'exceedance.prob'; default is 'predictions'. If |
... |
further arguments passed to |
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function defines the different tuning parameter that are used in the MCMC algorithm for Bayesian inference.
control.mcmc.Bayes( n.sim, burnin, thin, h.theta1 = 0.01, h.theta2 = 0.01, h.theta3 = 0.01, L.S.lim = NULL, epsilon.S.lim = NULL, start.beta = "prior mean", start.sigma2 = "prior mean", start.phi = "prior mean", start.S = "prior mean", start.nugget = "prior mean", c1.h.theta1 = 0.01, c2.h.theta1 = 1e-04, c1.h.theta2 = 0.01, c2.h.theta2 = 1e-04, c1.h.theta3 = 0.01, c2.h.theta3 = 1e-04, linear.model = FALSE, binary = FALSE )
control.mcmc.Bayes( n.sim, burnin, thin, h.theta1 = 0.01, h.theta2 = 0.01, h.theta3 = 0.01, L.S.lim = NULL, epsilon.S.lim = NULL, start.beta = "prior mean", start.sigma2 = "prior mean", start.phi = "prior mean", start.S = "prior mean", start.nugget = "prior mean", c1.h.theta1 = 0.01, c2.h.theta1 = 1e-04, c1.h.theta2 = 0.01, c2.h.theta2 = 1e-04, c1.h.theta3 = 0.01, c2.h.theta3 = 1e-04, linear.model = FALSE, binary = FALSE )
n.sim |
total number of simulations. |
burnin |
initial number of samples to be discarded. |
thin |
value used to retain only evey |
h.theta1 |
starting value of the tuning parameter of the proposal distribution for |
h.theta2 |
starting value of the tuning parameter of the proposal distribution for |
h.theta3 |
starting value of the tuning parameter of the proposal distribution for |
L.S.lim |
an atomic value or a vector of length 2 that is used to define the number of steps used at each iteration in the Hamiltonian Monte Carlo algorithm to update the spatial random effect; if a single value is provided than the number of steps is kept fixed, otherwise if a vector of length 2 is provided the number of steps is simulated at each iteration as |
epsilon.S.lim |
an atomic value or a vector of length 2 that is used to define the stepsize used at each iteration in the Hamiltonian Monte Carlo algorithm to update the spatial random effect; if a single value is provided than the stepsize is kept fixed, otherwise if a vector of length 2 is provided the stepsize is simulated at each iteration as |
start.beta |
starting value for the regression coefficients |
start.sigma2 |
starting value for |
start.phi |
starting value for |
start.S |
starting value for the spatial random effect. |
start.nugget |
starting value for the variance of the nugget effect; default is |
c1.h.theta1 |
value of |
c2.h.theta1 |
value of |
c1.h.theta2 |
value of |
c2.h.theta2 |
value of |
c1.h.theta3 |
value of |
c2.h.theta3 |
value of |
linear.model |
logical; if |
binary |
logical; if |
an object of class "mcmc.Bayes.PrevMap".
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function defines the different tuning parameter that are used in the MCMC algorithm for Bayesian inference using a SPDE approximation for the spatial Gaussian process.
control.mcmc.Bayes.SPDE( n.sim, burnin, thin, h.theta1 = 0.01, h.theta2 = 0.01, start.beta = "prior mean", start.sigma2 = "prior mean", start.phi = "prior mean", start.S = "prior mean", n.iter = 1, h = 1, c1.h.theta1 = 0.01, c2.h.theta1 = 1e-04, c1.h.theta2 = 0.01, c2.h.theta2 = 1e-04 )
control.mcmc.Bayes.SPDE( n.sim, burnin, thin, h.theta1 = 0.01, h.theta2 = 0.01, start.beta = "prior mean", start.sigma2 = "prior mean", start.phi = "prior mean", start.S = "prior mean", n.iter = 1, h = 1, c1.h.theta1 = 0.01, c2.h.theta1 = 1e-04, c1.h.theta2 = 0.01, c2.h.theta2 = 1e-04 )
n.sim |
total number of simulations. |
burnin |
initial number of samples to be discarded. |
thin |
value used to retain only evey |
h.theta1 |
starting value of the tuning parameter of the proposal distribution for |
h.theta2 |
starting value of the tuning parameter of the proposal distribution for |
start.beta |
starting value for the regression coefficients |
start.sigma2 |
starting value for |
start.phi |
starting value for |
start.S |
starting value for the spatial random effect. If not provided the prior mean is used. |
n.iter |
number of iteration of the Newton-Raphson procedure used to compute the mean and coviariance matrix of the Gaussian proposal in the MCMC; defaut is |
h |
tuning parameter for the covariance matrix of the Gaussian proposal. Default is |
c1.h.theta1 |
value of |
c2.h.theta1 |
value of |
c1.h.theta2 |
value of |
c2.h.theta2 |
value of |
an object of class "mcmc.Bayes.PrevMap".
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function defines the options for the MCMC algorithm used in the Monte Carlo maximum likelihood method.
control.mcmc.MCML(n.sim, burnin, thin = 1, h = NULL, c1.h = 0.01, c2.h = 1e-04)
control.mcmc.MCML(n.sim, burnin, thin = 1, h = NULL, c1.h = 0.01, c2.h = 1e-04)
n.sim |
number of simulations. |
burnin |
length of the burn-in period. |
thin |
only every |
h |
tuning parameter of the proposal distribution used in the Langevin-Hastings MCMC algorithm (see |
c1.h |
value of |
c2.h |
value of |
A list with processed arguments to be passed to the main function.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
control.mcmc <- control.mcmc.MCML(n.sim=1000,burnin=100,thin=1,h=0.05) str(control.mcmc)
control.mcmc <- control.mcmc.MCML(n.sim=1000,burnin=100,thin=1,h=0.05) str(control.mcmc)
This function is used to define priors for the model parameters of a Bayesian geostatistical model.
control.prior( beta.mean, beta.covar, log.prior.sigma2 = NULL, log.prior.phi = NULL, log.prior.nugget = NULL, uniform.sigma2 = NULL, log.normal.sigma2 = NULL, uniform.phi = NULL, log.normal.phi = NULL, uniform.nugget = NULL, log.normal.nugget = NULL )
control.prior( beta.mean, beta.covar, log.prior.sigma2 = NULL, log.prior.phi = NULL, log.prior.nugget = NULL, uniform.sigma2 = NULL, log.normal.sigma2 = NULL, uniform.phi = NULL, log.normal.phi = NULL, uniform.nugget = NULL, log.normal.nugget = NULL )
beta.mean |
mean vector of the Gaussian prior for the regression coefficients. |
beta.covar |
covariance matrix of the Gaussian prior for the regression coefficients. |
log.prior.sigma2 |
a function corresponding to the log-density of the prior distribution for the variance |
log.prior.phi |
a function corresponding to the log-density of the prior distribution for the scale parameter of the Matern correlation function; default is |
log.prior.nugget |
optional: a function corresponding to the log-density of the prior distribution for the variance of the nugget effect; default is |
uniform.sigma2 |
a vector of length two, corresponding to the lower and upper limit of the uniform prior on |
log.normal.sigma2 |
a vector of length two, corresponding to the mean and standard deviation of the distribution on the log scale for the log-normal prior on |
uniform.phi |
a vector of length two, corresponding to the lower and upper limit of the uniform prior on |
log.normal.phi |
a vector of length two, corresponding to the mean and standard deviation of the distribution on the log scale for the log-normal prior on |
uniform.nugget |
a vector of length two, corresponding to the lower and upper limit of the uniform prior on |
log.normal.nugget |
a vector of length two, corresponding to the mean and standard deviation of the distribution on the log scale for the log-normal prior on |
a list corresponding the prior distributions for each model parameter.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
See "Priors definition" in the Details section of the binomial.logistic.Bayes
function.
Auxiliary function used by loglik.linear.model
. This function defines whether the profile-loglikelihood should be computed or evaluation of the likelihood is required by keeping the other parameters fixed.
control.profile( phi = NULL, rel.nugget = NULL, fixed.beta = NULL, fixed.sigma2 = NULL, fixed.phi = NULL, fixed.rel.nugget = NULL )
control.profile( phi = NULL, rel.nugget = NULL, fixed.beta = NULL, fixed.sigma2 = NULL, fixed.phi = NULL, fixed.rel.nugget = NULL )
phi |
a vector of the different values that should be used in the likelihood evalutation for the scale parameter |
rel.nugget |
a vector of the different values that should be used in the likelihood evalutation for the relative variance of the nugget effect |
fixed.beta |
a vector for the fixed values of the regression coefficients |
fixed.sigma2 |
value for the fixed variance of the Gaussian process |
fixed.phi |
value for the fixed scale parameter |
fixed.rel.nugget |
value for the fixed relative variance of the nugget effect; |
A list with components named as the arguments.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Creates ID values for the unique set of coordinates.
create.ID.coords(data, coords)
create.ID.coords(data, coords)
data |
a data frame containing the spatial coordinates. |
coords |
an object of class |
a vector of integers indicating the corresponding rows in data
for each distinct coordinate obtained with the unique
function.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
x1 <- runif(5) x2 <- runif(5) data <- data.frame(x1=rep(x1,each=3),x2=rep(x2,each=3)) ID.coords <- create.ID.coords(data,coords=~x1+x2) data[,c("x1","x2")]==unique(data[,c("x1","x2")])[ID.coords,]
x1 <- runif(5) x2 <- runif(5) data <- data.frame(x1=rep(x1,each=3),x2=rep(x2,each=3)) ID.coords <- create.ID.coords(data,coords=~x1+x2) data[,c("x1","x2")]==unique(data[,c("x1","x2")])[ID.coords,]
This binomial data-set was simulated by generating a zero-mean Gaussian process over a 30 by 30 grid covering the unit square. The parameters used in the simulation are sigma2=1
, phi=0.15
and kappa=2
. The nugget effect was not included, hence tau2=0
.
The variables are as follows:
y binomial observations.
units.m binomial denominators.
x1 horizontal coordinates.
x2 vertical coordinates.
S simulated values of the Gaussian process.
data(data_sim)
data(data_sim)
A data frame with 900 rows and 5 variables
Plots the autocorrelogram for the posterior samples of the model parameters and spatial random effects.
dens.plot( object, param, component.beta = NULL, component.S = NULL, hist = TRUE, ... )
dens.plot( object, param, component.beta = NULL, component.S = NULL, hist = TRUE, ... )
object |
an object of class 'Bayes.PrevMap'. |
param |
a character indicating for which component of the model the density plot is required: |
component.beta |
if |
component.S |
if |
hist |
logical; if |
... |
additional parameters to pass to |
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Draws a sub-sample from a set of units spatially located irregularly over some defined geographical region by imposing a minimum distance between any two sampled units.
discrete.sample(xy.all, n, delta, k = 0)
discrete.sample(xy.all, n, delta, k = 0)
xy.all |
set of locations from which the sample will be drawn. |
n |
size of required sample. |
delta |
minimum distance between any two locations in preliminary sample. |
k |
number of locations in preliminary sample to be replaced by nearest neighbours of other preliminary sample locations in final sample (must be between 0 and |
To draw a sample of size n
from a population of spatial locations , with the property that the distance between any two sampled locations is at least
delta
, the function implements the following algorithm.
Step 1. Draw an initial sample of size n
completely at random and call this .
Step 2. Set and calculate the minimum,
, of the distances from
to all other
in the initial sample.
Step 3. If , increase
by 1 and return to step 2 if
, otherwise stop.
Step 4. If , draw an integer
at random from
, set
and return to step 3.
Samples generated in this way will exhibit a more regular spatial arrangement than would a random sample of the same size. The degree of regularity achievable will be influenced by the spatial arrangement of the population , the specified value of
delta
and the sample size n
. For any given population, if n
and/or delta
are too large, a sample of the required size with the distance between any two sampled locations at least delta
will not be achievable; the suggested solution is then to run the algorithm with a smaller value of delta
.
Sampling close pairs of points.
For some purposes, it is desirable that a spatial sampling scheme include pairs of closely spaced points. In this case, the above algorithm requires the following additional steps to be taken.
Let k
be the required number of close pairs.
Step 5. Set and draw a random sample of size 2 from the integers
, say
.
Step 6. Find the integer such that the distances from
to
is the minimum of all
distances from
to the
.
Step 7. Replace by
, increase
by 1 and return to step 5 if
, otherwise stop.
A matrix of dimension n
by 2 containing the final sampled locations.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
x<-0.015+0.03*(1:33) xall<-rep(x,33) yall<-c(t(matrix(xall,33,33))) xy<-cbind(xall,yall)+matrix(-0.0075+0.015*runif(33*33*2),33*33,2) par(pty="s",mfrow=c(1,2)) plot(xy[,1],xy[,2],pch=19,cex=0.25,xlab="Easting",ylab="Northing", cex.lab=1,cex.axis=1,cex.main=1) set.seed(15892) # Generate spatially random sample xy.sample<-xy[sample(1:dim(xy)[1],50,replace=FALSE),] points(xy.sample[,1],xy.sample[,2],pch=19,col="red") points(xy[,1],xy[,2],pch=19,cex=0.25) plot(xy[,1],xy[,2],pch=19,cex=0.25,xlab="Easting",ylab="Northing", cex.lab=1,cex.axis=1,cex.main=1) set.seed(15892) # Generate spatially regular sample xy.sample<-discrete.sample(xy,50,0.08) points(xy.sample[,1],xy.sample[,2],pch=19,col="red") points(xy[,1],xy[,2],pch=19,cex=0.25)
x<-0.015+0.03*(1:33) xall<-rep(x,33) yall<-c(t(matrix(xall,33,33))) xy<-cbind(xall,yall)+matrix(-0.0075+0.015*runif(33*33*2),33*33,2) par(pty="s",mfrow=c(1,2)) plot(xy[,1],xy[,2],pch=19,cex=0.25,xlab="Easting",ylab="Northing", cex.lab=1,cex.axis=1,cex.main=1) set.seed(15892) # Generate spatially random sample xy.sample<-xy[sample(1:dim(xy)[1],50,replace=FALSE),] points(xy.sample[,1],xy.sample[,2],pch=19,col="red") points(xy[,1],xy[,2],pch=19,cex=0.25) plot(xy[,1],xy[,2],pch=19,cex=0.25,xlab="Easting",ylab="Northing", cex.lab=1,cex.axis=1,cex.main=1) set.seed(15892) # Generate spatially regular sample xy.sample<-discrete.sample(xy,50,0.08) points(xy.sample[,1],xy.sample[,2],pch=19,col="red") points(xy[,1],xy[,2],pch=19,cex=0.25)
This data-set relates to two studies on lead concentration in moss samples, in micrograms per gram dry weight, collected in Galicia, norther Spain. The data are from two surveys, one conducted in October 1997 and on in July 2000. The variables are as follows:
x x-coordinate of the spatial locations.
y y-coordinate of the spatial locations.
lead lead concentration.
survey year of the survey (either 1997 or 2000).
data(galicia)
data(galicia)
A data frame with 195 rows and 4 variables
Diggle, P.J., Menezes, R. and Su, T.-L. (2010). Geostatistical analysis under preferential sampling (with Discussion). Applied Statistics, 59, 191-232.
This data-set contains the geographical coordinates of the boundary of the Galicia region in northern Spain.
The variables are as follows:
x x-coordinate of the spatial locations.
y y-coordinate of the spatial locations.
data(galicia.boundary)
data(galicia.boundary)
A data frame with 42315 rows and 2 variables
This function performs the Laplace method for maximum likelihood estimation of a generalised linear geostatistical model.
glgm.LA( formula, units.m = NULL, coords, times = NULL, data, ID.coords = NULL, kappa, kappa.t = 0.5, fixed.rel.nugget = NULL, start.cov.pars, method = "nlminb", messages = TRUE, family, return.covariance = TRUE )
glgm.LA( formula, units.m = NULL, coords, times = NULL, data, ID.coords = NULL, kappa, kappa.t = 0.5, fixed.rel.nugget = NULL, start.cov.pars, method = "nlminb", messages = TRUE, family, return.covariance = TRUE )
formula |
an object of class |
units.m |
an object of class |
coords |
an object of class |
times |
an object of class |
data |
a data frame containing the variables in the model. |
ID.coords |
vector of ID values for the unique set of spatial coordinates obtained from |
kappa |
fixed value for the shape parameter of the Matern covariance function. |
kappa.t |
fixed value for the shape parameter of the Matern covariance function in the separable double-Matern spatio-temporal model. |
fixed.rel.nugget |
fixed value for the relative variance of the nugget effect; |
start.cov.pars |
a vector of length two with elements corresponding to the starting values of |
method |
method of optimization. If |
messages |
logical; if |
family |
character, indicating the conditional distribution of the outcome. This should be |
return.covariance |
logical; if |
This function performs parameter estimation for a generealized linear geostatistical model. Conditionally on a zero-mean stationary Gaussian process and mutually independent zero-mean Gaussian variables
with variance
tau2
, the observations y
are generated from a GLM
with link function and linear predictor
where is a vector of covariates with associated regression coefficients
. The Gaussian process
has isotropic Matern covariance function (see
matern
) with variance sigma2
, scale parameter phi
and shape parameter kappa
.
The shape parameter is treated as fixed. The relative variance of the nugget effect, nu2=tau2/sigma2
, can also be fixed through the argument fixed.rel.nugget
; if fixed.rel.nugget=NULL
, then the relative variance of the nugget effect is also included in the estimation.
Laplace Approximation
The Laplace approximation (LA) method uses a second-order Taylor expansion of the integrand expressing the likelihood function. The resulting approximation of the likelihood is then maximized by a numerical optimization as defined through the argument method
.
Using a two-level model to include household-level and individual-level information.
When analysing data from household sruveys, some of the avilable information information might be at household-level (e.g. material of house, temperature) and some at individual-level (e.g. age, gender). In this case, the Gaussian spatial process and the nugget effect
are defined at hosuehold-level in order to account for extra-binomial variation between and within households, respectively.
An object of class "PrevMap".
The function summary.PrevMap
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: estimates of the model parameters; use the function coef.PrevMap
to obtain estimates of covariance parameters on the original scale.
covariance
: covariance matrix of the MCML estimates.
log.lik
: maximum value of the log-likelihood.
y
: binomial observations.
units.m
: binomial denominators.
D
: matrix of covariates.
coords
: matrix of the observed sampling locations.
times
: vector of the time points used in a spatio-temporal model.
method
: method of optimization used.
ID.coords
: set of ID values defined through the argument ID.coords
.
kappa
: fixed value of the shape parameter of the Matern function.
kappa.t
: fixed value for the shape parameter of the Matern covariance function in the separable double-Matern spatio-temporal model.
fixed.rel.nugget
: fixed value for the relative variance of the nugget effect.
call
: the matched call.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Christensen, O. F. (2004). Monte carlo maximum likelihood in model-based geostatistics. Journal of Computational and Graphical Statistics 13, 702-718.
Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.
Laplace.sampling
, Laplace.sampling.lr
, summary.PrevMap
, coef.PrevMap
, matern
, matern.kernel
, control.mcmc.MCML
, create.ID.coords
.
This function simulates from the conditional distribution of a Gaussian random effect, given binomial or Poisson observations y
.
Laplace.sampling( mu, Sigma, y, units.m, control.mcmc, ID.coords = NULL, messages = TRUE, plot.correlogram = TRUE, poisson.llik = FALSE )
Laplace.sampling( mu, Sigma, y, units.m, control.mcmc, ID.coords = NULL, messages = TRUE, plot.correlogram = TRUE, poisson.llik = FALSE )
mu |
mean vector of the marginal distribution of the random effect. |
Sigma |
covariance matrix of the marginal distribution of the random effect. |
y |
vector of binomial/Poisson observations. |
units.m |
vector of binomial denominators, or offset if the Poisson model is used. |
control.mcmc |
output from |
ID.coords |
vector of ID values for the unique set of spatial coordinates obtained from |
messages |
logical; if |
plot.correlogram |
logical; if |
poisson.llik |
logical; if |
Binomial model. Conditionally on the random effect , the data
y
follow a binomial distribution with probability and binomial denominators
units.m
. The logistic link function is used for the linear predictor, which assumes the form
Poisson model. Conditionally on the random effect , the data
y
follow a Poisson distribution with mean , where
is an offset set through the argument
units.m
. The log link function is used for the linear predictor, which assumes the form
The random effect has a multivariate Gaussian distribution with mean
mu
and covariance matrix Sigma
.
Laplace sampling. This function generates samples from the distribution of given the data
y
. Specifically a Langevin-Hastings algorithm is used to update where
and
are the inverse of the negative Hessian and the mode of the distribution of
given
y
, respectively. At each iteration a new value for
is proposed from a multivariate Gaussian distribution with mean
where is the current value for
,
is a tuning parameter and
is the the gradient of the log-density of the distribution of
given
y
. The tuning parameter is updated according to the following adaptive scheme: the value of
at the
-th iteration, say
, is given by
where and
are pre-defined constants, and
is the acceptance rate at the
-th iteration (
is the optimal acceptance rate for a multivariate standard Gaussian distribution).
The starting value for
, and the values for
and
can be set through the function
control.mcmc.MCML
.
Random effects at household-level. When the data consist of two nested levels, such as households and individuals within households, the argument ID.coords
must be used to define the household IDs for each individual. Let and
denote the
-th household and the
-th person within that household; the logistic link function then assumes the form
where the random effects are now defined at household level and have mean zero. Warning: this modelling option is available only for the binomial model.
A list with the following components
samples
: a matrix, each row of which corresponds to a sample from the predictive distribution.
h
: vector of the values of the tuning parameter at each iteration of the Langevin-Hastings MCMC algorithm.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
control.mcmc.MCML
, create.ID.coords
.
This function simulates from the conditional distribution of the random effects of binomial and Poisson models.
Laplace.sampling.lr( mu, sigma2, K, y, units.m, control.mcmc, messages = TRUE, plot.correlogram = TRUE, poisson.llik = FALSE )
Laplace.sampling.lr( mu, sigma2, K, y, units.m, control.mcmc, messages = TRUE, plot.correlogram = TRUE, poisson.llik = FALSE )
mu |
mean vector of the linear predictor. |
sigma2 |
variance of the random effect. |
K |
random effect design matrix, or kernel matrix for the low-rank approximation. |
y |
vector of binomial/Poisson observations. |
units.m |
vector of binomial denominators, or offset if the Poisson model is used. |
control.mcmc |
output from |
messages |
logical; if |
plot.correlogram |
logical; if |
poisson.llik |
logical; if |
Binomial model. Conditionally on , the data
y
follow a binomial distribution with probability and binomial denominators
units.m
. Let denote the random effects design matrix; a logistic link function is used, thus the linear predictor assumes the form
where is the mean vector component defined through
mu
.
Poisson model. Conditionally on , the data
y
follow a Poisson distribution with mean , where
is an offset set through the argument
units.m
. Let denote the random effects design matrix; a log link function is used, thus the linear predictor assumes the form
where is the mean vector component defined through
mu
.
The random effect has iid components distributed as zero-mean Gaussian variables with variance
sigma2
.
Laplace sampling. This function generates samples from the distribution of given the data
y
. Specifically, a Langevin-Hastings algorithm is used to update where
and
are the inverse of the negative Hessian and the mode of the distribution of
given
y
, respectively. At each iteration a new value for
is proposed from a multivariate Gaussian distribution with mean
where is the current value for
,
is a tuning parameter and
is the the gradient of the log-density of the distribution of
given
y
. The tuning parameter is updated according to the following adaptive scheme: the value of
at the
-th iteration, say
, is given by
where and
are pre-defined constants, and
is the acceptance rate at the
-th iteration (
is the optimal acceptance rate for a multivariate standard Gaussian distribution).
The starting value for
, and the values for
and
can be set through the function
control.mcmc.MCML
.
A list with the following components
samples
: a matrix, each row of which corresponds to a sample from the predictive distribution.
h
: vector of the values of the tuning parameter at each iteration of the Langevin-Hastings MCMC algorithm.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function simulates from the conditional distribution of a Gaussian process given binomial y
.
The Guassian process is also approximated using SPDE.
Laplace.sampling.SPDE( mu, sigma2, phi, kappa, y, units.m, coords, mesh, control.mcmc, messages = TRUE, plot.correlogram = TRUE, poisson.llik )
Laplace.sampling.SPDE( mu, sigma2, phi, kappa, y, units.m, coords, mesh, control.mcmc, messages = TRUE, plot.correlogram = TRUE, poisson.llik )
mu |
mean vector of the Gaussian process to approximate. |
sigma2 |
variance of the Gaussian process to approximate. |
phi |
scale parameter of the Matern function for the Gaussian process to approximate. |
kappa |
smothness parameter of the Matern function for the Gaussian process to approximate. |
y |
vector of binomial observations. |
units.m |
vector of binomial denominators. |
coords |
matrix of two columns corresponding to the spatial coordinates. |
mesh |
mesh object set through |
control.mcmc |
control parameters of the Independence sampler set through |
messages |
logical; if |
plot.correlogram |
logical; if |
poisson.llik |
logical: if |
Binomial model. Conditionally on the random effect , the data
y
follow a binomial distribution with probability and binomial denominators
units.m
. The logistic link function is used for the linear predictor, which assumes the form
The random effect has a multivariate Gaussian distribution with mean
mu
and covariance matrix Sigma
.
A list with the following components
samples
: a matrix, each row of which corresponds to a sample from the predictive distribution.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This data-set contains counts of reported onchocerciasis (or riverblindess) cases from 91 villages sampled across Liberia. The variables are as follows:
lat latitude of the of sampled villages.
long longitude of the sampled villages.
ntest number of tested people for the presence nodules.
npos number of people that tested positive for the presence of nodules.
elevation the elevation in meters of the sampled village.
log_elevation the log-transformed elevation in meters of the sampled village.
data(liberia)
data(liberia)
A data frame with 90 rows and 6 variables
Zouré, H. G. M., Noma, M., Tekle, Afework, H., Amazigo, U. V., Diggle, P. J., Giorgi, E., and Remme, J. H. F. (2014). The Geographic Distribution of Onchocerciasis in the 20 Participating Countries of the African Programme for Onchocerciasis Control: (2) Pre-Control Endemicity Levels and Estimated Number Infected. Parasites & Vectors, 7, 326.
This function performs Bayesian estimation for the geostatistical linear Gaussian model.
linear.model.Bayes( formula, coords, data, kappa, control.mcmc, control.prior, low.rank = FALSE, knots = NULL, messages = TRUE )
linear.model.Bayes( formula, coords, data, kappa, control.mcmc, control.prior, low.rank = FALSE, knots = NULL, messages = TRUE )
formula |
an object of class " |
coords |
an object of class |
data |
a data frame containing the variables in the model. |
kappa |
shape parameter of the Matern covariance function. |
control.mcmc |
output from |
control.prior |
output from |
low.rank |
logical; if |
knots |
if |
messages |
logical; if |
This function performs Bayesian estimation for the geostatistical linear Gaussian model, specified as
where is the measured outcome,
is a vector of coavariates,
is a vector of regression coefficients,
is a stationary Gaussian spatial process and
are independent zero-mean Gaussian variables with variance
tau2
. More specifically, has an isotropic Matern covariance function with variance
sigma2
, scale parameter phi
and shape parameter kappa
. The shape parameter kappa
is treated as fixed.
Priors definition. Priors can be defined through the function control.prior
. The hierarchical structure of the priors is the following. Let be the vector of the covariance parameters
; then each component of
can have independent priors freely defined by the user. However, uniform and log-normal priors are also available as default priors for each of the covariance parameters. To remove the nugget effect
, no prior should be defined for
tau2
. Conditionally on sigma2
, the vector of regression coefficients beta
has a multivariate Gaussian prior with mean beta.mean
and covariance matrix sigma2*beta.covar
, while in the low-rank approximation the covariance matrix is simply beta.covar
.
Updating the covariance parameters using a Metropolis-Hastings algorithm. In the MCMC algorithm implemented in linear.model.Bayes
, the transformed parameters
are independently updated using a Metropolis Hastings algorithm. At the -th iteration, a new value is proposed for each from a univariate Gaussian distrubion with variance, say
, tuned according the following adaptive scheme
where is the acceptance rate at the
-th iteration (0.45 is the optimal acceptance rate for a univariate Gaussian distribution) whilst
and
are pre-defined constants. The starting values
for each of the parameters
,
and
can be set using the function
control.mcmc.Bayes
through the arguments h.theta1
, h.theta2
and h.theta3
. To define values for and
, see the documentation of
control.mcmc.Bayes
.
Low-rank approximation.
In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process might be computationally beneficial. Let
and
denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then
is approximated as
, where
are zero-mean mutually independent Gaussian variables with variance
sigma2
and is the isotropic Matern kernel (see
matern.kernel
). Since the resulting approximation is no longer a stationary process (but only approximately), sigma2
may take very different values from the actual variance of the Gaussian process to approximate. The function adjust.sigma2
can then be used to (approximately) explore the range for sigma2
. For example if the variance of the Gaussian process is 0.5
, then an approximate value for sigma2
is 0.5/const.sigma2
, where const.sigma2
is the value obtained with adjust.sigma2
.
An object of class "Bayes.PrevMap".
The function summary.Bayes.PrevMap
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: matrix of the posterior samples for each of the model parameters.
S
: matrix of the posterior samplesfor each component of the random effect. This is only returned for the low-rank approximation.
y
: response variable.
D
: matrix of covariarates.
coords
: matrix of the observed sampling locations.
kappa
: vaues of the shape parameter of the Matern function.
knots
: matrix of spatial knots used in the low-rank approximation.
const.sigma2
: vector of the values of the multiplicative factor used to adjust the sigma2
in the low-rank approximation.
h1
: vector of values taken by the tuning parameter h.theta1
at each iteration.
h2
: vector of values taken by the tuning parameter h.theta2
at each iteration.
h3
: vector of values taken by the tuning parameter h.theta3
at each iteration.
call
: the matched call.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.
control.prior
, control.mcmc.Bayes
, shape.matern
, summary.Bayes.PrevMap
, autocor.plot
, trace.plot
, dens.plot
, matern
, matern.kernel
, adjust.sigma2
.
This function performs maximum likelihood estimation for the geostatistical linear Gaussian Model.
linear.model.MLE( formula, coords = NULL, data, ID.coords = NULL, kappa, fixed.rel.nugget = NULL, start.cov.pars, method = "BFGS", low.rank = FALSE, knots = NULL, messages = TRUE, profile.llik = FALSE, SPDE = FALSE, mesh = NULL, SPDE.analytic.hessian = FALSE )
linear.model.MLE( formula, coords = NULL, data, ID.coords = NULL, kappa, fixed.rel.nugget = NULL, start.cov.pars, method = "BFGS", low.rank = FALSE, knots = NULL, messages = TRUE, profile.llik = FALSE, SPDE = FALSE, mesh = NULL, SPDE.analytic.hessian = FALSE )
formula |
an object of class " |
coords |
an object of class |
data |
a data frame containing the variables in the model. |
ID.coords |
vector of ID values for the unique set of spatial coordinates obtained from |
kappa |
shape parameter of the Matern covariance function. |
fixed.rel.nugget |
fixed value for the relative variance of the nugget effect; default is |
start.cov.pars |
if |
method |
method of optimization. If |
low.rank |
logical; if |
knots |
if |
messages |
logical; if |
profile.llik |
logical; if |
SPDE |
logical; if |
mesh |
an object obtained as result of a call to the function |
SPDE.analytic.hessian |
logical; if |
This function estimates the parameters of a geostatistical linear Gaussian model, specified as
where is the measured outcome,
is a vector of coavariates,
is a vector of regression coefficients,
is a stationary Gaussian spatial process and
are independent zero-mean Gaussian variables with variance
tau2
. More specifically, has an isotropic Matern covariance function with variance
sigma2
, scale parameter phi
and shape parameter kappa
. In the estimation, the shape parameter kappa
is treated as fixed. The relative variance of the nugget effect, nu2=tau2/sigma2
, can be fixed though the argument fixed.rel.nugget
; if fixed.rel.nugget=NULL
, then the variance of the nugget effect is also included in the estimation.
Locations with multiple observations.
If multiple observations are available at any of the sampled locations the above model is modified as follows. Let denote the random variable associated to the measured outcome for the j-th individual at location
. The linear geostatistical model assumes the form
where and
are specified as mentioned above, and
are i.i.d. zer0-mean Gaussian variable with variance
. his model can be fitted by specifing a vector of ID for the unique set locations thourgh the argument
ID.coords
(see also create.ID.coords
).
Low-rank approximation.
In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process can be computationally beneficial. Let
and
denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then
is approximated as
, where
are zero-mean mutually independent Gaussian variables with variance
sigma2
and is the isotropic Matern kernel (see
matern.kernel
). Since the resulting approximation is no longer a stationary process, the parameter sigma2
is adjusted by a factorconstant.sigma2
. See adjust.sigma2
for more details on the the computation of the adjustment factor constant.sigma2
in the low-rank approximation.
An object of class "PrevMap".
The function summary.PrevMap
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: estimates of the model parameters; use the function coef.PrevMap
to obtain estimates of covariance parameters on the original scale.
covariance
: covariance matrix of the ML estimates.
log.lik
: maximum value of the log-likelihood.
y
: response variable.
D
: matrix of covariates.
coords
: matrix of the observed sampling locations.
ID.coords
: set of ID values defined through the argument ID.coords
.
method
: method of optimization used.
kappa
: fixed value of the shape parameter of the Matern function.
knots
: matrix of the spatial knots used in the low-rank approximation.
const.sigma2
: adjustment factor for sigma2
in the low-rank approximation.
fixed.rel.nugget
: fixed value for the relative variance of the nugget effect.
mesh
: the mesh used in the SPDE approximation.
call
: the matched call.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.
shape.matern
, summary.PrevMap
, coef.PrevMap
, matern
, matern.kernel
, maxBFGS
, nlminb
.
This function performs Monte Carlo maximum likelihood (MCML) estimation for a geostatistical linear model with preferentially sampled locations. For more details on the model, see below.
lm.ps.MCML( formula.response, formula.log.intensity = ~1, coords, which.is.preferential = NULL, data.response, data.intensity = NULL, par0, control.mcmc, kappa1, kappa2, mesh, grid.intensity, start.par = NULL, method = "nlminb", messages = TRUE, plot.correlogram = TRUE )
lm.ps.MCML( formula.response, formula.log.intensity = ~1, coords, which.is.preferential = NULL, data.response, data.intensity = NULL, par0, control.mcmc, kappa1, kappa2, mesh, grid.intensity, start.par = NULL, method = "nlminb", messages = TRUE, plot.correlogram = TRUE )
formula.response |
an object of class |
formula.log.intensity |
an object of class |
coords |
an object of class |
which.is.preferential |
a vector of 0 and 1, where 1 indicates a location in the data from a prefential sampling scheme and 0 from a non-preferential. This option is used to fit a model with a mix of preferentally and non-preferentiall sampled locations. For more, details on the model structure see the 'Details' section. |
data.response |
a data frame containing the variables in the sub-model of the response variable. |
data.intensity |
a data frame containing the variables in the log-Gaussian Coz process sub-model. This data frame must be provided only when explanatory variables are used in the
log-Gaussian Cox process model. Each row in the data frame must correspond to a point in the grid provided through the argument 'grid.intensity'. Deafult is |
par0 |
an object of class 'coef.PrevMap.ps'. This argument is used to define the parameters of the importance sampling distribution used in the MCML algorithm.
The input of this argument must be defined using the |
control.mcmc |
output from |
kappa1 |
fixed value for the shape parameter of the Matern covariance function of the spatial process of the sampling intensity (currently only |
kappa2 |
fixed value for the shape parameter of the Matern covariance function of the spatial process of the response variable. |
mesh |
an object obtained as result of a call to the function |
grid.intensity |
a regular grid covering the geographical region of interest, used to approximate the density function of the log-Gaussian Cox process. |
start.par |
starting value of the optimization algorithm. This is an object of class 'coef.PrevMap.ps' and must be defined using the function |
method |
method of optimization. If |
messages |
logical; if |
plot.correlogram |
logical; if |
This function performs parameter estimation for a geostatistical linear model with preferentially sampled locations. Let and
denote two independent, stationary and isotropic Gaussian processes.
The overall model consists of two sub-models: the log-Gaussian Cox process model for the preferentially sampled locations, say
;
the model for the response variable, say
. The model assumes that
where denotes 'the distribution of .'.
Each of the two submodels has an associated linear predictor.
Let
denote the intensity of the Poisson process
, continionally on
. Then
,
where is vector of explanatory variables with regression coefficient
. This linear predictor is defined through the argument
formula.log.intensity
.
The density of is given by
,
where is the region of interest. The integral at the denominator is intractable and is then approximated using a quadrature procedure.
The regular grid covering
, used for the quadrature, must be provided through the argument
grid.intensity
.
Conditionally on ,
and
, the response variable model is given by
where is another vector of regression coefficients and
is the preferentiality parameter. If
then we recover the standard geostatistical model.
More details on the fitting procedure can be found in Diggle and Giorgi (2016).
When the data have a mix of preferentially and non-preferentially sampled locations.
In some cases the set of locations may consist of a sub-set which is preferentially sampled, , and a standard
non-prefential sample,
. Let
and
denote the measurments at locations
and
.
In the current implementation, the model has the following form
where and
are two independent Gaussian process but with shared parameters, associated with
and
, respectively.
The linear predictor for
is the same as above. The measurements
, instead, have linear predicotr
where is vector of regression coefficients, different from
. The linear predictor for
and
is specified though
formula.response
.
For example, response ~ x | x + z
defines a linear predictor for with one explanatory variable
x
and a linear predictor for with two explanatory variables
x
and z
. An example on the application of this model is given in Diggle and Giorgi (2016).
An object of class "PrevMap.ps".
The function summary.PrevMap.ps
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: estimates of the model parameters; use the function coef.PrevMap.ps
to obtain estimates of covariance parameters on the original scale.
covariance
: covariance matrix of the MCML estimates.
log.lik
: maximum value of the approximated log-likelihood.
y
: observed values of the response variable. If which.is.preferential
has been provided, then y
is a list with components
y$preferential
, for the data with prefentially sampled locations, and y$non.preferential
, for the remiaining.
D.response
: matrix of covariates used to model the mean component of the response variable. If which.is.preferential
has been provided, then D.response
is a list with components
D.response$preferential
, for the data with prefentially sampled locations, and D.response$non.preferential
, for the remiaining.
D.intensity
: matrix of covariates used to model the mean component of log-intensity of the log-Gaussian Cox process.
grid.intensity
: grid of locations used to approximate the intractable integral of the log-Gaussian Cox process model.
coords
: matrix of the observed sampling locations. If which.is.preferential
has been provided, then coords
is a list with components
y$preferential
, for the data with prefentially sampled locations, and y$non.preferential
, for the remiaining.
method
: method of optimization used.
ID.coords
: set of ID values defined through the argument ID.coords
.
kappa.response
: fixed value of the shape parameter of the Matern covariance function used to model the spatial process associated with the response variable.
mesh
: the mesh used in the SPDE approximation.
samples
: matrix of the random effects samples from the importance sampling distribution used to approximate the likelihood function.
call
: the matched call.
Emanuele Giorgi [email protected]
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Diggle, P.J., Giorgi, E. (2017). Preferential sampling of exposures levels. In: Handbook of Environmental and Ecological Statistics. Chapman & Hall.
Diggle, P.J., Menezes, R. and Su, T.-L. (2010). Geostatistical analysis under preferential sampling (with Discussion). Applied Statistics, 59, 191-232.
Lindgren, F., Havard, R., Lindstrom, J. (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach (with discussion). Journal of the Royal Statistical Society, Series B, 73, 423–498.
Pati, D., Reich, B. J., and Dunson, D. B. (2011). Bayesian geostatistical modelling with informative sampling locations. Biometrika, 98, 35-48.
This data-set relates to a study of the prevalence of Loa loa (eyeworm) in a series of surveys undertaken in 197 villages in west Africa (Cameroon and southern Nigeria). The variables are as follows:
ROW row id: 1 to 197.
VILLCODE village id.
LONGITUDE Longitude in degrees.
LATITUDE Latitude in degrees.
NO_EXAM Number of people tested.
NO_INF Number of positive test results.
ELEVATION Height above sea-level in metres.
MEAN9901 Mean of all NDVI values recorded at village location, 1999-2001
MAX9901 Maximum of all NDVI values recorded at village location, 1999-2001
MIN9901 Minimum of all NDVI values recorded at village location, 1999-2001
MIN9901 Minimum of all NDVI values recorded at village location, 1999-2001
STDEV9901 standard deviation of all NDVI values recorded at village location, 1999-2001
data(loaloa)
data(loaloa)
A data frame with 197 rows and 11 variables
Diggle, P.J., Thomson, M.C., Christensen, O.F., Rowlingson, B., Obsomer, V., Gardon, J., Wanji, S., Takougang, I., Enyong, P., Kamgno, J., Remme, H., Boussinesq, M. and Molyneux, D.H. (2007). Spatial modelling and prediction of Loa loa risk: decision making under uncertainty. Annals of Tropical Medicine and Parasitology, 101, 499-509.
Computes confidence intervals based on the interpolated profile likelihood computed for a single covariance parameter.
loglik.ci(object, coverage = 0.95, plot.spline.profile = TRUE)
loglik.ci(object, coverage = 0.95, plot.spline.profile = TRUE)
object |
object of class "profile.PrevMap" obtained from |
coverage |
a value between 0 and 1 indicating the coverage of the confidence interval based on the interpolated profile likelihood. Default is |
plot.spline.profile |
logical; if |
A list with elements lower
and upper
for the upper and lower limits of the confidence interval, respectively.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Computes profile log-likelihood, or evaluatesx likelihood keeping the other paramaters fixed, for the scale parameter phi
of the Matern function and the relative variance of the nugget effect nu2
in the linear Gaussian model.
loglik.linear.model( object, control.profile, plot.profile = TRUE, messages = TRUE )
loglik.linear.model( object, control.profile, plot.profile = TRUE, messages = TRUE )
object |
an object of class 'PrevMap', which is the fitted linear model obtained with the function |
control.profile |
control parameters obtained with |
plot.profile |
logical; if |
messages |
logical; if |
an object of class "profile.PrevMap" which is a list with the following values
eval.points.phi
: vector of the values used for phi
in the evaluation of the likelihood.
eval.points.rel.nugget
: vector of the values used for nu2
in the evaluation of the likelihood.
profile.phi
: vector of the values of the likelihood function evaluated at eval.points.phi
.
profile.rel.nugget
: vector of the values of the likelihood function evaluated at eval.points.rel.nugget
.
profile.phi.rel.nugget
: matrix of the values of the likelihood function evaluated at eval.points.phi
and eval.points.rel.nugget
.
fixed.par
: logical value; TRUE
is the evaluation if the likelihood is carried out by fixing the other parameters, and FALSE
if the computation of the profile-likelihood was performed instead.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function computes values of the Matern kernel for given distances and parameters.
matern.kernel(u, rho, kappa)
matern.kernel(u, rho, kappa)
u |
a vector, matrix or array with values of the distances between pairs of data locations. |
rho |
value of the (re-parametrized) scale parameter; this corresponds to the re-parametrization |
kappa |
value of the shape parameter. |
The Matern kernel is defined as:
where and
are the scale and shape parameters, respectively, and
is the modified Bessel function of the third kind of order
. The family is valid for
and
.
A vector matrix or array, according to the argument u, with the values of the Matern kernel function for the given distances.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
plot.pred.PrevMap
displays predictions obtained from spatial.pred.linear.MLE
, spatial.pred.linear.Bayes
,spatial.pred.binomial.MCML
, spatial.pred.binomial.Bayes
and spatial.pred.poisson.MCML
.
## S3 method for class 'pred.PrevMap' plot(x, type = NULL, summary = "predictions", ...)
## S3 method for class 'pred.PrevMap' plot(x, type = NULL, summary = "predictions", ...)
x |
an object of class "PrevMap". |
type |
a character indicating the type of prediction to display: 'prevalence','odds', 'logit' or 'probit' for binomial models; "log" or "exponential" for Poisson models. Default is |
summary |
character indicating which summary to display: 'predictions','quantiles', 'standard.errors' or 'exceedance.prob'; default is 'predictions'. If |
... |
further arguments passed to |
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
plot.pred.PrevMap.ps
displays predictions obtained from lm.ps.MCML
.
## S3 method for class 'pred.PrevMap.ps' plot(x, target = NULL, summary = "predictions", ...)
## S3 method for class 'pred.PrevMap.ps' plot(x, target = NULL, summary = "predictions", ...)
x |
an object of class "PrevMap". |
target |
a integer value indicating the predictive target: |
summary |
character indicating which summary to display: 'predictions','quantiles' or 'standard.errors'. Default is |
... |
further arguments passed to |
Emanuele Giorgi [email protected]
Displays the results from a call to variog.diagnostic.lm
and variog.diagnostic.glgm
.
## S3 method for class 'PrevMap.diagnostic' plot(x, ...)
## S3 method for class 'PrevMap.diagnostic' plot(x, ...)
x |
an object of class "PrevMap.diagnostic". |
... |
further arguments passed to |
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
variog.diagnostic.lm
, variog.diagnostic.glgm
This function displays a plot of the profile log-likelihood that is computed by the function loglik.linear.model
.
## S3 method for class 'profile.PrevMap' plot(x, log.scale = FALSE, plot.spline.profile = FALSE, ...)
## S3 method for class 'profile.PrevMap' plot(x, log.scale = FALSE, plot.spline.profile = FALSE, ...)
x |
object of class "profile.PrevMap" obtained as output from |
log.scale |
logical; if |
plot.spline.profile |
logical; if |
... |
further arugments passed to |
A plot is returned. No value is returned.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function plots the profile likelihood for the shape parameter of the Matern covariance function using the output from shape.matern
function.
## S3 method for class 'shape.matern' plot(x, plot.spline = TRUE, ...)
## S3 method for class 'shape.matern' plot(x, plot.spline = TRUE, ...)
x |
an object of class 'shape.matern' obtained as result of a call to |
plot.spline |
logical; if |
... |
further arguments passed to |
The function does not return any value.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function produces a plot with points indicating the data locations. Arguments can control the points sizes, patterns and colors. These can be set to be proportional to data values, ranks or quantiles. Alternatively, points can be added to the current plot.
point.map(data, var.name, coords, ...)
point.map(data, var.name, coords, ...)
data |
an object of class "data.frame" containing the data. |
var.name |
a |
coords |
a |
... |
additional arguments to be passed to |
This function performs Monte Carlo maximum likelihood (MCML) estimation for the geostatistical Poisson model with log link function.
poisson.log.MCML( formula, units.m = NULL, coords, data, ID.coords = NULL, par0, control.mcmc, kappa, fixed.rel.nugget = NULL, start.cov.pars, method = "BFGS", low.rank = FALSE, knots = NULL, messages = TRUE, plot.correlogram = TRUE )
poisson.log.MCML( formula, units.m = NULL, coords, data, ID.coords = NULL, par0, control.mcmc, kappa, fixed.rel.nugget = NULL, start.cov.pars, method = "BFGS", low.rank = FALSE, knots = NULL, messages = TRUE, plot.correlogram = TRUE )
formula |
an object of class |
units.m |
an object of class |
coords |
an object of class |
data |
a data frame containing the variables in the model. |
ID.coords |
vector of ID values for the unique set of spatial coordinates obtained from |
par0 |
parameters of the importance sampling distribution: these should be given in the following order |
control.mcmc |
output from |
kappa |
fixed value for the shape parameter of the Matern covariance function. |
fixed.rel.nugget |
fixed value for the relative variance of the nugget effect; |
start.cov.pars |
a vector of length two with elements corresponding to the starting values of |
method |
method of optimization. If |
low.rank |
logical; if |
knots |
if |
messages |
logical; if |
plot.correlogram |
logical; if |
This function performs parameter estimation for a geostatistical Poisson model with log link function. Conditionally on a zero-mean stationary Gaussian process and mutually independent zero-mean Gaussian variables
with variance
tau2
, the observations y
are generated from a Poisson distribution with mean , where
is an offset defined through the argument
units.m
. A canonical log link is used, thus the linear predictor assumes the form
where is a vector of covariates with associated regression coefficients
. The Gaussian process
has isotropic Matern covariance function (see
matern
) with variance sigma2
, scale parameter phi
and shape parameter kappa
.
In the poisson.log.MCML
function, the shape parameter is treated as fixed. The relative variance of the nugget effect, nu2=tau2/sigma2
, can also be fixed through the argument fixed.rel.nugget
; if fixed.rel.nugget=NULL
, then the relative variance of the nugget effect is also included in the estimation.
Monte Carlo Maximum likelihood.
The Monte Carlo maximum likelihood method uses conditional simulation from the distribution of the random effect given the data
y
, in order to approximate the high-dimensiional intractable integral given by the likelihood function. The resulting approximation of the likelihood is then maximized by a numerical optimization algorithm which uses analytic epression for computation of the gradient vector and Hessian matrix. The functions used for numerical optimization are maxBFGS
(method="BFGS"
), from the maxLik package, and nlminb
(method="nlminb"
).
Low-rank approximation.
In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process might be computationally beneficial. Let
and
denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then
is approximated as
, where
are zero-mean mutually independent Gaussian variables with variance
sigma2
and is the isotropic Matern kernel (see
matern.kernel
). Since the resulting approximation is no longer a stationary process (but only approximately), the parameter sigma2
is then multiplied by a factor constant.sigma2
so as to obtain a value that is closer to the actual variance of .
An object of class "PrevMap".
The function summary.PrevMap
is used to print a summary of the fitted model.
The object is a list with the following components:
estimate
: estimates of the model parameters; use the function coef.PrevMap
to obtain estimates of covariance parameters on the original scale.
covariance
: covariance matrix of the MCML estimates.
log.lik
: maximum value of the log-likelihood.
y
: observations.
units.m
: offset.
D
: matrix of covariates.
ID.coords
: set of ID values defined through the argument ID.coords
.
coords
: matrix of the observed sampling locations.
method
: method of optimization used.
kappa
: fixed value of the shape parameter of the Matern function.
knots
: matrix of the spatial knots used in the low-rank approximation.
const.sigma2
: adjustment factor for sigma2
in the low-rank approximation.
h
: vector of the values of the tuning parameter at each iteration of the Langevin-Hastings MCMC algorithm; see Laplace.sampling
, or Laplace.sampling.lr
if a low-rank approximation is used.
samples
: matrix of the random effects samples from the importance sampling distribution used to approximate the likelihood function.
fixed.rel.nugget
: fixed value for the relative variance of the nugget effect.
call
: the matched call.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.
Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08
Christensen, O. F. (2004). Monte carlo maximum likelihood in model-based geostatistics. Journal of Computational and Graphical Statistics 13, 702-718.
Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.
Laplace.sampling
, Laplace.sampling.lr
, summary.PrevMap
, coef.PrevMap
, matern
, matern.kernel
, control.mcmc.MCML
.
Computes the empirical variogram using “bins” of distance provided by the user.
s_variogram(data, variable, bins = NULL, n_permutation = 0)
s_variogram(data, variable, bins = NULL, n_permutation = 0)
data |
an object of class |
variable |
a character indicating the name of variable for which the variogram is to be computed. |
bins |
a vector indicating the 'bins' to be used to define the classes of distance used in the computation of the variogram.
By default |
n_permutation |
a non-negative integer indicating the number of permutation used to compute the 95
level envelope under the assumption of spatial independence. By default |
an object of class 'variogram' which is a list containing the following components
dist_class
the classes of distances based on the provided values for bins
v
the variogram values
n
the number of pairs of observations within each bin
mid_points
the middle points of the classes of distance.
lower_bound
the lower values of the 95
upper_bound
the upper values of the 95
Emanuele Giorgi [email protected]
Claudio Fronterre [email protected]
set.par.ps
defines the model coefficients of a geostatistical linear model with preferentially sampled locations.
The output of this function can be used to: 1) define the parameters of the importance sampling distribution in lm.ps.MCML
; 2) the starting values of the optimization algorithm in lm.ps.MCML
.
set.par.ps(p = 1, q = 1, intensity, response, preferentiality.par)
set.par.ps(p = 1, q = 1, intensity, response, preferentiality.par)
p |
number of covariates used in the response variable model, including the intercept. Default is |
q |
number of covariates used in the log-Guassian Cox process model, including the intercept. Default is |
intensity |
a vector of parameters of the log-Gaussian Cox process model. These must be provided in the following order: regression coefficients of the explanatory variables; variance and scale of the spatial correlation for the isotropic Gaussian process. In the case of a model with a mix of preferentially and non-preferentially sampled locations, the order of the regression coefficients should be the following: regression coefficients for the linear predictor with preferential sampling; regression coefficients for the linear predictor with non-preferential samples. |
response |
a vector of parameters of the response variable model. These must be provided in the following order: regression coefficients of the explanatory variables; variance and scale of the spatial correlation for the isotropic Gaussian process; and variance of the nugget effect. |
preferentiality.par |
value of the preferentiality paramter. |
a list of coefficients of class coef.PrevMap.ps
.
Emanuele Giorgi [email protected]
This function plots the profile likelihood for the shape parameter of the Matern covariance function used in the linear Gaussian model. It also computes confidence intervals of coverage coverage
by interpolating the profile likelihood with a spline and using the asymptotic distribution of a chi-squared with one degree of freedom.
shape.matern( formula, coords, data, set.kappa, fixed.rel.nugget = NULL, start.par, coverage = NULL, plot.profile = TRUE, messages = TRUE )
shape.matern( formula, coords, data, set.kappa, fixed.rel.nugget = NULL, start.par, coverage = NULL, plot.profile = TRUE, messages = TRUE )
formula |
an object of class |
coords |
an object of class |
data |
a data frame containing the variables in the model. |
set.kappa |
a vector indicating the set values for evluation of the profile likelihood. |
fixed.rel.nugget |
a value for the relative variance |
start.par |
starting values for the scale parameter |
coverage |
a value between 0 and 1 indicating the coverage of the confidence interval based on the interpolated profile liklelihood for the shape parameter. Default is |
plot.profile |
logical; if |
messages |
logical; if |
The function returns an object of class 'shape.matern' that is a list with the following components
set.kappa
set of values of the shape parameter used to evaluate the profile-likelihood.
val.kappa
values of the profile likelihood.
If a value for coverage
is specified, the list also contains lower
, upper
and kappa.hat
that correspond to the lower and upper limits of the confidence interval, and the maximum likelihood estimate for the shape parameter, respectively.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function performs two variogram-based tests for residual spatial correlation in real-valued and count (Binomial and Poisson) data.
spat.corr.diagnostic( formula, units.m = NULL, coords, data, likelihood, ID.coords = NULL, n.sim = 200, nAGQ = 1, uvec = NULL, plot.results = TRUE, lse.variogram = FALSE, kappa = 0.5, which.test = "both" )
spat.corr.diagnostic( formula, units.m = NULL, coords, data, likelihood, ID.coords = NULL, n.sim = 200, nAGQ = 1, uvec = NULL, plot.results = TRUE, lse.variogram = FALSE, kappa = 0.5, which.test = "both" )
formula |
an object of class |
units.m |
vector of binomial denominators, or offset if the Poisson model is used. |
coords |
an object of class |
data |
an object of class "data.frame" containing the data. |
likelihood |
a character that can be set to "Gaussian","Binomial" or "Poisson" |
ID.coords |
vector of ID values for the unique set of spatial coordinates obtained from |
n.sim |
number of simulations used to perform the selected test(s) for spatial correlation. |
nAGQ |
integer scalar (passed to |
uvec |
a vector with values used to define the variogram binning. If |
plot.results |
if |
lse.variogram |
if |
kappa |
smothness parameter of the Matern function for the Gaussian process to approximate. The deafault is |
which.test |
a character specifying which test for residual spatial correlation is to be performed: "variogram", "test statistic" or "both". The default is |
The function first fits a generalized linear mixed model using the for an outcome which, conditionally on i.i.d. random effects
, are mutually independent
GLMs with linear predictor
where is a vector of covariates which are specified through
formula
. Finally, the are assumed to be zero-mean Gaussian variables with variance
Variogram-based graphical diagnostic
This graphical diagnostic is performed by setting which.test="both"
or which.test="variogram"
. The output are 95
(see below lower.lim
and upper.lim
) that are generated under the assumption of spatial indepdence through the following steps
1. Fit a generalized linear mixed model as indicated by the equation above.
2. Obtain the mode, say , of the
conditioned on the data
.
3. Compute the empirical variogram using
4. Permute the locations specified in coords
, n.sim
time while holding the fixed.
5. For each of the permuted data-sets compute the empirical variogram based on the .
6. From the n.sim
variograms obtained in the previous step, compute the 95
If the observed variogram (obs.variogram
below), based on the un-permuted , falls within the 95
residual spatial correlation; if, instead, that partly falls outside the 95
Test for spatial independence
This diagnostic test is performed if which.test="both"
or which.test="test statistic"
. Let denote the empirical variogram based on
for the distance bin
.
The test statistic used for testing residual spatial correlation is
where is the number of pairs of data-points falling within the distance bin
(
n.bins
below) and is the estimate of
.
To obtain the distribution of the test statistic under the null hypothesis of spatial independence, we use the simulated empirical variograms as obtained in step 5 of the iterative procedure described in "Variogram-based graphical diagnostic."
The p-value for the test of spatial independence is then computed by taking the proportion of simulated values for
under the null the hypothesis that are larger than the value of
based on the original (un-permuted)
An object of class "PrevMap.diagnostic" which is a list containing the following components:
obs.variogram
: a vector of length length(uvec)-1
containing the values of the variogram for each of
the distance bins defined through uvec
.
distance.bins
: a vector of length length(uvec)-1
containing the average distance within each of the distance bins
defined through uvec
.
n.bins
: a vector of length length(uvec)-1
containing the number of pairs of data-points falling within each distance bin.
lower.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the lower limits of the 95
generated under the assumption of absence of spatial correlation at each fo the distance bins defined through uvec
.
upper.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the upper limits of the 95
generated under the assumption of absence of spatial correlation at each fo the distance bins defined through uvec
.
mode.rand.effects
: the predictive mode of the random effects from the fitted non-spatial generalized linear mixed model.
p.value
: (available only if which.test="both"
or which.test="test statistic"
) p-value of the test for residual spatial correlation.
lse.variogram
: (available only if lse.variogram=TRUE
) a vector of length length(uvec)-1
containing the values of the estimated Matern variogram via a weighted least square fit.
This function performs Bayesian spatial prediction for the binomial logistic and binary probit models.
spatial.pred.binomial.Bayes( object, grid.pred, predictors = NULL, type = "marginal", scale.predictions = "prevalence", quantiles = c(0.025, 0.975), standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, messages = TRUE )
spatial.pred.binomial.Bayes( object, grid.pred, predictors = NULL, type = "marginal", scale.predictions = "prevalence", quantiles = c(0.025, 0.975), standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, messages = TRUE )
object |
an object of class "Bayes.PrevMap" obtained as result of a call to |
grid.pred |
a matrix of prediction locations. |
predictors |
a data frame of the values of the explanatory variables at each of the locations in |
type |
a character indicating the type of spatial predictions: |
scale.predictions |
a character vector of maximum length 3, indicating the required scale on which spatial prediction is carried out: "logit", "prevalence", "odds" and "probit". Default is |
quantiles |
a vector of quantiles used to summarise the spatial predictions. |
standard.errors |
logical; if |
thresholds |
a vector of exceedance thresholds; default is |
scale.thresholds |
a character value ("logit", "prevalence", "odds" or "probit") indicating the scale on which exceedance thresholds are provided. |
messages |
logical; if |
A "pred.PrevMap" object list with the following components: logit
; prevalence
; odds
; probit
;exceedance.prob
, corresponding to a matrix of the exceedance probabilities where each column corresponds to a specified value in thresholds
; samples
, corresponding to a matrix of the posterior samples at each prediction locations for the linear predictor; grid.pred
prediction locations.
Each of the three components logit
, prevalence
, odds
and probit
is also a list with the following components:
predictions
: a vector of the predictive mean for the associated quantity (logit, odds or prevalence).
standard.errors
: a vector of prediction standard errors (if standard.errors=TRUE
).
quantiles
: a matrix of quantiles of the resulting predictions with each column corresponding to a quantile specified through the argument quantiles
.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function performs spatial prediction, fixing the model parameters at the Monte Carlo maximum likelihood estimates of a geostatistical binomial logistic model.
spatial.pred.binomial.MCML( object, grid.pred, predictors = NULL, control.mcmc, type = "marginal", scale.predictions = c("logit", "prevalence", "odds"), quantiles = c(0.025, 0.975), standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, plot.correlogram = FALSE, messages = TRUE )
spatial.pred.binomial.MCML( object, grid.pred, predictors = NULL, control.mcmc, type = "marginal", scale.predictions = c("logit", "prevalence", "odds"), quantiles = c(0.025, 0.975), standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, plot.correlogram = FALSE, messages = TRUE )
object |
an object of class "PrevMap" obtained as result of a call to |
grid.pred |
a matrix of prediction locations. |
predictors |
a data frame of the values of the explanatory variables at each of the locations in |
control.mcmc |
output from |
type |
a character indicating the type of spatial predictions: |
scale.predictions |
a character vector of maximum length 3, indicating the required scale on which spatial prediction is carried out: "logit", "prevalence" and "odds". Default is |
quantiles |
a vector of quantiles used to summarise the spatial predictions. |
standard.errors |
logical; if |
thresholds |
a vector of exceedance thresholds; default is |
scale.thresholds |
a character value indicating the scale on which exceedance thresholds are provided; |
plot.correlogram |
logical; if |
messages |
logical; if |
A "pred.PrevMap" object list with the following components: logit
; prevalence
; odds
; exceedance.prob
, corresponding to a matrix of the exceedance probabilities where each column corresponds to a specified value in thresholds
; samples
, corresponding to a matrix of the predictive samples at each prediction locations for the linear predictor of the binomial logistic model (if scale.predictions="logit"
and neither the SPDE nor the low-rank approximations have been used, this component is NULL
); grid.pred
prediction locations.
Each of the three components logit
, prevalence
and odds
is also a list with the following components:
predictions
: a vector of the predictive mean for the associated quantity (logit, odds or prevalence).
standard.errors
: a vector of prediction standard errors (if standard.errors=TRUE
).
quantiles
: a matrix of quantiles of the resulting predictions with each column corresponding to a quantile specified through the argument quantiles
.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function performs Bayesian prediction for a geostatistical linear Gaussian model.
spatial.pred.linear.Bayes( object, grid.pred, predictors = NULL, type = "marginal", scale.predictions = c("logit", "prevalence", "odds"), quantiles = c(0.025, 0.975), standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, messages = TRUE )
spatial.pred.linear.Bayes( object, grid.pred, predictors = NULL, type = "marginal", scale.predictions = c("logit", "prevalence", "odds"), quantiles = c(0.025, 0.975), standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, messages = TRUE )
object |
an object of class "Bayes.PrevMap" obtained as result of a call to |
grid.pred |
a matrix of prediction locations. |
predictors |
a data frame of the values of the explanatory variables at each of the locations in |
type |
a character indicating the type of spatial predictions: |
scale.predictions |
a character vector of maximum length 3, indicating the required scale on which spatial prediction is carried out: "logit", "prevalence" and "odds". Default is |
quantiles |
a vector of quantiles used to summarise the spatial predictions. |
standard.errors |
logical; if |
thresholds |
a vector of exceedance thresholds; default is |
scale.thresholds |
a character value indicating the scale on which exceedance thresholds are provided: |
messages |
logical; if |
A "pred.PrevMap" object list with the following components: logit
; prevalence
; odds
; exceedance.prob
, corresponding to a matrix of the exceedance probabilities where each column corresponds to a specified value in thresholds
; grid.pred
prediction locations.
Each of the three components logit
, prevalence
and odds
is also a list with the following components:
predictions
: a vector of the predictive mean for the associated quantity (logit, odds or prevalence).
standard.errors
: a vector of prediction standard errors (if standard.errors=TRUE
).
quantiles
: a matrix of quantiles of the resulting predictions with each column corresponding to a quantile specified through the argument quantiles
.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function performs spatial prediction, fixing the model parameters at the maximum likelihood estimates of a linear geostatistical model.
spatial.pred.linear.MLE( object, grid.pred, predictors = NULL, predictors.samples = NULL, type = "marginal", scale.predictions = c("logit", "prevalence", "odds"), quantiles = c(0.025, 0.975), n.sim.prev = 0, standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, messages = TRUE, include.nugget = FALSE )
spatial.pred.linear.MLE( object, grid.pred, predictors = NULL, predictors.samples = NULL, type = "marginal", scale.predictions = c("logit", "prevalence", "odds"), quantiles = c(0.025, 0.975), n.sim.prev = 0, standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, messages = TRUE, include.nugget = FALSE )
object |
an object of class "PrevMap" obtained as result of a call to |
grid.pred |
a matrix of prediction locations. |
predictors |
a data frame of the values of the explanatory variables at each of the locations in |
predictors.samples |
a list of data frame objects. This argument is used to average over repeated simulations of the predictor variables in order to obtain an "average" map over the distribution of the explanatory variables in the model.
Each component of the list is a simulation. The number of simulations passed through |
type |
a character indicating the type of spatial predictions: |
scale.predictions |
a character vector of maximum length 3, indicating the required scale on which spatial prediction is carried out: "logit", "prevalence" and "odds". Default is |
quantiles |
a vector of quantiles used to summarise the spatial predictions. |
n.sim.prev |
number of simulation for non-linear predictive targets. Default is |
standard.errors |
logical; if |
thresholds |
a vector of exceedance thresholds; default is |
scale.thresholds |
a character value indicating the scale on which exceedance thresholds are provided; |
messages |
logical; if |
include.nugget |
logical; if |
A "pred.PrevMap" object list with the following components: logit
; prevalence
; odds
; exceedance.prob
, corresponding to a matrix of the exceedance probabilities where each column corresponds to a specified value in thresholds
; grid.pred
prediction locations; samples
, corresponding to the predictive samples of the linear predictor (only if any(scale.predictions=="prevalence")
).
Each of the three components logit
, prevalence
and odds
is also a list with the following components:
predictions
: a vector of the predictive mean for the associated quantity (logit, odds or prevalence).
standard.errors
: a vector of prediction standard errors (if standard.errors=TRUE
).
quantiles
: a matrix of quantiles of the resulting predictions with each column corresponding to a quantile specified through the argument quantiles
.
samples
: If n.sim.prev > 0
, the function returns n.sim.prev
samples of the linear predictor at each of the prediction locations.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function performs spatial prediction, fixing the model parameters at the maximum likelihood estimates of a linear geostatistical model.
spatial.pred.lm.ps( object, grid.pred = NULL, predictors = NULL, predictors.intensity = NULL, control.mcmc = NULL, target = 3, type = "marginal", quantiles = NULL, standard.errors = FALSE, messages = TRUE, return.samples = FALSE )
spatial.pred.lm.ps( object, grid.pred = NULL, predictors = NULL, predictors.intensity = NULL, control.mcmc = NULL, target = 3, type = "marginal", quantiles = NULL, standard.errors = FALSE, messages = TRUE, return.samples = FALSE )
object |
an object of class "PrevMap" obtained as result of a call to |
grid.pred |
a matrix of prediction locations. Default is |
predictors |
a data frame of the values of the explanatory variables at each of the locations in |
predictors.intensity |
a data frame of the values of the explanatory variables at each of the locations in |
control.mcmc |
output from |
target |
an integeter indicating the predictive target: |
type |
a character indicating the type of spatial predictions for |
quantiles |
a vector of quantiles used to summarise the spatial predictions. |
standard.errors |
logical; if |
messages |
logical; if |
return.samples |
logical; if |
A "pred.PrevMap.ps" object list with the following components: response
(if target=1
or target=3
) and intensity
(if target=2
pr target=3
).
grid.pred
prediction locations.
Each of the components intensity
and response
is a list with the following components:
predictions
: a vector of the predictive mean for the corresponding target.
standard.errors
: a vector of prediction standard errors (if standard.errors=TRUE
).
quantiles
: a matrix of quantiles of the resulting predictions with each column corresponding to a quantile specified through the argument quantiles
.
samples
: a matrix corresponding to the predictive samples of the predictive target (only if return.samples=TRUE
), with each row corresponding to a samples and column to a prediction location.
In the case of a model with a mix of preferential and non-preferential data, if target=1
or target=3
, each of the above components will be a list with two components,
namely preferential
and non.preferential
, associated with response
.
Emanuele Giorgi [email protected]
This function performs spatial prediction, fixing the model parameters at the Monte Carlo maximum likelihood estimates of a geostatistical Poisson model with log link function.
spatial.pred.poisson.MCML( object, grid.pred, predictors = NULL, control.mcmc, type = "marginal", scale.predictions = c("log", "exponential"), quantiles = c(0.025, 0.975), standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, plot.correlogram = FALSE, messages = TRUE )
spatial.pred.poisson.MCML( object, grid.pred, predictors = NULL, control.mcmc, type = "marginal", scale.predictions = c("log", "exponential"), quantiles = c(0.025, 0.975), standard.errors = FALSE, thresholds = NULL, scale.thresholds = NULL, plot.correlogram = FALSE, messages = TRUE )
object |
an object of class "PrevMap" obtained as result of a call to |
grid.pred |
a matrix of prediction locations. |
predictors |
a data frame of the values of the explanatory variables at each of the locations in |
control.mcmc |
output from |
type |
a character indicating the type of spatial predictions: |
scale.predictions |
a character vector of maximum length 2, indicating the required scale on which spatial prediction is carried out: "log" and "exponential". Default is |
quantiles |
a vector of quantiles used to summarise the spatial predictions. |
standard.errors |
logical; if |
thresholds |
a vector of exceedance thresholds; default is |
scale.thresholds |
a character value indicating the scale on which exceedance thresholds are provided; |
plot.correlogram |
logical; if |
messages |
logical; if |
A "pred.PrevMap" object list with the following components: log
; exponential
; exceedance.prob
, corresponding to a matrix of the exceedance probabilities where each column corresponds to a specified value in thresholds
; samples
, corresponding to a matrix of the predictive samples at each prediction locations for the linear predictor of the Poisson model (if scale.predictions="log"
this component is NULL
); grid.pred
prediction locations.
Each of the three components log
and exponential
is also a list with the following components:
predictions
: a vector of the predictive mean for the associated quantity (log or exponential).
standard.errors
: a vector of prediction standard errors (if standard.errors=TRUE
).
quantiles
: a matrix of quantiles of the resulting predictions with each column corresponding to a quantile specified through the argument quantiles
.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
summary
method for the class "Bayes.PrevMap" that computes the posterior mean, median, mode and high posterior density intervals using samples from Bayesian fits.
## S3 method for class 'Bayes.PrevMap' summary(object, hpd.coverage = 0.95, ...)
## S3 method for class 'Bayes.PrevMap' summary(object, hpd.coverage = 0.95, ...)
object |
an object of class "Bayes.PrevMap" obatained as result of a call to |
hpd.coverage |
value of the coverage of the high posterior density intervals; default is |
... |
further arguments passed to or from other methods. |
A list with the following values
linear
: logical value that is TRUE
if a linear model was fitted and FALSE
otherwise.
binary
: logical value that is TRUE
if a binary model was fitted and FALSE
otherwise.
probit
: logical value that is TRUE
if a binary model with probit link function was fitted and FALSE
if with logistic link function.
ck
: logical value that is TRUE
if a low-rank approximation was fitted and FALSE
otherwise.
beta
: matrix of the posterior summaries for the regression coefficients.
sigma2
: vector of the posterior summaries for sigma2
.
phi
: vector of the posterior summaries for phi
.
tau2
: vector of the posterior summaries for tau2
.
call
: matched call.
kappa
: fixed value of the shape paramter of the Matern covariance function.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
summary
method for the class "PrevMap" that computes the standard errors and p-values of likelihood-based model fits.
## S3 method for class 'PrevMap' summary(object, log.cov.pars = TRUE, ...)
## S3 method for class 'PrevMap' summary(object, log.cov.pars = TRUE, ...)
object |
an object of class "PrevMap" obatained as result of a call to |
log.cov.pars |
logical; if |
... |
further arguments passed to or from other methods. |
A list with the following components
linear
: logical value; linear=TRUE
if a linear model was fitted and linear=FALSE
otherwise.
poisson
: logical value; poisson=TRUE
if a Poisson model was fitted and poisson=FALSE
otherwise.
ck
: logical value; ck=TRUE
if a low-rank approximation was used and ck=FALSE
otherwise.
spde
: logical value; spde=TRUE
if the SPDE approximation was used and spde=FALSE
otherwise.
coefficients
: matrix of the estimates, standard errors and p-values of the estimates of the regression coefficients.
cov.pars
: matrix of the estimates and standard errors of the covariance parameters.
log.lik
: value of likelihood function at the maximum likelihood estimates.
kappa
: fixed value of the shape paramter of the Matern covariance function.
kappa.t
: fixed value of the shape paramter of the Matern covariance function for the temporal covariance matrix, if a spatio-temporal model has been fitted.
fixed.rel.nugget
: fixed value for the relative variance of the nugget effect.
call
: matched call.
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
summary
method for the class "PrevMap" that computes the standard errors and p-values of likelihood-based model fits.
## S3 method for class 'PrevMap.ps' summary(object, log.cov.pars = TRUE, ...)
## S3 method for class 'PrevMap.ps' summary(object, log.cov.pars = TRUE, ...)
object |
an object of class "PrevMap.ps" obatained as result of a call to |
log.cov.pars |
logical; if |
... |
further arguments passed to or from other methods. |
A list with the following components
coefficients.response
: matrix of the estimates, standard errors and p-values of the estimates of the regression coefficients for the response variable.
coefficients.intensity
: matrix of the estimates, standard errors and p-values of the estimates of the regression coefficients for the sampling intenisty of the log-Gaussian process.
cov.pars.response
: matrix of the estimates and standard errors of the covariance parameters for the Gaussian process associated with the response.
cov.pars.intenisty
: matrix of the estimates and standard errors of the covariance parameters for the Gaussian process associated with the log-Gaussian process.
log.lik
: value of likelihood function at the maximum likelihood estimates.
kappa.response
: fixed value of the shape paramter of the Matern covariance function.
call
: matched call.
Emanuele Giorgi [email protected]
Displays the trace-plots for the posterior samples of the model parameters and spatial random effects.
trace.plot(object, param, component.beta = NULL, component.S = NULL)
trace.plot(object, param, component.beta = NULL, component.S = NULL)
object |
an object of class 'Bayes.PrevMap'. |
param |
a character indicating for which component of the model the density plot is required: |
component.beta |
if |
component.S |
if |
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
Trace-plots of the MCMC samples from the importance sampling distribution used in binomial.logistic.MCML
.
trace.plot.MCML(object, component = NULL, ...)
trace.plot.MCML(object, component = NULL, ...)
object |
an object of class "PrevMap" obatained as result of a call to |
component |
a positive integer indicating the number of the random effect component for which a trace-plot is required. If |
... |
further arguments passed to |
Emanuele Giorgi [email protected]
Peter J. Diggle [email protected]
This function produces a plot of the variable of interest against each of the two geographical coordinates.
trend.plot(data, var.name, coords, ...)
trend.plot(data, var.name, coords, ...)
data |
an object of class "data.frame" containing the data. |
var.name |
a |
coords |
a |
... |
additional arguments to be passed to |
This function performs model validation for generalized linear geostatistical models (Binomial and Poisson) using Monte Carlo methods based on the variogram.
variog.diagnostic.glgm( object, n.sim = 200, uvec = NULL, plot.results = TRUE, which.test = "both" )
variog.diagnostic.glgm( object, n.sim = 200, uvec = NULL, plot.results = TRUE, which.test = "both" )
object |
an object of class "PrevMap" obtained as an output from |
n.sim |
integer indicating the number of simulations used for the variogram-based diagnostics.
Defeault is |
uvec |
a vector with values used to define the variogram binning. If |
plot.results |
if |
which.test |
a character specifying which test for residual spatial correlation is to be performed: "variogram", "test statistic" or "both". The default is |
The function takes as an input through the argument object
a fitted
generalized linear geostaistical model for an outcome , with linear predictor
where is a vector of covariates which are specified through
formula
, is a spatial Gaussian process and the
are assumed to be zero-mean Gaussian.
The model validation is performed on the adopted satationary and isotropic Matern covariance function used for
.
More specifically, the function allows the users to select either of the following validation procedures.
Variogram-based graphical validation
This graphical diagnostic is performed by setting which.test="both"
or which.test="variogram"
. The output are 95
(see below lower.lim
and upper.lim
) that are generated under the assumption that the fitted model did generate the analysed data-set.
This validation procedure proceed through the following steps.
1. Obtain the mean, say , of the
conditioned on the data
and by setting
in the equation above.
2. Compute the empirical variogram using
3. Simulate n.sim
data-sets under the fitted geostatistical model.
4. For each of the simulated data-sets and obtain as in Step 1.
Finally, compute the empirical variogram based on the resulting
.
5. From the n.sim
variograms obtained in the previous step, compute the 95
If the observed variogram (obs.variogram
below), based on the from Step 2, falls within the 95
evidence against the fitted spatial correlation model; if, instead, that partly falls outside the 95
correlation in the data.
Test for suitability of the adopted correlation function
This diagnostic test is performed if which.test="both"
or which.test="test statistic"
. Let and
denote the empirical and theoretical variograms based on
for the distance bin
.
The test statistic used for testing residual spatial correlation is
where is the number of pairs of data-points falling within the distance bin
(
n.bins
below).
To obtain the distribution of the test statistic under the null hypothesis that the fitted model did generate the analysed data-set, we use the simulated empirical variograms as obtained in step 5 of the iterative procedure described in "Variogram-based graphical validation."
The p-value for the test of suitability of the fitted spatial correlation function is then computed by taking the proportion of simulated values for
that are larger than the value of
based on the original
in Step 1.
An object of class "PrevMap.diagnostic" which is a list containing the following components:
obs.variogram
: a vector of length length(uvec)-1
containing the values of the variogram for each of
the distance bins defined through uvec
.
distance.bins
: a vector of length length(uvec)-1
containing the average distance within each of the distance bins
defined through uvec
.
n.bins
: a vector of length length(uvec)-1
containing the number of pairs of data-points falling within each distance bin.
lower.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the lower limits of the 95
generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec
.
upper.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the upper limits of the 95
generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec
.
mode.rand.effects
: the predictive mode of the random effects from the fitted non-spatial generalized linear mixed model.
p.value
: (available only if which.test="both"
or which.test="test statistic"
) p-value of the test for residual spatial correlation.
lse.variogram
: (available only if lse.variogram=TRUE
) a vector of length length(uvec)-1
containing the values of the estimated Matern variogram via a weighted least square fit.
This function performs model validation for linear geostatistical model using Monte Carlo methods based on the variogram.
variog.diagnostic.lm( object, n.sim = 1000, uvec = NULL, plot.results = TRUE, range.fact = 1, which.test = "both", param.uncertainty = FALSE )
variog.diagnostic.lm( object, n.sim = 1000, uvec = NULL, plot.results = TRUE, range.fact = 1, which.test = "both", param.uncertainty = FALSE )
object |
an object of class "PrevMap" obtained as an output from |
n.sim |
integer indicating the number of simulations used for the variogram-based diagnostics.
Defeault is |
uvec |
a vector with values used to define the variogram binning. If |
plot.results |
if |
range.fact |
a value between 0 and 1 used to disregard all distance bins provided through |
which.test |
a character specifying which test for residual spatial correlation is to be performed: "variogram", "test statistic" or "both". The default is |
param.uncertainty |
a logical indicating whether uncertainty in the model parameters should be incorporated in the selected diagnostic tests. Default is |
The function takes as an input through the argument object
a fitted
linear geostaistical model for an outcome , which is expressed as
where is a vector of covariates which are specified through
formula
, is a spatial Gaussian process and the
are assumed to be zero-mean Gaussian.
The model validation is performed on the adopted satationary and isotropic Matern covariance function used for
.
More specifically, the function allows the users to select either of the following validation procedures.
Variogram-based graphical validation
This graphical diagnostic is performed by setting which.test="both"
or which.test="variogram"
. The output are 95
(see below lower.lim
and upper.lim
) that are generated under the assumption that the fitted model did generate the analysed data-set.
This validation procedure proceed through the following steps.
1. Obtain the mean, say , of the
conditioned on the data
.
2. Compute the empirical variogram using
3. Simulate n.sim
data-sets under the fitted geostatistical model.
4. For each of the simulated data-sets and obtain as in Step 1.
Finally, compute the empirical variogram based on the resulting
.
5. From the n.sim
variograms obtained in the previous step, compute the 95
If the observed variogram (obs.variogram
below), based on the from Step 2, falls within the 95
evidence against the fitted spatial correlation model; if, instead, that partly falls outside the 95
correlation in the data.
Test for suitability of the adopted correlation function
This diagnostic test is performed if which.test="both"
or which.test="test statistic"
. Let and
denote the empirical and theoretical variograms based on
for the distance bin
.
The test statistic used for testing residual spatial correlation is
where is the number of pairs of data-points falling within the distance bin
(
n.bins
below).
To obtain the distribution of the test statistic under the null hypothesis that the fitted model did generate the analysed data-set, we use the simulated empirical variograms as obtained in step 5 of the iterative procedure described in "Variogram-based graphical validation."
The p-value for the test of suitability of the fitted spatial correlation function is then computed by taking the proportion of simulated values for
that are larger than the value of
based on the original
in Step 1.
An object of class "PrevMap.diagnostic" which is a list containing the following components:
obs.variogram
: a vector of length length(uvec)-1
containing the values of the variogram for each of
the distance bins defined through uvec
.
distance.bins
: a vector of length length(uvec)-1
containing the average distance within each of the distance bins
defined through uvec
.
n.bins
: a vector of length length(uvec)-1
containing the number of pairs of data-points falling within each distance bin.
lower.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the lower limits of the 95
generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec
.
upper.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the upper limits of the 95
generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec
.
mode.rand.effects
: the predictive mode of the random effects from the fitted non-spatial generalized linear mixed model.
p.value
: (available only if which.test="both"
or which.test="test statistic"
) p-value of the test for residual spatial correlation.
lse.variogram
: (available only if lse.variogram=TRUE
) a vector of length length(uvec)-1
containing the values of the estimated Matern variogram via a weighted least square fit.
This function computes sample (empirical) variograms with options for the classical or robust estimators. Output can be returned as a binned variogram, a variogram cloud or a smoothed variogram. Data transformation (Box-Cox) is allowed. “Trends” can be specified and are fitted by ordinary least squares in which case the variograms are computed using the residuals.
variogram(data, var.name, coords, ...)
variogram(data, var.name, coords, ...)
data |
an object of class "data.frame" containing the data. |
var.name |
a |
coords |
a |
... |
additional arguments to be passed to |
An object of the class "variogram" which is list containing components as detailed in variog
.