Chapter 4 IPM Implementation
4.1 Efficient implementation using NIMBLE
SPI-IPM
is implemented in a Bayesian framework and, more specifically, it is
fit using NIMBLE (de Valpine et al. 2017). In many ways, NIMBLE is the successor to
BUGS and JAGS, using very similar syntax but offering higher efficiency, much
more flexibility, and a range of smaller “quality-of-life improvements”.
Unlike its predecessors, NIMBLE is not a software on its own and is installed
via the R-package nimble
(de Valpine et al. 2021). A lot of resources on setting up and
working with NIMBLE are available on the NIMBLE website
and this manual will not give a comprehensive overview of NIMBLE or the
nimble
R package.
In the following, I will instead briefly outline some of
NIMBLE’s features that SPI-IPM
makes use of, and that may be good to be aware
of particularly for people switching over from BUGS/JAGS. I will also briefly
outline the structure for the basic call to NIMBLE to run a model.
4.1.1 Alternative specification of distributions
First, NIMBLE allows alternative specifications for statistical distributions (see the NIMBLE manual for an overview). This is particularly convenient for some commonly used distributions such as the normal distribution, which had to be specified via a mean and a precision (\(\tau\)) in BUGS/JAGS. The latter was then often related to the (some might say) more useful standard deviation (\(\sigma\)):
Looks familiar?
NIMBLE allows to specify the distribution parameters exactly as above, but also
offers alternatives such as parameterisation by mean and standard deviation
directly, e.g.:
4.1.2 Vectorized calculations
A second useful feature is vectorized calculation of deterministic nodes. At
some point, everyone has heard that when programming in R
, vectorization
outperforms for-loops when it comes to efficiency and speed. In many cases, this
is also true for MCMC, and NIMBLE therefore supports vectorization of
calculations for deterministic nodes (i.e. nodes assigned via <-
in the code).
SPI-IPM
uses vectorized calculations instead of for-loops in a variety of
cases, for example for defining the models underlying temporal variation in
vital rates:
logit(pNS[1:Tmax]) <- logit(Mu.pNS) + beta1.pNS*cov1[1:Tmax] + beta2.pNS*cov2[1:Tmax] + epsilon.pNS[1:Tmax]
instead of
for(t in 1:Tmax){
logit(pNS[t]) <- logit(Mu.pNS) + beta1.pNS*cov1[t] + beta2.pNS*cov2[t] + epsilon.pNS[t]
}
It is important to note that vectorization works for deterministic nodes, but
– per today – not for stochastic nodes (i.e. nodes assigned via ~
).
4.1.3 Custom distributions
The third class of NIMBLE features that SPI-IPM
capitalizes on is the ability
to define custom distributions. Specifically, SPI-IPM
uses a custom distribution
in the likelihood for the mark-recapture data.
In most IPMs, the analysis of mark-recapture data to estimate survival represents
the bottleneck for MCMC efficiency: sampling hundreds – if not thousands – of
latent alive or dead states is computationally expensive and results in long
MCMC runtimes (Gimenez et al. 2007).
Summarising individual capture histories into “m-arrays” has long been the only
way to reduce runtimes of Bayesian mark-recapture models (Kéry and Schaub 2011), but many
find this format less intuitive and it quickly becomes convoluted and impractical
when working with predictors/groups other than age class. With the rise of
NIMBLE, Turek, de Valpine, and Paciorek (2016) developed an approach to defining marginalized likelihoods
that integrate over latent states and lead to tremendous increases in MCMC
efficiency. Basic implementations of the marginalized likelihood for mark-recapture
models were included in the nimbleEcology
R package (Goldstein et al. 2021).
SPI-IPM
uses an extended version of the dCJS_vv
distribution contained in
nimbleEcology
which not only integrates over latent states, but also runs on
only unique – instead of all – capture histories (analogous to the goose
example in Turek, de Valpine, and Paciorek 2016). Preliminary tests have shown that mark-recapture models
with a likelihood specified using this distribution (dCJS_vv_sum
) have very
similar runtimes as implementations using m-arrays for small to medium-sized
datasets, and outspeed m-array formulations in for larger datasets.
The specification of the custom distribution is contained in dCJS_CustomDist.R
and its implementation within the model code looks quite minimalistic:
## Likelihood with custom distribution
for (i in 1:n.CH){
y.sum[i, first.sum[i]:last.sum[i]] ~ dCJS_vv_sum(
probSurvive = phi.CH[i, first.sum[i]:last.sum[i]],
probCapture = p.CH[i, first.sum[i]:last.sum[i]],
len = last.sum[i]-first.sum[i]+1,
mult = CHs.count[i])
}
y.sum
is a matrix containing all n.CH
unique capture histories and the
likelihood is fit for the range of observations between the first capture of an
individual (first.sum
) and the last possible time this individual could still
have been alive5(last.sum
). probSurvive
and
probCapture
are vectors containing the survival and recapture probabilities
relevant for the capture history (see IPMSetup.R
and Chapter 2.2.4 for details). The argument len
quantifies the length of y.sum
and is required for successful compilation of the custom function. Finally, mult
is an integer number specifying how many individual birds shared capture history
i
.
4.1.4 Calling NIMBLE from R
The simplest way to implement and run a Bayesian model in NIMBLE is via the
wrapper-function nimbleMCMC
. The structure of the nimbleMCMC
call is very
similar to the functions provided with the different R-packages for running
BUGS/JAGS and – for SPI-IPM
– looks like this (code line 49 in IPMRun_PopID.R
):
SPI.IPM <- nimbleMCMC(code = SPI.IPMcode, constants = SPI.IPMconstants, data = SPI.IPMdata, inits = Inits, monitors = parameters, niter = ni, nburnin = nb, nchains = nc, thin = nt, setSeed = mySeed, samplesAsCodaMCMC = TRUE)
code
is the model code, formatted as nimbleCode
. constants
and data
contain all information relevant for parameterising the model, as well as all
the observational data (see Chapter 1.4).
inits
is a list of initial values that is described further below, and monitors
is a character vector containing the names of all paramters that should be
monitored. niter
, nburnin
, nchains
, and thin
are the numbers of iterations,
samples to discard as burn-in, number of chains, and thinning interval for the
MCMC. setSeed
is absolutely crucial! By setting the seed within the call to
NIMBLE, your entire MCMC becomes reproducible which is invaluable also for
trouble-shooting. The final argument, samplesAsCodaMCMC
, is for people like
me who like to get back raw samples and make their own summaries instead of
having the function return larger objects containing the samples along with
a variety of summaries and other stats.
4.2 Simulation of initial values
As a general rule, all parameters that have priors provided for them within
Bayesian models also require initial values. For SPI-IPM
, this applies to:
- Vital rate averages (\(\mu\) /
Mu
) - Environmental effects on vital rates (\(\beta\) /
beta
) - Standard deviation for year random effects on vital rates (\(\sigma\) /
sigma
) - Initial population sizes (
localN[,1]
) - Average and standard deviation of immigrant numbers (
AvgImm
,sigma.Imm
) - Missing covariate values (subset of
cov1
,cov2
,cov3
) - Missing observation probabilities (subset of
PropImmDetect
)
Provided that initial values for all of these parameters have been provided, NIMBLE is able to calculate initial values for all of the remaining downstream nodes using the realtionships specified in the model code. In practice, however, the more different data sources and sub-models are included in an integrated model, the more likely it is that this “automatic” initialization of nodes will result in conflicts, i.e. initial values that are not compatible with each other.
While it may work for a single IPM to re-sample initial values repeatedly until
a set without conflicts has been found by chance, this is vastly impractical for
a framework like SPI-IPM
which is designed to also work well for comparative
analyses, i.e. may be run on several datasets/populations at a time.
For that reason, manual initialization of all nodes makes sense for SPI-IPM
and
is implemented in the function SPI_IPM.inits
(in file InitSim.R
).
The function runs a simulation of the entire population model, including the reconstruction of missing covariate and detection parameter values in four steps:
- Simulation of missing covariate values
- Simulation of age- and time-specific vital rates
- Simulation of initial population size and population trajectory over the study period, incl. immigration
- Simulation of missing detection parameters
The elements passed to the function are the collated data and constants
(IPM.data
and IPM.constants
, see Chapter 1.4),
as well as a logical argument determining whether initial values for year random
effects should be sampled from their simulated distribution or set to 0 (sampleRE
).
To sample initial values for several chains, the function is called multiple times within a list.
Inits <- list(
SPI_IPM.inits(IPM.data = SPI.IPMdata, IPM.constants = SPI.IPMconstants, sampleRE = FALSE),
SPI_IPM.inits(IPM.data = SPI.IPMdata, IPM.constants = SPI.IPMconstants, sampleRE = FALSE))
is an example for initializing two chains.
Note that I am sampling the initial values and storing them in an object called
Inits
in the example, which I then pass to nimbleMCMC
afterwards, instead of
calling the SPI_IPM.inits
directly within the call to nimbleMCMC
. The reason
for “pre-sampling” initial values in this way is that it makes trouble-shooting
much easier: if initialization problems occur at specific nodes, I can
investigate both the associated data AND pre-sampled initial values to work out
the cause of the discrepancy.
For the purpose of facilitating trouble-shooting, but also to promote reproducibility,
it is also good practice to set a seed prior to pre-sampling initial values
(in IPMRun_PopID.R
, for example, the seed is set right at the start of the
script).
4.3 Test runs and full runs: chains, iterations, burn-in, and thinning
If you have ample experience working with Bayesian models and MCMC, you can jump over this section.
The following is a (perhaps obvious) practical tip for users that are newer to Bayesian modelling and/or IPMs: always do a test-run of your implementation to make sure it works before running a long MCMC!
In the original code, SPI-IPM
is set up for a full run of 4 chains with
200 000 iterations each (see code lines 26-29 in IPMRun_PopID.R
). The first
50 000 iterations of each chain are discarded as burn-in and the remainder
thinned by 30, resulting in a combined posterior consisting of 4\(\times\) 5000 = 20 000
samples. This is rather generous, and it is not unlikely that your analysis will
require substantially less iterations to reach convergence (see Chapter
5.1 for how to assess convergence).
Nonetheless, running the full MCMC can easily take several hours and ideally,
you would discover issues such as bad initialization or mistakes made during
code adjustments before having to wait that long.
That’s where short test runs become useful. The shortest MCMC you can run is
two iterations long, no burn-in, and with thinning interval 1 (= no thinning).
By running that first, you can check that your model builds smoothly and contains
no unwanted NAs, and that initialization works as it should creates no conflicts.
With slightly more iterations in a test chain, you can also make sure that all
nodes that should be updating are in fact updating. Most importantly, this will
allow you to test your model’s setup and trouble-shoot some basic implementation
issues without having to wait for hours.
4.4 Trouble-shooting implementation issues
Specific content will be added later. In the meantime, a lot of helpful information can be found in the materials on the NIMBLE website and via the Google group “nimble-users”.
For issues that are related specifically to SPI-IPM
, you can also get in touch
via email.
References
de Valpine, Perry, Christopher Paciorek, Daniel Turek, Nick Michaud, Cliff Anderson-Bergman, Fritz Obermeyer, Claudia Wehrhahn Cortes, Abel Rodrìguez, Duncan Temple Lang, and Sally Paganin. 2021. NIMBLE: MCMC, Particle Filtering, and Programmable Hierarchical Modeling (version 0.12.1). https://doi.org/10.5281/zenodo.1211190.
de Valpine, Perry, Daniel Turek, Christopher J Paciorek, Clifford Anderson-Bergman, Duncan Temple Lang, and Rastislav Bodik. 2017. “Programming with Models: Writing Statistical Algorithms for General Model Structures with Nimble.” Journal of Computational and Graphical Statistics 26 (2): 403–13.
Gimenez, Olivier, Vivien Rossi, Rémi Choquet, Camille Dehais, Blaise Doris, Hubert Varella, Jean-Pierre Vila, and Roger Pradel. 2007. “State-Space Modelling of Data on Marked Individuals.” Ecological Modelling 206 (3-4): 431–38.
Goldstein, Benjamin R., Daniel Turek, Lauren Ponisio, and Perry de Valpine. 2021. “nimbleEcology: Distributions for Ecological Models in nimble.” https://cran.r-project.org/package=nimbleEcology.
Kéry, Marc, and Michael Schaub. 2011. Bayesian Population Analysis Using Winbugs: A Hierarchical Perspective. Academic Press.
Turek, Daniel, Perry de Valpine, and Christopher J Paciorek. 2016. “Efficient Markov Chain Monte Carlo Sampling for Hierarchical Hidden Markov Models.” Environmental and Ecological Statistics 23 (4): 549–64.
in practice this can be set to e.g. year of the last recorded observation of the individual plus maximum lifesspan, plus some. In my analysis of UK-breeding pied flycatchers, for example, I have set it – generously – to 20 years after the last recorded capture.↩︎