Chapter 3 Modelling temporal variation

Vital rates are expected to vary among years due to changes in environmental conditions. SPI-IPM is set up to account for among-year variation in different (age-specific) vital rates \(X_{a,t}\) using fixed effects of supplied environmental covariates (\(cov1_t\), \(cov2_t\),…) and year random effects (\(\epsilon_t^X\)). The resulting generalized linear mixed-models for time- and age-specific vital rates therefore take the following form: \[\begin{equation} link(X_{a,t}) = link(\mu_a^X) + \beta_{cov1}^X\times cov1_t + \beta_{cov2}^X\times cov2_t + ... + \epsilon_t^X \end{equation}\] Here, \(\mu_a^X\) is the age-specific average vital rate (intercept) and \(\beta_{cov1}\) and \(\beta_{cov2}\) are the slopes for the effects of covariates \(cov1\) and \(cov2\) on the link scale, respectively. The link function depends on the vital rate, and is set to logit for breeding probabilities (\(pB_{a,t}\)), nest success probabilities (\(pNS_t\)), and survival probabilities (\(sN_{a,t}\), \(sJ_t\), \(sA_t\)) and log for clutch size (\(CS_{a,t}\)).

3.1 Random year variation

The random year effects included in the basic implementation of SPI-IPM are assumed to be normally distributed such that \[\begin{equation} \epsilon_{t}^X \sim Normal(0, \sigma^X) \end{equation}\] where \(\sigma^X\) is the standard deviation of random year effects on vital rate \(X_{a,t}\). The random effects in the basic implementation are also age-independent, meaning that \(\epsilon_t^X\) is included in the equations for both the yearling vital rate \(X_{Y,t}\) and the adults vital rate \(X_{A,t}\). The exception are the annual survival rates for juveniles (\(sJ_t\)) and (\(sA_t\)), which both have separate random effects because the drivers of survival variation are expected to vary between those two age classes, and the basic implementation includes an additional covariate effect \(sJ_t\) only²:

for(t in 1:Tmax){
    
  ## Age- and time-dependent survival probabilities
  logit(sJ[t]) <- logit(Mu.sJ) + beta3.sJ*cov3[t] + epsilon.sJ[t]
  logit(sA[t]) <- logit(Mu.sA) + epsilon.sA[t]

  ## Temporal random effects
    epsilon.sJ[t] ~ dnorm(0, sd = sigma.sJ)
    epsilon.sA[t] ~ dnorm(0, sd = sigma.sA)
}

All random effects are treated as independent (= not correlated) in the basic implementation of the model, but the inclusion of correlation of random effects across age-classes and/or vital rates is straightforward to implement³ using e.g. multivariate normal distributions or approaches similar to the one used in Nater et al. (2020).

3.2 Temporal covariates

The basic implementation of SPI-IPM features an example covariate model structure that was motivated by an analysis of populations of pied flycatcher (Ficedula hypoleuca)) breeding in the UK. It involves three different covariates (\(cov1\), \(cov2\), and \(cov3\)) that are assumed to affect nest success probability (\(pNS_{t}\)), nestling survival (\(sN_{a,t}\)), and juvenile survival (\(sJ_t\)). However, it is very straightforward to alter the code to fit whatever alternative covariate structure is suitable for your particular analysis since the inclusion different/additional continuous and categorical covariates always works according to the same principles (see also Kéry and Schaub 2011). All covariates are need to be passed to SPI-IPM as vectors or arrays.

3.2.1 Continuous variables

Continuous variables are included as covariate effects using a specific slope parameter (\(\beta\)). Temporal effects then take the form \(\beta_{cov1}\times cov1_t\), where \(\beta_{cov1}\) is an estimated parameter quantifying strength and direction of the effect and \(cov1_t\) is the value of covariate \(cov1\) at time \(t\). In (generalized) linear mixed effects models as they are used in SPI-IPM, effects of several different covariates can just be added up on the link scale. For \(pNS_t\), for example, this codes as⁴:

for(t in 1:Tmax){
  logit(pNS[t]) <- logit(Mu.pNS) + beta1.pNS*cov1[t] + beta2.pNS*cov2[t] + epsilon.pNS[t]
}

The basic implementation of SPI-IPM includes the following covariate models: \[\begin{align*} & logit(pNS_t) = logit(\mu^{pNS}) + \beta_{cov1}^{pNS}\times cov1_t + \beta_{cov2}^{pNS} \times cov2_t + \epsilon_t^{pNS} \\ & logit(sN_{a,t}) = logit(\mu_a^{pNS}) + \beta_{cov1}^{sN}\times cov1_t + \beta_{cov2}^{sN} \times cov2_t + \epsilon_t^{sN} \\ & logit(sJ_t) = logit(\mu^{sJ}) + \beta_{cov3}^{sJ}\times cov3_t + \epsilon_t^{sJ} \end{align*}\] All covariates are continuous annual variables that have been standardized and centered (mean = 0, sd = 1) prior to analysis. \(cov1\) and \(cov2\) represent environmental conditions during the incubation and nestling period and hence influence nest success (\(pNS_t\)) and nestling survival (\(sN_{a,t}\)). In the case of the latter, covariates are further assumed to have the same magnitude of effect on the nests of yearling and adult females (i.e. the \(\beta\) parameters are independent of age). \(cov3\), on the other hand, symbolizes environmental conditions after fledging which impact juvenile annual survival (\(sJ_t\)). No covariate effects are included for the other vital rates.

3.2.2 Categorical variables

The basic SPI-IPM does not include any categorical covariates, but since such covariates may be relevant to a wide range of questions (e.g. some of the points raised in Chapter 8.2), I briefly illustrate how they could be included into vital rate models.

Generally, there are two approaches to modelling categorical covariates in this context.

The first approach works analogous to the approach for continuous covariates, i.e. it uses the form \(\beta_{cov}\times cov_t\). This is most relevant for binary categorical covariates that symbolize some sort of “on-off” process. An example of this would be if you would like to model the effect of an experimental treatment that has been performed in some years (\(cov_t = 1\)) but not others (\(cov_t = 0\)). Your binary covariate then works as a “switch” that determines whether or not the effect of the experimental treatment (\(\beta_{cov}\)) applied in a given year \(t\) or not since \(link(X_t) = link(\mu^X) + \beta_{cov_t}\times cov_t\) becomes \(link(\mu^X) + \beta_{cov_t}\) when \(cov_t = 1\) and \(link(\mu^X)\) when \(cov_t = 0\).

The second approach works via (nested) indexing and is more flexible since it can technically account for any number of levels in your categorical covariate. This can be relevant, for example, for categories of years (“good”, “average”, “bad”), habitat types (“deciduous forest”, “coniferous forest”), or individuals (“male”, “female”). The approach still uses \(\beta\) parameters, but instead of multiplying the \(\beta\) with the covariate value, we index the \(\beta\) by the covariate value such that \(\beta_1\) corresponds to the effect of category 1 (\(cat = 1\)), \(\beta_2\) corresponds to the effect of category 2 (\(cat = 2\)), and so on:

\[\begin{equation} link(X_{cat}) = link(\mu^X) + \beta_{cat} \end{equation}\] Priors then need to be provided for each category-specific \(\beta\).

In practice, SPI-IPM still requires vital rates \(X\) to be indexed by age class and year (at least with the population model described in Chapter 2). That’s where nested indexing becomes relevant. The relationship of a vital rate \(X_{a,t}\) with a categorical year covariate can, for example, be coded as follows:

for(a in 1:A){
  for(t in 1:Tmax){
      log(X[a,t]) <- log(Mu.X[a]) + beta[cov[t]]
  }
  Mu.X ~ dunif(0, 10)
  beta ~ dunif(-5, 5)
}

where cov[t] is a vector of integer numbers that represent the different year categories.

Introducing categorical effects that rely on additional structure beyond year and age (for example effects of sex or location) requires changing the underlying population model. Such extensions are currently not implemented in SPI-IPM, but see Chapter 8 for some perspectives.

3.2.3 Imputation of missing covariate values

Perhaps you have been wondering about how to deal with NAs in your covariate data? The good news is that SPI-IPM (just like any other Bayesian hierarchical model) can accommodate NAs in both continuous and categorical covariates. The (perhaps) less good news is that how well it works really depends on how large a proportion of your covariate data is NA.

There are three practical requirements for working with partially observed covariate data:

Your covariate data containing numbers for your observed covariate values and NAs for your unobserved/unknown covariate values
Initial values with the same dimensions as your covariate data containing numbers in the positions of NA covariate values and NAs in the positions of observed covariate values.
A model describing the distribution of missing covariate values.

Numbers 1. and 2. are pretty self-explanatory (but see Chapter 4.2 for more information on sampling initial values). Number 3. is going to depend on what type of covariate data you are dealing with.

The basic implementation of SPI-IPM is set up to be able to deal with NA values in the continuous temporal covariates \(cov1\), \(cov2\), and \(cov3\). These covariates are assumed to have been standardized and centered, i.e. they should more or less follow a \(Normal(mean= 0, sd = 1)\) distribution. If we assume that the observed and unobserved covariate values follow the same distribution (i.e. the missing values are a random subset of all values), this can be used to specify the process model for the missing covariates in the code:

for(t in 1:Tmax){
  cov1[t] ~ dnorm(0, sd = 1)
  cov2[t] ~ dnorm(0, sd = 1)
  cov3[t] ~ dnorm(0, sd = 1)
}

There are numerous alternatives for specifying distributions of missing values in continous covariates, and they can be accommodated by changing the above section in the code.

The number of candidate distributions are a bit more limited when there are missing values in categorical covariates. Chapter 8.2.1 outlines an example for dealing with partially missing information on individual age. It may also be helpful to remember that models with partially observed categorical variables are essentially “mixture models” including auxiliary data about the underlying distribution.

3.3 Notes on covariate selection

To be added later.

References

Kéry, Marc, and Michael Schaub. 2011. Bayesian Population Analysis Using Winbugs: A Hierarchical Perspective. Academic Press.

Nater, Chloé R, Yngvild Vindenes, Per Aass, Diana Cole, Øystein Langangen, S Jannicke Moe, Atle Rustadbakken, Daniel Turek, Leif Asbjørn Vøllestad, and Torbjørn Ergon. 2020. “Size-and Stage-Dependence in Cause-Specific Mortality of Migratory Brown Trout.” Journal of Animal Ecology 89 (9): 2122–33.

Note that the implementation of this in IPMSetup.R looks slightly different as it uses a vectorized formulation (calculations done over all time-steps simultaneously instead of using a for-loop). Vectorized calculations are a nifty feature available in NIMBLE (but not BUGS and JAGS); more on this in Chapter 4.1.↩︎
Wheter or not formally including random effects correlations is useful or not depends on the biological questions of interest and the amoung of data available. When testing a model with correlated random effects for juvenile and adult survival on seven datasets from breeding populations of pied flycatchers in the UK, I found that estimates did not differ from those obtained from a model with independent random effects, and the posterior distribution for the correlation coefficient was so wide that no inference on strength or direction of the correlation was possible.↩︎
see Footnote 1↩︎