Supplemental Material for ‘Bayesian Clustered Coeffcients Regression with Auxiliary Covariates Assistant Random Effects’

This webpage is created as an online supplementary material for the manuscript Bayesian Clustered Coefficients Regression with Auxiliary Covariates Assistant Random Effects. We present our modeling code using the nimble package (Valpine et al. 2017), as well as code to perform posterior inference.

1. The Main nimble Function

We start this section with introducing two helper functions. Let S be the number of locations. The mineigen function, as its name suggests, takes a square matrix \(\in \mathbf{R}^{S\times S}\), and returns its smallest eigenvalue.

library(nimble)
library(coda)
## locations
S <- 159
mineigen <- nimbleFunction(
  run = function(mat = double(2, c(S, S))) {
    returnType(double(0))
    return(min(eigen(mat, symmetric = TRUE)$values))
  }
)

Secondly comes another helper function. This function is used for calculating \(\pi_i (i=1,\cdots, k)\) in Step 2-Step 4 introduced in Section 2.2. The input \(r\) (representing the \(\eta\)’s) is a scalar of length \(M\), the maximum number of clusters specified before. The \(\pi\) is also a scalar of length \(M\), and the first \(k\) elements are given the same values of \(\eta\), with the remaining \(M-k\) elements being zero.

piFun2 <- nimbleFunction(
  run = function(r = double(1)) {
    rlength <- length(r)
    rsum <- rep(0, rlength)
    pi <- rep(0, rlength)
    rsum[1] <- pi[1] <- r[1]
    for (i in 2:rlength) {
      rsum[i] <- rsum[i - 1] + r[i]
      if (rsum[i] >= 1) {
        pi[i] <- 1 - rsum[i - 1]
      }
      else {pi[i] <- r[i]}
    }
    for (i in 1:rlength) {
      if (pi[i] < 0) {pi[i] <- 0}
    }
    returnType(double(1))
    return(pi)
  }
)

The main function is as follows. The second line specifies the prior distribution of \(\tau_y\) to be \(\text{Gamma}(1,1)\). The loop starting on line 4 corresponds to the first line in Equation (2.12), where \(Y(s_i)\) follows normal distribution with mean \(X(s_i)\beta_{z_i} + w(s_i)\) and standard deviation \(1/\tau_y\). Lines 15 to 20 correspond to the MFM specification for each of the 159 \(z_i\)’s. Lines 22 to 26 specifies the distribution of the \(\beta\) vector for each cluster to be multivariate normal. The prior of the hyperparameter \(\tau_\beta\) to be Gamma(1,1). Lines 28-30 indicate that each of the means for \(beta\) vector has the standard normal distribution prior.

The multivariate normal distribution for the spatial random effect is declared in line 33. The constraint in lines 34 and 35 ensures all samples are guaranteed to produce a valid, positive definite matrix. The parameter alpha in the code denotes \(\sigma^2\) in Equation (2.12), which controls for the overall variation of the spatial random effect. Lines 39-41 correspond to line 3 of Equation (2.12), i.e., the covariance matrix is a linear combination of several matrices. Line 44 corresponds to line 4 of (2.12). Lines 46-48 indicate that we use the exponential kernel in calculating the similarity matrices. Finally, lines 50-52 assign gamma priors to the bandwidths in the exponential kernel.

SLMMCode <- nimbleCode({
  tau_y ~ dgamma(1, 1)
  
  for (i in 1:S) {
    y[i] ~ dnorm(mu_y[i], tau = tau_y)
    mu_y[i] <- b[i, 1] + b[i, 2] * x1[i] + b[i, 3] * x2[i] +
               b[i, 4] * x3[i] + W[i]
    
    b[i, 1:4] <- bm[latent[i], 1:4]
    latent[i] ~ dcat(pi[1:M])
    
    Dev[i] <- log(2 * 3.14159) - log(tau_y) + tau_y * (y[i] - mu_y[i])^2
  }
  ### clustered coefficients: MFM
  for (i in 1:M) {
    r[i] ~ dexp(rate = lambda)
  }
  pi[1:M] <- piFun2(r = r[1:M])
  
  lambda ~ dlnorm(0, varlog = 1)
  
  for (k in 1:M) {
    bm[k, 1:4] ~ dmnorm(mu_bm[1:4], cov = var_bm[1:4, 1:4])
  }
  var_bm[1:4, 1:4] <- 1/tau_bm * diag(rep(1, 4))  
  tau_bm ~ dgamma(1, 1)
  
  for (j in 1:4) {
    mu_bm[j] ~ dnorm(0, 1)
  }
  
  ### spatial random effect: covariance regression
  W[1:S] ~ dmnorm(mu_w[1:S], cov = Sigma[1:S, 1:S])
  eg <- mineigen(Sigma[1:S, 1:S])
  constraint_data ~ dconstraint(eg > 0)
  
  alpha ~ dinvgamma(1, 1)
  
  Sigma[1:S, 1:S] <- alpha * (beta[1] * diag(rep(1, S))[1:S, 1:S] +
                              beta[2] * W1[1:S, 1:S] + beta[3] * W2[1:S, 1:S] +
                              beta[4] * W3[1:S, 1:S])
  
  
  beta[1:4] ~ ddirch(mbeta[1:4])
  
  W1[1:S, 1:S] <- exp(-Q1[1:S, 1:S] / kappa[1])
  W2[1:S, 1:S] <- exp(-Q2[1:S, 1:S] / kappa[2])
  W3[1:S, 1:S] <- exp(-Q3[1:S, 1:S] / kappa[3])

  for (k in 1:3) {
    kappa[k] ~ dgamma(1, 1)
  }
})

2. A Simulated Example

We adopt the first balanced true cluster configuration, and Model 3 from the manuscript (Equation 4.3). After the true cluster memberships are generated, the matrix of \(\beta\) is created with 3 rows corresponding to 3 covariates, and 159 columns corresponding to each of the 159 locations.

set.seed(5)

true_cluster <- sample(1:3, 159, replace = TRUE)
betaMat <- matrix(nrow = 3, ncol = 159)
betaMat[ , true_cluster == 1] <- c(4, 1, -2)
betaMat[ , true_cluster == 2] <- c(1, 1, 0)
betaMat[ , true_cluster == 3] <- c(1, -2, -1)

Next we generate three covariates together with two auxiliary covariates, all from the Uniform(0,1) distribution. Two similarity matrices are constructed using \(\kappa_1 = 5\) and \(\kappa_2 = 3\) (lines 9 and 10). Then \(\boldsymbol{Y}~\) is generated according to Equation (4.3) in the manuscript, where the spatial random effect consists of factors both from the distance and the auxiliary covariates.

gcd <- readRDS("./newdistMat.rds")
set.seed(2)
X1 <- runif(159, 0, 1)
X2 <- runif(159, 0, 1)
X3 <- runif(159, 0, 1)
Z1 <- runif(159, 0, 1)
Z2 <- runif(159, 0, 1)
X <- data.frame(X1, X2, X3)

Q1 <- exp(-abs(outer(Z1, Z1, "-")) * 5)
Q2 <- exp(-abs(outer(Z2, Z2, "-")) * 3)

Y <- as.numeric(diag(as.matrix(X) %*% betaMat) +
                  MASS::mvrnorm(1, rep(0, 159),
                                Sigma = 0.25 * (0.81 * diag(1, 159) +
                                                  0.04 * exp(-gcd / 4) +
                                                  0.05 * Q1 + 0.1 * Q2)) +
                  rnorm(159, mean = 0, sd = 0.1))

To run the model, we need to supply it with the observed data, model constants, and initial values.

SLMMdata <- list(y = Y, x1 = X$X1, x2 = X$X2, x3 = X$X3,
                 mu_w = rep(0, 159), Q1 = Q1, Q2 = Q2, Q3 = gcd,
                 mbeta = c(1, 1, 1, 1))
SLMMConsts <- list(S = 159, M = 10)
SLMMInits <- list(tau_y = 1, beta = rep(0.25, 4),
                  kappa = c(2,2,2),
                  tau_bm = 1, mu_bm = rep(0, 4),
                  alpha = 0.5,
                  latent = rep(1, SLMMConsts$S),
                  pi = rep(0.1, SLMMConsts$M),
                  r = rep(0.1, SLMMConsts$M),
                  lambda = 2,
                  Sigma = diag(SLMMConsts$S))

Next, we combine all aforementioned pieces into a nimbleModel object, add monitors for the parameters, build the model, and invoke sampling. For illustration purposes, we set the number of iteration to 20000 with thinning interval 2, and discard the first 6000 as burn-in.

SLMM <- nimbleModel(code = SLMMCode, name = "SLMM", constants = SLMMConsts,
                    data = SLMMdata, inits = SLMMInits)

## defining model...

## building model...

## setting data and initial values...

## running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... Error in chol.default(model$Sigma[1:159, 1:159]) : 
##   the leading minor of order 14 is not positive definite
## 
## checking model sizes and dimensions... This model is not fully initialized. This is not an error. To see which variables are not initialized, use model$initializeInfo(). For more information on model initialization, see help(modelInitialization).
## model building finished.

cSLMM <- compileNimble(SLMM)

## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## compilation finished.

SLMMconf <- configureMCMC(SLMM, print = FALSE)
SLMMconf$addMonitors(c( "beta",
                        "b","tau_y", "bm"
                        , "latent", "alpha"
                        ,"pi", "lambda", "kappa"
                        ,"Dev", "mu_y"))

## thin = 1: tau_y, lambda, tau_bm, mu_bm, alpha, kappa, beta, b, bm, latent, pi, Dev, mu_y

SLMMmcmc <- buildMCMC(SLMMconf)
cSLMMmcmc <- compileNimble(SLMMmcmc, project = SLMM)

## compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## compilation finished.

cSLMM$setInits(SLMMInits)
mcmc.out <- runMCMC(cSLMMmcmc, niter = 25000, setSeed = 5, thin = 2)

## running chain 1...

## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|

Next we perform posterior inference.

pos_mcmc <- as.mcmc(mcmc.out[-c(1:9500),])

## point estimate for the latent variables
latentPE <- as.numeric(unlist(apply(pos_mcmc[,845:1003], 2, FUN = function(x) {
                            return(DescTools::Mode(x)[1])
                        })))
## check number of clusters, and number of counties in each cluster
table(latentPE)

## latentPE
##  1  3  4 
## 70 21 68

## calculate rand index against the true cluster configuration
fossil::rand.index(true_cluster, latentPE)

## [1] 0.7251015

References

Valpine, Perry de, Daniel Turek, Christopher J Paciorek, Clifford Anderson-Bergman, Duncan Temple Lang, and Rastislav Bodik. 2017. “Programming with Models: Writing Statistical Algorithms for General Model Structures with NIMBLE.” Journal of Computational and Graphical Statistics 26 (2): 403–13.

Supplemental Material for ‘Bayesian Clustered Coeffcients Regression with Auxiliary Covariates Assistant Random Effects’

Guanyu Hu, Yishu Xue, and Zhihua Ma\(^*\)

1. The Main nimble Function

2. A Simulated Example

References