Multilevel models can be specified in both wide and long format. Wide format is possible when the data is balanced (i.e., each group has an equal number number of observations). When the data is not balanced, then long format is required.
Here’s a random intercept JAGS model specified using wide format data
for (i in 1:N) {
for (j in 1:J) {
mu[i,j] <- alpha[i] + beta * (X[i,j] - x.bar);
Y[i,j] ~ dnorm(mu[i, j], tau.c)
}
alpha[i] ~ dnorm(alpha.mu, alpha.tau);
}
With data formatted in R as follows
jagsdata <- list(X=as.matrix(Data.wide.x), Y=as.matrix(Data.wide.y), N=N, J=J)
And heres the JAGS model in long format
for (i in 1:N) {
mu[i] <- alpha[id.i[i]] + beta * (X[i] - x.bar);
Y[i] ~ dnorm(mu[i], tau.c)
}
for (i in 1:I) {
alpha[i] ~ dnorm(alpha.mu, alpha.tau);
}
with the following R data line
jagsdata <- list(X=Data$x, Y=Data$y, id.i=Data$id.i, N=nrow(Data),
I=length(unique(Data$id.i)))
Note that specification of priors and so forth were basically unchanged.
So, what’s changed?
mu[i,j]) long format uses
vector notation (e.g., mu[i])for loops. The outer-loop loops over rows (i.e.,
group ids); the inner-loop loops over columns (i.e., observations within
each group).
The random intercept coefficient is placed
outside the inner-loop because it does not vary by group.id.i) which is of the length of y but
records the group id.
This indicates which intercept coefficient in this case to use for the
particular observation.
The second loop is of length equal to the number of groups.