# Using a variable both as fixed and random effect

One of the questions to answer when using mixed models is whether to use a variable as a fixed effect or as a random effect. Sometimes it makes sense to use a variable both as fixed and random effect. In this post I will try to make clear in which cases it can make sense and what are the benefits of doing so. I will also handle cases in which it doesn’t make sense. Much will depend on the nature of the variable. Therefore to post is split into three sections: categorical, discrete and continuous. I will only display to most relevant parts of the code. The full code is available on GitHub.

# Categorical variable

To make this clear, we start by creating a dummy dataset with 3 categorical covariates. B is nested within A. The resulting dataset is displayed in fig 1.

library(tidyverse)
library(lme4)
library(INLA)
n_a <- 6
n_b <- 2
n_sample <- 3
sd_A <- 2
sd_B <- 1
sd_noise <- 1
dataset <- expand.grid(
B = paste0("b", seq_len(n_a * n_b)),
sample = seq_len(n_sample)
) %>%
mutate(
A = paste0("a", as.integer(B) %% n_a) %>%
factor(),
mu = rnorm(n_a, sd = sd_A)[A] +
rnorm(n_a * n_b, sd = sd_B)[B],
Y = mu + rnorm(n(), sd = sd_noise)
)

The first model is one that doesn’t make sense. Using a categorical variable both as random and a fixed effect. In this case both effects are competing for the same information. Below is the resulting fit from lme4 and INLA. Note the warning in the lme4 output, the model failed to converge. Nevertheless, both lme4 and INLA yield the same parameter estimate (fig 2), albeit the much wider confidence intervals for lme4. The estimates for the random effects in both packages are equivalent to zero (fig. 3). Again the lme4 estimate has more uncertainty.

model.1 <- lmer(Y ~ 0 + A + (1|A), data = dataset)
summary(model.1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Y ~ 0 + A + (1 | A)
##    Data: dataset
##
## REML criterion at convergence: 114.5
##
## Scaled residuals:
##     Min      1Q  Median      3Q     Max
## -1.7375 -0.7377  0.0970  0.6162  1.7453
##
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  A        (Intercept) 5.838    2.416
##  Residual             1.859    1.363
## Number of obs: 36, groups:  A, 6
##
## Fixed effects:
##     Estimate Std. Error t value
## Aa0    1.009      2.479   0.407
## Aa1   -3.244      2.479  -1.308
## Aa2    3.244      2.479   1.308
## Aa3   -1.989      2.479  -0.802
## Aa4   -1.118      2.479  -0.451
## Aa5    1.414      2.479   0.570
##
## Correlation of Fixed Effects:
##     Aa0   Aa1   Aa2   Aa3   Aa4
## Aa1 0.000
## Aa2 0.000 0.000
## Aa3 0.000 0.000 0.000
## Aa4 0.000 0.000 0.000 0.000
## Aa5 0.000 0.000 0.000 0.000 0.000
model.2 <- inla(Y ~ 0 + A + f(A, model = "iid"), data = dataset)
summary(model.2)
##
## Call:
## "inla(formula = Y ~ 0 + A + f(A, model = \"iid\"), data = dataset)"
##
## Time used:
##  Pre-processing    Running inla Post-processing           Total
##          0.6836          0.6377          0.0487          1.3700
##
## Fixed effects:
##        mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## Aa0  1.0082 0.5573    -0.0931   1.0082     2.1081  1.0082   0
## Aa1 -3.2427 0.5573    -4.3439  -3.2427    -2.1426 -3.2428   0
## Aa2  3.2429 0.5573     2.1415   3.2429     4.3428  3.2430   0
## Aa3 -1.9885 0.5573    -3.0897  -1.9885    -0.8885 -1.9885   0
## Aa4 -1.1178 0.5573    -2.2191  -1.1178    -0.0178 -1.1178   0
## Aa5  1.4133 0.5573     0.3120   1.4133     2.5132  1.4133   0
##
## Random effects:
## Name   Model
##  A   IID model
##
## Model hyperparameters:
##                                              mean        sd 0.025quant
## Precision for the Gaussian observations 5.726e-01 1.432e-01      0.334
## Precision for A                         1.861e+04 1.836e+04   1262.197
##                                          0.5quant 0.975quant      mode
## Precision for the Gaussian observations     0.559  8.935e-01    0.5333
## Precision for A                         13187.426  6.716e+04 3448.4855
##
## Expected number of effective parameters(std dev): 5.998(5e-04)
## Number of equivalent replicates : 6.002
##
## Marginal log-Likelihood:  -94.54

What if we want to add variable B as a nested random effect? We already know that adding A to both the fixed and the random effects is nonsense. The correct way of doing this is to use A as a fixed effect and B as an implicit nested random effect.

model.1 <- lmer(Y ~ 0 + A + (1|A/B), data = dataset)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: large eigenvalue ratio
##  - Rescale variables?
summary(model.1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Y ~ 0 + A + (1 | A/B)
##    Data: dataset
##
## REML criterion at convergence: 107.4
##
## Scaled residuals:
##      Min       1Q   Median       3Q      Max
## -1.95966 -0.51229 -0.02562  0.54737  1.59750
##
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  B:A      (Intercept) 1.295    1.138
##  A        (Intercept) 1.293    1.137
##  Residual             1.082    1.040
## Number of obs: 36, groups:  B:A, 12; A, 6
##
## Fixed effects:
##     Estimate Std. Error t value
## Aa0    1.009      1.456   0.693
## Aa1   -3.244      1.456  -2.227
## Aa2    3.244      1.456   2.228
## Aa3   -1.989      1.456  -1.366
## Aa4   -1.118      1.456  -0.768
## Aa5    1.414      1.456   0.971
##
## Correlation of Fixed Effects:
##     Aa0   Aa1   Aa2   Aa3   Aa4
## Aa1 0.000
## Aa2 0.000 0.000
## Aa3 0.000 0.000 0.000
## Aa4 0.000 0.000 0.000 0.000
## Aa5 0.000 0.000 0.000 0.000 0.000
## convergence code: 0
## Model is nearly unidentifiable: large eigenvalue ratio
##  - Rescale variables?
model.1b <- lmer(Y ~ 0 + A + (1|B), data = dataset)
summary(model.1b)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Y ~ 0 + A + (1 | B)
##    Data: dataset
##
## REML criterion at convergence: 107.4
##
## Scaled residuals:
##      Min       1Q   Median       3Q      Max
## -1.95966 -0.51229 -0.02562  0.54737  1.59750
##
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  B        (Intercept) 1.295    1.138
##  Residual             1.082    1.040
## Number of obs: 36, groups:  B, 12
##
## Fixed effects:
##     Estimate Std. Error t value
## Aa0   1.0085     0.9099   1.108
## Aa1  -3.2437     0.9099  -3.565
## Aa2   3.2439     0.9099   3.565
## Aa3  -1.9891     0.9099  -2.186
## Aa4  -1.1181     0.9099  -1.229
## Aa5   1.4137     0.9099   1.554
##
## Correlation of Fixed Effects:
##     Aa0   Aa1   Aa2   Aa3   Aa4
## Aa1 0.000
## Aa2 0.000 0.000
## Aa3 0.000 0.000 0.000
## Aa4 0.000 0.000 0.000 0.000
## Aa5 0.000 0.000 0.000 0.000 0.000
model.2 <- inla(
Y ~ 0 + A + f(A, model = "iid") + f(B, model = "iid"),
data = dataset
)
summary(model.2)
##
## Call:
## c("inla(formula = Y ~ 0 + A + f(A, model = \"iid\") + f(B, model = \"iid\"), ",  "    data = dataset)")
##
## Time used:
##  Pre-processing    Running inla Post-processing           Total
##          0.1822          0.1401          0.0306          0.3529
##
## Fixed effects:
##        mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## Aa0  1.0079 0.8039    -0.6132   1.0079     2.6290  1.0081   0
## Aa1 -3.2416 0.8039    -4.8616  -3.2419    -1.6193 -3.2423   0
## Aa2  3.2418 0.8039     1.6202   3.2421     4.8623  3.2425   0
## Aa3 -1.9878 0.8039    -3.6081  -1.9880    -0.3659 -1.9882   0
## Aa4 -1.1174 0.8039    -2.7380  -1.1176     0.5043 -1.1177   0
## Aa5  1.4128 0.8039    -0.2083   1.4129     3.0338  1.4131   0
##
## Random effects:
## Name   Model
##  A   IID model
## B   IID model
##
## Model hyperparameters:
##                                              mean        sd 0.025quant
## Precision for the Gaussian observations 9.656e-01 2.847e-01     0.5059
## Precision for A                         1.874e+04 1.842e+04  1261.8028
## Precision for B                         2.955e+00 4.340e+00     0.3363
##                                          0.5quant 0.975quant      mode
## Precision for the Gaussian observations 9.341e-01      1.616    0.8727
## Precision for A                         1.330e+04  67449.492 3448.3991
## Precision for B                         1.685e+00     13.351    0.7651
##
## Expected number of effective parameters(std dev): 9.584(1.458)
## Number of equivalent replicates : 3.756
##
## Marginal log-Likelihood:  -99.59
model.2b <- inla(Y ~ 0 + A + f(B, model = "iid"), data = dataset)
summary(model.2b)
##
## Call:
## "inla(formula = Y ~ 0 + A + f(B, model = \"iid\"), data = dataset)"
##
## Time used:
##  Pre-processing    Running inla Post-processing           Total
##          0.1613          0.1620          0.0326          0.3559
##
## Fixed effects:
##        mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## Aa0  1.0082 0.6005    -0.1751   1.0081     2.1906  1.0082   0
## Aa1 -3.2425 0.6005    -4.4256  -3.2425    -2.0600 -3.2426   0
## Aa2  3.2427 0.6005     2.0595   3.2427     4.4251  3.2428   0
## Aa3 -1.9884 0.6005    -3.1715  -1.9884    -0.8059 -1.9884   0
## Aa4 -1.1177 0.6005    -2.3009  -1.1178     0.0647 -1.1178   0
## Aa5  1.4132 0.6005     0.2300   1.4132     2.5957  1.4132   0
##
## Random effects:
## Name   Model
##  B   IID model
##
## Model hyperparameters:
##                                           mean     sd 0.025quant 0.5quant
## Precision for the Gaussian observations 0.9655 0.2847     0.5064   0.9337
## Precision for B                         2.9571 4.3566     0.3359   1.6837
##                                         0.975quant   mode
## Precision for the Gaussian observations      1.617 0.8719
## Precision for B                             13.388 0.7637
##
## Expected number of effective parameters(std dev): 6.053(0.4377)
## Number of equivalent replicates : 5.948
##
## Marginal log-Likelihood:  -99.50

# Discrete variable

## Intro

A discrete variable is a numerical variable but each interval between two values is an integer multiple of a fixed step size. Typical examples are related to time, e.g. the year in steps of 1 year, months expressed in terms of years (step size 1/12), …

We create a new dummy dataset with a discrete variable. The response variable is a third order polynomial of the discrete variable. The X variable is rescaled to -1 and 1.

n_x <- 25
n_sample <- 10
sd_noise <- 10
dataset <- expand.grid(
X = seq_len(n_x),
sample = seq_len(n_sample)
) %>%
mutate(
mu =  0.045 * X ^ 3 - X ^ 2 + 10,
Y = mu + rnorm(n(), sd = sd_noise),
X = (X - ceiling(n_x / 2)) / floor(n_x / 2)
)

## Fit with lme4

Suppose we fit a simple linear model to the data. We know that this is not accurate because the real pattern is a third order polynomial. And let’s add the variable also as a random effect. We use first lme4 to illustrate the principle. Fig. 5 illustrate how the fit of the fixed part is poor but the random effect of X compensates the fit.

model.1 <- lmer(Y ~ X + (1|X), data = dataset)
summary(model.1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Y ~ X + (1 | X)
##    Data: dataset
##
## REML criterion at convergence: 1990.4
##
## Scaled residuals:
##      Min       1Q   Median       3Q      Max
## -2.57303 -0.65313 -0.01203  0.67045  2.25593
##
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  X        (Intercept) 1453.6   38.13
##  Residual              109.2   10.45
## Number of obs: 250, groups:  X, 25
##
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  -20.576      7.654  -2.688
## X             10.354     12.737   0.813
##
## Correlation of Fixed Effects:
##   (Intr)
## X 0.000

The overall model fit improves when we add a second and third polynomial term. And the variance of the random effect decreases. It reduces even to zero once the third polynomial is in the model. Fig. 6 illustrates how the fit of the fixed effect improves when adding the higher order terms. The effect on the fitted values with the random effect is marginal.

model.1b <- lmer(Y ~ X + I(X ^ 2) + (1|X), data = dataset)
model.1c <- lmer(Y ~ X + I(X ^ 2) + I(X ^ 3) + (1|X), data = dataset)
anova(model.1, model.1b, model.1c)
## refitting model(s) with ML (instead of REML)
## Data: dataset
## Models:
## model.1: Y ~ X + (1 | X)
## model.1b: Y ~ X + I(X^2) + (1 | X)
## model.1c: Y ~ X + I(X^2) + I(X^3) + (1 | X)
##          Df    AIC    BIC   logLik deviance  Chisq Chi Df Pr(>Chisq)
## model.1   4 2011.1 2025.2 -1001.57   2003.1
## model.1b  5 1964.7 1982.3  -977.34   1954.7 48.462      1  3.367e-12 ***
## model.1c  6 1888.9 1910.0  -938.44   1876.9 77.789      1  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(model.1b)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Y ~ X + I(X^2) + (1 | X)
##    Data: dataset
##
## REML criterion at convergence: 1939.5
##
## Scaled residuals:
##      Min       1Q   Median       3Q      Max
## -2.61429 -0.64414  0.00578  0.68773  2.09118
##
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  X        (Intercept) 209.4    14.47
##  Residual             109.2    10.45
## Number of obs: 250, groups:  X, 25
##
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  -58.639      4.459 -13.150
## X             10.354      4.941   2.096
## I(X^2)       105.405      9.214  11.440
##
## Correlation of Fixed Effects:
##        (Intr) X
## X       0.000
## I(X^2) -0.746  0.000
summary(model.1c)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Y ~ X + I(X^2) + I(X^3) + (1 | X)
##    Data: dataset
##
## REML criterion at convergence: 1866.1
##
## Scaled residuals:
##      Min       1Q   Median       3Q      Max
## -2.63326 -0.69085  0.02283  0.70746  2.41956
##
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  X        (Intercept)   0.0     0.00
##  Residual             108.4    10.41
## Number of obs: 250, groups:  X, 25
##
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  -58.639      0.989  -59.29
## X            -41.931      2.755  -15.22
## I(X^2)       105.405      2.044   51.58
## I(X^3)        80.612      3.897   20.69
##
## Correlation of Fixed Effects:
##        (Intr) X      I(X^2)
## X       0.000
## I(X^2) -0.746  0.000
## I(X^3)  0.000 -0.917  0.000

## Fit with INLA

INLA requires that we alter the data to get the same output. First we copy X into X.copy because inla allows the variable to be used only once in the formula. For some reason this wasn’t needed with the categorical variables. The lme4 syntax X + (1|X) translates into the following INLA syntax: X + f(X.copy, model = "iid"). Then next thing is that INLA does the model fitting and prediction in a single step. Getting predictions for new data requires to add the new data to the original data while setting the response to NA. If we want predictions for the fixed effect only, then we need to add rows were all random effect covariates are set to NA. Hence X.copy must be NA while X must be non NA. Note that this would be impossible without creating X.copy.

Let’s fit the three same models with INLA. The predictions are given in fig. 7. The results are very similar to the lme4 results.

dataset2 <- dataset %>%
mutate(X.copy = X) %>%
bind_rows(
dataset %>%
distinct(X, mu)
)
model.2 <- inla(
Y ~ X + f(X.copy, model = "iid"),
data = dataset2,
control.compute = list(waic = TRUE)
)
model.2b <- inla(
Y ~ X + I(X ^ 2) + f(X.copy, model = "iid"),
data = dataset2,
control.compute = list(waic = TRUE)
)
model.2c <- inla(
Y ~ X + I(X ^ 2) + I(X ^ 3) + f(X.copy, model = "iid"),
data = dataset2,
control.compute = list(waic = TRUE)
)

# Continuous variable

A continuous variable is a numeric variable were there is not fixed step size between two values. In practice the step size will be several magnitudes smaller than the measured values. Again let’s clarify this with an example dataset. For the sake of simplicity we’ll reuse the true model from the example with the discrete variable. Compare fig. 8 with fig. 4 and you’ll see that fig. 8 has no step size while fig. 4 does.

n_x <- 25
n_sample <- 10
sd_noise <- 10
dataset <- data.frame(
X = runif(n_x * n_sample, min = 1, max = n_x)
) %>%
mutate(
mu =  0.045 * X ^ 3 - X ^ 2 + 10,
Y = mu + rnorm(n(), sd = sd_noise),
X = (X - ceiling(n_x / 2)) / floor(n_x / 2),
X.copy = X
)

The lmer model fails because the random effect has as many unique values as observations.

tryCatch(
lmer(Y ~ X + (1|X), data = dataset),
error = function(e){e}
)
## <simpleError: number of levels of each grouping factor must be < number of observations>

The INLA model yields output but the variance of the random effect is very high. A good indicator that there is something wrong.

model.2 <- inla(
Y ~ X + f(X.copy, model = "iid"),
data = dataset,
control.compute = list(waic = TRUE)
)
inla.tmarginal(
function(x){x^-1},
model.2$marginals.hyperpar$Precision for X.copy
) %>%
inla.zmarginal()
## Mean            1229.72
## Stdev           109.831
## Quantile  0.025 1027.31
## Quantile  0.25  1152.92
## Quantile  0.5   1224.96
## Quantile  0.75  1301.12
## Quantile  0.975 1458.24

## Conclusion

• Using a variable both in the fixed and random part of the model makes only sense in case of a discrete variable.