For this exercise you will need the lme4 and ggplot2 packages.
library(lme4,quietly=TRUE)
## Warning: package 'lme4' was built under R version 3.6.2
library(ggplot2,quietly=TRUE)
We will use a dataset, sleepstudy, that is available from the package lme4. The sleepstudy data set is from an experiment that examined reaction times of sleep deprived individuals over the course of a few days (individuals only got 3h of sleep each day). You can get an overview of the data using the function str(), looking at the first few rows of the data frame with head(), and by plotting it with ggplot().
str(sleepstudy)
## 'data.frame': 180 obs. of 3 variables:
## $ Reaction: num 250 259 251 321 357 ...
## $ Days : num 0 1 2 3 4 5 6 7 8 9 ...
## $ Subject : Factor w/ 18 levels "308","309","310",..: 1 1 1 1 1 1 1 1 1 1 ...
head(sleepstudy,20)
## Reaction Days Subject
## 1 249.5600 0 308
## 2 258.7047 1 308
## 3 250.8006 2 308
## 4 321.4398 3 308
## 5 356.8519 4 308
## 6 414.6901 5 308
## 7 382.2038 6 308
## 8 290.1486 7 308
## 9 430.5853 8 308
## 10 466.3535 9 308
## 11 222.7339 0 309
## 12 205.2658 1 309
## 13 202.9778 2 309
## 14 204.7070 3 309
## 15 207.7161 4 309
## 16 215.9618 5 309
## 17 213.6303 6 309
## 18 217.7272 7 309
## 19 224.2957 8 309
## 20 237.3142 9 309
ggplot(sleepstudy, aes(Days, Reaction, group=Subject, colour=Subject)) +
geom_point() +
facet_wrap(~Subject, ncol=9) +
scale_x_continuous(limits=c(0, 10),breaks=c(0,10)) +
theme_minimal()
From the look of it, people show slower reaction times as they became increasingly sleep deprived. But how should we quantify the effects of sleep deprivation and possible individual differences?
The relation between reaction times and days looks like they could be captured in some regression model. We will try out different ones sequentially to see how different assumptions capture the link between sleep deprivation and reaction times as well as any individual differences thereof.
The first model below assumes that people are all the same and that there is no relation between Days and Reaction Times. Take a moment to look at the equation below and note that this model includes only an intercept, 1. Consequently, in the figure below, all participants are modeled with the same single value, the overall average reaction time.
model1 <- lm(Reaction ~ 1, sleepstudy)
summary(model1)
##
## Call:
## lm(formula = Reaction ~ 1, data = sleepstudy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -104.176 -43.132 -9.857 38.244 167.846
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 298.508 4.198 71.1 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 56.33 on 179 degrees of freedom
datafit=fitted(model1)
ggplot(sleepstudy, aes(Days, Reaction, group=Subject, colour=Subject)) +
geom_point() +
geom_line(aes(y=datafit), linetype=2) +
facet_wrap(~Subject, ncol=9) +
scale_x_continuous(limits=c(0, 10),breaks=c(0,10)) +
theme_minimal()
The second model below assumes that people differ in their initial reaction times but still, like the previous model, that there is no relation between Days and Reaction Times.
model2 <- lmer(Reaction ~ 1 + (1 | Subject), sleepstudy)
summary(model2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ 1 + (1 | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1904.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.4983 -0.5501 -0.1476 0.5123 3.3446
##
## Random effects:
## Groups Name Variance Std.Dev.
## Subject (Intercept) 1278 35.75
## Residual 1959 44.26
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 298.51 9.05 32.98
One thing you see in the model summary above (sometimes called an unconditional means model) is that the between-person variance, 1278, is about 39% of the total variance (i.e., the sum of the between and within-person variance, 1278 + 1959). This ratio is called the intraclass correlation coefficient (ICC) and it suggests that a significant portion of the variance is found within participants.
[NOTE: ICC can help you determine whether or not a mixed model is warranted as well as the nature of your construct. If you find that the ICC is close to 0, the observations within clusters are no more similar than observations from different clusters so perhaps you could ignore the clustering altogether. In contrast, if you find that ICC is close to 1, this suggests that a construct varies between people but not within a person and you can think of it as a trait (e.g., personality characteristic).
Note that, in the figure below, each participant now gets his/her own intercept (but still no effect of Days).
An obvious extension of the previous models is to consider the effect of Days. Please note that the model below considers and overall effect of Days that does NOT differ between participants…
model3 <- lmer(Reaction ~ 1 + Days + (1 | Subject), sleepstudy)
summary(model3)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ 1 + Days + (1 | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1786.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2257 -0.5529 0.0109 0.5188 4.2506
##
## Random effects:
## Groups Name Variance Std.Dev.
## Subject (Intercept) 1378.2 37.12
## Residual 960.5 30.99
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 251.4051 9.7467 25.79
## Days 10.4673 0.8042 13.02
##
## Correlation of Fixed Effects:
## (Intr)
## Days -0.371
datafit=fitted(model3)
ggplot(sleepstudy, aes(Days, Reaction, group=Subject, colour=Subject)) +
geom_point() +
geom_line(aes(y=datafit), linetype=2) +
facet_wrap(~Subject, ncol=9) +
scale_x_continuous(limits=c(0, 10),breaks=c(0,10)) +
theme_minimal()
Naturally, we may expect that the effect of Days on reaction times may be different for each participant (we get that feeling from the raw data, check out participant 335). Model 4 below thus includes the effect of days as a fixed effect (across participants) but also asks the model to estimate differences bewteen participants (Days is introduced as a fixed AND as a random effect).
model4 <- lmer(Reaction ~ 1 + Days + (1 + Days | Subject), sleepstudy)
summary(model4)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ 1 + Days + (1 + Days | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1743.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9536 -0.4634 0.0231 0.4634 5.1793
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 612.10 24.741
## Days 35.07 5.922 0.07
## Residual 654.94 25.592
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 251.405 6.825 36.838
## Days 10.467 1.546 6.771
##
## Correlation of Fixed Effects:
## (Intr)
## Days -0.138
It is likely helpful to see how the fixed or random effects coefficients look like for this model…
fixef(model4)
## (Intercept) Days
## 251.40510 10.46729
ranef(model4)
## $Subject
## (Intercept) Days
## 308 2.2585509 9.1989758
## 309 -40.3987381 -8.6196806
## 310 -38.9604090 -5.4488565
## 330 23.6906196 -4.8143503
## 331 22.2603126 -3.0699116
## 332 9.0395679 -0.2721770
## 333 16.8405086 -0.2236361
## 334 -7.2326151 1.0745816
## 335 -0.3336684 -10.7521652
## 337 34.8904868 8.6282652
## 349 -25.2102286 1.1734322
## 350 -13.0700342 6.6142178
## 351 4.5778642 -3.0152621
## 352 20.8636782 3.5360011
## 369 3.2754656 0.8722149
## 370 -25.6129993 4.8224850
## 371 0.8070461 -0.9881562
## 372 12.3145921 1.2840221
##
## with conditional variances for "Subject"
Does model 4 (random intercepts and slopes) simply provide estimates that are the same as computing individual correlations/regression for each person? Not quite… as you can see below the regression lines done for each single individual (solid lines) differ from those of Model 4 (dotted lines). For example, for subject 335, the mixed-effects model seems to be less willing to expect reaction times to drop across days - the upward patterns for other subjects influence the model’s fit to the 335’s data. This statistical phenomenon is called “shrinkage”.
ggplot(sleepstudy, aes(Days, Reaction, group=Subject, colour=Subject)) +
geom_point() +
geom_line(aes(y=datafit), linetype=2) +
facet_wrap(~Subject, ncol=9) +
scale_x_continuous(limits=c(0, 10),breaks=c(0,10)) +
geom_smooth(method=lm,se=FALSE)+
theme_minimal()
To further illustrate the principle of shrinkage (or borrowing strength) one can compare the estimates from a model that fits each participant separetly (the solid lines in the figure above, model 1 below) and a mixed effects model (the dashed lines in the figure above, model 2 below)
model0=lmList(Reaction ~Days |Subject, sleepstudy)
#model2=lmer(Reaction ~ 1 + Days + (1+Days|Subject), sleepstudy)
coefs_avg=fixef(model0)
coefs_m1=coef(model0)
coefs_m2=ranef(model4)[[1]]
coefs_m1
## (Intercept) Days
## 308 244.1927 21.764702
## 309 205.0549 2.261785
## 310 203.4842 6.114899
## 330 289.6851 3.008073
## 331 285.7390 5.266019
## 332 264.2516 9.566768
## 333 275.0191 9.142045
## 334 240.1629 12.253141
## 335 263.0347 -2.881034
## 337 290.1041 19.025974
## 349 215.1118 13.493933
## 350 225.8346 19.504017
## 351 261.1470 6.433498
## 352 276.3721 13.566549
## 369 254.9681 11.348109
## 370 210.4491 18.056151
## 371 253.6360 9.188445
## 372 267.0448 11.298073
coefs_m2
## (Intercept) Days
## 308 2.2585509 9.1989758
## 309 -40.3987381 -8.6196806
## 310 -38.9604090 -5.4488565
## 330 23.6906196 -4.8143503
## 331 22.2603126 -3.0699116
## 332 9.0395679 -0.2721770
## 333 16.8405086 -0.2236361
## 334 -7.2326151 1.0745816
## 335 -0.3336684 -10.7521652
## 337 34.8904868 8.6282652
## 349 -25.2102286 1.1734322
## 350 -13.0700342 6.6142178
## 351 4.5778642 -3.0152621
## 352 20.8636782 3.5360011
## 369 3.2754656 0.8722149
## 370 -25.6129993 4.8224850
## 371 0.8070461 -0.9881562
## 372 12.3145921 1.2840221
As you can see above, the coefs from model 1 do not have the same meaning as those of model 2 - keep in mind that the latter represent deviations from the sample mean (fixed effects) so we’ll have to add the fixed effects to these coefficents before comparing the estimates from the two models…
coefs_m2[,1]=coefs_m2[,1]+fixef(model4)[1]
coefs_m2[,2]=coefs_m2[,2]+fixef(model4)[2]
We can now plot these values to see how the estimates are different for each person. In the figure below, each pair of points connected by an arrow indicates estimates for the same individual, and the arrow head points towards the estimate from the mixed-effects model. What can you conclude?
Overall, Model 4 seems to capture the data well (and better than simpler ones). However, the models above differ not only in how well they fit the data but also in their complexity (i.e., number of parameters). At every step of model building we may want to test the assumption that the increased complexity (i.e., the addition of a parameter) is justified. In order to do this, we may want to test whether two models differ in their fit to the data (given their complexity). The anova() command shows a number of fit indices that do just that (e.g., AIC, BIC) and a statistical test of the difference between models. For example, as would be expected from looking at the model fits in the figure, model 4 provides a significantly better fit relative to the simpler models 2 and 3 (has lower AIC and BIC values). Note that we can’t compare different families of models (lm vs. lmer), which excludeds model1.
anova(model2,model3,model4)
## refitting model(s) with ML (instead of REML)
## Data: sleepstudy
## Models:
## model2: Reaction ~ 1 + (1 | Subject)
## model3: Reaction ~ 1 + Days + (1 | Subject)
## model4: Reaction ~ 1 + Days + (1 + Days | Subject)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## model2 3 1916.5 1926.1 -955.27 1910.5
## model3 4 1802.1 1814.8 -897.04 1794.1 116.462 1 < 2.2e-16 ***
## model4 6 1763.9 1783.1 -875.97 1751.9 42.139 2 7.072e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1