We begin by importing the tab-delimited version of the Rat Pup data set into Stata, assuming that the rat_pup.dat file is located in the C:\temp directory. Note that we present the Stata commands including the prompt (.), which is not entered as part of the commands.
. insheet using "C:\temp\rat_pup.dat", tab
We utilize the xtmixed command (first available in Stata Release 9) to fit the models
for this example.
Step 1: Fit a model with a “loaded” mean structure.
Stata by default treats the lowest-valued level of a categorical variable (alphabetically or numerically) as the reference category, so the default reference level for SEX would be Female in this case. To be consistent with the other software procedures, we use the char command to set the reference category for SEX to be Male:
. char sex[omit] Male
Because “Control” is the lowest level of TREATMENT alphabetically, we do not need to recode this variable. The xtmixed command used to fit Model 3.1 is specified as follows:
. * Model 3.1 fit .
. xi: xtmixed weight i.treatment*i.sex litsize, || litter:,covariance(identity) variance
- The xi: (interaction expansion) option is used before invoking the xtmixed command to create dummy variables for the categories of the fixed factors TREATMENT and SEX, and for the corresponding interaction terms. The i. notation (i.treatment*i.sex) automatically specifies that fixed effects associated with the interaction between TREATMENT and SEX should be included in the model, along with the main effects for both of these terms.
- This tells us that Control is the reference level of TREATMENT (Control omitted), i.e., no dummy variable is created for the control treatment, and that Male is the reference level of SEX (Male omitted).
- The xtmixed command syntax has three parts, separated by commas. The first part specifies the fixed effects, the second part specifies the random effects, and the third part specifies the covariance structure for the random effects, in addition to miscellaneous options. We discuss these parts of the syntax in detail later. The first variable listed after the xtmixed command is the continuous dependent variable, WEIGHT. The variables following the dependent variable are the terms that will have associated fixed effects in the model. We include i.treatment*i.sex and litsize.
- The two vertical bars (||) precede the variable that defines clusters of observations (litter:).
- The absence of additional variables after the colon indicates that there will only be a single random effect associated with the intercept for each litter in the model.
- The covariance option after an additional comma specifies the covariance structure for the random effects (or the D matrix). Because Model 3.1 includes only a single random effect associated with the intercept (and therefore a single variance parameter associated with the random effects), it has an identity covariance structure. The covariance option is actually not necessary in this simple case.
- Finally, the variance option requests that the estimated variances of the random effects and the residuals be displayed in the output, rather than their estimated standard deviations, which is the default. The xtmixed procedure uses REML estimation by default.
- The AIC and BIC information criteria can be obtained by using the following command after the xtmixed command has finished running:
. * Information criteria.
. estat ic
- By default, the xtmixed command does not display F-tests for the fixed effects in the model. Instead, omnibus Wald chi-square tests for the fixed effects in the model can be performed using the test command. For example, to test the overall significance of the fixed treatment effects, the following command can be used:
. * Test overall significance of the fixed treatment effects.
. test _Itreatment_2 _Itreatment_3
- The two terms listed after the test command are the dummy variables automatically generated by Stata for the fixed treatment effects (we obtain the names for these dummy variables from the initial output for the model shown above). The fixed effects associated with these variables will be displayed in the output, and the indicator variables for the high and low levels of TREATMENT will automatically be saved in the data set. The test command is testing the null hypothesis that the two fixed effects associated with these terms are equal to zero (i.e., the null hypothesis that the treatment means are all equal).
- Similar omnibus tests may be obtained for the fixed SEX effect, the fixed LITSIZE effect, and the interaction between TREATMENT and SEX:
. * Omnibus tests for SEX, LITSIZE and TREATMENT*SEX interaction.
. test _Isex_1
. test litsize
. test _ItreXsex_2_1 _ItreXsex_3_1
- The dummy variables that Stata generates for the interaction effects (_ItreXsex_2_1 and _ItreXsex_3_1) can be somewhat confusing to identify, but they are located in the Variables window in Stata and are also displayed in the initial model output. Once a model has been fitted using the xtmixed command, EBLUPs of the random effects associated with the levels of the random factor (LITTER) can be saved in a new variable (named EBLUPS) using the following syntax:
. predict eblups, reffects
- The saved EBLUPs can then be used to check for random effects that may be outliers.
Step 2: Select a structure for the random effects (Model 3.1 vs. Model 3.1A).
We test Hypothesis 3.1 to decide whether the random effects associated with the intercept for each litter can be omitted from Model 3.1, using a likelihood ratio test, based on the following output generated by the xtmixed command:
Stata reports “chibar2(01),” indicating that it uses the correct null hypothesis distribution of the test statistic each with equal weight of 0.5 (see Subsection 3.5.1). The likelihood ratio test reported by the xtmixed command is an overall test of the covariance parameters associated with all random effects in the model. In models with a single random effect for each cluster, as in Model 3.1, it is appropriate to use this test to decide if that random effect should be included in the model. The significant result of this test (p < .001) suggests that the random litter effects should be retained in Model 3.1. The current version of the xtmixed command in Stata does not allow one to fit LMMs with residual variances that vary between different levels of a “group” variable in the data set (e.g., Models 3.2A, 3.2B, and 3.3). We therefore do not consider additional models in Stata.