Statistical Services Offered

Back to List of Services

Mixed Model Analysis

Fixed effects are those factors whose levels (or values) are selected by a nonrandom process or whose levels consist of the entire population of possible levels. All levels of interest are in the data set. Researchers are interested in comparing only the specific levels included in the study. In a drug study, for example, researchers want to compare the effect of three drugs (A, B, and C). They are interested exclusively in the comparison of these three drugs, which they have selected before conducting the experiment. Drug is a fixed effect. A model containing only fixed effects is called a fixed effects model.

Random effects are those factors whose levels consist of a random sample of levels from a population of possible levels. For example, in the same drug study, researchers may select five clinics from the population of clinics in a region (i.e., five levels of the factor clinic). However, they want to make inferences for the drug effects across the population (i.e., all levels) of clinics, not just the five included in the study. Effects like clinic are random effects. Models in which all effects are random are called random effects models.

Mixed Models are models in which some factors are fixed effects and other factors are random effects. In SAS software, the Mixed procedure has been developed for mixed models analyses. As of SAS 9.1, a new procedure, Proc GLIMMIX*, became available which augments and extends the SAS mixed model tools in a number of ways, including the capability of fitting models to non-normal or normal data with correlations or nonconstant variability. The SAS GLM procedure is a fixed-effects procedure and is not recommended for mixed models analyses. (*For a number of years, the %GLIMMIX macro has been available through the SAS technical support site.)

In the theory of the general linear model, mixed models are in the class of general linear mixed models. These models extend the general linear model by allowing a more general specification of the covariance matrix of the response variable. The general linear model can be viewed as a special case of the general linear mixed model when the random effects component is zero. Normality assumptions limit the general linear model and the general linear mixed model to continuous responses. Different methodology must be used when the responses are discrete and non-normal (see "Longitudinal Data Analysis").

In a randomized complete block (RCB) design, treatments are randomly assigned within blocks. Blocks are groups of experimental units selected in such a way that experimental units within blocks are as homogeneous as possible. Blocks are designed to isolate variability due to extraneous or nuisance causes.

RCB designs often involve fixed and random effects, making mixed model analysis appropriate. In these analyses, hypotheses about the fixed effects remain the same as those in the fixed-effect model: whether there are significant treatment effects. Hypotheses about the random effects are whether the variance components associated with the random effects equal zero; in other words, whether there are significant variations due to these random variables. Inferences about fixed effects are often the primary interest, with the role of random effects being to model sources of variation so that the fixed effects can be more accurately estimated and tested.

There are numerous applications of mixed models in designed experiments. One of the simplest applications is a two-way mixed model, with a continuous response variable and one random effect and one fixed effect. Nested designs have a hierarchical data structure. There is more than one size of experimental unit, with smaller experimental units nested within larger ones.

A split-plot design is a factorial design in which the experimental unit with respect to one or more factors is a sub-unit with respect to other factors. Split-plot experiments are often used out of necessity. Sometimes a factor must be applied to relatively large experimental units, whereas other factors are more appropriately applied to sub-units. Split-plot experiments are also used for convenience. It is often easier to apply different factors to different sized units. While the split-plot design has an agricultural heritage, with the whole-plots usually being large areas of land and the sub-plots being smaller areas of land within the large area, the design is useful in many scientific and industrial experiments as well.

Incomplete Block Designs: In some experiments using randomized block designs, it may not be possible to apply all treatment combinations in each block. Situations like this usually occur because of shortages of experimental apparatus or facilities or the physical size of the block. Incomplete block designs are designs in which only a subset of treatments are applied in each block.

Crossover Design: In a crossover design the number of replicates (subjects) must be a multiple of the number of treatments. For example, consider a series of three treatments which can be administered in six different three-period sequences. If each sequence was assigned to four patients, there would be 24 patients in all, which is a multiple of three. Crossover designs address questions such as whether there is a treatment effect and what is the variability due to the random selection of the patients.

Analysis of covariance (ANCOVA): In the mixed models experimental designs discussed above, the response variable is continuous and the predictor (or independent) variables are categorical. ANCOVA is an analysis in which the response variable is continuous and the independent variables include one or more continuous variables in addition to classification variables. The goal of the analysis is often to examine treatment effects accounting for the variability associated with the continuous variables, which are called covariates.

Random Coefficient Model: In analysis of covariance, the regression coefficients for the covariates are assumed to be fixed effects, that is, unknown fixed parameters estimated from data. In the random coefficient model, the regression coefficients for one or more covariates are assumed to be a random sample from some population of possible coefficients. Random coefficient models may be appropriate when the data comes from independent subjects or clusters and when the regression model for each subject or cluster can be assumed to be a random deviation from some population regression model.

Repeated Measures Data Analysis: Repeated measures refer to multiple measurements on the same experimental unit (or subject). Repeated measures data requires special treatment: because of the variance and covariance structure of the errors, the assumptions that the error variances are independent and homogeneous are no longer valid. Repeated measures data analysis takes into account the presence of correlation between observations obtained on the same subject and of possible nonconstant variances.

Generalized Linear Mixed Models: Despite the ability of the SAS Proc Mixed to handle random effects, the model does carry assumptions of the general linear model: data is normally distributed; the means (or expected values) of the responses are linearly related to the predictor variables (i.e., linear in terms of fixed-effects parameters); and the variances and covariances of the data are in terms of covariance parameters, and they exhibit a structure available in the Mixed procedure.

Generalized linear mixed models expand the uses of the mixed model: the distribution of the random error can come from the family of exponential distributions (e.g., binary, binomial, Poisson, negative binomial, normal, beta, gamma, and inverse Gaussian) rather than only the normal distribution as assumed in the general linear model. Proc GLIMMIX performs estimation and statistical inference for generalized linear mixed models. If the response distribution is normal, Proc Mixed could also be used. (As noted earlier, Proc GLIMMIX became available as of SAS 9.1.)

Non-Linear Mixed Models: While generalized linear mixed models expand the range of linear mixed models, not all data can be adequately characterized by linear models. Nonlinear mixed models can be viewed as a further generalization of generalized linear mixed models, with the form of the nonlinear function being given by the link function. In nonlinear mixed models, the expected values of the responses can relate to the predictor variables through any type of nonlinear form. Nonlinear mixed models are fit in SAS software with Proc NLMixed which is a generalization of the random coefficient models fit by the MIXED procedure. This generalization allows the random coefficients to enter the model nonlinearly, whereas in Proc Mixed they must enter linearly.

 

Back to List of Services