While HLM software takes level-specific equation as a syntax, SAS PROC MIXED accepts just one equation, or I forget the
word for it--maybe one common equation or something as opposed to level-specific. If you started with Bryk and Raudenbush's
HLM book and are used to their HLM software, the transition feels hard. Compare the difference:
Level-specific way (R& B's HLM software uses this syntax):
Level1: Y=b0 + b1*X + Error
Level2: b0=g00 + g01*W + R0
Level3: b1=g10+g11*W + R1
One equation way (SAS PROC MIXED way)
Y=g00 + g01*W + g10*X +g11*W*X + R1*X + E + R0
How do we translate level-specific way (HLM way) into a one equation version (PROC MIXED way)?
1. Insert higher level equations into the level-1 equation.
Y=b0 + b1*X + Error
Insert level-2 equations into b's
--> Y=[g00 + g01*W + R0] + [g10+g11*W + R1 ]*X + Error
Take out the brackets
--> Y=g00 + g01*W + R0 + g10*X +g11*W*X + R1*X + Error
Shuffle the terms around, so it is easier to understand
--> Y=g00 + g01*W + g10*X +g11*W*X + R1*X + E + R0
Notice which parts are fixed effecs and which parts are random components.
--> Y=g00 + g01*W + g10*X +g11*W*X + R1*X + E + R0
Now write fixed effect part of the PROC MIXED syntax. Focus on the green part of the equation above.
proc mixed ;
class ;
model Y= W X W*X;
run;
Next write the random component part. The rule of thumb is that first you look at the random component, " R1*X + E + R0" and notice which variable is written with *R1 or *any error term. Throw that variable into the random
statement.
proc mixed ;
class ;
model Y= W X W*X;
random intercept X;
run;
Just one more look at R1*X + E
+ R0:
- E is a residual (or level-1 error in HLM terminology) and PROC MIXED syntax assumes
it exists, so we don't have to do anything about it.
- R0 is a level-2 error (or level-2 intercept), which is why we said "random intercept .."
- R1*X is the random coefficients for X, which is why we said "random ... X."
The most important thing is to notice which variable gets an asterisk next to it with error terms. In this case,
the error term was R1 and it sat right next to X. This is why we put X in the random statement.
Advantage of PROC MIXED way of writing one equation
When you start writing it in PROC MIXED way, you may begin to feel that HLM is not so complicated. For example,
something that you used to call level-1 variables, level-2 variables, level-3 variables are now just variables--period.
In HLM terminology, you use to use terms like level-1 error, level-2 error, level-3 error. They are now just residuals
(=level1 error) and random effects. "Centering" might have felt like a magical concept unique to HLM, but now it feels
like something that is not really specific to HLM but something that is awfully useful in any statistical model. I even
use centering for simple OLS regression because it adds a nice meaning to the intercept.
Finally, you realize HLM is just another regression. Without a random statement, a mixed model is the same as OLS
regression.
This is the same as OLS regression (note that I deleted a random statement):
proc mixed data=ABC covtest noclprint;
title "Non conditional model";
class state;
model actual =/solution ddfm=bw;
run;
This is an HLM (just added a random statement).
proc mixed data=ABC covtest noclprint;
title "Non conditional model";
class state;
model actual =/solution ddfm=bw;
random
intercept/sub=state;
run;
This simple difference is hidden when you use B&R HLM software because the use of
an independent software with lots of features make it look like it is a whole different ball game.