The following is a proc glimmix example syntax. I ran it using a fake dataset, so the results are also fake. The outcome is an interval variable and the model is a linear model (not a non-linear model like the logistic regression model). The random statement makes this model "multilevel." Level-1 units are students and level-2 units are schools (IDs are nces_school_name). The intercept is being estimated as the grand average of school-specific intercepts.
This model is a multilevel model and different from the OLS model because it estimates intercept as random effects (school-specific intercepts are derived as random effects and the grand average of those are reported as the intercept).
proc glimmix data=temp1 ;
class nces_school_name ;
model y=x1 x2 x3/
solution ddfm=kr dist=normal link=identity ;
random intercept /subject=nces_school_name;
run;
The main result table you will need to look at is the following. You can interpret this just like you would interpret the OLS regression result.
Solutions for Fixed Effects | |||||
Effect | Estimate | Standard Error |
DF | t Value | Pr > |t| |
Intercept | 0.5062 | 0.01600 | 1040 | 31.64 | <.0001 |
x1 | 0.005776 | 0.01717 | 3483 | 0.34 | 0.7365 |
x2 | 0.000465 | 0.01684 | 3486 | 0.03 | 0.9780 |
x3 | -0.01117 | 0.01708 | 3487 | -0.65 | 0.5133 |
Another table that you want to look at is this. Residual (0.54) is Level-1 variance (student-level variance). Intercept/nces_school_name (0.65) is Level-2 variance (school-level variance). I made up these numbers.
Covariance Parameter Estimates | |||
Cov Parm | Subject | Estimate | Standard Error |
Intercept | nces_school_name | 0.65 | 0.0181 |
Residual | 0.54 | 0.2002 |
I will focus on the first coefficient table when discussing results; however, I would also report information related to the second table. I would report the following information:
ICC (intraclass correlation):
level2 variance / (level1+level2 variance)
That is:
0.65 / (0.65 + 0.54) =0.546218
This shows the degree to which outcome variance is located between groups (schools, clusters) as opposed to within individuals.
I would also report "variance explained."
I need to run the additional model, which is the analysis-of-variance model where I have no covariate in the model:
proc glimmix data=temp1 ;
class nces_school_name ;
model y=/
solution ddfm=kr dist=normal link=identity ;
random intercept /subject=nces_school_name;
run;
Let's take a look at the covariance table (I made up these values).
Covariance Parameter Estimates | |||
Cov Parm | Subject | Estimate | Standard Error |
Intercept | nces_school_name | 0.92 | 0.181 |
Residual | 0.83 | 0.200 |
Let's combine this table with the other one in this way.
Analysis of variance model | Final model | Variance explained | |
Level 1 variance | 0.83 | 0.54 | 0.34939759 |
Level 2 variance | 0.92 | 0.65 | 0.293478261 |
Variance explained was calculated as:
(0.83-0.54)/0.83
and
(0.92-0.65)/0.92
I would make the final table look look like this. I didn't round numbers, but you should.
Estimates | Standard error | p-value | statistical test | |
Intercept | 0.5062 | 0.016 | <.0001 | *** |
x1 | 0.00578 | 0.01717 | 0.7365 | |
x2 | 0.00047 | 0.01684 | 0.978 | |
x3 | -0.0112 | 0.01708 | 0.5133 | |
level-1 variance | 0.83 | |||
level-2 variance | 0.92 | |||
ICC | 0.55 | |||
Level-1 variance explained | 0.35 | |||
Level-2 variance explained | 0.3 | |||
Notes: *** if p < 0.001, ** if p < 0.01, * if p < 0.05. |
How to interpret coefficients in the table
In the table above, x1's coefficient is 0.00578. This means that one unit increase in X1 will lead to an increase of 0.00578 in Y. The p-value associated with this is 0.74. So the coefficient here is not statistically significant at alpha=0.05.
One unit increase in X1 means ... if X is about height in meters (e.g., 1.5 meter, 1.7 meter), then 1 meter is 1 unit increase. If X is a binary variable (0 or 1), then one unit increase means "0 to 1 increase".
For my work, I almost always have a variable called TREATMENT which is 1 if subjects received treatment/intervention and 0 if the subjects did not. The coefficient for this is called "program impact effect." If, for example, the program impact effect is 0.25, I just say that and I also mention that other covariates are in the model and the program impact effect is adjusted for these factors. If the estimated program effect is 0.25, it means that the difference between the two groups (treatment vs. control) is 0.25 in outcome.
I also want to provide a standardized version of the program effect. I would run the same statistical model with the z-score version of the outcome variable. To do this, I usually use proc standard:
data abc2; set abc1;
Z_Y=Y;
run;
proc standard data=abc2 out=abc3 mean=0 std=1;
var Z_Y;
run;
Another approach would be to code this by hand in a datastep. if the mean of Y is -0.42 and SD is 0.5:
data abc2; set abc1;
Z_Y= (Y - 0.42)/0.5 ;
run;