SAS PROC GLIMMIX example – My Statistical tools

The following is a proc glimmix example syntax. I ran it using a fake dataset, so the results are also fake. The outcome is an interval variable and the model is a linear model (not a non-linear model like the logistic regression model). The random statement makes this model "multilevel." Level-1 units are students and level-2 units are schools (IDs are nces_school_name). The intercept is being estimated as the grand average of school-specific intercepts.

This model is a multilevel model and different from the OLS model because it estimates intercept as random effects (school-specific intercepts are derived as random effects and the grand average of those are reported as the intercept).

proc glimmix data=temp1 ;
class nces_school_name ;
model y=x1 x2 x3/
solution ddfm=kr dist=normal link=identity ;
random intercept /subject=nces_school_name;
run;

The main result table you will need to look at is the following. You can interpret this just like you would interpret the OLS regression result.

Solutions for Fixed Effects
Effect	Estimate	Standard Error	DF	t Value	Pr > \|t\|
Intercept	0.5062	0.01600	1040	31.64	<.0001
x1	0.005776	0.01717	3483	0.34	0.7365
x2	0.000465	0.01684	3486	0.03	0.9780
x3	-0.01117	0.01708	3487	-0.65	0.5133

Another table that you want to look at is this. Residual (0.54) is Level-1 variance (student-level variance). Intercept/nces_school_name (0.65) is Level-2 variance (school-level variance). I made up these numbers.

Covariance Parameter Estimates
Cov Parm	Subject	Estimate	Standard Error
Intercept	nces_school_name	0.65	0.0181
Residual		0.54	0.2002

I will focus on the first coefficient table when discussing results; however, I would also report information related to the second table. I would report the following information:

ICC (intraclass correlation):

level2 variance / (level1+level2 variance)

That is:

0.65 / (0.65 + 0.54) =0.546218

This shows the degree to which outcome variance is located between groups (schools, clusters) as opposed to within individuals.

I would also report "variance explained."

I need to run the additional model, which is the analysis-of-variance model where I have no covariate in the model:

proc glimmix data=temp1 ;
class nces_school_name ;
model y=/
solution ddfm=kr dist=normal link=identity ;
random intercept /subject=nces_school_name;
run;

Let's take a look at the covariance table (I made up these values).

Covariance Parameter Estimates
Cov Parm	Subject	Estimate	Standard Error
Intercept	nces_school_name	0.92	0.181
Residual		0.83	0.200

Let's combine this table with the other one in this way.

	Analysis of variance model	Final model	Variance explained
Level 1 variance	0.83	0.54	0.34939759
Level 2 variance	0.92	0.65	0.293478261

Variance explained was calculated as:

(0.83-0.54)/0.83

and

(0.92-0.65)/0.92

I would make the final table look look like this. I didn't round numbers, but you should.

	Estimates	Standard error	p-value	statistical test
Intercept	0.5062	0.016	<.0001	***
x1	0.00578	0.01717	0.7365
x2	0.00047	0.01684	0.978
x3	-0.0112	0.01708	0.5133

level-1 variance	0.83
level-2 variance	0.92
ICC	0.55
Level-1 variance explained	0.35
Level-2 variance explained	0.3
Notes: * if p < 0.001, if p < 0.01, * if p < 0.05.

How to interpret coefficients in the table

In the table above, x1's coefficient is 0.00578. This means that one unit increase in X1 will lead to an increase of 0.00578 in Y. The p-value associated with this is 0.74. So the coefficient here is not statistically significant at alpha=0.05.

One unit increase in X1 means ... if X is about height in meters (e.g., 1.5 meter, 1.7 meter), then 1 meter is 1 unit increase. If X is a binary variable (0 or 1), then one unit increase means "0 to 1 increase".

For my work, I almost always have a variable called TREATMENT which is 1 if subjects received treatment/intervention and 0 if the subjects did not. The coefficient for this is called "program impact effect." If, for example, the program impact effect is 0.25, I just say that and I also mention that other covariates are in the model and the program impact effect is adjusted for these factors. If the estimated program effect is 0.25, it means that the difference between the two groups (treatment vs. control) is 0.25 in outcome.

I also want to provide a standardized version of the program effect. I would run the same statistical model with the z-score version of the outcome variable. To do this, I usually use proc standard:

data abc2; set abc1;

Z_Y=Y;

run;

proc standard data=abc2 out=abc3 mean=0 std=1;

var Z_Y;

run;

Another approach would be to code this by hand in a datastep. if the mean of Y is -0.42 and SD is 0.5:

data abc2; set abc1;

Z_Y= (Y - 0.42)/0.5 ;

run;

Leave a Reply Cancel reply