Doing HLM by SAS® PROC MIXED

 

 

 

 +

 

 

 

 

 

 

Version 2.2 updated 2/4/2005

 

by Kazuaki (Kaz) Uekawa, Ph.D.

Copyright © 2004 By Kazuaki Uekawa All rights reserved.

email: kuekawa@alumni.uchicago.edu

website: www.estat.us

 


Profile:

When I wrote this manual, I was a research fellow for Japan Society for the Promotion of Science.  I am now a research analyst at an applied research organization in Washington DC.  Even for me this manual is convenient since I refer to it myself for my work.

 

Table of Contents

I.     Advantage of SAS and SASfs RROC MIXED.. 3

II.    What is HLM?. 3

III.  Description of data (for practice) 6

IV.   Specifying Models. 8

Description of each statement in PROC MIXED.. 9

How to Read Results off the SAS Default Output 11

Replicating 4 models described by Bryk and Raudenbush (1992), Chapter 2. 15

Cross-level Interaction, Modeling of Slopes. 17

Group and grand mean centering. 20

The use of Weight 21

Residual analysis using ODS. 22

V.    PROC MIXED used along with other SAS functionalities. 24

·      With ODS (Output Delivery System) 24

·      Wth ODS, and GRAPH.. 26

Other Useful SAS functionalities for HLM analyses. 28

·      Creating group-level mean variables. 28

Normalizing sample weight right before PROC MIXED.. 28

·      The use of Macro (also described in another one of my SAS tutorial) 29

VI.  APPENDIX ODS: How to find available tables from SASfs PROCEDURES. 30

VII. Appendix Other resources for running PROC MIXED.. 31

VIII.       Frequently Asked Questions for Novice Users. 33

 


             I.      Advantage of SAS and SASfs RROC MIXED

 

     PROC MIXED is part of SAS, not a stand-alone program.  You only need one program file to prep data, run stats, and report results, using tables or graphs.  Thus, I know where my files are.  The use of a stand-alone program creates too many files (control files and output files), in addition to SAS, SPSS, or STATA files to prep data for the program.

Being part of SAS, PROC MIXED comes with ODS that saves results as SAS data sets.  Just for this I recommend SAS PROC MIXED.  Instead of relying on default outputs, I can create graphics and tables within the same syntax program that I use for PROC MIXED.  In other words, I can control my outputs digitally.

              And PROC MIXED is easy to program.  If you have a column of an outcome variable, just doing the following will you a result for non-conditional model:

 

proc mixed;
model bsmpv01=;
random intercept/ sub= IDSCHOOL ;
run;

 

              Finally, I like SASfs support system through email (support@sas.com).  I email SAS perhaps three times when I am doing an intensive programming.  I get answers usually in the same day.

 

 

          II.      What is HLM?

HLM is a model where we can set coefficients to be random instead of just fixed. Linear regression model like OLS is a model whose coefficients are all fixed.  HLM allows the coefficients to vary across group units.  As a result of setting the intercept random, it corrects for the correlated errors that exist in hierarchically structured data (e.g., data sets that we often use for education research).  As a result of setting other coefficients to vary randomly by group units, we may be able to be more true to what is going on in the data.

Letfs examine the simplest models, only-intercept model in HLM and in OLS.  Imagine that our outcome measure is math test score.  The goal of only-intercept model is to derive a mean (b0).  Units of analysis are students who are nested within schools.  We use PROC MIXED to do both OLS and HLM.  Yes, it is possible.  The data is a default data set that comes with SAS installation.


 

 

OLS

HLM

Y_jk = b0 + error_jk

 

Y_jk=b0 + error_jk + error_k

or

Level1: Y_jk=b0 + error_jk

Level2: b0=g0 + error_k

 

proc mixed data=sashelp.Prdsal3 covtest noclprint;

title "Non conditional model";

class state;

model actual =/solution ddfm=kr;

run;

 

proc mixed data=sashelp.Prdsal3 covtest noclprint;

title "Non conditional model";

class state;

model actual =/solution ddfm=kr;

random intercept/sub=state;

run;

 

 

In HLM we estimate errors at different levels, such as individual and schools.  If I am a seller of furniture in this sample adta, my sales average will be constructed by b0, which is a grand mean, and my own error term and my statefs error term.  In OLS, we only got an individual level error.  In HLM, because we remove the influence of the state effect from my individual effect, we have more confidence in saying that my error is independent, which allows me to have more confidence in the statistical evaluation of b0.

 

 

Here I added a classification variable (dummy variables) as fixed effects.

Y_jk=b0 + b1*X + error_jk + error_k

or

Level1: Y_jk=b0 + b1*X + error_jk

Level2: b0=g00 + error00_k

Level2: b1=g10

 

proc mixed data=sashelp.Prdsal3 covtest noclprint;

title "One predictor as an fixed effect";

class state;

model actual =predict/solution ddfm=bw;

random intercept/sub=state;

run;

 

Here I set the coefficient to vary randomly by group units.

Y_jk=b0 + b1*X + error_jk + error_k

or

Level1: Y_jk=b0 + b1*X + error_jk

Level2: b0=g00 + error00_k

Level2: b1=g10 + error10_k

 

proc mixed data=sashelp.Prdsal3 covtest noclprint;

title "One predictor as an fixed effect";

class state;

model actual =predict/solution ddfm=bw;

random intercept/sub=state;

run;

 

 

Advanced topic (jumping ahead too much):

Here I set a classification variable (that work as a series of dummy variable) and set their coefficients random.  In SAS we donft have to code character variables into dummy variables.  We request that a variable PRODUCT be treated as character variables by stating it at CLASS line.  Also we write gGROUP=PRODUCTh at the random statement line to request random components estimated separately for each group.

 

proc mixed data=sashelp.Prdsal3 covtest noclprint;

title "Character variable as a random effect";

class state product;

model actual =predict product/solution ddfm=bw;

random intercept/sub=state;

random product/sub=state group=product;

run;

 


 

       III.      Description of data (for practice)

Data name: esm.sas7dbat

http://www.estat.us/sas/esm.zip

 

ESM study of Student Engagement conducted by Kathryn Borman and Associates

We collected data from high school students in math and science classrooms, using ESM (Experience Sampling Method).  Our subjects come from four cities.  In each city we went to two high schools.  In each high school we visited one math teachers and one science teachers.  We observed two classes that each teacher taught for one entire week.  We gave beepers to five randomly selected girls and five randomly selected boys.  We beeped each 10 minute interval of a class, but we arranged such that each student received a beep (vibration) only once in 20 minutes.  To do this, we divided the ten subjects into two groups.  We beeped the first group at the first ten minute point and 30 minute point.  The second group was beeped at the twenty minute point and 40 minute point.  For a class of 50 minute, which is a typical length in most cities, therefore, we collected two observations from each student.  We visited the class everyday for five days, so we have eight observations per student.  There are also 90 minute class that meets only alternate date.  Students are beeped four times in such long classes at 20 minute interval.  Between each beep, researchers recoded what was going on in the class

 

FROM:

gStudent Engagement in Mathematics and Science (Chapter 6)h In Reclaiming Urban Education: Confronting the Learning Crisis  Kathryn Borman and Associates  SUNY Press.  2005

 

Data Structure

About 2000 beep points (repeated measures of Student Engagement)

Nested within about 300 students

Nested within 33 classes

 

  • IDs
    • IDstudent (level-1)
    • IDstudent (level2).
    • IDclass (level-3)

 

Data matrix looks like:

Date                    IDclass IDstudent            IDbeep TEST c

Monday               A           A1                       1            0

Monday               A           A1                       3            1

Monday              A           A2                       2            1

Monday               A           A2                       4            1

ccontinues

 

 

Variables selected for exercise

Dependent variable: ENGAGEMENT Scale: This composite is made up of the following 8 survey items.  We used Rasch Model to create a composite.  The eight items are:

When you were signaled the first time today,   SD  D     A   SA

•I was paying attentionccccccccc..           O     O O     O

•I did not feel like listeningcccccccc            O     O O     O

•My motivation level was highcccccc..            O     O O     O

•I was boredccccccccccccc..         O     O O     O

•I was enjoying classcccccccccc            O     O O     O

•I was focused more on class than anything else            O     O O     O

•I wished the class would end sooncccc .            O     O O     O

•I was completely into class ccccccc            O     O O     O

 

    • Precision_weight: weight derived by 1/(error*error) where error is a standard error of engagement scale derived by WINSTEPS.  This weight is adjusted to add up to the number of observations; otherwise, the variance components change proportion to the sum of weights, while other results remain the same.

 

 

    • IDBeep: categorical variables for the beeps (1st beep, 2nd beep, up to the 4th in a typical 50 minute class; up to 8th in 90 minute classes that are typically block scheduled.)

 

    • DATE: Categorical variables, Monday, Tuesday, Wednesday, Thursday, Friday
    • TEST: Students report that what was being taught in class was related to test.  0 or 1.  Time-varying.
  • Fixed Characteristics of individuals
    • Hisp: categorical variable for being Hispanic
    • SUBJECT: mathematics class versus science class

 


 

        IV.      Specifying Models

Intercept-only model (One-way ANOVA with Random Effects)

Any HLM analysis starts from intercept-only model:

 

Y=intercept + error1 + error2 + error3

where error1 is within-individual error, error2 is between-individual error, error3 is between-class error.

Intercept is a grand mean that we of courses are interested in.  Also we want to know the size of variance for level-1 (within-individual), level-2 (between-individual), and level-3 (between-class)—to get a sense of how these errors are distributed. 

              We need to be a bit careful when we get too much used to calling these simply as level-1 variance, level-2 variance, and level-3 variance.  There is a qualitative difference between level-1 variance, which is residual variance and level-2 and level-3 variance, which are parameter variances.  Level-1 variance/Residual variance is specific to individual cases.  Level-2 variances are the variance of level-2 intercepts (i.e., parameters) and level-3 variances are the variance of level-3 intercepts. 

The point of doing HLM is to set these parameters to vary by group units.  So, syntax-wise, the difference between PROC MIXED and PROC REG (for OLS regression) seems PROC MIXEDfs use of RANDOM lines.  With them, users specify a) which PARAMETER (e.g., intercept and beta) to vary and b) by what group units (e.g., individuals, schools) they should vary.  RANDOM lines, thus, are the most important lines in PROC MIXED.

 

Compare 3-level HLM and 2-level HLM

Libname here "C:\";

/*This is three level model*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class IDclass IDstudent;

model engagement= /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);    /*level2*/

random intercept /sub=IDclass ;        /*level3*/

run;

 

/*This is two level model*/

/*Note that I simply put * to get rid of one of the random lines*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class IDclass IDstudent;

model engagement=  /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);    /*level2*/

*random intercept /sub=IDclass ;       /*level3*/

run;

 

 

 

Description of each statement in PROC MIXED

 

LETfs EXAMINE THIS line by line

Libname here "C:\";

 

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class IDclass IDstudent;

model engagement= /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);    /*level2*/

random intercept /sub=IDclass ;        /*level3*/

run;

 

PROC MIXED statement

proc mixed data=here.esm covtest noclprint;

 

gcovtesth does a test for covariance components (whether variances are significantly larger than zero.).  The reason why you have to request such a simple thing is that COVTEST is not based on chi-square test that one would use for a test of variance.  It uses instead t-test or something that is not really appropriate.  Shockingly, SAS has not corrected this problem for a while.  Anyways, because SAS feels bad about it, it does not want to make it into a default option, which is why you have to request this.  Not many people know this and I myself could not believe this.  So I guess that means that we cannot really believe in the result of COVTEST and must use it with caution.

              When there are lots of group units, use NOCLPRINT to suppress the printing of group names.

 

CLASS statement

class IDclass IDstudent;

or also

class IDclass IDstudent Hisp;

 

We throw in the variables that we want SAS to treat as categorical variables.  Variables that are characters (e.g., city names) must be on this line (it wonft run otherwise).  Group IDs, such as IDclass in my example data, must be also in these lines; otherwise, it wonft run.  Variables that are numeric but dummy-coded (e.g., black=1 if black;else 0) donft have to be in this line, but the outputs will look easier if you do.  One thing that is a pain in the neck with CLASS statement is that it chooses a reference category by alphabetical order.  Whatever group in a classification variable that comes the last when alphabetically ordered will be used as a reference group.  We can control this by data manipulation.  For example, if gender=BOY or GIRL, then I tend to create a new variable to make it explicit that I get girl to be a reference group:

If gender=hBoyh then gender2=h(1) Boyh;

If gender=hGirlh then gender2=h(2) Girlh;

I notice people use dummy coded variables (1 or 0) often, but in SAS there is no reason to do that.  I recommend that you use just character variables as they are and treat them as gclass variablesh by specifying them in a class statement.  But if you want to center dummy variables, then you will have to use 0 or 1 dummies.  We do that often in HLM setup.

I recommend getting used to the use of class statement – for people who are not using it and instead using dummy-coded variables as numeric variables.  Once you get used to the habit of SAS that singles out a reference category based on the alphabetical order of groups (specified in one classification variable), you feel it is a lot easier (than each time dummy coding a character variable).

 

MODEL statement

model engagement= /solution ddfm=kr;

 

ddfm=kr specifics the ways in which the degree of freedom is calculated.  This option is relatively new at the time of this writing (2005).  It seems most close to the degree of freedom option used by Bryk, Raudenbush, and Congdonfs HLM program.  Under this option, the degree of freedom is close to the number of cases minus the number of parameters estimated when evaluating the coefficients that are fixed.  When evaluating the coefficients that are treated as random effects, the option KR uses the number of group units minus the number of parameters estimated.  But the numbers are slightly off, so I think some other things are also considered and I will have to read some literature on KR.  I took a SAS class at the SAS institutes.  The instructor says KR is the way to go, which contradicts the advice of Judith Singer in her famous PROC MIXED paper for the use of ddfm=BW, but again KR is added only lately to PROC MIXED.  And it seems KR makes more sense thatn BW.  BW uses the same DF for all coefficients.  Also BW uses different DF for character variables and numeric variables, which I never understood and no one gave me an explanation.  Please let me know if you know anything about this (à kaz_uekawa@yahoo.com).

 

SOLUTION (or just S) means gprint resultsh on the parameters estimated. 

Random statement

 

random intercept /sub=IDstudent(IDclass);           /*level2*/

random intercept /sub=IDclass ;                          /*level3*/

 

This specifics the level at which you want the intercept (or other coefficients) to vary.  grandom intercept /sub=IDstudent(IDclass)h means gmake the intercept to vary across individuals (that are nested within classes).h  SUB or subject means subjects.  The use of ( ) confirms the fact that IDstudent in nested within IDclass and this is necessary to obtain the correct degree of freedom when IDstudent is not unique.  If every IDstudent is unique, I believe there is no need to use () and the result would be the same.  My colleagues complain that IDstudent(IDclass) feels counterintuitive and IDclass(IDstudent) makes more intuitive appeal.  I agree, but it has to be IDstudent(IDclass) such that the hosting unit must noted in the bracket.

Letfs make it into a language lesson.  gRandom intercept /sub=IDstudent(IDclass);hcan be translated into English as, gPlease estimate the intercept separately for each IDstudent (that, by the way, is nested within IDclass).h  gRandom intercept hisp /sub=IDclass;: can be translated into English as: gPlease estimate both the intercept and Hispanic effect separately for each IDstudent (that, by the way, is nested within IDclass).h

There is one useful feature of random statement when dealing with class variables.  Class variables are the variables that are made of characters.  Race variable is a good example.  In B, R, and Kfs HLM software, you need to dummy code character variables like this, which I found tedious.

 

black=0;

If race=1 then black=1;

 

white=0;

If race=2 then white=1;

 

Hispanic=0;

If race=3 then Hispanic=1;

 

In SAS, we donft need to create dummy variables.  For example, we can put a RACE indicator at a class statement.  But how can we request that the character variable to be set random and estimate how each race groupfs slope varies across groups?  This would do it (though race variable is not in this practice data set for now).

 

random intercept race /sub=IDclass  group=race;                    

 

 

How to Read Results off the SAS Default Output

 

RUN THIS.  Be sure you have my data set (http://www.estat.us/sas/esm.zip

) in your C directory.  We are modeling the level of student engagement here.  This is an analysis of variance model or intercept-only model.  No predictor is added.

 

Libname here "C:\";

 

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class IDclass IDstudent;

model engagement= /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);    /*level2*/

random intercept /sub=IDclass ;        /*level3*/

run;

 

YOU GET:

 

Text Box:                                       The Mixed Procedure

                                       Model Information

                     Data Set                     HERE.ESM
                     Dependent Variable         engagement
                     Weight Variable             precision_weight
                     Covariance Structure       Variance Components
                     Subject Effects             IDstudent(IDclass),
                                                    IDclass
                     Estimation Method           REML
                     Residual Variance Method     Profile
                     Fixed Effects SE Method      Prasad-Rao-Jeske-
                                                  Kackar-Harville
                     Degrees of Freedom Method    Kenward-Roger

 

 

 

 

                                         

 

 

 

 

 

 

 

 

 

 

REML (Restricted Maximum Likelihood Estimation Method) is a default option for estimation method.  It seems most recommended.

 

 

 

Text Box: Dimensions

                              Covariance Parameters             3
                              Columns in X                        1
                              Columns in Z Per Subject         17
                              Subjects                          32
                              Max Obs Per Subject             114
                              Observations Used              2316
                              Observations Not Used             0
                              Total Observations             2316

 

                                       

 

 

 

 

 

 

 

 

 


 

 

Text Box:                                      Non conditional model                                     4
                                                                08:53 Saturday, February 5, 2005

                                      The Mixed Procedure

                                 Covariance Parameter Estimates

                                                         Standard         Z
         Cov Parm      Subject               Estimate       Error     Value        Pr Z

         Intercept     IDstudent(IDclass)   23.3556      2.5061      9.32      <.0001
         Intercept     IDclass                 5.3168      2.0947      2.54      0.0056
         Residual                               31.9932      1.0282     31.12      <.0001
 

Text Box: Iteration History

                  Iteration    Evaluations    -2 Res Log Like       Criterion

                          0              1     16058.34056441
                          1              2     15484.50136495      0.00180629
                          2              1     15472.31142352      0.00029511
                          3              1     15470.46506873      0.00001303
                          4              1     15470.38907630      0.00000007
                          5              1     15470.38870052      0.00000000


                                   Convergence criteria met.

Recall that the test of covariance is not using chi-square test and SAS has to fix this problem sometime soon in the future!

 

Intercept     IDstudent(IDclass) is for the level-3 variance

Intercept     IDclass     is for the level-2 variance

Residual  is level-1 variance.

Therefore:

 

Remember in this study we went to the classroom and gave beepers to kids.  Kids are beeped about 8 times during our research visit.  So beep level is level-1 and it is at the repeated measure level.  You can create a pie chart like this to understand it better.  The level of student engagement is mostly between student and then next within students.  There is little between-class differences of studentsf engagement level.

So, looking at this, I say gengagement depends a lot on students.h  Also I say, gStill given the same students, their engagement level varies a lot!h  And I say, gGee, I thought classes that we visited looked very different, but there is not much variance between the classrooms.h

You can also use the percentages for each level to convey the same information.  Intraclass correlation is 9% for the class level.  Intraclass correlation is calculated like this:

IC= variance of higher level / total variance

where total variance is a sum of all variances estimated.

              Because there are three levels, it is confusing what infraclass correlation means.  But if there are only two levels, it is more straight forward.

IC=level 2 variance/ (level1 variance + level2 variance)

 

 

Text Box: Fit Statistics

                             -2 Res Log Likelihood         15470.4
                             AIC (smaller is better)       15476.4
                             AICC (smaller is better)      15476.4
                             BIC (smaller is better)       15480.8

You can use these to evaluate the goodness-of-fit across models.  But when you are using restricted maximum likelihood, you can only compare the two models that are different only by the random effects added.  That is pretty restrictive!

To compare models that vary by fixed effects or random effects, you need to use Full Maximum likelihood methods.  You need to request that by changing an option for estimation method.  For example:

                                 

proc mixed data=here.esm covtest noclprint method= ml;

 

The way to do goodness of fit test is the same as other statistical methods, such as logistic regression models.  You get the differences of both statistic (e.g., -2 Reg Log Likelihood) and number of parameters estimated.  Check these values using a chi-square table.  Report P.  Here is a silly hypothetical example.  I use AIC for this:

 

Model 1: AIC 10.5; Number of parameters estimated 5

Model 2: AIC 11.5; Number of parameters estimated 6

 

Difference in AIC=>11.5 – 19.5=2

Difference in Number of parameters estimated => 1

 

Then I go to a chi-square table.  Find a P-value where the chi-square statistic is 1 and DF is 1 and say if it is statistically significant at a certain alpha level (like 95%).  If you feel lazy about finding a chi-square table, Excel let you test it.  The function to enter into an Excel cell is:

=CHIDIST(x, deg_freedom)

 

This greturns the one-tailed probability of the chi-squared distribution.  So in the above example, I do this:

=CHIDIST(2, 1)

 

I get a p-value of

0.157299265

 

So the result suggests that the P is not small enough to claim that the difference is statistically significant.

Can someone tell me if above is an okay procedure to use?  I am wondering if it was okay to use this function that says gone tailed probability ofch  Does it have to be two tails???  Please email me at kaz_uekawa@yahoo.com

 

 

 

Text Box: Solution for Fixed Effects

                                         Standard
               Effect        Estimate       Error      DF    t Value    Pr > |t|

               Intercept      -1.2315      0.5081    30.2      -2.42      0.0216
Finally, this is the coefficient table.  This model is an intercept only model, so we have only one line of result.  If we add predictors, the coefficients for them will be listed in this table.  The degree of freedom is very small, corresponding to the number of classes that we visited minus the number of parameters estimated.  Something else must be being considered here as the number 30.2 indicates.

Replicating 4 models described by Bryk and Raudenbush (1992), Chapter 2

Research Q1: What is the effect of race gbeing Hispanich on student engagement, controlling for basic covariates?

Random Intercept model where only the intercept is set random.

/*Hispanic effect FIXED*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class idbeep hisp IDclass IDstudent subject;

model engagement=  idbeep hisp test /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);/*level2*/

random intercept /sub=IDclass ;/*level3*/

run;

 

 

Research Q2: Given that Hispanic effect is negative, does that effect vary by classrooms?

Random Coefficients Model where the intercept and Hispanic effect are set random.

/*Hispanic effect random*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class idbeep hisp IDclass IDstudent subject;

model engagement=  idbeep hisp test/solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);/*level2*/

random intercept hisp /sub=IDclass ;/*level3*/

run;

 

 

Research Q3:Now we know that the effect of Hispanics vary by classroom, can we explain that variation by subject matter (mathematics versus science)?

Slopes-as-Outcomes Model with Cross-level Interaction

A researcher thinks that even after controlling for subject, the variance in the Hispanics slopes remain (note hisp at the second random line).

/*Hispanic effect random and being predicted by class characteristic*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class idbeep hisp IDclass IDstudent subject;

model engagement=  idbeep test math hisp hisp*math /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);/*level2*/

random intercept hisp/sub=IDclass ;/*level3*/

run;

 

 

A researcher thinks that the variance of Hispanic slope can be completely explained by subject matter difference (note hisp at the second random line is suppressed).  Actually this is just a plain interaction term, as we know it in OLS.

A Model with Nonrandomly Varying Slopes

/*Hispanic effect fixed and being predicted by class characteristics/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class idbeep hisp IDclass IDstudent subject;

model engagement=  idbeep test math hisp hisp*math /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);/*level2*/

random intercept /*hisp*/ /sub=IDclass ;/*level3*/

run;

Cross-level Interaction, Modeling of Slopes

While HLM software takes level-specific equation as syntax, SAS PROC MIXED accepts just one equation, or I forget the word for it--maybe one common equation or something as opposed to level-specific.  When thinking about cross-level interaction effect, it is easier to think of it using level-specific representation of the model.

 

Level-specific way (R& B's HLM software uses this syntax):

Level1: Y=b0 + b1*X + Error

Level2: b0=g00 + g01*W + R0

Level3: b1=g10+g11*W + R1

 

 

But this is not how SAS PROC MIXED wants you to program.  You need to insert all higher level equations to be inserted into the level-1 equation.  Above equation will be reduced into one equation in this way:

 

One equation way (SAS PROC MIXED way)

Y=g00 + g01*W + g10*X +g11*W*X + R1*X + E + R0

 

How do we translate level-specific way (HLM software way) into a one equation version (PROC MIXED way)?  The rule of thumb is this:

 

Text Box: 1.	Insert all the higher level equations into level-1 equation.
2.	Identify the variables that appear with error term.  For example, X appears with R1 (see R1*X).
3.	Put that variable at a random statement.
Here is an example:

Level1: Y=b0 + b1*X + Error

Level2: b0=g00 + g01*W + R0

Level3: b1=g10+g11*W + R1

 

1. Insert higher level equations into the level-1 equation.

Y=b0 + b1*X + Error

Insert level-2 equations into b's

--> Y=[g00 + g01*W + R0]  + [g10+g11*W + R1 ]*X + Error

 

Take out the brackets

--> Y=g00 + g01*W + R0  + g10*X +g11*W*X + R1*X + Error

 

Shuffle the terms around, so it is easier to understand.  And identify which part is the structural part of the model and which part is random components of the model:

--> Y=g00 + g01*W + g10*X +g11*W*X  + R1*X + Error + R0

              structural component                     random components

Structural part means that it affects every subject in the data.  Random part varies by a person or a group that he/she belongs to.

 

Now write fixed effect part of the PROC MIXED syntax.  Focus on the random part of the equation above. 

 

proc mixed  ;

class ;

model Y= W   X   W*X;

run;

 

Next write the random component part.  The rule of thumb is that first you look at the random component, " R1*X + E + R0" and notice which variable is written with *R1 or *any higher level error term.  Throw that variable into the random statement.  Also note that gintercepth is in the random statement to represent R0, which is a level-2 error (or the variance of the random intercepts).

 

proc mixed  ;

class ;

model Y= W   X   W*X;

random intercept X;

run;

 

Just one more look at R1*X + Error + R0:

Error is a residual (or level-1 error in HLM terminology) and PROC MIXED syntax assumes it exists, so we don't have to do anything about it.

 

Again, R0 is a level-2 error (or level-2 intercept), which is why we said "random intercept .."   

 

R1*X is the random coefficients for X, which is why we said "random ... X."

The most important thing is to notice which variable gets an asterisk next to it with error terms.  In this case, the error term was R1 and it sat right next to X.  This is why we put X in the random statement.

 

Advantage of PROC MIXED way of writing one equation

When you start writing it in PROC MIXED way, you may begin to feel that HLM is not so complicated.  For example, something that you used to call level-1 variables, level-2 variables, level-3 variables are now just gvariables.h  The reason why we differentiate among the variables of different levels in HLM is because with the software you have to prepare data sets such that there are level-1 data sets, level-2 data sets, and level-3 data sets (though the latest version of HLM has an option of just using one data to process).

In HLM terminology, you use to use terms like level-1 error, level-2 error, level-3 error.  They are now just residuals (=level1 error) and random effects.  "Centering" might have felt like a magical concept unique to HLM, but now it feels like something that is not really specific to HLM but something that is awfully useful in any statistical model.  I even use centering for simple OLS regression because it adds a nice meaning to the intercept.

Finally, you realize HLM is just another regression.  Without a random statement, a mixed model is the same as OLS regression.

             This is the same as OLS regression (note that I deleted a random statement):

proc mixed data=ABC covtest noclprint;

title "Non conditional model";

class state;

model actual =/solution ddfm=bw;

run;

 

This is an HLM (just added a random statement).

proc mixed data=ABC covtest noclprint;

title "Non conditional model";

class state;

model actual =/solution ddfm=bw;

random intercept/sub=state;

run;

 

This simple difference is hidden when you use B&R HLM software because the use of an independent software with lots of features make it look like it is a whole different ball game.

 

 

Group and grand mean centering

In SAS, we have to prepare the data such that variables are centered already before being processed by PROC MIXED.  In a way it is an advantage over the HLM software.  It reminds you of the fact that centering is not really specific to HLM.

 

Grand Mean Centering:

proc standard data=JOHN mean=0;

var XXX YYY;

run;

 

Compare this to the making of Z-scores:

proc standard data=JOHN mean=0 std=1;

var XXX YYY;

run;

 

Group Mean Centering

proc standard data=JOHN mean=0;

by groupID;

var XXX YYY;

run;

 

So why do we do centering?  The issue of centering is introduced in HLM class, but I argue centering is not really specific to HLM.  Why with OLS, we donft center?  In fact, I do centering even with OLS, so intercept has some meaning.  It corresponds to a value of a typical person.  When I center all the variables, intercept will mean the value for a subject who has 0 on all predictors.  This person must be an average person in the sample for having 0 or mean score on all predictors.  For the ease of interpretation of results, I do standardization of continuous variables (both dependent and independent variables) with a mean of zero and standard deviation of 1 (in other words, Z-score).  For dummy variables, I just do centering.

Centering means subtracting the mean of variable from all observations.  If your height is 172 cm and the mean is 172, then your new value after centering will be 0.  In other words, a person with an average height will receive 0.  A person with 5 in this new variable is a person whose height is average value + 5cm.

              Doing proc standard or converting values to a Z-score does two things, setting mean to zero and setting standard deviation to one.  Here I focus on centering part (meaning setting mean to zero).  Centering brings a meaning to the intercept.  Usually when we see results of OLS or other regression models, we donft really care about the intercept because it is just an arbitrary value when a line crosses the Y-axis on a X-Y plot, as we learned in algebra. 

The intercept obtained by centering predictors is also called gadjusted mean.h  The intercept is now a mean of group when everything else is adjusted. 

When we do HLM, we want to have a substantial meaning to the intercept value in this way, so we know what we are modeling.  Why do we want to know what we are modeling especially with HLM, but not OLS?  With HLM, we talk about variance of group specific intercepts, which is a distribution of an intercept value for each group unit.  We may even plot the distribution of that.  When doing so, we feel better if we know that that intercept has some substantive meaning.  For example, if we see group Afs intercept is 5, it means the typical person in group Afs value is 5.  Without centering, we will be saying ggroup Afs regression line crosses the Y-axis line at 5,h which does not mean much.

Centering Dummy Variables

In HLM, we sometimes center dummy variables.  This feels strange at first.  For example with gender, we code gender=1 if female and 0 if male. Then we center the value by its mean.  The following proc standard would do the group mean centering, for example:

 

Group Mean Centering

proc standard data=JOHN mean=1 std=1;

by groupID;

var GENDER;

run;

 

If there are half men and half women,

Male will receive -.5

Female will receive .5 as values.

 

 

The reason why we do this is because we want to assign meanings to the intercept.  When we center gender around its mean, the intercept will mean:

 

The value that is adjusted for gender composition.

 

What does this mean?

 

Under Construction

The use of Weight

PROC MIXED has a weight statement.  But people tell me that we cannot use a sample weight for the weight statement.  Please email me if you know more about this subject (kaz_uekawa@yahoo.com).

 

 

Residual analysis using ODS

ODS (output delivery system) and other SAS procedures allow a researcher to check residuals through a routine procedures.  In the beginning I wasnft sure what gresidual analysish meant, but I found out that it means that we investigate the random effects that we estimated.  For example, in our model of student engagement, we estimated student level engagement level and class level engagement level.  So the actual values for each student and each teacherfs class are there somewhere behind PROC MXIED.  Because we estimated them, we can of course print them out, graphically examine them, and understand their statistical properties.

In this study from which data came, we were actually in the classrooms, observing what was going on.  With procedures above, we were able to compare the distribution of class and student means and our field notes and memories.  Sometimes, we were surprised to see what we thought was a highly engaged classroom was in fact low on engagement level.  Triangulation of quantitative and qualitative data is easy for PROC MIXED has the ability to quickly show a researcher how to check residuals in this way.

Note that I put gsolutionh at the second random line (it does not matter which random line you put this) and requested that all the data related to that line gets produced and reported.  AND I am requesting this time solutionR to save this information, using ODS.

 

ODS listing close;/*suppress printing*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class idbeep hisp IDclass IDstudent subject;

model engagement=  /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);    /*level2*/

random intercept /sub=IDclass solution;      /*level3*/

ods output solutionF=mysolutiontable1

CovParms=mycovtable1 Dimensions=mydimtable1 solutionR=mysolRtable1;

run;

ODS listing;/*allow printing for the procedures to follow after this*/

 

 

I put gODS listing closeh to suppress printing (in SASfs window); otherwise, we get long outputs.  Donft forget to put gODS listingh at the end, so whatever procedures follow it will gain the capacity to print. 

The table produced by solutionR= contains actual parameters (intercepts in this case) for all students and classes.  Unfortunately this table (here arbitrarily named gmysolRtable1h) is messy because it has both individual-mean and class-mean of the outcome.  I selected part of the table for demonstration purpose:

                                                            StdErr

                Obs     Effect      IDclass    IDstudent    Estimate        Pred      DF     tValue     Probt

 

                  1    Intercept     1_FF       01_1_FF       6.5054      2.3477    1964       2.77    0.0056

                  2    Intercept     1_FF       10_1_FF      -2.7622      2.1226    1964      -1.30    0.1933

                 14    Intercept     .          .                  0      4.8588    1964       0.00    1.0000

                 15    Intercept     .          .                  0      4.8588    1964       0.00    1.0000

                 16    Intercept     .          .                  0      4.8588    1964       0.00    1.0000

                 17    Intercept     1_FF                    -0.2199      1.4898    1964      -0.15    0.8827

              and a lot more lines

 

Estimated Level-2 intercepts (intercepts for each individual or gindividual meanh in this case) are the cases where level-2 unit indicator (IDstudent) has the values (e.g., 01_1_FF and 10_1_FF, i.e., observation 1 and 2).  There are lots of them obviously for there are about 300 students to account for.  There are also level-3 intercepts (intercepts for each class, or class mean in this case), which in the above example is observation 17.  You can tell this because the level-2 ID, i.e., IDstudent is empty.  Observation 14,15,and 16 above have dots for both IDs, which (I think) means that they are missing case and were not used in the run.

Perhaps the first thing that we want to do after running the program is to check the distribution of parameters that we decided to set random.  We need to look at it to see if things make sense.

 

To look at level-3 intercepts (classroom means), select cases when level-2 ID (IDstudent) is empty.

 

proc print data=mysolRtable1;

where  idstudent="";

run;

 

/*for something more understandable*/

proc sort data=mysolRtable1 out=sorteddata;

by estimate;

run;

proc timeplot data=sorteddata;

where  idstudent="";

plot  estimate /overlay hiloc npp ref=.5;

id IDclass estimate;

run;

 

 

 

In order to look at level-2 intercepts (student means) and the distribution of values, you must select cases when level-2 ID (IDstudent) is neither empty nor missing.

proc print data=mysolRtable1;

where  idstudent ne "" and idstudent ne ".";

run;

 

proc univariate data=mysolRtable1 plot;

where  idstudent ne "" and idstudent ne ".";

var estimate;run;

 

 

 

 


 

           V.      PROC MIXED used along with other SAS functionalities

·        With ODS (Output Delivery System)

ODS is the most powerful feature of SAS, which I think does not exist in other software packages.  Further details of ODS can be found in one of the other manuals that I wrote (downloadable from the same location).  You are able to save results of any SAS procedures, including PROC MIXED, as data sets.

 

/*Intercept-only model*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class idbeep hisp IDclass IDstudent subject;

model engagement=  /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);    /*level2*/

random intercept /sub=IDclass ;                   /*level3*/

ods output solutionF=mysolutiontable1  CovParms=mycovtable1 ;

run;

/*ADD predictors FIXED*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class idbeep hisp IDclass IDstudent subject;

model engagement=  idbeep hisp test /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);/*level2*/

random intercept /sub=IDclass ;/*level3*/

ods output solutionF=mysolutiontable2  CovParms=mycovtable2;

run;

 gODS outputh tells the program to produce data sets which contain statistical results.  We want to assign names to those data sets.

SolutionF=you_choose_name à kicks out data with parameter estimates

CovParms=you_choose_name àkicks out data with information about variance

and so on.

Are you curious what you just created?  Do proc print to look at them.

proc print data=type_the_name_of_the_data_here;run;

 

Forget default outputs.  This is a simple sample of how to create a readable table using ODS.  For an example of how this can be elaborated further, see appendix III, Sample SAS program.

data newtable;set mysolutiontable1 mysolutiontable2;

if probt < 0.1 then sign='+';

if probt < 0.05 then sign='*';

if probt < 0.01 then sign='**';

if probt < 0.001 then sign='***';

if probt < -999 then sign="   ";

run;

/*table a look*/

proc print data=newtable;

var effect estimate sign;

run;

/*create an excel sheet and save it in your C drive*/

PROC EXPORT DATA= newtable  OUTFILE= "C:\newtable.xls" DBMS=EXCEL2000 REPLACE;RUN;

 

 

 


·        Wth ODS, and GRAPH

We use the data mycovtable1 obtained in the previous page or the program below.

 

ODS listing close;/*suppress printing*/

proc mixed data=here.esm covtest noclprint;

weight precision_weight;

class idbeep hisp IDclass IDstudent subject;

model engagement=  /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);    /*level2*/

random intercept /sub=IDclass solution;           /*level3*/

ods output solutionF=mysolutiontable1  CovParms=mycovtable1 Dimensions=mydimtable1 solutionR=mysolRtable1;

run;

ODS listing;/*allow printing for the procedures to follow after this*/

 

 

Now that the table for variance components (I named mycovtable1) is created, I want to manipulate it, so it is ready for graphing.  Check what is in gmycovtable1h by doing:

proc print data=mycovtable1;run;

 

It has lots of variables and values that are not really ready for graphing.  In the first data step below, I am modifying the data to look like this, so it is ready to be used by PROC GCHART.

 

Obs          level           Estimate

 

 1     Between-Individual     23.6080

 2     Between-Class           5.2838

 3     Within-Individual      31.6213

 

 

data forgraphdata;set mycovtable1;

/*crearing a new variable called LEVEL*/

length level $ 18;

if CovParm="Residual" then level="Within-Individual";

if Subject="IDstudent(IDclass)" then level="Between-Individual";

if Subject="IDclass" then level="Between-Class";

 

 

goptions cback=black htext=2 ftext=duplex;title1 height=3 c=green

'Distribution of Variance';

footnote1 h=2 f=duplex c=yellow 'Engagement study: Uekawa, Borman, and Lee';

 

proc gchart data=forgraphdata;

  pie level /

     sumvar=estimate

     fill=solid

     matchcolor;

run;

/*

Borrowed program code for pie chart from http://www.usc.edu/isd/doc/statistics/sas/graphexamples/piechart1.shtml*/

 

 

Other Useful SAS functionalities for HLM analyses

·        Creating group-level mean variables

One could use proc means to derive group-level means.  I donft recommend this since it involves extra steps of merging the mean data back to the main data set.  Extra steps always create rooms for errors.  PROC SQL does it at once. 

proc sql;

create table newdata as

select *,

mean(variable1) as mean_var1,

mean(variable2) as mean_var2,

mean(variable3) as mean_var3

from here.esm

group by IDCLASS;

 

Normalizing sample weight right before PROC MIXED

This is a simple example.  F1pn1wt is the name of weight in this data.

proc sql;

create table newdata as

select *,

f1pnlwt * (count(f1pnlwt)/Sum(f1pnlwt)) as F1weight

from here.all; /*proc SQL does not require run statement*/

Sometimes you may want to create more than one weight specific to the dependent variable of your choice—when depending on the outcome, the number of cases are very different.

/*create 1 or 0 marker for present/missing value distinction*/

%macro flag (smoke=);

flag&smoke=1;if &smoke =. then flag&smoke=0;

%mend flag;

%flag (smoke=F1smoke);

%flag (smoke=F1drink);

%flag (smoke=F1cutclass);

%flag (smoke=F1problem);

%flag (smoke=F1marijuana);

run;

/*create weight specific to each outcome variable.  Note the variables created above are now instrumental in deriving weights*/

proc sql;

create table final as

select *,

f1pnlwt * (count(f1pnlwt)/Sum(f1pnlwt)) as F1weight,

f1pnlwt * (count(F1smoke)/Sum(f1pnlwt*flagF1smoke)) as F1smokeweight,

f1pnlwt * (count(F1drink)/Sum(f1pnlwt*flagF1drink)) as F1drinkweight,

f1pnlwt * (count(F1cutclass)/Sum(f1pnlwt*flagF1cutclass)) as F1cutclassweight,

f1pnlwt * (count(F1problem)/Sum(f1pnlwt*flagF1problem)) as F1problemweight,

f1pnlwt * (count(F1marijuana)/Sum(f1pnlwt*flagF1marijuana)) as F1marijuanaweight

from here.all;

 

·        The use of Macro (also described in another one of my SAS tutorial)

This example executes PROC MIXED three times, each time with different outcome variable, in one run. 

%macro john (var1=);

proc mixed data=here.esm covtest noclprint;

title gModeling &var1h;

weight precision_weight;

class IDclass IDstudent;

model &var1= /solution ddfm=kr;

random intercept /sub=IDstudent(IDclass);      /*level2*/

random intercept /sub=IDclass ;                     /*level3*/

run;

%mend john;

%john (var1=engagement);

%john (var1=another_variable2);

%john (var1=another_variable3);


        VI.      APPENDIX ODS: How to find available tables from SASfs PROCEDURES

You can find out what tables are available, do this to any SAS procedure:

ods trace on;

SAS procedure here

ods trace off;

The log file will tell you what kind of tables are available. 

Available tables from PROC MIXED:

 


Output Added:

-------------

Name:       ModelInfo

Label:      Model Information

Template:   Stat.Mixed.ModelInfo

Path:       Mixed.ModelInfo

-------------

WARNING: Length of CLASS variable IDclass truncated to 16.

 

Output Added:

-------------

Name:       Dimensions

Label:      Dimensions

Template:   Stat.Mixed.Dimensions

Path:       Mixed.Dimensions

-------------

NOTE: 16 observations are not included because of missing values.

 

Output Added:

-------------

Name:       IterHistory

Label:      Iteration History

Template:   Stat.Mixed.IterHistory

Path:       Mixed.IterHistory

-------------

 

 

 

 

 

Output Added:

-------------

Name:       ConvergenceStatus

Label:      Convergence Status

Template:   Stat.Mixed.ConvergenceStatus

Path:       Mixed.ConvergenceStatus

-------------

NOTE: Convergence criteria met.

 

Output Added:

-------------

Name:       CovParms

Label:      Covariance Parameter Estimates

Template:   Stat.Mixed.CovParms

Path:       Mixed.CovParms

-------------

 

Output Added:

-------------

Name:       FitStatistics

Label:      Fit Statistics

Template:   Stat.Mixed.FitStatistics

Path:       Mixed.FitStatistics

-------------

 

Output Added:

-------------

Name:       SolutionF

Label:      Solution for Fixed Effects

Template:   Stat.Mixed.SolutionF

Path:       Mixed.SolutionF

-------------



     VII.      Appendix Other resources for running PROC MIXED

General Reading

  • Multilevel stat book (all chapters on-line) by Goldstein http://www.arnoldpublishers.com/support/goldstein.htm
  • Bryk and Raudenbush, Hierarchical Linear Models (Sage Publication 1992)

Gives a quick understanding when going from HLM to PROC MIXED

  • Using SAS PROC MIXED to Fit Multilevel Models, Hierarchical Models, and Individual Growth Models, Judith D. Singer (Harvard University) Journal of Educational and Behavioral Statistics winter 1998, volume 23, number 4. A reprint of the paper downloadable at http://gseweb.harvard.edu/~faculty/singer/
  • Suzuki and Sheufs tutorial (PDF format) http://www.kcasug.org/mwsug1999/pdf/paper23.pdf

Comparison of HLM and PROC MIXED

  • "Comparison of PROC MIXED in SAS and HLM for Hierarchical Linear Models" by Annie Qu says they produce almost identical results. http://www.pop.psu.edu/stat-core/software/hlm.htm

Further details of PROC MIXED

  • On-line SAS CD has a chapter on PROC MIXED.
  • SAS System for Mixed Models By: Dr. Ramon C. Littell, Ph.D., Dr. George A. Milliken, Ph.D., Dr. Walter W. Stroup, Ph.D., and Dr. Russell D. Wolfinger, Ph.D. http://www.amazon.com/exec/obidos/ASIN/1555447791/noteonsasbyka-20

Best place to go for technical details

  • email support@sas.com with your SAS license registration information (Copy the head of the log file into the email text)

Mailing list for multilevel models. http://www.jiscmail.ac.uk/lists/multilevel.html

 

Applications done in my neighborhood

  • Bidwell, Charles E., and Jefferey Y. Yasumoto.  1999.  gThe Collegial Focus: Teaching Fields, Collegial Relationships, and Instructional Practice in American High Schools.h  Sociology of Education 72:234-56 (Social influence among high school teachers was modeled.  B&Rfs HLM.)
  • Uekawa, Kazuaki.  2000.  Making Equality in 40 National Education Systems.  The University of Chicago, Department of Sociology.  Dissertation.  (Examined how the effect of parentsf education on studentsf mathematics score varies by nation.  Focused on institutional differences of national education systems.  PROC MIXED used.)
  • Yasumoto, Jeffrey Y., Kazuaki Uekawa, and Charles E. Bidwell.  2001.  gThe Collegial Focus and High School Studentsf Achievement.h  Sociology of Education 74: 181-209.  (Growth of studentsf mathematics achievement score was modeled.  B&Rfs HLM.)
  • McFarland, Daniel A., 2001.  gStudent Resistance: How the Formal and Informal Organization of Classrooms Facilitate Everyday Forms of Student Defiance.h  American Journal of Sociology Volume 107 Number 3 (November 2001): 612-78.  (Modeled the occurrence of studentsf resistance in classrooms.  Multilevel Poisson Regression, SAS macro GLIMMIX.)
  • Uekawa, Kazuaki, and Charles E. Bidwell.  2002.  School as a Network Organization and Its Implication for High School Studentsf Attitudes and Conduct in School: An Exploratory Study of NELS and LSAY.  Presented at American Sociological Association 2002.  (Growth of studentsf problem behaviors was modeled.  Multilevel logistic regression, SAS macro GLIMMIX)  Manuscript available upon request.
  • Uekawa, Kazuaki, Kathryn Borman, and Reginald Lee.  2002.  Student Engagement in Americafs Urban High School Mathematics and Science Classrooms: Findings on Effective Pedagogy, Ethnicity and Culture, and the Impact of Educational Policies. (Modeld the ups and downs of studentsf engagement level.  Random intercept model, SAS PROC MIXED) Manuscript available upon request.

 

Nonlinear Model

 

  • Dr. Wonfingerfs paper on NLMIXED procedure.  http://www.sas.com/rnd/app/papers/nlmixedsugi.pdf
  • GLMMIX macro download at http://ewe3.sas.com/techsup/download/stat/glmm800.html
  • How to use SAS for logistic Regression with Correlated Data by Kuss.  Some description of GMIMMIX and NLMIX.  http://www2.sas.com/proceedings/sugi27/p261-27.pdf

 

 


 

  VIII.      Frequently Asked Questions


Q: What is correlated error and what that is a problem?

A: Error for each observation has to be independent and not correlated; otherwise, variance is underestimated, which leads to the underestimation of standard errors for coefficients.  In other words, OLS regression, which ignores dependency of errors by group units, may give us too good results.

The first paragraph above has several steps.  First, errors being correlated means your error and my error are similar for a systematic reason, such as you and I are in the same group.  Why correlated error leads to underestimation of variance?  Here is one extreme example that brings a quick understanding.  Imagine you thought you took a sample of 100 persons, but wrongly collected these 100 surveys from the same person.  In this case, all observations are the same; thus, errors are perfectly correlated.  The variance is 0 and this IS underestimated.

Then, why underestimated variance leads to underestimated standard error?  The algorithm to derive standard errors includes a value of variance.  The smaller the variance, the smaller the standard errors.  Why then is underestimation of standard error bad?  Everything becomes statistically significant when we underestimate standard errors.

 

Q: What does HLM do to solve this problem?

A: When we say error, it means each observationfs deviation from a predicted value by the model.  When subjects are drawn from group units, such as schools, the error may contain not just onefs individual deviation from a norm but also the group unitfs deviation from a norm.  Ignoring this will lead to underestimation of variance and then standard error.  HLM now estimates errors both for individual and for group units.  By doing this, HLM attempts to obtain more realistic standard errors.