# HLM (Hierarchical Linear Modeling)

For visitors who only have one minute:

HLM is just like OLS, but it is more careful about incorporating sources of uncertainty associated with the fact that subjects are nested within groups.  Because it is careful about handling of errors, the results of statistical tests are conservative (meaning  results become less impressive if you use HLM).  Being conservative is considered better in research studies than being liberal.  If the between group difference in outcome is not large, results parameter estimates, standard errors, p-values) from HLM and OLS will be similar.  Generally speaking,  parameter estimates from the two models are similar.  Standard errors are difference such that HLM's standard errors are larger than OLS's.

For other visitors:

Hierarchical Linear Modeling (HLM) is a type of regression model used frequently for education datasets.  Education data typically select students from a set of schools and thus information about students are correlated (which is not great for the reason I state below).  You can say this in a couple of different ways.  I will try two:

1. Students' outcomes are related to one another if they are in the same school (John and Mary have similar outcome scores because they sit right next to each other in the same school)
2. Residuals from the regression model are not independent from one another (John's residual and Mary's residual are close)

With this type of data, classical methods (e.g., OLS) would not produce correct standard errors as they rely on the assumption that residuals are independent from each other.   Such models are called fixed effect models and they ignore the source of errors that reside between group units and thus standard errors are underestimated.

Here "residuals" very roughly means the same thing as outcome values (though outcome values sit on the right side of the equation and residuals are on the left side of the equation.  If you plot outcome values and residuals from the intercept-only model (no predictor model), they will form a straight line.

If you apply the classical approach (e.g., OLS) on hierarchically structured data, you will be underestimating the size of standard errors estimated for coefficients.  This problem is also refereed to as clustering problem.

HLM is one approach to adjust for the clustering problem, so statistical test results are more realistic and conservative statistical testing.  Personally, I understood HLM better when I learned Geospatial statistics, which is also an approach to fix the data dependency problem (observation from Washington DC and observation from Arlington VA are similar due to geographical proximity).

HLM is one of many approaches that deal with data dependency problem.  It is only one approach and fixes one type of problem, leaving many other problems not fixed.

Parameter estimates (AKA, coefficients, effects), however, are not drastically different in classical methods and HLM.  If OLS tells you the US junior high school students scored 555 points on average, HLM would give you almost the same information.  However, standard errors would be larger for HLM than OLS, as HLM considers sources of errors more rigorously than OLS.

My manual for SAS PROC MIXED for doing HLM

proc glimmix data=asdf;
class School_ID ;
model Y = X1 X2 /dist=normal link=identity s ddfm=kr;
random int / subject = School_ID;
output out=gmxout residual=resid;
run;

Special Topics