Statistical joint test of categorical variables when expressed as a series of dummy variables

When I have multiple subgroup represented in a series of dummy variables (e.g.,  race groups, grade levels, etc.), I want to know if dummy variables as a system  contribute to the model with statistical significance.   This may be called a joint test because I want to know if, for example, race groups together (not separately)  make a differences to the model.

The easiest way to do this is to treat those variables as classification variables.  You will get a joint statistical test in one of the result tables.

proc glimmix ..;

class race grade_level;

....

run;

In my applications I almost always use numeric version of variables, i.e., dummy variables (coded as 0 or 1).  I like this approach because I can use PROC MEANS on them to create a descriptive statistics table.

The question is how I get joint statistical tests when  all of my predictors are numerically coded and thus I can't rely on the class statement (shown above in the syntax example).

The GLIMMIX syntax below treats race groups and grade levels as numerically coded dummy variables (if YES 1, else 0).

The parameter estimate tables will show coefficients derived for each of the numeric variables; however, I wouldn't know if race groups as a group matters to the model or grade levels as a system matters to the model.   For example, even when  the coefficient derived for subjects being black is statistically significant, that is only about how black students are different from white students (reference group in this example).  We don't know if race as a group matters and race groups jointly make a statistically significant contribution to the model.

<Again this can be done easily by using class variables instead (as shown earlier); however, I like using numeric variables in my models.>

Contrast statements will do the trick.

proc glimmix data=usethis namelen=32;
class groupunit;
model Y= treat black hispanic other grade09 grade10 grade11/
solution ddfm=kr dist=&dist link=&link ;
output out=&outcome.gmxout residual=resid;
random intercept /subject=groupunit;
CONTRAST 'Joint F-Test Race groups ' Black 1, Hispanic 1, other 1;
CONTRAST 'Joint F-Test Grade levels' grade09 1, grade10 1, grade11 1,

ods output
ParameterEstimates=_3_&outcome.result covparms=_3_&outcome.cov
Contrasts=cont&outcome;
run;

 

Leave a Reply