/*******************************************************************\ |Doing Rasch Model Using PROC NLMXIED | |GET person measure and item difficulty | Updated October 2003 READ > Kamata, A. (2001). Item analysis by the hierarchical generalized linear > model. Journal of Educational Measurement, 38, 79-93. Using a famous Rasch data set "Knox Cube Test" (Thanks to Doug who suggested that I do) I show how PROC NLXMIED can be used to replicate the results of WINSTEPS. Why bother, while there is WINSTEPS? Personally I felt like I understood the Rasch model when I tried to think of it in a logistic regression framework. Rasch Model can be thought of as a special case of multilevel logistic regression (see Aki Kamata's work). For me thinking Rasch model in terms of logistic regression gave me an "aha" moment. Data should be transformed to look like this. You have to create lots of dummy variables indicating how each line corresponds to whcih item. RespondentID ItemID Response Item1flag Item2flag Item3flag Item4flag 1 1 1 1 0 0 0 1 2 0 0 1 0 0 1 3 1 0 0 1 0 1 4 1 0 1 0 1 And the occurance of 1 in response as opposed to 0 will be modeled by flag variables that indicate the items (e.g., Item1flag is 1 an observation comes from item1). We use a multilevel logistic regression model (as opposed to simple logistic regression model. logit(P_ij/1-P_ij)= b0_ij + b1_i*item1flag + b2_i*item2flag .... + error_j where i is an item and j is a person (correct me if I am wrong). ij thus is a specific item of a specific person. P_ij is the probability that a person i gets an item J right. Note that error_j is a random effect coming from a multilevel model framework. In an ordinary logistic regression there is no error term. So this model translates into English: The probability that each line of data gets 1 as a response depends on which items are being taken by a test taker and the test taker's deviation from the mean probability of getting the item correct from the grand mean. The coefficients of these item flag variables correspond to item easiness (or the opposite of item difficulty). The easier the items, the easier to see 1 in a response. So the coefficients, if the sign is flipped, corresponds to Rasch model's item difficulty measure. Now, the individual level error corresponds to person measures. It's like this--you can predict the response somewhat by what items you are dealing with, but the part that is not still explained away can be attributed to individual ability level (reflected in the individual level error). This gave me an aha moment. The ordinary Rasch equasion and a famous curv did not give me this quick understanding-- partly because the connection between the equasion and the graph is not immediately clear (as opposed to OSL which has a visually clear linkage between an equasion and a X-Y plot graph. Agree?) Problem1:on my laptop it takes about three minutes for NLMIXED to do its run. Problme2: the way I transfered the data to make it NLMIXED ready was tedious and idiosyncratic. Can anyone suggest an alternative? /* &INST TITLE='Knox Cube Test (Best Test Design p.31)' ;; This comment about KCT.DAT will go into the output file NI=18 ITEM1=11 NAME1=1 TABLES=11111111111111111111111 PERSON=KID ITEM=TAP STBIAS=Y PFILE=KCT.PF IFILE=KCT.IF &END 1= 1-4 2= 2-3 3= 1-2-4 4= 1-3-4 5= 2-1-4 6= 3-4-1 7= 1-4-3-2 8= 1-4-2-3 9= 1-3-2-4 10=2-4-3-1 11=1-3-1-2-4 12=1-3-2-4-3 13=1-4-3-2-4 14=1-4-2-3-4-1 15=1-3-2-4-1-3 16=1-4-2-3-1-4 17=1-4-3-1-2-4 18=4-1-3-4-2-1-4 END NAMES Richard M 111111100000000000 Tracie F 111111111100000000 Walter M 111111111001000000 Blaise M 111100101000000000 Ron M 111111111100000000 William M 111111111100000000 Susan F 111111111111101000 Linda F 111111111100000000 Kim F 111111111100000000 Carol F 111111111110000000 Pete M 111011111000000000 Brenda F 111110101100000000 Mike M 111110011111000000 Zula F 111111111110000000 Frank M 111111111111100000 Dorothy F 111111111010000000 Rod M 111101111100000000 Britton F 111111111100100000 Janet F 111111111000000000 David M 111111111100100000 Thomas M 111111111110100000 Betty F 111111111111000000 Bert M 111111111100110000 Rick M 111111111110100110 Don M 111011000000000000 Barbara F 111111111100000000 Adam M 111111100000000000 Audrey F 111111111010000000 Anne F 111111001110010000 Lisa F 111111111000000000 James M 111111111100000000 Joe M 111111111110000000 Martha F 111100100100000000 Elsie F 111111111101010000 Helen F 111000000000000000 |by Kazuaki Uekawa, Ph.D. | |(my permanent email address kuekawa@alumni.uchicago.edu or kaz@estat.us) | |http://www.estat.us/ and follow the link for SAS tutorials | ********************************************************************/ data troy; input Name $ 1-8 gender $ 9 item1 11 item2 12 item3 13 item4 14 item5 15 item6 16 item7 17 item8 18 item9 19 item10 20 item11 21 item12 22 item13 23 item14 24 item15 25 item16 26 item17 27 item18 28 ; datalines; Richard M 111111100000000000 Tracie F 111111111100000000 Walter M 111111111001000000 Blaise M 111100101000000000 Ron M 111111111100000000 William M 111111111100000000 Susan F 111111111111101000 Linda F 111111111100000000 Kim F 111111111100000000 Carol F 111111111110000000 Pete M 111011111000000000 Brenda F 111110101100000000 Mike M 111110011111000000 Zula F 111111111110000000 Frank M 111111111111100000 Dorothy F 111111111010000000 Rod M 111101111100000000 Britton F 111111111100100000 Janet F 111111111000000000 David M 111111111100100000 Thomas M 111111111110100000 Betty F 111111111111000000 Bert M 111111111100110000 Rick M 111111111110100110 Don M 111011000000000000 Barbara F 111111111100000000 Adam M 111111100000000000 Audrey F 111111111010000000 Anne F 111111001110010000 Lisa F 111111111000000000 James M 111111111100000000 Joe M 111111111110000000 Martha F 111100100100000000 Elsie F 111111111101010000 Helen F 111000000000000000 ; run; data troy;set troy; /*I assign numeric IDs based on people's names--just for convinience*/ ID=_N_; run; proc print data=troy; title "This is original data"; run; proc transpose data=troy out=troy2; id ID; run; proc print data=troy2; title "I transposed it here"; run; /*I am transforming data so that the data to go into NLMIXED has item-person structure*/ /*I hate the way I am doing this. It's completely useless for people to replicate. I did this a few years ago and now have no energy to redo this in a better way*/ /*but it will be a good idea to try a more generic way of data transformating*/ ****************************************************** ******************************************************; %macro kuri (which=); data d&which;set troy2;keep _NAME_ response person; response=_&which;person=&which; %mend kuri; %kuri (which=1);%kuri (which=2);%kuri (which=3);%kuri (which=4);%kuri (which=5);%kuri (which=6);%kuri (which=7);%kuri (which=8);%kuri (which=9); %kuri (which=10);%kuri (which=11);%kuri (which=12);%kuri (which=13);%kuri (which=14);%kuri (which=15);%kuri (which=16);%kuri (which=17);%kuri (which=18);%kuri (which=19);%kuri (which=20);%kuri (which=21); %kuri (which=22);%kuri (which=23);%kuri (which=24);%kuri (which=25);%kuri (which=26);%kuri (which=27);%kuri (which=28);%kuri (which=29);%kuri (which=30);%kuri (which=31);%kuri (which=32);%kuri (which=33); %kuri (which=34);%kuri (which=35); data all;set d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 d13 d14 d15 d16 d17 d18 d19 d20 d21 d22 d23 d24 d25 d26 d27 d28 d29 d30 d31 d32 d33 d34 d35; by person; %macro kuri2 (kore=); &kore=0; if _NAME_= "&kore" then &kore=1; %mend kuri2; %kuri2 (kore=item1); %kuri2 (kore=item2); %kuri2 (kore=item3); %kuri2 (kore=item4); %kuri2 (kore=item5); %kuri2 (kore=item6); %kuri2 (kore=item7); %kuri2 (kore=item8); %kuri2 (kore=item9); %kuri2 (kore=item10); %kuri2 (kore=item11); %kuri2 (kore=item12); %kuri2 (kore=item13); %kuri2 (kore=item14); %kuri2 (kore=item15); %kuri2 (kore=item16); %kuri2 (kore=item17); %kuri2 (kore=item18); data all;set all; if _NAME_="item1" or _NAME_="item2" or _NAME_="item3" or _NAME_="item4" or _NAME_="item5" or _NAME_="item6"or _NAME_="item7" or _NAME_="item8" or _NAME_="item9" or _NAME_="item10" or _NAME_="item11" or _NAME_="item12" or _NAME_="item13" or _NAME_="item14" or _NAME_="item15"or _NAME_="item16" or _NAME_="item17" or _NAME_="item18" ; ********************************************************************************* *********************************************************************************; proc print data=all; title1 "This is how the data should look like before going into NLMIXED"; title2 "RESPONSE is the outcome and the rest are bunch of independent variables indicating items"; run; *ods trace on; proc nlmixed data=all; *where _NAME_="item1" or _NAME_="item2" or _NAME_="item3" or _NAME_="item4" or _NAME_="item5" or _NAME_="item6"or _NAME_="item7" or _NAME_="item8";/*with this you can delete items*/ title "See parameter estimate table for item difficulty measure"; eta= b1*item1 + b2*item2 + b3*item3 + b4*item4 + b5*item5 + b6*item6 + b7*item7 + b8*item8 + b9*item9 + b10*item10 + b11*item11 + b12*item12 + b13*item13 + b14*item14 + b15*item15 + b16*item16 + b17*item17 + b18*item18 + u; expeta=exp(eta); p=expeta/(1+expeta); model response ~ binary(p); random u ~ normal(0,s2u) subject=person out=PERSON_MEASURE_FILE; /*out= part could be done at ODS line, but I could not figure it out how*/ ods output ParameterEstimates=ITEM_DIFFICULTY_FILE; run; *ods trace off; data items; set ITEM_DIFFICULTY_FILE; if parameter ne "s2u " then do; estimate=estimate*-1;/*note here I flipped plus minus sign to force "item easiness" to be come "item difficulty"*/; end; run; proc print data=items; title "ITEM DIFFICULTY MEASURES derived by PROC NLMIXED"; run; proc print data=PERSON_MEASURE_FILE; title "Person Measures (Student Engagement Level) derived by PROC NLMIXED)"; run; proc univariate plot data=PERSON_MEASURE_FILE; title1 "Hey, troy, this is how your kids are paying attention to your class!"; title2 "Take a good care of those outlier kids."; var Estimate ;run;