Yi-Hsin Chen, University of South Florida, Tampa, FL Isaac Li, University of South Florida, Tampa, FL Jeffrey D. Kromrey, University of South Florida, Tampa, FL
Just an experiment with a Winsteps control option:
What happens if you enter random numbers to SAFILE.
CATEGORY PROBABILITIES table gets strange shaped curves.
Reliability gets wacky/low.
Conclusion: You will definitely notice something wrong happened.
Rasch model analysis has the following set of advantages
- Being logit scores, Rasch scores have no theoretical upper and lower boundary values (useful for statistical analysis)
- Rasch scores facilitate pretest and posttest comparison based on different set of test items (You can avoid taking identical test at pre and post)
- Rasch model can handle missing values (as long as a subject is not missing all items)
- Rasch model (or Rasch model software programs) comes with an excellent set of diagnostics statistics to evaluate the model and data fit
#01 CHECK RESPONSE VALUES (IF THEY ARE CODED INTO CORRECT NUMERALS)
Items used for Rasch model analysis are usually ordinal variables based on response values such as “Strongly agree,” Agree,” “Disagree,” and “Strongly Disagree.” Code these so that higher agreement receives higher numbers:
- Strongly agree 4
- Agree 3
- Disagree 2
- Strongly Disagree 1
If by mistake these numbers are flipped, you will have a catastrophic situation where the result is flipped. Do two things to prevent such a catastrophe:
- Confirm this by looking at the actual survey and by looking at the data (Look at it until your eyes bleed).
- People are likely to agree with items as they have social pressure to report good things when taking a survey. Look at the original data and see if you see a lot of positive responses.
#02 CHECK THE N OF SUBJECTS INCLUDED IN THE ANALYSIS
Check the output and confirm that the number of subject used is correct. Checking the number of subject is el numero uno protection against errors.
#03 CHECK THE N OF ITEMS INCLUDED IN THE ANALYSIS
Check the output and confirm that the number of items used is correct. Especially when you are not using all item’s data in your analysis (you might have decided to drop some items), be sure you used the ones you wanted to use. With Winsteps, misspecification of a control file can lead to inclusion of subject IDs as response data by mistake. Avoid this (such a case will produce an extremely low reliability score).
#04 CHECK WHAT VALUE WAS USED FOR MISSING SCORES
When a subject does not provide any response, Winsteps imputes a token number (-2, I think) to indicate that it is a missing value. This value should be treated as a missing value and should NOT be included in the analysis dataset. If you treat a token value (-2 in the case of Winsteps) as a true value, you will have a catastrophic situation where you have an arbitrary value used as a real data point. You should replace such a number with “.” (dot) before analysis as statistical software, such as SAS or SPSS, will treat a dot as a missing value.
Winteps Reference: Definition of status variable in Winsteps output
When a subject lacks data, missing value is indicated by -2.
Basic QC procedures should catch 99% of errors. Advanced ones are more intricate ones.
#05 INVESTIGATE ITEM DIFFICULTY SCORES
If you are using item difficulty parameters provided by the developer, compare them against the ones you derived from the dataset you collected. They must be more or less comparable. If not, investigate whether it is caused by a data error.
#06 HISTORICALLY COMPARE RESULTS
If you are repeating the study, compare your results with historical data (e.g., last year’s result).
Mike wrote this for me:
log (Pnij / Pni(j-1) ) = Bn - Di - Fj where Fj is the Andrich Threshold between categories j-1 and j.
For a Rasch/Winsteps analysis that includes anchored items (difficulty parameters are fixed based on a set of values previously calibrated; as opposed to estimating them fresh from the data), one can check Table 14.1 of Winsteps output provides various diagnostics statistics.
In educational evaluation field, we often have access to vertically equated scales. Scales means scores, measures, points. Vertically equated scores in the context of education are the ones that are comparable across grades, which means that you can pick a score from 5th grader and a score from 8th grader and consider them to be measuring the same construct on the same scale, such as math ability. I can say this or elaborate the concept in a couple of different ways.
- Vertically equated scales allow you to compare students of different grades on a common scale.
- If a 4th grader got a score of 50 and 9th grader also got score of 50, they have the same ability level.
- The 10 point difference among 5th graders (e.g., 50 and 60) and the 10 point difference among 8th graders (60 and 70) are considered equal.
Instead of providing a detailed methodological note, I'd like to use a metaphor to explain why equating is possible across different grades.
Using PROC LOGISTIC to Estimate the Rasch Model
Tianshu Pan, Pearson Yumin Chen, the University of Texas Health Science Center at San Antonio ABSTRACT
This paper describes how to use PROC LOGISTIC to estimate the Rasch model and make its estimates consistent with the results of the standard Rasch model software WINSTEPS.
Can this be right? If right, it helps reduce the computational demand off the procedure. Page 4:
"When thousands of persons take a test, the procedure takes a long time to estimate the parameters. It is well known that the Rasch model gives the same parameter estimates for each person who receives the same total score. So, variable ‘person’ is able to be replaced with variable ‘total’ when all examinees answer all items as shown by Nord (2008). After the model is fit, the estimate of the parameter for each person is equal to the estimate of the parameter of the total score corresponding to the person’s total score. The third code example and its output are shown as follows:"