## Sample size needed for estimating a proportion with a certain level of precision

http://bold-ed.com/calculator.htm#calculator

http://www.surveysystem.com/sscalc.htm

## Rasch data

data raschdata;
input
ID \$ 1-10
Q01 11 Q02 12 Q03 13 Q04 14 Q05 15 Q06 16 Q07 17 Q08 18 Q09 19 Q10 20
Q11 21 Q12 22 Q13 23 Q14 24 Q15 25 Q16 26 Q17 27 Q18 28;
cards ;
Richard M 111111100000000000
Tracie F 111111111100000000
Walter M 111111111001000000
Blaise M 111100101000000000
Ron M 111111111100000000
William M 111111111100000000
Susan F 111111111111101000
Linda F 111111111100000000
Kim F 111111111100000000
Carol F 111111111110000000
Pete M 111011111000000000
Brenda F 111110101100000000
Mike M 111110011111000000
Zula F 111111111110000000
Frank M 111111111111100000
Dorothy F 111111111010000000
Rod M 111101111100000000
Britton F 111111111100100000
Janet F 111111111000000000
David M 111111111100100000
Thomas M 111111111110100000
Betty F 111111111111000000
Bert M 111111111100110000
Rick M 111111111110100110
Don M 111011000000000000
Barbara F 111111111100000000
Audrey F 111111111010000000
Anne F 111111001110010000
Lisa F 111111111000000000
James M 111111111100000000
Joe M 111111111110000000
Martha F 111100100100000000
Elsie F 111111111101010000
Helen F 111000000000000000
;
run;

## PROC CALIS to do confirmatory factor analysis or even Rasch model???

I'd like to investigate if I can do CFA or Rasch model using PROC CALIS.

PROC CALIS COVARIANCE CORR RESIDUAL MODIFICATION data=one;
LINEQS
risk_1n= F1 + E1,
risk_2n = F1 + E2,
risk_3n= F1 + E3,
risk_4n= F2 + E4,
risk_5n= F2 + E5,
risk_6n= F2 + E6;
STD
F1 = 1,
F2 = 1,
E1-E6 = VARE1-VARE6;
COV
F1 F2 = CF1F2;
VAR risk_1n risk_2n risk_3n risk_4n risk_5n risk_6n;
RUN;

## Dummy variables in logistic regression models

Why do switching of values in a dummy variable and the use of class statement in PROC LOGISTIC change the coefficients in logistic regression?

(1) and (2) produce the same results. (3) and (4) produce the same results.

(1)
proc logistic data=here.asdf descending ;
model college= boy ;
run;

(2)
proc logistic data=here.asdf descending ;
class girl;
model college= girl ;
run;

(3)
proc logistic data=here.asdf descending ;
model college= girl;
run;

(4)
proc logistic data=here.asdf descending ;
class boy;
model college= boy ;
run;

 (1) Estimates (2) Estimates (3) Estimates (4) Estimates Intercept 0.5346 Intercept 0.3645 Intercept 0.1945 Intercept 0.3645 boy -0.3401 girl -0.1701 girl 0.3401 boy 0.1701 Odds ratio 0.712 0.712 1.405 1.405

## T-test for proportions of multiple groups using SAS procedures and datasteps

/*In this example, there is only two groups, but you can run it with multiple groups.*/

data kaz;set sashelp.class;
if age < 12 then THIS_IS_OUTCOME=0; if age > 13 then THIS_IS_OUTCOME=1;

run;

%let group=sex;
%let outcome=THIS_IS_OUTCOME;
%let dataname=kaz;

ods listing;
ods trace on;

proc means data=&dataname;
where &outcome ne .;
class &group;
var &outcome;
ods output summary=kaz_mean;
run;

proc glimmix data=&dataname;
class &group;
model &outcome=&group ;
lsmeans &group /diff ;
ods output Diffs=kaz_t;
run;

data kaz_t2;
set kaz_t;
keep &group _&group estimate;
run;
proc sort;
by &group;run;

data kaz_mean1;
set kaz_mean;
prop1=&outcome._mean;
n1=&outcome._n;
keep &group prop1 n1;
run;
proc sort;by &group;run;

data kaz_mean2;
set kaz_mean;
prop2=&outcome._mean;
n2=&outcome._n;
_&group=&group;
keep _&group prop2 n2;
run;
proc sort;by _&group;run;

data mix1;
merge kaz_t2 kaz_mean1;
by &group;
run;
proc sort;
by _&group;run;

data mix2;
merge mix1 kaz_mean2;
by _&group;
if estimate ne .;
DEG_FD=N1+N2-2;
/*QC’ed
tValue=2.228;
DEG_FD=10;
*/
tValue=(prop1-prop2)/(SQRT((prop1*(1-prop1)/n1 )+prop2*(1-prop2)/ n2));
/*2 tail test*/
P=(1-probt(abs(tValue),DEG_FD))*2;

length _2TAIL_STAT_TEST \$ 2;

if P < .05 then _2TAIL_TEST = "*"; group1=&group; group2=_&group; classvar="&group"; dif=estimate; outcome="&outcome"; run; data mix3; retain outcome classvar group1 group2 n1 prop1 n2 prop2 dif p _2TAIL_TEST ; set mix2; keep outcome classvar group1 group2 n1 prop1 n2 prop2 dif p _2TAIL_TEST ; ; run;

## SAS function: count the occurrance of a character

data test;
input transfer_matrix \$20.;
cards;
..0000000
11111111..
11000000001
1111111
111111111..
1111111000
0000.110
;
run;

data test;
set test;
count_1=countc(transfer_matrix,'1');
run;

proc print;
run;

Thanks, K.

## Creating confidence intervals using datastep

X is a proportion (ranges from 0 to 1).  When alpha is set to 1.96, it creates 95% confidence intervals (two-tail tests).

N is the number of cases.

%let Alpha=1.96;

X_STDERR =sqrt(((X*(1-X))/N));
X_CI95_lower =X-(X_STDERR*&alpha);
X_CI95_upper =X+(X_STDERR*&alpha);

REFERENCE

http://davidmlane.com/hyperstat/B9168.html

## HLM equation and PROC MIXED/GLIMMIX syntax

level1 : y= b0 + b1*X + error1
level2: b0=g00 + g01*group+error2
level2: b1=g10 + g11*group + error3

I insert level2 equations into level 1.

level1 : y= g00 + g01*group+error 1+ (g10 + g11*group + error2)*X + error3

level1 : y= g00 + g01*group+error 1+ g10*X + g11*group *X + error2*X + error3

This translates into PROC MIXED syntax in this way.  Independent variables that are right next to the error (error2*X) goes into the random statement.

proc mixed ;
class sub group ;
model y=x group x*group;
random intercept x;
run;

## Saving log and output files into external text files (to avoid stopping of a SAS run)

filename printout 'C:\temp\log.txt';
filename logout 'C:\temp\output.txt';
proc printto print=printout log=logout new;
run;

proc printto ;
run;