kuekawa – Page 3 – My Statistical tools

September 28, 2022

Baseline Check using SAS

proc means data=both4 stackodsoutput;
class treat;
var
male minority age
risk_miss
risk_08
risk_09
risk_10
felony_binary
misdem_Sum
felony_Sum
misdem_binary;
ods output summary=temp1a;
run;

proc means data=outgs2 stackodsoutput;
class treat;
var
male minority age
risk_miss
risk_08
risk_09
risk_10
felony_binary
misdem_Sum
felony_Sum
misdem_binary;
ods output summary=temp1b;
run;

data temp1a;
set temp1a;
datatype="(1)raw data";
run;

data temp1b;
set temp1b;
datatype="(2)sample";
run;

data temp1;
set temp1a temp1b;run;

data T;set temp1;
if treat=1;
suji=_n_;
T_N=N;
T_Mean=Mean;
T_SD=StdDev;
T_Min=Min;
T_Max=Max;
keep suji variable T_N T_mean T_SD T_min T_max datatype;
run;
proc sort;by variable;run;

data C;set temp1;
if treat=0;
suji=_n_;
C_N=N;
C_Mean=Mean;
C_SD=StdDev;
C_Min=Min;
C_Max=Max;
keep variable C_N C_mean C_SD C_min C_max;
run;
proc sort;by variable;run;

data TC;
merge T C;
by variable;

/*create statistics*/
mean_dif=(T_Mean-C_Mean);
/*Standardized effects*/
g1=
((T_N-1)*(T_SD*T_SD))
+((C_N-1)*(C_SD*C_SD));

g2=T_N + C_N -2;
g3=sqrt(g1/g2);
WWC_effect=mean_dif/g3;

outcome_type="interval";
if T_Min=0 and T_Max=1 and C_Min=0 and C_Max=1 then do;
outcome_type="binary";
/*&usethis._Mean_Yes-&usethis._Mean_NO*/
Odds_C=(C_Mean/(1-C_Mean));
Odds_T=(T_Mean/(1-T_Mean));

Odds_ratio=Odds_T/Odds_C;

LN_C=LOG(Odds_C);
LN_T=LOG(Odds_T);
LN_DIF=LN_T-LN_C;

WWC_effect=/*abs*/(round(LN_DIF/1.65,0.001));/*fixed 06 01 2016*/

/*If greater than 02, Small Effect
If greater than 0.5, Medium Effect
If greater 0.8 then Large Effect*/
end;

if WWC_effect ne . then do;
if abs(WWC_effect) > 0.2 then cohen="Small ";
if abs(WWC_effect) > 0.5 then cohen="Medium";
if abs(WWC_effect) > 0.8 then cohen="Large";
end;

drop
LN_DIF
LN_C
LN_T
Odds_ratio
Odds_T
Odds_C
g3
g2
g1
;

run;
proc sort;by suji;run;

September 26, 2022September 26, 2022

Proc psmatch

proc psmatch data=asdf1 region=cs;
class FLAG2 econ_status rural size ;
psmodel FLAG2(Treated="Y")=
n_10th_graders
prop_minority_10G;
match method=greedy(K=1) exact=(cate) stat=lps caliper=0.25;
output out(obs=match)=outgs2 lps=_Lps matchid=_matchID;
run;

September 19, 2022September 19, 2022

Citation for alpha level (reliability) .70 or .80 as thresholds

Kline, P. (1999). Handbook of Psychological Testing(2^nd ed.). London: Routledge. P.13

Despite the dangers of boosting the reliability of a test by making the items highly similar to each other, in which case validity is reduced, reliabilities should ideally be high, around .9, especially for ability tests. Certainly alphas should never drop below .7, a value stressed by both Guilford (1956) and Nunnally(1978). The rationale and proof of these claims are bound up in psychometric theory and are given in Chapter 3.

For 0.7; Nunnally,J . (1978). Psychometric Theory. New York, McGraw-Hill.

For 0.8; Nunally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3^rd ed.) New Yort: McGraw-Hill.

Reference:

Abell, N., Springer, D. W., & Kamata, A. (2009). Developing and Validating Rapid Assessment Instruments. Oxford University Press.

September 1, 2022September 1, 2022

Excel function ranking

https://www.ablebits.com/office-addins-blog/2017/09/06/excel-rank-functions/

=rank.eq(x1, x1-x5)

x1 refers to the value you want to evaluate

x1-x5 specifies the range

August 29, 2022August 30, 2022

line graph that also shows the difference of two lines

https://www.mrexcel.com/board/threads/line-graph-showing-the-difference-between-two-lines.782756/

August 27, 2022August 27, 2022

T-test using R

https://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm

simple_gap=120-110;
FGC_l_StdDev_YES=30;
REG_l_StdDev_YES=30;
FGC_l_N_YES=33;
REG_l_N_YES=33;
pooled_error=sqrt( ((FGC_l_StdDev_YES^2) / FGC_l_N_YES ) + ( (REG_l_StdDev_YES^2) / REG_l_N_YES) );

T_value=simple_gap/pooled_error
P_value=2*(1-(pnorm(abs(T_value))))

T_value
P_value

August 27, 2022August 27, 2022

QC Comparison of WWC effect size code using R and SAS

I wrote this entry when I wanted to QC my WWC effect size calculation using R and SAS.

WWC standard doc

https://ies.ed.gov/ncee/wwc/Docs/referenceresources/wwc_procedures_v2_1_standards_handbook.pdf

Page 37.

In SAS:

data test;

N_Yes=4300;
Mean_Yes=400;
StdDev_Yes=200;

N_No=4000;
Mean_NO=300;
StdDev_NO=200;

/*create statistics*/
mean_dif=(Mean_Yes-Mean_NO);
/*Standardized effects*/

g1=((N_Yes-1)*(StdDev_Yes*StdDev_Yes)) +((N_No-1)*(StdDev_No*StdDev_No));
g2=N_Yes + N_No -2;
g3=sqrt(g1/g2);
WWC_effect=mean_dif/g3;

run;

In R:

FGC_l_N_YES=4300
FGC_l_Mean_YES=400
FGC_l_StdDev_YES=200

REG_l_N_YES=4000
REG_l_Mean_YES=300
REG_l_StdDev_YES=200

simple_gap=FGC_l_Mean_YES-REG_l_Mean_YES

g1<- ((FGC_l_N_YES-1)*(FGC_l_StdDev_YES*FGC_l_StdDev_YES))+((REG_l_N_YES-1)*(REG_l_StdDev_YES*REG_l_StdDev_YES))
g2= FGC_l_N_YES + REG_l_N_YES -2
g3= sqrt(g1/g2)
simple_gap_std= simple_gap/g3
simple_gap_std

g1
g2
g3
simple_gap_std

My web calculator

https://www.estat.us/file/calc_t_test1b.php

Treatment N:4300
Treatment mean:400
Treatment SD:200

Comparison N:4000
Comparison mean:300
Comparison SD:200
The group mean difference:100

[RESULTS FOR CONTINUOUS OUTCOME]

Probability: Under Development (Still working on this)
T-score is: 22.761
Significant at alpha 0.05 (two tail test;I used a z-test and ignored degree freedom; threshold 1.96)

T-test (the same test as above but with three thresholds)T 1.96, 2.576, 3.291, each for p=0.05, 0.01, 0.001
Sig at p=.001***

Hedges d 0.5

August 26, 2022

SAS t-test for proportions

data kaz;

N_yes=1162;
N_no=381;

mean_yes=0.129088;
mean_no=0.170604;

DEG_FD=N_Yes + N_No -2;
/*QC’ed
tValue=2.228;
DEG_FD=10;
*/

tValue=(Mean_NO-Mean_Yes)/(SQRT((Mean_NO*(1-Mean_NO) /
N_No )+Mean_Yes*(1-Mean_Yes)/ N_Yes));

/*2 tail test*/
P_value=(1-probt(abs(tValue),DEG_FD))*2;
run;

August 26, 2022

<?php

function compute()
{

$roundunit=3;

$Tmean = $_POST['Tmean'];
$Cmean = $_POST['Cmean'];
$TSD = $_POST['TSD'];
$CSD = $_POST['CSD'];
$TN = $_POST['TN'];
$CN = $_POST['CN'];

$mean_dif=$Tmean-$Cmean;
$SE=sqrt(
(($TSD*$TSD) / $TN)+(($CSD*$CSD) / $CN)
);

$T=$mean_dif/$SE;
$DF=$TN+$CN-2;

$P="Under Development (Still working on this)";

/*I will get T for binary variable comparison*/
$N_SUCCESS_T=$TN*$Tmean;
$N_SUCCESS_C=$CN*$Cmean;
$P_=($N_SUCCESS_T+$N_SUCCESS_C)/($TN+$CN);
$Z_numerator=$Tmean-$Cmean-0;
$Z_denom=SQRT(($P_*(1-$P_))*((1/$TN)+(1/$CN)));
$Z_bin=abs($Z_numerator/$Z_denom);
$Z_bin_abs=abs($Z_numerator/$Z_denom);

/*$P=stats_dens_normal($T, 0,1);*/
/*$P=stats_dens_gamma(float $X, float $shape, float $scale);*/
/*$P= $T / 100 ;*/

/*Hedges g*/
/*g numerator*/
$g_numerator=($Tmean-$Cmean)*(1-3/((4*($TN+$CN))-9));
/*g demnominator*/
$g_denominator=SQRT(((($TN-1)* ($TSD**2) )+(($CN-1)* ($CSD**2) ))/($TN+$CN-2));
$hedges_d=$g_numerator/$g_denominator;
$hedges_d_abs=abs($hedges_d);

/*if binary variabels*/
$T_Odds=$Tmean/(1-$Tmean);
$C_Odds=$Cmean/(1-$Cmean);
$Odds_ratio=$T_Odds/$C_Odds;
$Tstep1=log($T_Odds);
$Cstep1=log($C_Odds);
$step2=$Tstep1-$Cstep1;
$WWC_binary_effect=$step2/1.65;

/*
if ($hedges_d >= 0.2) echo "Small Effect (Cohen)";
if ($hedges_d >= 0.5) echo "Medium Effect (Cohen)";
if ($hedges_d >= 0.8) echo "Large Effect (Cohen)";
*/

echo " ";
echo "WWC group comparison of continuous and binary variables";
echo " ";
echo " ";

echo "Treatment N:" .$TN;
echo " ";

echo "Treatment mean:" .$Tmean;
echo " ";
echo "Treatment SD:" .$TSD;

echo " ";
echo " ";

echo "Comparison N:" .$CN;
echo " ";
echo "Comparison mean:" .$Cmean;
echo " ";
echo "Comparison SD:" .$CSD;

echo " ";
echo "The group mean difference:".round($mean_dif,$roundunit);
echo " ";
echo " ";

echo "[RESULTS FOR CONTINUOUS OUTCOME]";
echo " ";
/*echo "Probability " .round($P,2);*/
echo "Probability: " .$P;
echo " ";

$abs_T=abs($T);

echo "T-score is: " .round($T,$roundunit);
echo " ";

if($abs_T < 1.96 ) {
echo "Not significant at alpha 0.05 (two tail test;I used a z-test and ignored degree of freedom; threshold 1.96)";
}elseif($abs_T >=1.96){
echo "Significant at alpha 0.05 (two tail test;I used a z-test and ignored degree freedom; threshold 1.96)";
}
echo " ";
echo " ";
echo "T-test (the same test as above but with three thresholds)";
echo "T 1.96, 2.576, 3.291, each for p=0.05, 0.01, 0.001";
echo " ";
if($abs_T < 1.96) {
echo "Not sig. at alpha 0.05";
}elseif($abs_T>=1.960 and $abs_T < 2.576 ){
echo "Sig at p=.05*";

}elseif($abs_T>=2.576 and $abs_T < 3.291 ){
echo "Sig at p=.01**";

}elseif($abs_T>=3.291 ){
echo "Sig at p=.001***";
}else {
echo "N/A";
}

echo " ";
echo " ";

echo "Hedges d " .round($hedges_d,$roundunit);

echo " ";
/*cohen's rule of thumb*/
echo "Cohen's rule of thumb for effect size interpretation";
echo " ";
if($hedges_d_abs < 0.2) {
echo "Close to zero and Not even small Effect (Cohen)";
}elseif($hedges_d_abs>=0.2 and $hedges_d_abs < 0.5){
echo "Small effect (Cohen)";
}elseif($hedges_d_abs>=0.5 and $hedges_d_abs < 0.8){
echo "Medium effect (Cohen)";
}elseif($hedges_d_abs>=0.8){
echo "Large effect (Cohen)";
}else {
echo "others";
}
echo " ";

echo "Baseline equivalence test";
echo " ";
if($hedges_d_abs <= 0.05) {
echo "Satisfies the baseline equivalence requirement";
}elseif($hedges_d_abs>0.05 and $hedges_d_abs <= 0.25){
echo "Requires statistical adjustment to satisfy the baseline equivalence requirement";
}elseif($hedges_d_abs>0.25){
echo "Does not satisfy the baseline equivalence requirement";
}elseif($hedges_d_abs>=10){
echo "Something Strange happened";
}else {
echo "N/A";
}

echo " ";
echo " ";

echo "[RESULTS FOR BINARY OUTCOME]";

echo " ";
echo "If outcomes were binary variables (range 0 to 1), the WWC effect size would be ";
echo "" .round($WWC_binary_effect,$roundunit);

echo " ";
echo "T-score for the binary outcome is: " .round($Z_bin,$roundunit);
echo " ";

echo " ";
echo "T-test for binary outcomes";
echo " ";
if($Z_bin_abs < 1.96) {
echo "Not sig. at alpha 0.05";
}elseif($Z_bin_abs>=1.960 and $Z_bin_abs < 2.576 ){
echo "Sig at p=.05*";

}elseif($Z_bin_abs>=2.576 and $Z_bin_abs < 3.291 ){
echo "Sig at p=.01**";

}elseif($Z_bin_abs>=3.291 ){
echo "Sig at p=.001***";
}else {
echo "N/A";
}

echo " ";
echo " ";

}

/*echo "The result is: " . compute();*/
compute();

REFERENCE

Cohen's rule of thumb about effect sizes:
 
<li>If greater than 02, Small Effect
 
<li>If greater than 0.5, Medium Effect
 
<li>If greater 0.8 then Large Effect
 
Cohen, J. Statistical power for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum (1988).
 
<a href="https://wmich.edu/sites/default/files/attachments/u58/2015/Effect_Size_Substantive_Interpretation_Guidelines.pdf">
Effect Size Substantive Interpretation Guidelines: Issues in the Interpretation of Effect Sizes Jeff Valentine and Harris Cooper, Duke University(see page. 5)</a>
 
 
WWC related info:
 
<a href="https://ies.ed.gov/ncee/wwc/Docs/ReferenceResources/wwc_procedures_handbook_v4_draft.pdf">WWC procedures handbook (see page. 14)</a>
 
<a href="https://ies.ed.gov/ncee/wwc/Docs/OnlineTraining/wwc_training_m3.pdf">WWC standards slides (Definition of small sample size correction, slide 14)</a>
 
WWC considers the effect size greater than .25 substnatively important.
<a href="https://ies.ed.gov/ncee/wwc/Docs/referenceresources/wwc_procedures_handbook_v4.pdf">P.22 of WWC standards</a>

T-Table
 
<a href="https://www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf">T-Table</a>

<a href="calc_t_test1.php">Back to the calculcator </a>

<a href="https://www.estat.us">My website</a>

August 19, 2022August 19, 2022

How to generate data from a model (using R)

How to generate data from a model – Part 1