Odds ratio (Explanation)

Odds-ratio can summarize a value that would otherwise take multiple percentage values to explain the result of an intervention.  For example, imagine one group of high school students received the mentoring intervention and the other didn't.  The results of on-time high school graduation was:

  • Group T: 85% graduated; 15% did not graduate
  • Group C: 75% graduated; 25% did not graduate

This is a lot of information to communicate.  I could reduce it like this too, but still it takes a lot of words:

  • Group T: 85% graduated
  • Group C: 75% graduated

Odds ratio can express this with one value.

odds ratio= (P1/(1-P1)) / (P2/(1-P2)

To plug in numbers from the graduation example:

odds ratio= (.85/(1-.85))  /  (.75/(1-.75)) = 1.8888

For people who are not used to mathematical notations:

  • /  means divided by (e.g., 30/3 =10)
  • Also notice that algorithms usually use rates rather than percentages (not 85 but .85).

I recommend replicating this result using Excel sheet.  Enter these values at the left-top corner of an Excel sheet and confirm that the function (A3/B3) will return 1.888...

0.85 0.75
 =(A1/(1-A1))  =(B1/(1-B1))
 =A3/B3

For Excel beginners, A1 means the cell defined by Column A and Row 1 of the Excel sheet.

As you do this replication, try to understand the meaning of a resulting value conceptually.  Change the values in Excel from original .85 and .75 to other values to understand how the algorithm works and changes the result.  Confirm the following:

  • Odds ratio can vary from 0 to infinity (=super big values).
  • If the odd ratio is greater than 1, the intervention program made a larger difference.
  • If the odd ratio is 1, the program did not make any difference.  Try to understand the algorithm by entering the same values to P1 and P2.
  • If the odd ratio is small than 1, the program made the situation worse.

Finally, one of the advantages of odds ratio is that when you look at the value, you can immediately tell if the treatment group had more favorable result than the comparison group did. If programmed exactly as above, an odds ratio value greater than 1 means the treatment group performed better.  If less than 1, the comparison group did better.

Adding a note to SAS results

data NOTES;

input  Notes & $ 1-100;

datalines;

This is my note

;

run;

 

proc print;

run;

 

*****************

data _null_;
set n_level_info;
call symput ("NLevels", NLevels);
run;

data NOTES;
input Notes $ 1-100;
textResolved=dequote(resolve(quote(Notes)));
datalines;
This is the way I add a note in a data step.
This is an example of how I can use a macro --> &NLevels .

;
run;

data notes2;
set notes;
keep textResolved;

run;

Data editing in SAS

My data looks like this:

VAR1
TITLE A
APPLICATION #1
APPLICATION #2
APPLICATION #3
TITLE B
APPLICATION #4
APPLICATION #5
APPLICATION #6
TITLE C
APPLICATION #4
APPLICATION #5
APPLICATION #6

I’d like the result to look like VAR2 below

VAR1 VAR2
TITLE A TITLE A
APPLICATION #1 TITLE A
APPLICATION #2 TITLE A
APPLICATION #3 TITLE A
TITLE B TITLE B
APPLICATION #4 TITLE B
APPLICATION #5 TITLE B
APPLICATION #6 TITLE B
TITLE C TITLE C
APPLICATION #4 TITLE C
APPLICATION #5 TITLE C
APPLICATION #6 TITLE C

To be more exact, I’d like it to be like this, but if I get above, I can get this myself:

VAR1 VAR2
APPLICATION #1 TITLE A
APPLICATION #2 TITLE A
APPLICATION #3 TITLE A
APPLICATION #4 TITLE B
APPLICATION #5 TITLE B
APPLICATION #6 TITLE B
APPLICATION #4 TITLE C
APPLICATION #5 TITLE C
APPLICATION #6 TITLE C

Thanks Charly:

*****;
data a;
input VAR1 &$30.;
cards;
TITLE A
APPLICATION #1
APPLICATION #2
APPLICATION #3
TITLE B
APPLICATION #4
APPLICATION #5
APPLICATION #6
TITLE C
APPLICATION #4
APPLICATION #5
APPLICATION #6
;

data b;
set a;
if var1 =: 'TITLE' then var2=var1;
else output;
retain var2;
run;

proc print ; run;

Creating a series of dummy variables

Thanks Russ:

Based on the lowest and highest grade, the following creates a series of dummy variables indicating which grade level is served --- by schools.

data one;
input ID LOWEST_GRADE HIGHEST_GRADE;
cards;
1 4 9
2 9 12
;

data new;
array grades(*) grade1-grade12;
set one;
do i =1 to dim(grades);
if i ge lowest_grade and i le highest_grade then grades(i)=1;
else grades(i)=0;
end;
run;

proc print;
run;

SAS Drop variables when all values are missing

/*http://ftp.sas.com/techsup/download/sample/datastep/dropvar.html*/
%let abc=&syslast;
data _null_;
set &abc end=end;
array test (*) _numeric_;
* array allmiss (8) $ (8*'true');
array allmiss (3000) $ (3000*'true');

length list $ 5000;

do i=1 to dim(test);
if test(i) ne . then allmiss(i)='false';
end;
if end=1 then
do i= 1 to dim(test);
if allmiss(i) ='true' then list=trim(list)||' '||trim(vname(test(i)));
end;
call symput('mlist',list);
run;

data &abc ;
set &abc ;
drop &mlist;
run;

SAS Converting string values into numeric values

proc contents data=xxx1 position;
ods output position=kuekawa1;
run;

data kuekawa2;
set kuekawa1;
if type="Char" then do;
x1=compress(variable||"_n=input(");
x2=x1||variable||",7.);";
syntax=compress(x2);
*&var1._n=input(&var1,7.);
end;
if syntax ne "";
keep syntax;
run;

data _null_;set kuekawa2;
blank=' ';
file "C:\temp\proc_contents1.txt";
put
(syntax) (100.0);
run;

data xxx2;
set xxx1;
%include "C:\temp\proc_contents1.txt";
run;