I learned SAS programming by looking at examples. The programs below are for people who share the similar
learning style as mine. All programs use SAS's default data sets that are in sashelp directory.
proc freq data=R2 nlevels; where ID ne "" or ID ne "."; tables ID / noprint; ods
output nlevels=new; run;
- PROC MEANS for getting basic descriptive statistics of interval scales. But don't use this to create mean variables. Use
PROC SQL instead. (Also PROC CORR's descriptive statsitics has a better look than PROC MEANS. PROC MEANS' result tends
to spread across pages and it looks ugly.)
- PROC SQL to create mean variables. Most people whose programs I have ever seen take a tedious process of creating mean value
variables by using PROC MEANS. But PROC SQL is a lot better.
- PROC FACTOR to do factor analysis. I show a simple example, as well as a macro program that gets you a result like this. Result in a text format. Result document in rich text format.
- PROC FACTOR for factor analysis, a macro to deal with more than one set of variables. Just a modified version of above progam. I like this version
better, espcially for the use of notepad to see the result.
- DATA steps to manipulate data (sort data sets, keep/drop variables, merge data sets or pile one on top of another, create new variables,
etc.)
- PROC STANDARD is to do the following. They all are the same things, actually.
- create Z-scores
- grand-mean or group mean centering when doing PROC MIXED/HLM
- imputation of missing cases based on either grand mean or group mean.
- Descritive, exploratory analysis Get basic statistical properties of variables.
- REPORTING of descriptive statistics results: If you are a researcher/research assistent working in a group there is a need to report results in a meaningful manner.
Of course, also useful, to make sense out of your results. Here I show you how ODS (output delivery system) can be used to
create a report table that is easy to understand. The results can be saved in an excel sheet or as an RTF document.
- REPORTING of Regression Results: I have a macro for you to try at another page of my site. But it is also useful if you try to figure out this code. It is a simplified version. The result of a full version, if you use the macro, the results will look like this.
- Creating a report that is a text file. I hate ODS output files because they look ugly and I am not patient enough to learn how to manipulate the styles.
- PROC MIXED to do multilevel model/HLM.
- Comparison of PROC MXIED and PROC REG. If I had seen these syntaxes before I took my HLM class in 1994, I would have understood the class a lot better.
The message in this one is that you can get the same results as OLS using PROC MIXED if you just don't specify a
random statement. That tells us something. By the way, I think in fact PROC MIXED is better than PROC OLS for
doing a simple linear model like OLS, because MXIED takes a CLASS statement, so we don't have to recode character
variables into a series of dummy variables. I think Maximum Likelihood estimation method will return the same results
as OLS if it is a simple linear model.
- PROC NLMIXED to do multilevel logistic regression. Also see my attempt to replicate Rasch model with NLMIXED. The problem of NLMIXED is that it is hard to converge when the model is poorly specified, which happens
a lot in social science. GLMMIX, below, is easier. Estimation method seems less rigorous or something.
- PROC NLMIXED to do lots of modeling for the sake of learning. Min-Ah says,
"The default of DF in proc nlmixed is the
number of level 2 -1. So, If one wants to analyze the effects of level 1 variables and level 2 variables (Multilevel)
together, he/she has to put an additional option to fix DF for level 1 variables in syntax. So, I used “estimate” commend to fix the DF for level
1 variables."
- PROC GLIMMIX to do multilevel logistic regression.
- A GLIMMIX macro glmm800.sas (This used to be used before PROC GLIMMIX)
- Comparison program: Analytical sample versus a full sample. Have you ever been in a situation where, becuase of a pattern of missing cases in your data, your analytical
sample gets very small in size? You will have to explain how the reduced data is different from a full large data.
This program does that check. For every paper I had to do this, so I finally wrote a macro type of program. The
documentation is poor, so please ask me. kuekawa @ alumni.uchicago.edu
- AddSuffix program: Add suffix to every variable in the data.
- PROC TIMEPLOT
- PROC IML
- Read a file off the internet using filename http URL
- PROC COMPARE: compare two data sets and see if they are different.
- X statement allows you to get out of the SAS environment and excute things in MS-DOS environment.
- SAS DDE Dynamic Data Exchange
- Be careful with retain statement. It can mess up the values of variables.
- proc MI
- PROC SORT to deal with duplicatge IDS http://analytics.ncsu.edu/sesug/2006/CC14_06.PDF
- SAS PROC MEANS OUTPUT Statement
proc means data=budget;
class ID;
var Funding_Budget1
Funding_Budget2;
output out=budget_info(drop=_type_ _freq_) mean=budget1_average budget2_average; run;
SAS datasets to delete all temporary data sets in a working folder:
proc datasets lib=work nolist kill;
quite;
run;
To kill specific datasets in a working folder:
proc datasets library =work nolist; delete name_of_data_here; quit;
How to print log file and output file to external files (so SAS window does not get full and stop when running
a long program)
filename printout 'C:\xxxxx\log.txt';
filename logout 'C:\xxxxx\output.txt';
proc printto print=printout log=logout new;
run ;
APPEND DATASETS
/*this makes sure there is no previous dataset named "alldata"*/ proc
datasets library = work nolist; delete alldata; quit; %macro kaz(var1=);
data new&var1;set maindata; unitid=&var1; run;
proc append force data=new&var1 out=alldata; run;
%mend kaz; %include "c:\temp\example.txt";
SAS RESULT TABLES
What information tables are available in SAS data formats--for each PROC? This is important to know when you want
to use ODS.
MACRO
Routine work tips
- Reading text files / Writing text files
- Use of Excel for the writing of repetitive syntax Click this excel sheet and see. I use COPY function of excel to enter the same repetitive information.
- Creating IDs. X=_n_; wuold assign sequence IDs, but this one creates sequence IDs within group units. See here.
- ADD asterisks (**) based on p-values.
- This program adds prefix to variables in a data set (i.e., all the variables name will be given the same prefix, like Year1_*).
- Drop variables when the values are all missing.
- Repeat values and fill in blanks: To go from {A, ., ., ., B,.,.} to {A, A, A, A, B, B, B}
- Provide a sequence ID, that is, to go, for example, from {2,2,2,5,5,3} to {1,1,1,2,2,3}
Solving problems in real situations
Case: Imagine you have fifty organizations to report to. You prepared the report and before you
send your report you realize you spotted an error. Imagine instead of saying "2003" you needed to say "2005."
You don't want to open fifty documents to make this change.
Solution: Activate MS-word macro from within SAS
Case: You need to create a table that has description of variables. If your labeles are nicely
prepared, you can use them for the purpose of creating this table.
Case: You have a gigantic SAS data set and you need to create a text-file version of the same data sets.
You need to create position statement that tells SAS where the variables have to be placed in a text file.
Case: Log file and output file get too long, SAS has to stop in the middle. To prevent this stopping, save these
files into text files.
filename printout 'C:\Documents and Settings\log.txt'; filename logout 'C:\Documents and Settings\output.txt'; proc
printto print=printout log=logout new; run;
Let SAS write a macro based on variable values in a dataset:
%let cate=sex; /*using a data set and syntax, create a variable that has a SAS statement. See variable CODE*/ data
syntax; set sashelp.class;
code= '%kaz (var1='||&cate||');';
run; proc sort nodupkey;by &cate;run;
/*Now write out an external text file that has a macro command*/
data _null_;set syntax; blank=' '; file "c:\temp\example.txt"; /*you can change this*/ put (code) (30.0); run;
/*Macro statement*/ /*Exact step is up to what you want to do*/ /*This is an example of breaking the data into pieces
based on the value of NAME*/ %macro kaz (var1=); data &var1; set sashelp.class; if &cate="&var1"; run; %mend
kaz;
/*And read the text file you created earlier*/ %include "c:\temp\example.txt";
Silly but perhaps convinient at least for me
Statistics
Appling t-test in a data step:
data &schoollevel.&var1; retain group; merge ueka1b ueka2b; by group;
P1=M2_MET_pre_Mean; P2=M2_MET_post1_Mean;
N1=M2_MET_pre_N; N2=M2_MET_post1_N;
A=(P1*(1-P1))/N1; B=(P2*(1-P2))/N2; STDERR=sqrt(A+B);
Z=abs((P1-P2)/STDERR); /*two tail 5%*/
P=(1-probnorm(Z))*2;
if P < 0.05 then SIG="*";
drop A B P1 P2 N1 N2; run;
|