SPSS Syntax

In September and October of 2015, I had to learn how to program in SPSS.  I have programmed always in SAS and the SAS-to-SPSS transition wasn't easy.  The best resource I got was Chapter 2 (Data Management) of the following IBM-SPSS document.  It reviews basic commands necessary for SPSS data management and the way it was written is intuitive for SAS programmers.  I reviewed the first 130 pages of this document and took notes of procedures and techniques that I would use in SAS (e.g., tranpose).

https://developer.ibm.com/predictiveanalytics/wp-content/uploads/sites/48/2015/04/Programming-and-Data-Management-for-IBM-SPSS-Statistics-23.pdf

After getting the basics, I relied on the following materials by keeping hitting these pages when googling:

The major SAS-SPSS difference is that in SAS, you FIRST specify the dataset name in this way:

libname here "c:\temp\";
data new; set here.old;
<you start doing data editing here>
run;

In SPSS, you do that LAST.

GET FILE="c:\temp\old.sav".
DATASET NAME new WINDOW=FRONT.
<you start doing data editing here>
Also notice you do data editing in different locations.

When programming, SAS users are used to structured blocks of procedures (e.g., PROC MEANS) and data steps (I like closing each step with a RUN statement to make data steps explicitly ending).  Any part of SAS syntax is either procedures or data steps.

In SPSS, the structure feels less explicit as any statements you make affects the dataset that happens to be open at the time.  This, however, is just an initial feeling you may get.

The rest were all similar.  Here is my SPSS cheat sheet:

Read SPSS File

GET FILE="c:\temp\temp555.sav".
DATASET NAME freq_result WINDOW=FRONT.
dataset activate freq_result.

In SAS:

libname kaz "c:\temp";

data freqresult;
set kaz.temp555;
run;

 

Combine datasets (1) -- I wish I could add more than two, but I don't think that is possible

ADD FILES
file='C:\Users\basedata_1.sav'
file='C:\Users\SPSS.sav'
in=division.
EXECUTE.
DATASET NAME FINAL_INDIVIDUAL WINDOW=FRONT.

In SAS -- you can add more than two datasets:

libname kaz "C:\Users";

data final_individual;
set kaz.basedata kaz.SPSS;
run;

Combine datasets (2)

DATASET ACTIVATE olddata.
ADD FILES /FILE=*
/FILE='newdata'.
EXECUTE.

 

Saving a dataset

SAVE OUTFILE='C:\temp\new.sav'
/COMPRESSED.

 

Change the position of variables 

MATCH FILES FILE=* /KEEP=label_1 label_2 ALL.
LIST.

In SAS, this would be:

data new;
retain label_1 label_2;
set old;
run;

Recoding

RECODE Var2 ('System'='x2_missing')('Total'='x1_validtotal').
In SAS:
if var2="System" then var2="x2_missing";
if var2="Total" then var2="x1_validtotal";

RECODE var1 var2 var3 var4
(1=1) (2=1) (3=1) (4=2) (5=3) (6=3) (7=3).
EXECUTE.

In SAS (the if-statements could be consolidated and simplified a bit more than this, but just as an example):
array theseguys var1 var2 var3 var4;
do over theseguys;
if theseguys=1 then theseguys=1;
if theseguys=2 then theseguys=1;
if theseguys=3 then theseguys=1;
if theseguys=4 then theseguys=2;
if theseguys=5 then theseguys=3;
if theseguys=6 then theseguys=3;
if theseguys=7 then theseguys=3;
end;

Adding labels

VALUE LABELS
var1 var2 var3
1 "Disagree"
2 "Neither"
3 "Agree".

Format of a numeric variable that adds leading zeros (this adds 000)

FORMATS id (n3).

Computing

compute new_var=var1/var2.

Delete variables

Delete variables Command_ Subtype_ Percent.

Another example:

DELETE VARIABLES var1 TO var10.

 

Sequence function (This creates a column of 1, 2, 3, 4, ...)

compute id=$Casenum.
exe.

In SAS:

id=_n_;

Numeric to String, String to Numeric

https://kb.iu.edu/d/aoym

Combining string variables (In this example, the first variable is a numeric one and it had to be first converted into a string one.)

string label_2 (A400).
compute label_2=CONCAT(SEQUENCE_ID,label_).

Return the position of the first occurrence of a letter (in this example, a blank)

compute first_blank = char.index(label_,' ').

How to substring a part of a variable by specifying the location (example)

compute first_blank = char.index(label_,' ').
string label_1(a50).
string label_2(a400).
compute label_1 = char.substr(label_,1,first_blank).
compute label_2 = char.substr(label_,first_blank+1,400).

Get Frequencies

/Format = limit(15) is useful.  This limits the printing of response categories to 15 and helps avoid reading results from text variables.  Variables=ALL will enter all variables from the dataset (you can also specify variable names one by one).

FREQUENCIES VARIABLES=ALL
/FORMAT=LIMIT(15)
/ORDER=ANALYSIS.

 

Dropping all string variables using a PYTHON script. (By David Marso http://spssx-discussion.1045642.n5.nabble.com/SPSS-syntax-to-delete-string-variables-with-width-more-than-500-td5725961.html)

begin program.
import spss, spssaux
dict1=spssaux.VariableDict(variableType='string')
vars1 = " ".join(dict1.variables)
dropvars=vars1
spss.Submit("Delete Variable " + dropvars)
end program.

Save as an Excel file (but cannot control the sheet/tab behavior)

SAVE TRANSLATE OUTFILE=!pathname+!title+" Part 05.xls"
/TYPE=XLS
/VERSION=12
/MAP
/REPLACE
/FIELDNAMES
/CELLS=VALUES.

Save results of Frequency run as an Excel file (sheet/tab can be specified!)

oms select tables
/destination format=sav OUTFILE ="c:\temp\temp.sav"
/if commands=['Frequencies'] subtypes=['Frequencies'].
FREQUENCIES VARIABLES=
var1 var2 var3
/ORDER=ANALYSIS.
omsend.

Basic Macro -- SAS's %let = var1;

I don't know why the third one doesn't require " but maybe when you specify terms with spaces in between you don't want "s. It took hours to figure this out.

define !pathname() 'C:\temp\!enddefine.
define !title() "spss_data" !enddefine.
define !var_set1() var1 var2 var3 var4 !enddefine.

Regular Macro -- SAS's %macro sushi (var1=x, var2=y);

The meaning of CMDEND unclear, but it seems that you want to use it to define the last one in the define statement.  Other ones can use CHAREND('/').  This is the most tricky part of SPSS macro, which is different from SAS macro.

This is the first example:

DEFINE !kaz (var1=!CHAREND ('/') / var2=!CMDEND)
DATASET ACTIVATE FINAL_INDIVIDUAL_DATA.
oms select tables
/destination format=xls OUTFILE ="c:\temp\temp.sav"
/if commands=['Frequencies'] subtypes=['Frequencies'].
FREQUENCIES VARIABLES= !var1 !var1
/ORDER=ANALYSIS.
omsend.
!ENDDEFINE.
!kaz var1=v1 v2 / var2=v3.

This is the second example:

DEFINE !kaz (var1=!CHAREND ('/') / var2=!CHAREND ('/') / var3=!CHAREND ('/') / var4=!CHAREND ('/') / var5=!CMDEND)

DATASET ACTIVATE DataSet1.
RECODE !var1 (1=1) INTO !var5.
RECODE !var2 (1=2) INTO !var5.
RECODE !var3 (1=3) INTO !var5.
RECODE !var4 (1=4) INTO !var5.

EXECUTE.
!ENDDEFINE.
!kaz var1=Q312/ var2=Q442 / var3= Q311/ var4= Q42/ var5=newvar1.
!kaz var1=Q312b/ var2=Q442b / var3= Q311b/ var4= Q42b/ var5=newvar2.

 

Copy a dataset (This copyes a data that is open prior to it)

DATASET COPY John.

Produce an Excel file and put data in a sheet

set printback off.

DATASET ACTIVATE original.
output new.
FREQUENCIES VARIABLES=var1 var2.
output export
/contents export=visible
/xls documentfile=!pathname+!title+" extra.xls"
operation=CREATESHEET
sheet="Two_categorical_vars".

Produce an Excel file (but can't specify a sheet)

dataset activate freq_result.

SAVE TRANSLATE OUTFILE=!pathname+!title+" freq.xls"
/TYPE=XLS
/VERSION=12
/MAP
/REPLACE
/FIELDNAMES
/CELLS=VALUES.