Professional Documents
Culture Documents
Step Processing
Alan C. Elliott
stattutorials.com
Execution Phase
(Read data, Calculate)
Output Phase
(Create Data Set)
Compile Phase
DATA NEW;
INPUT ID $ AGE
TEMPC;
TEMPF=TEMPC*(9/5)
+32;
DATALINES;
0001 24 37.3
0002 35 38.2
;
run;
proc print;run;
If errors are
discovered, SAS
attempts to interpret
what you mean. If SAS
cant correct the error,
it prints an error
Alan C. Elliott, stattutorials.commessage to the log.
2
0
3
0
4
1
INPUT
BUFFER
6
2
7
4
9 10 11 12
3 7
.
3
Execution Phase
PROGRAM DATA VECTOR (PDV) is created and
contains information about the variables
_N_ _ERROR ID
_
1
0
AG TEMPC
E
.
.
TEMPF
.
Buffe
r
Buffer to PDV
1
9
3
10 11 12
7
3
PDV
Reads 1st
record_N_
1
_ERROR_
ID
AGE
TEMPC
TEMPF
000
1
24
37.3
000
1
AGE
TEMPC
TEMPF
24
37.3
99.14
Initially
missing
Calculated
value
Output Phase
The values in the PDV are
written to the output data set
(NEW) as the first observation:
_N_
_ERROR_
ID
AGE
TEMPC
TEMPF
000
1
24
37.3
99.14
24
37.3
TEMPF
99.14
From
PDV
_N_
1
Exceptions to Missing in
Initial values
PDV
usually set to
_ERRO
R_
0
I
D
AG
E
TEMP
C
TEMPF
missing in PDV
24
37.3
99.14
000
2
35
38.2
100.76
Descriptor Information
For the data set, SAS creates and
maintains a description about each SAS
data set:
data set attributes
variable attributes
the name of the data set
member type, the date and time that the
data set was created, and the number,
names and data types (character or
numeric) of the variables.
Alan C. Elliott, stattutorials.com
Name
NEW
Member
Type
DATA
File Size
5120
Last
Modified
20Nov13:0
8:59:32
Description output
continued
Observations
Variables
Indexes
Observation Length
2
4
0
32
0
Protection
Deleted
Observations
Compressed
Sorted
NO
Last Modified
WORK.NEW
DATA
V9
Wed, Nov 20, 2013
08:59:32 AM
Wed, Nov 20, 2013
08:59:32 AM
Label
Data Representation WINDOWS_64
Encoding
wlatin1 Western
(Windows)
NO
Description output
continued
Alphabetic List of Variables and Attributes
#
Variable
Type
Len
2
AGE
Num
8
1
ID
Char
8
3
TEMPC
Num
8
4
TEMPF
Num
8
Original Program
DATA NEW;
INPUT ID $ AGE TEMPC;
TEMPF=TEMPC*(9/5)+32;
DATALINES;
0001 24 37.3
0002 35 38.2
;
run;
proc print;run;
Original Program
DATA NEW;
INPUT ID $ AGE TEMPC;
Program output
TEMPF=TEMPC*(9/5)+32;
DATALINES;
0001 24 37.3
Obs ID
AGE TEMP TEMP
0002 35 38.2
C
F
1
0001 24
37.3 99.14
;
2
0002 35
38.2 100.76
run;
proc print;run;
Example of Error
DATA NEW;
INPUT ID $ AGE TEMPC;
TEMPF=TEMPC*(9/5)+32
DATALINES;
0001 24 37.3
0002 35 38.2
;
run;
proc print;run;
Missing Semi-colon
proc datasets ;
contents data=new;
run;
Alan C. Elliott, stattutorials.com
76
77
78
79
DATA NEW;
INPUT ID $ AGE TEMPC;
TEMPF=TEMPC*(9/5)+32
Error found during
DATALINES;
compilation
--------22
80
0001 24 37.3
---180
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *,
**, +, -, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE,
GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, ^=, |, ||,
~=.
ERROR 180-322: Statement is not valid or it is used out of proper order.
81
82
83
0002 35 38.2
;
run;
Summary - Compilation
Phase
During Compilation
Check syntax
Identify type and length of each new variable (is a
data type conversion needed?)
creates input buffer if there is an INPUT statement for
an external file
creates the Program Data Vector (PDV)
creates descriptor information for data sets and
variable attributes
Other options not discussed here: DROP; KEEP;
RENAME; RETAIN; WHERE; LABEL; LENGTH; FORMAT;
ARRAY; BY; ATTRIB; END=, IN=, FIRST, LAST, POINT=
Alan C. Elliott, stattutorials.com
End