Professional Documents
Culture Documents
Powerpoint Templates
What is SAS?
Statistical Application Software
Powerpoint Templates
SAS/STAT
Procedures for various statistical analyses
SAS/GRAPH
Procedures and options to create graphs
SAS/ETS
Economic Time Series Time Series Analysis
SAS/OR
Operations Research Optimization, LP etc
SAS/EIS
Enterprise Information Systems for OLAP models
Enterprise Miner
Mining Package with various techniques
SAS/Intrnet
Web based application and portal development
Powerpoint Templates
Powerpoint Templates
Log Window
Powerpoint Templates
Output Window
Output Window
View the Values of a Dataset in this Window
Powerpoint Templates
PROC (procedure) steps are pre-written routines that enable you to analyze and process the data in a SAS data set
you can use PROC steps to create a report that lists the data produce descriptive statistics create a summary report produce plots and charts.
Powerpoint Templates
Powerpoint Templates
Data Portion
Contains Rows, Column and actual Value
Variable Attributes
Name Type Length Format Informat Label
Powerpoint Templates
SAS Libraries
Every SAS file is stored in a SAS library, which is a collection of SAS files. A SAS data library is the highest level of organization for information within SAS. General form, basic LIBNAME statement:
LIBNAME libref 'SAS-data-library';
where libref is 1 to 8 characters long, begins with a letter or underscore, and contains only letters, numbers, or underscores. SAS-data-library is the name of a SAS data library in which SAS data files are stored. The specification of the physical name of the library differs by operating environment.
Powerpoint Templates
10
Powerpoint Templates
11
To reference temporary SAS files, you can specify the default libref Work, a period, and the filename.
For example, the two-level name Work.Test Alternatively, you can use a one-level name (the filename only) to reference a file in a temporary SAS library. When you specify a one-level name, the default libref Work is assumed.
If the USER library is assigned, SAS uses the User library rather than the Work library for one-level names. User is a permanent library. So referencing a SAS file in any library except Work indicates that the SAS file is stored permanently.
Powerpoint Templates
12
Powerpoint Templates
13
Powerpoint Templates
14
Basic Report
You can easily list the contents of a SAS data set by using a simple program like the one shown below.
libname clinic 'your-SAS-data-library'; proc print data=clinic.admit; run;
You can produce column totals for numeric variables within your report.
libname clinic 'your-SAS-data-library'; proc print data=clinic.admit; sum fee; run;
Powerpoint Templates
15
Powerpoint Templates
16
Meaning
equal to
Example
where name='Jones, C.';
^= or ne
> or gt < or lt >= or ge <= or le
not equal to
greater than less than greater than or equal to less than or equal to
Powerpoint Templates
17
More Operators
Using the CONTAINS Operator
The CONTAINS operator selects observations that include the specified substring. The mnemonic equivalent for the CONTAINS operator is ?
where firstname CONTAINS 'Jon'; where firstname ? 'Jon';
IN operator
where actlevel in ('LOW','MOD'); where fee in (124.80,178.20);
Between And
Where date Between 21Dec2010d And 20Jan2011d;
Like Operator
Where name like _uj%;
Powerpoint Templates 18
LIBNAME statement
FILENAME statement DATA statement INFILE statement INPUT statement RUN statement PROC PRINT statement RUN statement
Powerpoint Templates
20
Powerpoint Templates 21
Powerpoint Templates
22
Powerpoint Templates
23
Powerpoint Templates
24
SAS Expressions
An expression is a sequence of operands and operators that form a set of instructions. The instructions are performed to produce a new value:
Operands are variable names or constants. They can be numeric, character, or both. Operators are special-character operators, grouping parentheses, or functions.
Powerpoint Templates
25
**
* / + -
exponentiation
multiplication division addition subtraction
raise=x**y;
mult=x*y; divide=x/y; sum=x+y; diff=x-y;
I
II II III III
When you use more than one arithmetic operator in an expression, operations of priority I are performed before operations of priority II, and so on consecutive operations that have the same priority are performed
from right to left within priority I from left to right within priorities II and III
Powerpoint Templates
27
Chennai 3
Calcutta 4 ; run ;
Powerpoint Templates
*
28
Compilation phase
A SAS DATA step is processed in two phases:
Powerpoint Templates
31
Execution phase
If the DATA step compiles successfully, then the execution phase begins.
During the execution phase, the DATA step reads and processes the input data. The DATA step executes once for each record in the input file
Compile Program Execution Phase
Initialize variables to missing
End of File ?
No
Yes
Powerpoint Templates
32
Powerpoint Templates
33
Powerpoint Templates
34
Syntax Checking
During the compilation phase, SAS scans each statement looking for following syntax errors.
missing or misspelled keywords invalid variable names missing or invalid punctuation invalid options.
Powerpoint Templates
35
Input Data
The INFILE statement identifies the location of the raw data.
Input Pointer
INPUT statement uses an input pointer to keep track of its position The input pointer starts at column 1 of the first record, unless otherwise directed
Powerpoint Templates 36
Powerpoint Templates
37
data perm.update; infile invent; input Item $ 1-13 IDnum $ 15-19 InStock 21-22 BackOrd 24-25; Total=instock+backord; run;
Powerpoint Templates
38
Powerpoint Templates
39
Column input
Powerpoint Templates
40
Free-Format Data
What is free format data
Data that is not arranged in columns The fields are often separated by blanks or by some other delimiter
Powerpoint Templates
41
Fixed-Field Data
What is Fixed-Field Data
Data is arranged in columns or fixed fields You can specify a beginning and ending column for each field
Powerpoint Templates
42
Calcutta 4
NOTE: The infile "C:\training\sample1.txt" is: RECFM=V, LRECL=256 The minimum record length was 8. The maximum record length was 10.
Powerpoint Templates
*
44
Powerpoint Templates
45
NOTE: SAS went to a new line when INPUT statement reached past the end of a line. NOTE: The data set WORK.RUNNERS has 2 observations and 5 variables. All missing data must be indicated by a period All values are separated by at least one space Character data are eight characters or fewer Should not have embedded spaces
Powerpoint Templates
*
46
name=Jon surname=K age=13 runtime1=25.1 runtime2=. _ERROR_=1 _N_=3 NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
Powerpoint Templates
*
47
Powerpoint Templates
48
Powerpoint Templates
50
Powerpoint Templates
51
Powerpoint Templates
52
No placeholder is required for missing data. A blank field is read as missing and does not cause other fields to be read incorrectly. Fields or parts of fields can be re-read. Fields do not have to be separated by blanks or other delimiters.
Powerpoint Templates
53
Powerpoint Templates
54
Powerpoint Templates
55
Powerpoint Templates
56
Powerpoint Templates
57
Using Informats
An informat is an instruction that tells SAS how to read raw data SAS provides many informats for reading standard and DATEw. NENGOw. nonstandard data values PERCENTw.d $BINARYw. DATETIMEw. PDw.d Note that
$w. COMMAw.d HEXw. JULIANw. MMDDYYw. PERCENTw. TIMEw. w.d
each informat contains a w value to indicate the width of the raw data field each informat also contains a period, which is a required delimiter for some informats, the optional d value specifies the number of implied decimal places informats for reading character data always begin with a dollar sign ($).
Powerpoint Templates 58
Powerpoint Templates
59
blanks commas dashes dollar signs percent signs right parentheses left parentheses, which are converted to minus signs
COMMA w. d
the informat name a value that specifies the width of the field to be read (including dollar signs, decimal places, or other special characters), followed by a period an optional value that specifies the number of implied decimal places for a value (not necessary if the value already contains decimal places).
Powerpoint Templates
60
A SAS datetime is a special value that combines both date and time information. A SAS datetime value is stored as the number of seconds between midnight on January 1, 1960, and a given date and time.
Powerpoint Templates
61
DATEw. Informat
Reads ddmmmyy or ddmmmyyyy
Date Expression 30May00 30May2000 30-May-2000 SAS Date Informat DATE7. DATE9. DATE11.
Powerpoint Templates
62
Five is the minimum acceptable field width for the TIMEw. informat.
Powerpoint Templates 63
Monday
Monday, Apr 5, 99 Monday, April 5, 1999
Powerpoint Templates
64
Powerpoint Templates
65
Note that the raw data file must contain the same number of records for each observation that is being created
Powerpoint Templates
66
Powerpoint Templates
67
Powerpoint Templates
68
Powerpoint Templates
69
Points to Remember
Because the / pointer control can only move forward, the pointer control is specified after the values in the current record are read. The #n pointer control can read records in any order and must be specified before the variable names are defined. A semicolon should be placed at the end of the complete INPUT statement.
Powerpoint Templates 70
Powerpoint Templates
71
Using the Double Trailing At Sign (@@) to Hold the Current Record
Typically, each time a DATA step executes, the INPUT statement reads a new record. When the trailing @@ is used, the INPUT statement holds the current record and reads the next value. The double trailing at sign (@@)
Holds the data line in the input buffer across multiple executions of the DATA step Typically is used to read multiple SAS observations from a single data line Should not be used with the @ pointer control, with column input, nor with the MISSOVER option.
Powerpoint Templates
72
Using the Double Trailing At Sign (@@) to Hold the Current Record
A record that is being held by the double trailing at sign (@@) is not released until one of the following events occurs:
The input pointer moves past the end of the record. Then the input pointer moves down to the next record. An INPUT statement that has no line-hold specifier executes.
Powerpoint Templates
73
Using the Single Trailing At Sign (@) to Hold the Current Record
Like the double trailing @@, the single trailing @
Enables the next INPUT statement to read from the same record Releases the current record when a subsequent INPUT statement executes without a line-hold specifier.
It's easy to distinguish between the trailing @@ and the trailing @ by remembering that
the double trailing at sign (@@) holds a record across multiple iterations of the DATA step until the end of the record is reached. the single trailing at sign (@) releases a record when control returns to the top of the DATA step.
Powerpoint Templates
74