Professional Documents
Culture Documents
Paper 19-28
Abstract
The SQL Procedure contains many powerful and Jurassic Park PG-13 127
elegant language features for advanced SQL users. Lethal Weapon R 110
This paper presents SQL topics that will help Michael PG-13 106
programmers unlock the many hidden features, National Lampoon's Vacat PG-13 98
options, and other hard-to-find gems found in the SQL Poltergeist PG 115
universe. Topics include CASE logic; the COALESCE Rocky PG 120
function; SQL statement options _METHOD, _TREE, Scarface R 170
and other useful options; dictionary tables; automatic Silence of the Lambs R 118
macro variables; and performance issues.
Star Wars PG 124
The Hunt for Red October PG 135
The Terminator R 108
Finding the First Non-Missing Value
The Wizard of Oz G 101
The SQL procedure provides a way to find the first
Titanic PG-13 194
non-missing value in a column or list. Specified in a
SELECT statement, the COALESCE function inspects
a column, or in the case of a list scans the arguments
from left to right, and returns the first non-missing or Summarizing data
non-NULL value. If all values are missing, the result is Although the SQL procedure is frequently used to
missing. display or extract detailed information from tables in a
database, it is also a wonderful tool for summarizing
When coding the COALESCE function, all arguments (or aggregating) data. By constructing simple queries,
must be of the same data type. The example shows data can be summarized down rows (observations) as
one approach on computing the total number of well as across columns (variables). This flexibility
minutes in the MOVIES table. In the event either the gives SAS users an incredible range of power, and
LENGTH or RATING columns contain a missing the ability to take advantage of several SAS-supplied
value, a zero is assigned to prevent the propagation (or built-in) summary functions. For example, it may
of missing values. be more interesting to see the average of some
quantities rather than the set of all quantities.
SQL Code
Without the ability to summarize data in SQL, users
PROC SQL; would be forced to write complicated formulas and/or
SELECT TITLE, routines, or even write and test DATA step programs
RATING,
(COALESCE(LENGTH, 0))
to summarize data. To see how an SQL query can be
AS Tot_Length constructed to summarize data, two examples will be
FROM MOVIES; illustrated: 1) Summarizing data down rows and 2)
QUIT; Summarizing data across rows.
Case Logic
SQL Code In the SQL procedure, a case expression provides a
way of conditionally selecting result values from each
PROC SQL; row in a table (or view). Similar to an IF-THEN
SELECT AVG(LENGTH) AS
Average_Movie_Length construct, a case expression uses a WHEN-THEN
FROM MOVIES clause to conditionally process some but not all the
WHERE RATING IN rows in a table. An optional ELSE expression can be
(PG, PG-13); specified to handle an alternative action should none
QUIT; of the expression(s) identified in the WHEN
condition(s) not be satisfied.
The result from executing this query shows that the
average movie length rounded to the hundredths A case expression must be a valid SQL expression
position is 124.08 minutes. and conform to syntax rules similar to DATA step
SELECT-WHEN statements. Even though this topic is
Results best explained by example, lets take a quick look at
the syntax.
Average_
Movie_Length CASE <column-name>
124.0769 WHEN when-condition THEN result-expression
<WHEN when-condition THEN result-expression>
<ELSE result-expression>
2. Summarizing data across columns END
Being able to summarize data across columns often
comes in handy, when a computation is required on A column-name can optionally be specified as part of
two or more columns in each row. Suppose you the CASE-expression. If present, it is automatically
wanted to know the difference in minutes between made available to each when-condition. When it is not
each PG and PG-13 movies running length with specified, the column-name must be coded in each
trailers (add-on specials for your viewing pleasure) when-condition. Lets examine how a case expression
and without trailers. works.
the PRODUCTS table with the value stored in the would produce a cross-reference listing on the user
macro variable MIN_PRODCOST using the INTO library PATH for the column TITLE in all DATA types.
clause. The results are displayed on the SAS log.
SQL Code
SQL Code
%MACRO COLUMNS(LIB, COLNAME);
PROC SQL NOPRINT; PROC SQL;
SELECT MIN(LENGTH) SELECT LIBNAME, MEMNAME
INTO :MIN_LENGTH FROM DICTIONARY.COLUMNS
FROM MOVIES; WHERE UPCASE(LIBNAME)=&LIB AND
QUIT; UPCASE(NAME)=&COLNAME AND
%PUT &MIN_LENGTH; UPCASE(MEMTYPE)=DATA;
QUIT;
%MEND COLUMNS;
SAS Log Results %COLUMNS(PATH,TITLE);
Results
Bio
References
Kirk Paul Lafler is a SAS Consultant and SAS
Lafler, Kirk.; Ten Great Reasons to Learn the SQL Certified Professional with 25 years of SAS software
Procedure, SAS Users Group International, 1999. experience. He has written four books and over one
hundred articles for professional journals and SAS
Lafler, Kirk.; Power SAS: A Survival Guide, First Edition;
User Group proceedings. Kirks popular SAS Tips
Apress, Berkeley, CA, USA, 2002. column appears regularly in the BASAS, SANDS, and
SAS Guide to the SQL Procedure: Usage and SESUG Newsletters. His expertise spans application
Reference, Version 6, First Edition; SAS Institute, design and development, training, and programming
Cary, NC, USA; 1990. using base-SAS, SQL, ODS, SAS/FSP, SAS/AF,
SAS SQL Procedure Users Guide, Version 8; SAS SCL, FRAME, and SAS/EIS software.
Institute Inc., Cary, NC, USA; 2000.
Comments and suggestions can be sent to:
SAS SQL Programming Tips: Version 8; Software
Intelligence Corporation, Spring Valley, CA, USA; Kirk Paul Lafler
2002. Software Intelligence Corporation
P.O. Box 1390
Spring Valley, California 91979-1390
E-mail: KirkLafler@cs.com
http://www.software-intelligence.com
Voice: 619.660.2400