Professional Documents
Culture Documents
INTRODUCTION
One way to break out of that vicious circle of "can't get a job without experience, can't get experience without a job" is to take
advantage of the many opportunities to use open data to help inform your community. Open data, that is data freely available to
anyone to use or publish, is available from an enormous range of government and non-profit sources. My personal favorite sources
for open data are the websites from data.gov , U.S. Census Bureau and National Center for Education Statistics. These three sites
alone provide several hundred thousand data sets to choose from. Applying your SAS skills to open data gives you experience with
different data set types, statistical techniques and procedures as well as the potential for doing some good for your community.
This is the second of a three-paper series on SAS applied to open data. There is often substantial work involved in preparing a data
set for analysis and an earlier paper dealt with that process (De Mars, 2011a). Were going to assume that work has already been
done. The next step is to produce presentation quality results.
Not only is this table really ugly, its not even usable for a presentation. Most middle school students are not going to be able to
interpret numbers in scientific notation. Wed like to get rid of the date, give it a title other than The SAS System, have Black and
White show up for race instead of 0 and 1. Wed also like to get rid of the scientific notation and have numbers as human beings
read them, not calculators. According to the SAS Procedures Guide for Version 8 (SAS Institute, 1999), "When scientific notation is
used, only the first few significant digits are shown. If you need more significant digits than PROC FREQ displays, create an output
data set by specifying OUT= in the TABLES statement. Then use PROC PRINT and assign an appropriate format to the variable
COUNT."
The later documentation has been silent on this issue, so if there is a simple way to get rid of scientific notation in PROC FREQ,
I havent found it. We want to create an output data set anyway. Think about the fact that you are working with three million records.
Charting, sorting, applying IF statements based on the value of 3,000,000+ records is inefficient. So, our challenge is to both become
more efficient and change our output to the improved version below, in one and the same program.
Population Distribution by Race
Race
includes
Black
Race
includes
White
2009
Population
Percent
of
Population
No
Yes
234,175,873
76.3
Yes
No
38,805,561
12.6
No
No
31,876,214
10.4
Yes
Yes
2,148,908
0.7
Here is the code to produce the table above. At first glance, it may seem an unreasonable amount of code for one little table, but
there is a method at work here, trust me.
OPTIONS NODATE NONUMBER ;
PROC FORMAT ;
VALUE $YN
"0" = "No"
"1" = "Yes" ;
PROC FREQ DATA = lib.pums9 NOPRINT;
TABLES racblk* racwht / OUT = lib.blkwhitmix ;
WEIGHT pwgtp ;
PROC SORT DATA = lib.blkwhitmix ;
BY DESCENDING PERCENT ;
ODS RTF FILE = "C:\Users\AnnMaria\Documents\pc_pus\sasout\RaceDist.rtf" STYLE = OCEAN ;
TITLE "Population Distribution by Race" ;
FOOTNOTE "2009 American Community Survey Data" ;
PROC PRINT DATA = lib.blkwhitmix SPLIT = " " ;
ID racblk ;
VAR racwht COUNT PERCENT ;
FORMAT COUNT COMMA14. PERCENT 8.1 racblk racwht $yn. ;
LABEL COUNT = "2009 Population"
PERCENT = "Percent of Population" ;
WHAT WERE DOING WITH THIS PROGRAM AND WHY WE ARE DOING IT
Starting from the top of our program
OPTIONS NODATE NONUMBER ;
This removes the date and number from the first line of our output.
PROC FORMAT ;
The format procedure begins with PROC FORMAT statement.
VALUE $YN
"0" = "No"
"1" = "Yes" ;
The VALUE statement will create a format. You can have as many VALUE statements as you like. A few points to note here:
Even if the values are numbers, if the variable to which your format is going to be applied is a character variable, you need to
put those numbers in quotes, just like any time you are referencing a character variables value.
This format is temporary because I did not store it anywhere. Just like a temporary data set, when the program is ended, this
format will be gone. For this reason, you probably want to give it some thought before applying a temporary format to
variables stored in a permanent data set.
With these two statements, I have created a new temporary format. A format has to be defined before it can be used in your SAS
program, which is why it is a good habit to put your FORMAT procedure before any other DATA or PROC steps.
PROC FREQ DATA = lib.pums9 NOPRINT;
NOPRINT option doesnt really matter for this case, but it s a good habit to get into when you dont need the printed output, as with
some variables, for example, income, the procedure could produce thousands of lines of useless output.
TABLES racblk* racwht / OUT = lib.blkwhitmix ;
This TABLES statement will produce a cross-tabulation with the first variable being the row variable and the second one the column
variable. Dont forget the * . If you leave out the asterisk youll get two tables, one a frequency distribution of the variable racblk and
a second a frequency distribution of the racwht variable. The OUT = option will write the counts and frequencies to a data set.
WEIGHT pwgtp ;
Many open data sets are surveys and usually will include a WEIGHT statement. Dont forget this!!! In the case of the American
Community Survey, leaving off the WEIGHT statement means that your counts will be off by a factor of 101.
To see your output data set, click on the EXPLORER tab, double-click on the LIBRARIES tab, double-click on the name of the library,
in this case, Lib, and then double-click on the data set. The data set created by the FREQ procedure can be seen in the window to
the right.
The FORMAT statement specifies that the count variable will have a width of 14, and include commas. The percent variable will have
a width of 8 with one decimal place. Since there are two variables listed before the $yn format, both racblk and racwht will use the $yn
format we created above. NOTE! I created a temporary format and I am using it in the PROC PRINT step. A format used in a PROC
step does not permanently change the format of the stored variable, it only changes it for that step. As a general rule, avoid using
temporary formats for permanent data sets if you can and you will run into fewer format error problems.
LABEL COUNT = "2009 Population"
PERCENT = "Percent of Population" ;
The LABEL statement puts a label for each variable, and because the SPLIT = option was used, these variables will be split to a new
line between words.
WHAT WERE DOING WITH THIS PROGRAM AND WHY WE ARE DOING IT
We want to create a new variable, race, based on the survey respondents checked answers to the two boxes for black and white,
that is the variables racblk and racwht. Here is the first time well be glad we created an output dataset. Rather than perform the logic
in the DATA step for the 3,030,728 records in the PUMS data set, we only need to do it for the four records in the output file from the
frequency procedure.
DATA byrace ;
SET lib.blkwhitmix ;
IF racblk = 1 AND racwht = 0 THEN Race
ELSE IF racblk = 0 AND racwht =
ELSE IF racblk = 0 AND racwht =
ELSE IF racblk = 1 AND racwht =
PERCENT = PERCENT/ 100 ;
run;
=
1
0
1
"Black" ;
THEN Race = "White" ;
THEN Race = "Other" ;
THEN Race = "Mixed" ;
In the data set saved by the frequency procedure, PERCENT is not saved as a decimal, rather, 40.1% is saved as 40.1. For later use,
we want that to be an actual decimal, so divide it by 100.
Note that the OPTIONS, TITLE and FOOTNOTE statements earlier in our program set the title and footnote, removed the number
and date. We dont need to do it again. All of these - TITLE, FOOTNOTE, OPTIONS - will remain the same throughout our program
unless we use another TITLE, FOOTNOTE or OPTIONS statement t change them.
These next six statements will also apply to any relevant output throughout our program.
AXIS1 LABEL = ( ANGLE = 90 "Percent") ORDER = (0 to 1 by .1 ) ;
The first part of this statement labels the axis with the text in quotes, in this case Percent. It also sets the angle for the axis label
rotation to be 90 degrees, in other words, it prints sideways instead of at the end of the axis. The ORDER = option specifies that the
axis minimum will be 0 and maximum will be 1 with tick marks at .1 . Without the ORDER = , the axis is set based on the data. In this
case, it would have had a maximum of 80%.
AXIS2 ORDER = ("White" "Black" "Mixed" "Other" ) ;
This statement specifies the order to display the categories. Because the main point of this chart is for the students to use in
discussing whether the Mixed group should be include in a comparison of black and white survey respondents, we wanted it put
right after the black and white bars. The ORDER = option forces that order. Without this option, the responses would have been in
alphabetical order.
PATTERN1
PATTERN2
PATTERN3
PATTERN4
COLOR = BLACK ;
COLOR= GRAY ;
COLOR = BROWN ;
COLOR=WHITE ;
Without PATTERN statements, SAS will pick the colors by default. It makes a more effective graphic to have the bar representing
Black respondents colored black and the one representing White respondents colored white. NOTE! If you look back at the graph,
this can be confusing, since the first bar is white, the second is black, the third gray and the fourth brown. This doesnt seem to match
the PATTERN statements. It does, however, if you consider the fact that the PATTERN statements are separate from the AXIS
statements. The PATTERN statement for the response variables is assigned in alphabetical order. Since black is first, it is assigned
the color from PATTERN1.
PROC GCHART DATA=byrace ;
This statement begins the GCHART procedure, using the data set byrace.
WHAT WERE DOING WITH THIS PROGRAM AND WHY WE ARE DOING IT
First, go back to the PROC FORMAT and create one more format, for sex. The PROC FORMAT is now:
PROC FORMAT ;
VALUE
"0" =
"1" =
VALUE
1 =
2 =
$YN
"No"
"Yes" ;
$sex
Male
Female ;
Having learned from our earlier experience, the first thing we are going to do is create an output data set from the frequency
procedure.
PROC FREQ DATA = lib.pums9 ;
TABLES sex*racblk / OUT = lib.blkwhtsex ;
TABLES st*racblk
/ OUT = lib.blkwhtst ;
WEIGHT pwgtp ;
WHERE racblk = "1" OR racwht = "1" ;
run ;
Remember the WEIGHT statement!!! Even though I said that above, Im reminding you here because forgetting the WEIGHT
statement is a common novice mistake and, in this case, it will cause your results to be, literally, 10,000% wrong. Thats a lot of
wrong. The second TABLES statement is not needed for the PROC TABULATE output at the end, but since well need the data set of
race by state later, we went ahead and created it in this step. In the actual project, there were a lot of TABLES statements in this step.
The only new statement here from our first step is the WHERE statement. Having concluded our discussion above we have decided
to drop out the Other category and include those who had checked both categories. We also decided to consider everyone who
checked Black for their race as black, whether they also checked White or not. If some students disagree with us, that is good
because the point of this whole project with the schools is to get them talking and thinking about statistics. If they think our
designation is wrong or unfair, this is going to be the most passion theyve ever had about statistics.
Were going to do the same two things with a lot of data sets, because now we have made two decisions. The first is to use the output
from PROC FREQ for analysis, so were going to be dividing that percent variable by 100 each time. The second is to categorize
people who selected Black as their race as black. Whenever you find yourself doing the same bit of code over and over, think about
creating a macro. Macro programming is not nearly as scary as some people make it out to be. The trick is to start early in your career
with very simple macros and just get progressively more complex. The example below, with only one macro parameter, is about as
simple as you can get. Although we only use it one time in this paper, in the actual project, we used it over and over.
Lets look at this macro line by line.
%MACRO mkrace(dsn) ;
Create a macro named mkrace and specifies that this macro will require one parameter, which is named dsn.
DATA &dsn ;
This statement creates a data set. When the macro is run &dsn will be replaced with whatever we provided when we called the
macro.
SET lib.&dsn ;
This statement reads in a data set from the library referenced by lib and named whatever value I had supplied for &dsn.
IF racblk = 1 THEN Race = "Black" ;
ELSE IF racblk = 0 THEN Race = "White" ;
Percent = percent/ 100 ;
RUN;
These are just IF, ELSE and assignment statements like every other IF, ELSE and assignment statement you have written in your life.
The fact that they occur in the middle of a macro makes no difference whatsoever.
%MEND mkrace ;
This sends the mkrace macro. Now, to call this macro, all I need to do is:
%mkrace(blkwhtsex) ;
Before moving on to the next procedure, lets recap what we did here, because its important. We used a PROC FREQ to create a
couple of permanent SAS data sets. The first one, blkwhtsex, included four records. We read this tiny data set and created a new,
temporary data set, also four records, with a new variable, race, and the variable percent now in a decimal format.
Your mileage may vary. There are a couple of choices I made here for reasons of my
own. I mention these choices because part of becoming an experienced programmer is making
decisions and judgments. Even if your decision is to copy an example, you should know why the
example includes the specific choices it does. Here are the choices I made and why.
I did not supply a libref for the project directory. I always used lib in the LIBNAME
statement in this project because it saves me having to specify a library as well as a
data set name when I use a macro. To see how to specify both the library and data set,
see the earlier paper (De Mars,2011a).
Could I have just created a format using PROC FORMAT for race? Yes. The reason I chose not to do that is,
This is a temporary data set with four records. The time and storage to read every record and create a new variable
is as close to nothing as one could get. Thus, the advantage of using PROC FORMAT in many cases, that is, it is
faster and takes up less storage space than creating a new variable, is really irrelevant, and
I am going to use this race variable a lot. The odds of me forgetting to apply the format at some point and having to
re-run the analysis to produce some output is great. Given this, its less trouble for me to create the macro.
Now, were going to do a PROC TABULATE using this temporary data set
count*F=COMMA12.0
TABLE race* sex ALL, count*(SUM= ' '*F=COMMA12.0) percent*(SUM = ' '*F=PERCENT8.1) ;
The first part of our TABLE statement, then, requests statistics for race by sex and for the total population. The second part
specifies the first column variable will be count, with the SUM statistic, and this statistic will not have a label over it, that is, the label
text is a blank space. The format will be a width of 12, 0 decimal places and commas. The second column variable will be percent,
with the SUM statistic, again, no label, and in a percent format with one decimal place.
LABEL Count = "2009 Population"
Percent = "Percent" ;
FORMAT sex $sex. ;
These last two statements should be familiar from above. These just define the labels for the two column variables and specify the
format for sex, which uses the $sex format we created in the PROC FORMAT.
Population Distribution by Race
2009
Population
Percent
Race
Sex
Black
Male
19,565,078
7.1%
Female
21,389,391
7.8%
Male
115,771,666
42.1%
Female
118,404,207
43.0%
275,130,342
100.0%
White
All
To force the percentages to fit specific categories, another VALUE statement was added to the PROC FORMAT at the top of our
program.
VALUE grays
LOW - .002
.003 - .005
.006 - .009
.010 - high
=
=
=
=
"<2%"
"3-5%"
"6-9%"
"10 - 12%" ;
The syntax
LOW - some number = formatted value
assigns the formatted value on the right hand side of the equals sign to all of the values from the minimum in the data set to the
specified number. Similarly,
some number - HIGH = formatted value
will assign the values from the specified number to the maximum value.
The rest of the program is:
%mkrace(blkwhtst) ;
DATA blkwhtst ;
SET blkwhtst ;
STATE = INPUT(st,BEST8.) ;
WHERE racblk = "1" ;
pct = ROUND(percent,.001) ;
TITLE "African-American Population by State " ;
TITLE2 "By Percent" ;
PATTERN1 COLOR = White ;
PATTERN2 V=M3N45 color=black;
PATTERN3 COLOR= Gray ;
PATTERN4 COLOR= Black ;
PROC GMAP DATA = blkwhtst
MAP = MAPS.US ;
ID STATE ;
CHORO pct / DISCRETE STATISTIC=MEAN ;
WHERE STATE NE 72 ;
FORMAT pct grays. ;
LABEL pct = "Percentage African-American" ;
WHAT WERE DOING WITH THIS PROGRAM AND WHY WE ARE DOING IT
%mkrace(blkwhtst) ;
This calls our macro we created above (remember our macro?), creates a temporary data set named blkwhtst, sets the value of
percent to a decimal and creates a variable named race. Its reading in the permanent data set named blkwhtst that we created in
the PROC FREQ step earlier.
Before I can use the MAPS.US data set I need to have a variable in my data set that matches it. This next step creates a
variable to match the STATE variable in the MAPS.US data set. It also does a little more clean up of the data set while youre at it.
DATA blkwhtst ;
SET blkwhtst ;
STATE = INPUT(st,BEST8.) ;
After reading in the data from the blwhtst data set, our assignment statement creates a new, numeric variable, STATE. The
INPUT function inputs the st variable in a numeric format. If there were character values in this field, that could cause problems, but
there are no characters, just numbers 1 - 72.
where racblk = "1" ;
Because I want to map the percentage of African-American residents in each state, I only need to keep the records where the
respondent checked black as his or her race.
pct = ROUND(percent,.001) ;
The use of the ROUND function will round the variable, percent, to the nearest .001. Without this, SAS would map each value of
percent with a different color. There is another reason for creating a new variable here. PERCENT is a keyword in the GMAP
procedure. It is generally both a bad idea and confusing to use keywords as variable names.
TITLE "African-American Population by State " ;
Title2 "By Percent" ;
Now I need to change the title. This will replace the previous TITLE statement and add a second title line underneath. Notice
that the footnote on the graph stays the same, since there is no new FOOTNOTE statement.
PATTERN1 COLOR = White ;
PATTERN2 V=M3N45 color=black;
PATTERN3 COLOR= Gray ;
PATTERN4 COLOR= Black ;
The previous patterns started with black. The patterns are used from the lowest percentage of our variable to be graphed African-American residents to the highest. It would be confusing to have the states with the lowest percentage of African-Americans
black and those with the highest percentage shown on the chart in white. PATTERN1, the states with the lowest percentage, will now
show up in white. We need something between white and gray, though. The V = option on the PATTERN statement gives a value for
When you select Graph-N-Go, a new window will pop-up with an icon you are supposed to recognize as a SAS dataset.
Click on that and a new window will pop up with a the words SAS dataset at the top, an empty box and, next to it, a button with
three dots, causing you to ask yourself, What the heck am I supposed to do now? The answer is to click on the button with the .
Click on that and yet another window will pop up. The next window should look familiar. It has the libraries available to you in the
left pane, including the WORK library, SASUSER, MAPS and any libraries you might have defined with a LIBNAME statement. Select
the library you want to use. Then, in the right pane, select the dataset you want to use. In this case we are going to select the
WORK library and the dataset named competitors.
On the left of the window are several buttons. We want a line plot, so were going to click on the line plot and drag it to the large
pane in the bottom right. An empty box appears with the title Plot 1. We right-click on the empty box and from the drop-down menu
select PROPERTIES.
In the pop-up window is a drop-down menu with the title DATA MODEL. By this point we are wondering if it might be easier to
learn SAS/GRAPH after all, but we forge ahead, selecting from the drop-down menu the one dataset that we identified previously,
work.competitors.
There are five tabs at the top of the PROPERTIES window, these are General, Data, Titles/Footnotes, Appearance and
Object Size. Were going to click on DATA tab and from the drop down menu next to X, select Males as the variable that we want
to plot and under Y, well select Year. Well also select REGRESSION from the drop-down menu under PLOT STYLES.
Well click the TITLES tab and give a title for the plot.
We click OK and the chart below is produced. If we didnt like the size, we could right-click on it, select Grow/Shrink and then
drag on the side of the plot to increase or decrease its size, or check the box next to MAXIMIZE, which will make the graph the
maximize to size to fit in the window.
We click on MAXIMIZE and are happy with this size, so we simply right-click on the chart, pick EXPORT and from the options
select External File. There will be a pre-filled default directory, name and type, something like :
C:\Users\Yourname\My SAS files\9.2\males.bmp
If you want to change any of that, to the right is the ubiquitous box with the three dots again. Click on that and a new pop-up
window will allow you to change the folder, file name and type.
Here is our plot and it seems pretty clear that there is a downward trend. The middle line is our regression line, showing that the
prediction is a straight line downward. The two dashed lines are the confidence intervals.
The plot for male competitors worked fine but when we do the same steps to get a plot for female competitors it looks decidedly odd.
There appears to be an upward trend to a point, and then a downward trend. In 1988, womens competition was added to the
Olympics for the first time. I speculate that this may have caused womens competition to swing up, counter to the overall downward
trend, but then after the excitement of having qualified as an Olympic sport faded, they, too would show a downward trend. Also,
elections are held for a new board of the National Governing Body each Olympic year and they take office the following year.
To test these hypotheses, that there was an upward trend followed by a downward trend, we can use PROC REG, the SAS
regression procedure .
ODS GRAPHICS ON ;
This statement turns ODS Statistical Graphics output on. If you have not used ODS Graphics yet, you need to try it. Simply put, SAS
tries to guess what you would most likely want as graphics output and produces it. Its as simple as that.
PROC REG ;
MODEL females = year / STB ;
WHERE year < 2002 ;
The PROC REG statement calls the regression procedure. It will use the most recently created data set which is the temporary
file created above. The MODEL statement gives the dependent variable (females) = the independent (year). The option STB is for
standardized regression coefficient. More information about standardized coefficients can be found in the related paper on Statistics
for Hamsters (De Mars, 2011b). The WHERE statement selects only those records where the year is less than 2002.
The next procedure is identical except that for the WHERE statement and produces the same analyses for the years after 2001.
PROC REG ;
MODEL females = year / STB ;
WHERE year > 2001 ;
RUN;
ODS GRAPHICS OFF ;
The statement at the end turns ODS graphics off.
The REG procedure with ODS graphics produces a lot of output. This is the reason you probably want to turn it off if you dont
specifically need the graphics. In with all of the other charts is the one below that addresses our particular question. On the right side
it gives the R-Square value of .1501. The square root of this, that is the R value, is .39. In other words, we can tell the inquiring minds
that want to know that from 1990 to 2001 there was a correlation of about .40 between year and the number of competitors, which
means the number of competitors was increasing each year.
Examining the output for our second PROC REG, we find this next plot. This plot has an R-square of .63. In other words, the
correlation between year and the number of female competitors is -.80. You dont really need the numbers though to see what we had
here was a somewhat modest upward trend followed by a very steep downward trend in the number of competitors. What to do about
it is the decision of the people in the organization, but the facts are very hard to deny when presented in this manner. The number of
competitors is clearly in decline for both males and females, a trend that has been going on for over a decade for women and much
longer for men.
CONCLUSION
One advantage of using open data has over the data sets used with most textbooks is the potential for analysis of big data sets.
These analyses almost force the programmer to learn more efficient techniques for processing data. While the first example seems an
awful lot of effort to produce a single table, most of this work was re-used over and over throughout out example. The output data set
created in the frequency procedure was used repeatedly, the TITLE ,FOOTNOTE and OPTIONS statement applied to several graphs
and tables. The PATTERN and AXIS statements applied to several charts. The formats created in the PROC FORMAT step were
also used in various output produced for this project. Completing a textbook exercise, one might not see the advantages of going
through all of these extra steps for one chart or table. We have the numbers from the PROC FREQ, you can just make a table in
Word or PowerPoint and insert those numbers, a new programmer is likely to complain. To make one graph or table, that is probably
true, but when there are multiple tables and graphs to be produced, the time put in up front pays off. Similarly, repeating code to
perform a simple task like creating a new variable or formatting a variable can be repeated using a simple macro.
A second advantage of the use of open data is that, given the number of data sets available, these can be used for almost any type of
project, procedure or analysis that the programmer wishes to experience.
The third advantage, as can be seen from our last example, is that even small, simple data sets can lend themselves to a moderately
sophisticated statistical analysis.
A further advantage of the use of open data occurs when analysis is done to assist a particular audience. This in itself is a learning
experience. The days of transom engineering are over. The value of the ability to produce accurate numbers is greatly increased
when paired with the ability to convey information based on that information. Presenting national demographics to a class of seventhgraders or presenting regression analyses to an audience of judo coaches are real challenges that cause the programmer to seek
new and better means of presentation.
Creating and implementing an open data project for a community program provides experience not just in trying different SAS
techniques but also in tailoring the output of those to the needs of the intended audience. Not only does the community organization
served benefit from this technique, but it also increases the marketable skills of the programmer and provides him or her a larger
portfolio to point to of statements, options and procedures with which he or she has professional experience.
REFERENCES
Besler, L. (2007). Communication-effective pie charts. Presentation at the annual meeting of the SAS Users Group International.
www2.sas.com/proceedings/forum2007/134-2007.pdf
De Mars, A. (2011a). SAS Functions for a Better Functioning Community. Paper presented at the annual meeting of Western
Users of SAS Software. San Francisco, CA.
De Mars, A. (2011b). SAS Essentials III: Statistics for Hamsters. Paper presented at the annual meeting of Western Users of
SAS Software. San Francisco, CA.
SAS Institute (1999) SAS Procedures Guide. SAS Institute Inc, Cary, NC
SAS Institute (2011).SAS/GRAPH(R) 9.2: Reference, Second Edition SAS Institute, Cary, NC.
ACKNOWLEDGMENTS
Thank you to Kirby Posey of the U.S. Census Bureau for invaluable assistance in verifying the variable coding and estimates.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
AnnMaria De Mars
The Julia Group
2111 7th St. #8
Santa Monica, CA 90405
(310) 717-9089
annmaria@thejuliagroup.com
http://www.thejuliagroup.com
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the
USA and other countries. indicates USA registration.
Other brand and product names are trademarks of their respective companies.