You are on page 1of 23

Proc Transpose A Simple Tutorial

By Charles Patridge The Hartford 860-547-6644 Charles_S_Patridge@prodigy.net http://www.sconsig.com

A Simple Transposition The Input Data Set 1


OBS # 1 2 3 4 5 6 7 8 Tester1 22 15 17 20 14 15 10 22 Tester2 Tester3 25 21 19 18 Columns 19 become Rows 19 19 16 15 13 & 17 18 11 9 Rows become Columns 24 23 Tester4 21 17 19 19 13 19 10 21

The Output Data Set 2


_NAME_ COL1 Tester1 22 Tester2 25 Tester3 21 Tester4 21 COL2 15 19 18 17 COL3 17 19 19 19 COL4 20 19 16 19 COL5 14 15 13 13 COL6 15 17 18 19 COL7 10 11 9 10 COL8 22 24 23 21

PROC TRANSPOSE <DATA=input-data-set> <LABEL=label> <LET> <NAME=name> <OUT=output-data-set> <PREFIX=prefix>; BY <DESCENDING> variable-1 <<DESCENDING> variable-n> <NOTSORTED>;
COPY variable(s); ID variable; IDLABEL variable; VAR variable(s);

Options DATA= input-data-set names the SAS data set to transpose. Default: most recently created SAS data set LABEL= label specifies a name for the variable in the output data set that contains the label of the variable that is being transposed to create the current observation. Default: _LABEL_ LET allows duplicate values of an ID variable. PROC TRANSPOSE transposes the observation containing the last occurrence of a particular ID value within the data set or BY group. NAME= name specifies the name for the variable in the output data set that contains the name of the variable being transposed to create the current observation. Default: _NAME_

Options - continued

OUT= output-data-set names the output data set. If output-data-set does not exist, PROC TRANSPOSE creates it using the DATAn naming convention. Note: If a BY group in the input data set has more observations Default: than DATA other n BY groups, PROC TRANSPOSE assigns missing values in the output data set to the variables that have no corresponding input observations. PREFIX= prefix specifies a prefix to use in constructing names for transposed variables in the output data set. For example, if PREFIX=VAR, the names of the variables are VAR1, VAR2, . . . ,VARn.

Interaction: when you use PREFIX= with an ID statement, the value prefixes to the ID value.

Proc Transpose A simple example


proc format; value mymths 1="JAN" 2="FEB" 3="MAR" 4="APR" 5="MAY" 6="JUN 7="JUL" 8="AUG" 9="SEP" 10="OCT" 11="NOV" 12="DEC"; run;

Data rawdata; infile cards missover; input @1 gender $1. @3 date date9. amount;

year = year(date); month = month(date);


cards; F 01Feb2003 123 M 01Mar2003 57 F 01Mar2003 121 M 01Apr2003 63 ;;;;

run;

Proc Transpose A simple example - continued

proc sort data=rawdata out=rawdata; by gender year; run;

proc transpose data=rawdata out=tranpose (drop=_name_); by gender year; id month ; format month mymths3.;

var amount;
run;

proc print; run;

Proc Transpose A simple example


Obs gender year FEB MAR APR

1 2

F M

2003 2003

123 .

121 57

. 63

F 01Feb2003 123 M 01Mar2003 57

RAWDATA

F 01Mar2003 121 M 01Apr2003 63

Proc Transpose A simple example - continued


proc transpose data=tranpose out=rawout; by gender year; var feb mar apr;

run;

data rawout; set rawout; tmpdate = '01'||trim(_name_)||put(year,4.);

date = input(tmpdate,date9.);
amount = col1; if amount = . then amount = 0; drop tmpdate _name_ col1; format date date9.;

run;

proc print data=rawout; run;

Proc Transpose A simple example


Obs gender year date amount

1
2 3

F
F F

2003
2003 2003

01FEB2003
01MAR2003 01APR2003

123
121 0

4
5 6

M
M M

2003
2003 2003

01FEB2003
01MAR2003 01APR2003

0
57 63

Proc Transpose Can create records for missing OBS BUT only inclusive of existing end points!

Proc Transpose More Uses


/*** What if you need to have a full matrix of Data ***/ data template (keep=gender date amount year); length gender $ 1.; format date date9.; tmpdate = '01dec2002'd; amount = 0; do g = 1 to 2; do d = 1 to 12;

if g = 1 then gender = 'F';


if g = 2 then gender = 'M'; date = intnx('month', tmpdate, d ); output; end; end; run;
continued.

proc sort data=rawdata out=rawdata ; by gender date; run; proc sort data=template out=template; by gender date; run;

/*** merge template with raw data to create a full matrix of data ***/
data rawdataa; merge template (in=t) rawdata (in=r); by gender date; year = year(date); month= month(date);

run;

continued.

/*** now you would like to take vertical data to make horizontal data ***/ proc transpose data=rawdataa out=tranpose; by gender year; id month ; format month mymths3.; var amount; run;

/*** print to see what data now looks like ***/ proc print data=tranpose; run;

Obs

gender

year 2003 2003

_NAME_ amount amount

JAN 0 0

FEB 123 0

MAR 121 57

APR 0 63

MAY 0 0

JUN 0 0

JUL 0 0

AUG 0 0

SEP . 0 . 0 .

1 2

F M

You can see we have created buckets for all the months of the year JAN thru DEC Now, lets do some simple math for percentages, ytd #s and ytd %s.

Continued from prior data step


data rawdatab; set tranpose; array mth(*) jan jul

feb mar aug sep

apr oct

may jun nov dec ;

array pct(*) janpct febpct marpct aprpct maypct junpct julpct augpct seppct octpct novpct decpct; array ytd(*) janytd febytd marytd aprytd mayytd junytd julytd augytd sepytd octytd novytd decytd; array ytdp(*) janytp febytp marytp aprytp mayytp junytp julytp augytp sepytp octytp novytp decytp; Continued

Continued
/*** get yearly totals ***/ total = sum(of jan--dec); /*** get monthly percentages to yearly totals***/

do i = 1 to 12; pct(i) = mth(i) / total; end;


/*** get ytd totals ***/ do i = 1 to 12; if i = 1 then ytd(i) = mth(i); else ytd(i) = mth(i) + ytd(i-1); end; /*** get ytd percentages ***/

do i = 1 to 12; ytdp(i) = ytd(i) / total; end;


drop i; run; continued

Continued
/*** convert horizontal data to vertical data ***/ proc transpose data=rawdatab out=tranposeb; by gender year; var _numeric_; run; proc print data=tranposeb; run;

Continued output
Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 gender F F F F F F F F F F F F F year 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 _NAME_ year JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC amt 2003.00 0.00 123.00 121.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 14 15 16 17 18 19 20 21 22 23 24 25 F F F F F F F F F F F F 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 janpct febpct marpct aprpct maypct junpct julpct augpct seppct octpct novpct decpct 0.00 0.50 0.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Obs gender year _NAME_ amt

continued.

Continued output
Obs gender year _NAME_ amt Obs gender year _NAME_ amt

26 27 28 29 30 31

F F F F F F

2003 2003 2003 2003 2003 2003

janytd febytd marytd aprytd mayytd junytd

0.00 123.00 244.00 244.00 244.00 244.00

38 39 40 41 42 43 44 45 46 47 48 49 50

F F F F F F F F F F F F F

2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003

janytp febytp marytp aprytp mayytp junytp julytp augytp sepytp octytp novytp decytp total

0.00 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 244.00

32
33 34 35 36 37

F
F F F F F

2003
2003 2003 2003 2003 2003

julytd
augytd sepytd octytd novytd decytd

244.00
244.00 244.00 244.00 244.00 244.00

Lets Process this Last Transposed Dataset


data rawdatac; set tranposeb; length month $ 3. ;

_name_ = upcase(_name_); /*** make contents upper case ***/


if _name_ = "YEAR" then delete; if _name_ = "TOTAL" then delete; /*** do not need this ***/ /*** do not need this ***/

month = substr(_name_,1,3); /*** Get Name of Month ***/

if ' '

= substr(_name_,4,3) then mthly = amt; /*** get monthly amts ***/

if 'YTD' = substr(_name_,4,3) then YTD = amt; /*** get YTD amts ***/ if 'PCT' = substr(_name_,4,3) then PCT = amt; /*** get PCT amts ***/ if 'YTP' = substr(_name_,4,3) then YTDP = amt; /*** get YTD PCT amts ***/

/*** convert date field to sas date field ***/


tmpdate = '01' || month || put(year,4.); date = input( tmpdate, date9.); run; continued.

Continued
/*** summarize over the new variables created ***/ proc summary data=rawdatac nway missing;

classes gender date;


var mthly pct ytd ytdp; output out=summary (drop=_type_ _freq_) sum=;

run;

proc print; format date date7.;

run;

Final Output
Obs gender date mthly PCT YTD YTDP

1 2

F F

01JAN03 01FEB03

0 123

0.00000 0.50410

0 123

0.00000 0.50410

3
4 5 6 7 8 9 10 11

F
F F F F F F F F

01MAR03
01APR03 01MAY03 01JUN03 01JUL03 01AUG03 01SEP03 01OCT03 01NOV03

121
0 0 0 0 0 0 0 0

0.49590
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000

244
244 244 244 244 244 244 244 244

1.00000
1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000

12
13 14 15

F
M M M

01DEC03
01JAN03 01FEB03 01MAR03

0
0 0 57

0.00000
0.00000 0.00000 0.47500

244
0 0 57

1.00000
0.00000 0.00000 0.47500

Proc Transpose A Simple Tutorial The End


By Charles Patridge The Hartford 860-547-6644 Charles_S_Patridge@prodigy.net http://www.sconsig.com

You might also like