Professional Documents
Culture Documents
PROC TRANSPOSE <DATA=input-data-set> <LABEL=label> <LET> <NAME=name> <OUT=output-data-set> <PREFIX=prefix>; BY <DESCENDING> variable-1 <<DESCENDING> variable-n> <NOTSORTED>;
COPY variable(s); ID variable; IDLABEL variable; VAR variable(s);
Options DATA= input-data-set names the SAS data set to transpose. Default: most recently created SAS data set LABEL= label specifies a name for the variable in the output data set that contains the label of the variable that is being transposed to create the current observation. Default: _LABEL_ LET allows duplicate values of an ID variable. PROC TRANSPOSE transposes the observation containing the last occurrence of a particular ID value within the data set or BY group. NAME= name specifies the name for the variable in the output data set that contains the name of the variable being transposed to create the current observation. Default: _NAME_
Options - continued
OUT= output-data-set names the output data set. If output-data-set does not exist, PROC TRANSPOSE creates it using the DATAn naming convention. Note: If a BY group in the input data set has more observations Default: than DATA other n BY groups, PROC TRANSPOSE assigns missing values in the output data set to the variables that have no corresponding input observations. PREFIX= prefix specifies a prefix to use in constructing names for transposed variables in the output data set. For example, if PREFIX=VAR, the names of the variables are VAR1, VAR2, . . . ,VARn.
Interaction: when you use PREFIX= with an ID statement, the value prefixes to the ID value.
Data rawdata; infile cards missover; input @1 gender $1. @3 date date9. amount;
run;
proc transpose data=rawdata out=tranpose (drop=_name_); by gender year; id month ; format month mymths3.;
var amount;
run;
1 2
F M
2003 2003
123 .
121 57
. 63
RAWDATA
run;
date = input(tmpdate,date9.);
amount = col1; if amount = . then amount = 0; drop tmpdate _name_ col1; format date date9.;
run;
1
2 3
F
F F
2003
2003 2003
01FEB2003
01MAR2003 01APR2003
123
121 0
4
5 6
M
M M
2003
2003 2003
01FEB2003
01MAR2003 01APR2003
0
57 63
Proc Transpose Can create records for missing OBS BUT only inclusive of existing end points!
proc sort data=rawdata out=rawdata ; by gender date; run; proc sort data=template out=template; by gender date; run;
/*** merge template with raw data to create a full matrix of data ***/
data rawdataa; merge template (in=t) rawdata (in=r); by gender date; year = year(date); month= month(date);
run;
continued.
/*** now you would like to take vertical data to make horizontal data ***/ proc transpose data=rawdataa out=tranpose; by gender year; id month ; format month mymths3.; var amount; run;
/*** print to see what data now looks like ***/ proc print data=tranpose; run;
Obs
gender
JAN 0 0
FEB 123 0
MAR 121 57
APR 0 63
MAY 0 0
JUN 0 0
JUL 0 0
AUG 0 0
SEP . 0 . 0 .
1 2
F M
You can see we have created buckets for all the months of the year JAN thru DEC Now, lets do some simple math for percentages, ytd #s and ytd %s.
apr oct
array pct(*) janpct febpct marpct aprpct maypct junpct julpct augpct seppct octpct novpct decpct; array ytd(*) janytd febytd marytd aprytd mayytd junytd julytd augytd sepytd octytd novytd decytd; array ytdp(*) janytp febytp marytp aprytp mayytp junytp julytp augytp sepytp octytp novytp decytp; Continued
Continued
/*** get yearly totals ***/ total = sum(of jan--dec); /*** get monthly percentages to yearly totals***/
Continued
/*** convert horizontal data to vertical data ***/ proc transpose data=rawdatab out=tranposeb; by gender year; var _numeric_; run; proc print data=tranposeb; run;
Continued output
Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 gender F F F F F F F F F F F F F year 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 _NAME_ year JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC amt 2003.00 0.00 123.00 121.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 14 15 16 17 18 19 20 21 22 23 24 25 F F F F F F F F F F F F 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 janpct febpct marpct aprpct maypct junpct julpct augpct seppct octpct novpct decpct 0.00 0.50 0.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Obs gender year _NAME_ amt
continued.
Continued output
Obs gender year _NAME_ amt Obs gender year _NAME_ amt
26 27 28 29 30 31
F F F F F F
38 39 40 41 42 43 44 45 46 47 48 49 50
F F F F F F F F F F F F F
2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003
janytp febytp marytp aprytp mayytp junytp julytp augytp sepytp octytp novytp decytp total
0.00 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 244.00
32
33 34 35 36 37
F
F F F F F
2003
2003 2003 2003 2003 2003
julytd
augytd sepytd octytd novytd decytd
244.00
244.00 244.00 244.00 244.00 244.00
if ' '
if 'YTD' = substr(_name_,4,3) then YTD = amt; /*** get YTD amts ***/ if 'PCT' = substr(_name_,4,3) then PCT = amt; /*** get PCT amts ***/ if 'YTP' = substr(_name_,4,3) then YTDP = amt; /*** get YTD PCT amts ***/
Continued
/*** summarize over the new variables created ***/ proc summary data=rawdatac nway missing;
run;
run;
Final Output
Obs gender date mthly PCT YTD YTDP
1 2
F F
01JAN03 01FEB03
0 123
0.00000 0.50410
0 123
0.00000 0.50410
3
4 5 6 7 8 9 10 11
F
F F F F F F F F
01MAR03
01APR03 01MAY03 01JUN03 01JUL03 01AUG03 01SEP03 01OCT03 01NOV03
121
0 0 0 0 0 0 0 0
0.49590
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
244
244 244 244 244 244 244 244 244
1.00000
1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
12
13 14 15
F
M M M
01DEC03
01JAN03 01FEB03 01MAR03
0
0 0 57
0.00000
0.00000 0.00000 0.47500
244
0 0 57
1.00000
0.00000 0.00000 0.47500