SAS FAQ
How can I create tables using proc tabulate?
Proc tabulate is predominately used to make nice looking tables. Unlike proc freq this
procedure can handle multiple variables in the row and column expressions. It can also handle
multiple levels in both rows and columns whereas proc freq will only create two variable
contingency tables. This procedure is often used to create tables to be used in publications
because it allows for a great deal of manipulation and control over almost every aspect of the
table.
Inputting the dataset ex1 to be used in all the following code
data ex1;
input treat
cards;
1
1
1
2
1
3
1
1
1
2
1
3
2
1
2
2
2
3
2
1
2
2
2
3
2
1
2
2
2
3
2
1
2
2
2
3
1
1
1
2
1
3
1
1
1
2
1
3
;
run;
visit
1
1
1
2
2
2
3
3
3
4
4
4
1
1
1
2
2
2
3
3
3
4
4
4
ptn
6.8496
14.7009
8.9982
7.5940
14.2160
14.6928
10.4298
10.3169
5.4979
5.6657
13.1932
10.2387
13.5339
5.6718
14.5702
7.9719
7.7261
11.8993
14.7676
7.2651
11.8824
9.1276
10.5855
7.8723
score1
1.3007
14.4018
2.9965
0.1880
13.4321
14.3855
5.8596
5.6338
4.0041
3.6687
11.3864
5.4774
12.0679
3.6563
14.1405
0.9439
0.4522
8.7986
14.5353
0.4698
8.7647
3.2553
6.1711
0.7445
score2;
Creating a basic table of patients by treatment showing their score of drug A for each treatment
averaged over all visits.
proc tabulate data=ex1;
class treat ptn;
var score1;
table ptn='Patient id',
mean=' '*score1='Drug A, average score over all
visits'*treat='Treatment'*F=10.
/ RTS=13.;
run;
----------------------------------|
|Drug A, average score|
|
|
over all visits
|
|
|---------------------|
|
|
Treatment
|
|
|---------------------|
|
|
1
|
2
|
|-----------+----------+----------|
|Patient id |
|
|
|-----------|
|
|
|1
|
10|
11|
|-----------+----------+----------|
|2
|
12|
9|
|-----------+----------+----------|
|3
|
11|
9|
|-----------+----------+----------|
|4
|
9|
10|
-----------------------------------
|Patient Id |
|
|
|
|
|
|
|-----------|
|
|
|
|
|
|
|1
|
7|
15|
9|
14|
6|
15|
|-----------+------+------+------+------+------+------|
|2
|
8|
14|
15|
8|
8|
12|
|-----------+------+------+------+------+------+------|
|3
|
15|
7|
12|
10|
10|
5|
|-----------+------+------+------+------+------+------|
|4
|
9|
11|
8|
6|
13|
10|
-------------------------------------------------------
Here is the same table but using treat=' ' and visit=' ' to eliminate the extra lines. This leaves
only the values of the categories of treat and visit which can be confusing, hence the use of the
formatting in the last graph.
proc tabulate data=ex1;
class ptn treat visit;
var score1;
table ptn='Patient Id',
sum=' '*score1='Drug A'*treat=' '*visit=' '*F=6. / RTS=13.;
run;
------------------------------------------------------|
|
Drug A
|
|
|-----------------------------------------|
|
|
1
|
2
|
|
|--------------------+--------------------|
|
| 1
| 2
| 3
| 1
| 2
| 3
|
|-----------+------+------+------+------+------+------|
|Patient Id |
|
|
|
|
|
|
|-----------|
|
|
|
|
|
|
|1
|
7|
15|
9|
14|
6|
15|
|-----------+------+------+------+------+------+------|
|2
|
8|
14|
15|
8|
8|
12|
|-----------+------+------+------+------+------+------|
|3
|
15|
7|
12|
10|
10|
5|
|-----------+------+------+------+------+------+------|
|4
|
9|
11|
8|
6|
13|
10|
-------------------------------------------------------
The final version of this table makes use of the sum=' ' and variable_name=' ' in
the table statement and formatting using proc format.
proc format;
value visit 1='Visit 1' 2='Visit 2' 3='Visit 3';
value tr 1='Therapy 1' 2='Therapy 2';
run;
proc tabulate data=ex1;
class ptn treat visit;
var score1 score2;
table ptn='Patient id',
mean=' '*score1='Drug A'*treat=''*visit=''*F=10./ RTS=13.;
format treat tr. visit visit.;
run;
------------------------------------------------------------------------------
|
|
Drug A
|
|
|-----------------------------------------------------------------|
|
|
Therapy 1
|
Therapy 2
|
|
|-------------------------------+--------------------------------|
|
| Visit 1 | Visit 2 | Visit 3 | Visit 1 | Visit 2 | Visit 3
|
|-----------+----------+----------+----------+----------+---------+----------|
|Patient id |
|
|
|
|
|
|
|-----------|
|
|
|
|
|
|
|1
|
7|
15|
9|
14|
6|
15|
|-----------+----------+----------+----------+----------+---------+----------|
|2
|
8|
14|
15|
8|
8|
12|
|-----------+----------+----------+----------+----------+---------+----------|
|3
|
15|
7|
12|
10|
10|
5|
|-----------+----------+----------+----------+----------+---------+----------|
|4
|
9|
11|
8|
6|
13|
10|
------------------------------------------------------------------------------
Eliminating the lines separating the rows by using the noseps option in the proc
tabulate statement.
proc tabulate data=ex1 noseps;
class treat ptn;
var score1;
table ptn='Patient id',
mean=''*score1='Drug A, average score over all
visits'*treat=''*F=10. / RTS=13.;
format treat tr.;
run;
----------------------------------|
|Drug A, average score|
|
|
over all visits
|
|
|---------------------|
|
|Therapy 1 |Therapy 2 |
|-----------+----------+----------|
|Patient id |
|
|
|1
|
10|
11|
|2
|
12|
9|
|3
|
11|
9|
|4
|
9|
10|
-----------------------------------
Creating a table with multiple levels of rows, i.e., separating patients by treatment.
proc tabulate data=ex1;
class ptn treat visit;
var score1;
table treat='Treatment'*ptn='Patient id',
mean='Average Score from all visits'*score1='Drug A'*F=10. / RTS=25.;
format treat tr.;
run;
-----------------------------------|
| Average |
|
|Score from|
|
|all visits|
|
|----------|
|
| Drug A |
|-----------------------+----------|
|Treatment |Patient id |
|
|-----------+-----------|
|
|Therapy 1 |1
|
10|
|
|-----------+----------|
|
|2
|
12|
|
|-----------+----------|
|
|3
|
11|
|
|-----------+----------|
|
|4
|
9|
|-----------+-----------+----------|
|Therapy 2 |1
|
11|
|
|-----------+----------|
|
|2
|
9|
|
|-----------+----------|
|
|3
|
9|
|
|-----------+----------|
|
|4
|
10|
------------------------------------
|-----------|
|
|
|
|
|
|
|1
| 7| 15| 9| 14| 6| 15|
|-----------+---+---+---+---+---+---|
|2
| 8| 14| 15| 8| 8| 12|
|-----------+---+---+---+---+---+---|
|3
| 15| 7| 12| 10| 10| 5|
|-----------+---+---+---+---+---+---|
|4
| 9| 11| 8| 6| 13| 10|
-------------------------------------
The length of the cells for the patient id is controlled by the RTS option in the table statement,
the length of the cells inside the table is controlled by the *F=d. for each column expression (all
the variables listed after the comma in the table statement). In the following table we are adding
several columns with multiple levels. We are also formatting visits to be dates.
Note: Since there is only one observation for each cell the mean is the same as the raw score and
it doesn't matter which function you specify (mean, sum, etc) but unless you want a separate line
with the function name in it in the table it is advisable to include the function with the
specification that the line be blank, i.e., by usingmean=' ' or sum=' ' as in the program shown.
There are various functions available including sum, mean, n (calculates the frequency),
colpctsum, rowpctsum and reppctsum.
proc format;
value vi 1='3/20' 2='8/30' 3='11/03';
run;
proc tabulate data=ex1;
class ptn treat visit;
var score1 score2;
table ptn='Id #',
mean=' '*score1='Drug A'*treat=''*visit=''*F=6.
sum=' '*score2='Drug B'*treat=''*visit=''*F=6. / RTS=6.;
format treat tr. visit vi.;
run;
----------------------------------------------------------------------------------------|
|
Drug A
|
Drug B
|
|
|----------------------------------------+-----------------------------------------|
|
|
Therapy 1
|
Therapy 2
|
Therapy 1
|
Therapy 2
|
|
|--------------------+--------------------+-------------------+--------------------|
|
| 3/20 | 8/30 |11/03 | 3/20 | 8/30 |11/03 | 3/20 | 8/30 |11/03 | 3/20 |
8/30 |11/03 |
|----+------+------+------+------+------+------+------+------+------+-----+------+------|
|Id #|
|
|
|
|
|
|
|
|
|
|
|
|
|----|
|
|
|
|
|
|
|
|
|
|
|
|
|1
|
7|
15|
9|
14|
6|
15|
1|
14|
3|
12|
4|
14|
|----+------+------+------+------+------+------+------+------+------+-----+------+------|
|2
|
8|
14|
15|
8|
8|
12|
0|
13|
14|
1|
0|
9|
|----+------+------+------+------+------+------+------+------+------+-----+------+------|
|3
|
15|
7|
12|
10|
10|
5|
15|
0|
9|
6|
6|
4|
|----+------+------+------+------+------+------+------+------+------+-----+------+------|
|4
|
9|
11|
8|
6|
13|
10|
3|
6|
1|
4|
11|
5|
-----------------------------------------------------------------------------------------
Creating a table with multiple variables in the row expression (variables listed before the comma
in the table statement).
proc tabulate data=ex1;
class ptn treat visit;
var score1 score2;
table treat='Treatment' visit='Visit',
mean=' '*score1='Drug A'*ptn='Patient Id';
format treat tr. visit vi.;
---------------------------------------------------------------------------|
|
Drug A
|
|
|---------------------------------------------------|
|
|
Patient Id
|
|
|---------------------------------------------------|
|
|
1
|
2
|
3
|
4
|
|----------------------+------------+------------+------------+------------|
|Treatment
|
|
|
|
|
|----------------------|
|
|
|
|
|Therapy 1
|
10.18|
12.17|
11.31|
9.20|
|----------------------+------------+------------+------------+------------|
|Therapy 2
|
11.26|
9.20|
8.75|
9.70|
|----------------------+------------+------------+------------+------------|
|visit
|
|
|
|
|
|----------------------|
|
|
|
|
|3/20
|
10.19|
7.78|
12.60|
7.40|
|----------------------+------------+------------+------------+------------|
|8/30
|
10.19|
10.97|
8.79|
11.89|
|----------------------+------------+------------+------------+------------|
|11/03
|
11.78|
13.30|
8.69|
9.06|
----------------------------------------------------------------------------
Creating a table with missing values labeled 'Missing' by using the misstext option in
the table statement.
data miss;
set ex1;
if ptn=4 then score1=. ;
run;
proc tabulate data=miss;
class ptn visit;
var score1;
table ptn='Patient Id',