Anda di halaman 1dari 16

Introduction to SAS

Learning SAS:
1. SAS is the most widely used statistical software package in the world. Therefore,
it is a good idea for you to learn it.
2. There is a version of SAS on the PCs in each of Carnegie !ellons clusters.
". The #est way to learn to use SAS is to do the following$
a. %o to a computer which has SAS for windows, start the program, look under
the &'elp( menu for &)nline Training( and go carefully through the tutorial.
This will take several hours, so you should #udget your time accordingly.
#. *ook carefully at all of the e+amples we do in class ,,, - include SAS
programs with them.
c. There is a help feature on this version of SAS. Select &.+tended 'elp( from
the 'elp menu. Then select the ta# la#eled contents. Select the #ook la#eled
&SAS system help main menu( /rom there, you can get help on a wide
variety of things. 0ou can find descriptions of the procedures we will use in
this course under &modeling and analysis tools( and then &data analysis( or
under &modeling and analysis tools( and then &econometrics and time series(
Using SAS:
1. SAS is a programming language, like /)1T1A2 or C.
2. The way you will generally use SAS is$
i. 3rite a SAS program
ii. Su#mit the SAS program
iii. *ook at the results SAS generates, fi+ any &#ugs(, and su#mit again
". )n the PCs in the computer clusters, you write the program in the &Program
.ditor( window. 0ou su#mit the program #y pushing the su#mit #utton 4looks like
a guy running5, on the tool#ar 6ust #elow the menu #ar near the top of the window.
0ou e+amine the results in the *)% window 4where SAS tells you a#out any
errors which occurred5 and in the )7TP7T window 4where SAS tells you the
results of your program.
Your data file:
1. %o to the we# site for the course. 2avigate to the 8atasets page. There you will
see some datasets listed. .ach dataset has a link for &9ip file.( 4A &9ip( file which
was compressed using one of the programs 3in9ip or Pk9ip5. This is where the
data are. Click on the macro link under data file. The internet #rowser should ask
you whether you want to save or open the file. )pen it. 2ow you should #e in the
program 3in9ip. Tell 3in9ip to e+tract 4uncompress5 the data in macro #y
clicking on e+tract and then navigating to c$:temp, the temp folder on the hard
drive. 3in9ip will place the file macro.t+t in the temp folder on the hard. This file
contains macroeconomic data for the 7nited States from the first ;uarter of 1<=>
through the second ;uarter of 1<<2. .ach line in the data is one ;uarters worth of
data. The columns in the dataset are 4in order left to right5 year, ;uarter, %8P in
#illions of ?, consumption e+penditures in #illions of ?, investment e+penditure in
#illions of ?, net e+ports 4e+ports , imports5 in #illions of ?, government spending
in #illions of ?, and the %2P deflator 4a price inde+5 e;ual to 1@@ in 1<>A. Please
take a look at the file, either with !icrosoft 3ord or a te+t editor 46ust dou#le,
click on the file5. 4#e sure to use Courier font or Courier 2ew to look at the data5
2otice the dots for %8P in the early o#servations. &.( -s SASs way of saying
&missing( ,,, missing means that we dont know the value of the varia#le for that
particular o#servation.
2. 2ow go and take a look under the Code#ook and 8escription links at the we# site.
0ou will see the same descriptive information as in 1. Typically, you will look on
the we#site for descriptive information a#out the data.
". -n the discussion to follow, we will go along through the program attached to this
handout la#eled &SAS .BA!P*. C1(
2
Parts of a SAS program:
1. -nitial few lines
-n the first few lines, we tell SAS where to find the files which contain the data we
are going to use for our analysis. /or e+ample, the first line of a program might
#e$
flename macroraw c:\temp\macro.txt;
This tells SAS that there is a file called macro.t+t in the temp directory on the c
drive of the computer and that we are assigning a &nickname( macroraw to it, so
that, later on, we can 6ust call the file macroraw instead of Dc$:temp:macro.t+t
2otice that the command ends with a &E( All SAS commands end with a &E(
*eaving the &E( off is the most common error in programming SAS, and SAS is
very stupid a#out catching this mistake. 3hen you get goofy error messages 4in
the *)% window5 always check for missing &E( first.
2. 8ata step
The data step #egins with a line like$
data macro;
This tells SAS to create a dataset called macro. The ne+t few lines will tell SAS
where to go to get the data that goes in the data step. )ften the ne+t line will #e$
infle macroraw;
This line tells SAS that it is going to get the data from a file called macroraw
4which we already told it where to find in 1. a#ove5. 2ow SAS knows the data
are in macroraw, #ut we still must tell it what the varia#les are and how they are
arranged in the file$
input year 1-4 quarter 10
Y 14-20 24-!0 " !4-40
#$ 44-%0 & %4-'0 ( '4-');
This line tells SAS that the varia#les in the data file are named year, ;uarter, 0, C,
-, B!, %, P. -t also tells SAS that the values for the varia#le year is stored in
columns 1,F in the file macroraw, that the values for the varia#le ;uarter are
stored in column 1@ in the file called macroraw, . . . 2otice, this command takes
several lines ,,, that is )G, SAS does not look for carriage returns at the end of
commands, it looks for &E( 0ou can stretch a command over as many lines as you
want, as long as it ends eventually with a &E(
After these two lines, SAS reads in all the data from the file 4assuming that we
have accurately told it which columns contain which varia#le . . . 5.
)ften, we want to modify the varia#les in a dataset or create new varia#les. The
data we start out with here are &nominal( %8P, consumption, etc. and we would
like to use &real( %8P, consumption, etc. 4That is, we want to put everything
into 1<>A?5. To create new varia#les in a data step, 6ust write the new varia#le
3
name, an e;uals sign, and an e+pression to make the new varia#le. To make real
%8P out of nominal %8P, we want to multiply #y 1@@ and divide #y the %8P
deflator 4whyH5. The line that creates the varia#le real %8P is$
*ealY + 100,Y-(;
There are several more similar lines in the program. The ne+t line of the data step
is$
in. + 100,/(-la04/(11-la04/(1;
This line calculates the inflation rate. P, recall, is the price inde+. 1emem#er that
inflation is defined to #e the I change in a price inde+ in a year. So,
inflation J 1@@K4price this ;uarter , price 1 year ago5L4price one year ago5. Since
our data are in ;uarters, price one year ago is price F ;uarters ago 4or price F
o#servations ago5. SAS interprets the term lagF4P5 as &P F o#servations ago(
4similarly lag14P5 would mean &P one o#servation ago( and lag>4P5 would mean
&P > o#servations 42 years5 ago(5. !ake sure you understand this. Similarly, if
we wanted to calculate the growth rate of %8P over the past year$
0rowt2 + 100,/*ealY-la04/*ealY11-la04/*ealY1;
This is the end of the data step in this program.
". Procedures
The procedures tell SAS what to do with the dataset you created in the data step.
)ften, the first two procedures you will run will #e proc contents and proc means$
proc content3;
proc mean3;
Proc contents causes SAS to print a summary of what varia#les are in your
dataset. The output from this procedure appears on page >. Proc means causes
SAS to calculate the mean and several other statistics for each varia#le in your
dataset. The output from this procedure appears on page <.
Since one of the main topics of this course is regression, we will often use proc
reg, which is a regression procedure. /or proc reg, you need two lines 4at least5$
proc re0;
model *eal" + *ealY;
The first line, proc reg, tells SAS &- want you to run a regression( The second
line tells SAS e+actly what regression to run. -n this case, you are telling SAS to
run a regression #ased on the following model$
1e 1e alI alY u
i i i
= + +
1 2
This says that real investment is a function of real %8P. The output from this
procedure appears on page <.
F. .tc.
4
After you are done with the procedures, you can go on to make new datasets and
to run more procedures #y starting a new data step, then going on to do more
procedures, then starting a new data step, then going on to do more
procedures, . . .
=.. This is the end
To ensure that your final procedure runs correctly, it is a good idea for the very
last line of your program to #e
run;
My first SAS program:
Please type, in the program editor, the program la#eled &SAS .BA!P*. C1(
4attached5. -f you are feeling la9y, - included this program in the 9ip fileE it should now
#e in your temp folder, named .+ample1.sas. To get it click in the program editor
window then go to the file menu and choose open. 2ow hit the su#mit #utton 4it is
the guy running #utton, under the menu #ar near the top of the window5 ,,,
alternatively, you can select &su#mit( from the *ocal menu right after you are done
typing in the program. SAS will &eat( the program from the program editor windowE
there will #e a #rief delay, and then 4if everything has gone well5 the output window
will pop up with all of the results of all of your analyses in it.
*ook over your analyses and make sure they are the same as on the attached output
la#eled &SAS .BA!P*. C1, )7TP7T( 2ow, to clear out all that output 4we want
to do some more analyses5 choose &clear te+t( from the edit menu. All of the output
will disappear. Shrink the output window #y clicking on the #utton in the upper right
hand corner of the output window.
The *)% window, which is now visi#le contains information on what SAS did ,,,
when you make errors programming, the *)% window is where SAS will tell you
a#out them.
2ow, lets say we want to add some more analyses to our program. Click in the
program editor to select it. 2otice, your program is gone, and it would #e a pain to
type it in again. 'appily, you dont have to$ 6ust go to the *ocal menu and select
&1ecall te+t( 2ow your program has reappeared. Add more lines to your program so
5
that it looks like &SAS .BA!P*. C2(. Su#mit it again, and look at the output ,,, it
should look like &SAS .BA!P*. C2, )7TP7T(
Suppose we are now happy with our analyses and output. 3hile we are looking at the
output window, we can save our output #y going to the /ile menu, Save As and
selecting a file name. 7sually, SAS output files are saved with the suffi+ .lst 4.lst for
&list(5. So, a good name for our output would #e first.lst. 3hen you are finished
saving the program, clear te+t, shrink the output window, and activate the program
editor #y clicking in it. 1ecall your program and save it also. 7sually, SAS program
files are saved with the suffi+ .sas. So, a good name for our program file would #e
first.sas.
/inally, ;uit SAS and fire up !icrosoft 3ord, or your favorite word processor. )pen
the two files first.lst and first.sas. 0ou will want to change the font of these files to
SAS !onospace font 4< point5 and you will want to reformat these files so that the
lines do not &wrap around( ,,, often this will involve changing margins and using
*andscape rather than Portrait orientation. Also, you will need to put in page #reaks
to prevent the analyses from #eing split across pages.
A feature of SAS you may find useful is in the options menu. 3hile in SAS, and
M./)1. you su#mit a program, go to the %lo#als menu and select )ptionsE a pop,
up menu will appearE select %lo#al )ptions from this menu. 7sing the dialogue #o+
that comes up, you can control how many characters per line SAS uses and how many
lines per page.
6
Class auto example:
3e did 4or will do5 an e+ample in class involving the automo#ile dataset ,,, availa#le
on the we# site. 0ou can download it 4auto1.t+t5 along with an attached sas program
,,, the one - used in class. 2otice, in order to get the covariace matri+ - used in class,
- modified the proc reg slightly to tell SAS that - wanted the covariance matri+$
Proc reg;
model price = dom weight rel mpg / COVB;
The L C)NM is how you ask SAS for the covariance matri+.
-f you recall, our model was$
u mpg rel weight dom price + + + + + =
= F " 2 1

And we were calculating the standard error of$
= "
2 =@@
The calculation was a #ig pain. 'appily, SAS will automate these sorts of calculations.
The ne+t command in that SAS program is also a regression command$
Proc glm;
model price = dom weight rel mpg;
estimate w up 5 mpg d! 2" weight 5 mpg #2;
Proc glm is 6ust another way of saying$ &SAS - want you to run a regression.( The
model statement tells SAS what regression to run. The estimate statement tells SAS
that you want to calculate an estimate and standard error for some e;uation involving
the
s
. 0ou must then name your estimate ,,, - named it &w up =@@ mpg dn 2(. 0ou
also must tell SAS what to calculate. 3eight =@@ says &multiply the coefficient on
weight #y =@@( mpg O2 says &multiply the coefficient on mpg #y O2( Then SAS adds
them up P reports an estimate and a standard error. 1un this program through SAS
and then look at the last page of output. 0ou will see 2 parameter ta#les. The lower
one contains all the same parameters as #efore. The upper one contains a parameter
named &w up =@@ mpg dn 2( 4the name we gave it a#ove5. There is an estimate, 1A2>
and a standard error, 2"F ,,, identical to what we calculated in class, and much easier.
SAS does internally e+actly the calculation which we did in class.
$
%
EXAMPLE #1
&ile!ame macroraw 'c()temp)macro*t+t';
data macro;
i!&ile macroraw;
i!put ,ear -.4 /uarter -
0 -4.2 C 24.3 1 34.4
23 44.5 4 54.6 P 64.6%;
5eal0 = -60/P;
5ealC = -6C/P;
5eal1 = -61/P;
5eal23 = -623/P;
5eal4 = -64/P;
i!& = -67P.lag47P88/lag47P8;
growth = -675eal0.lag475eal088/lag475eal08;
proc co!te!ts;
proc mea!s;
proc reg;
model 5eal1 = 5eal0;
ru!;
9
EXAMPLE #1, OUTPUT
42e 565 5y3tem
11:44 7edne3day8 9e:ruary 118 1;;)
<=4>=45 (*<>?@*>
?ata 5et =ame: 7<*A.$6*< <:3erBation3: 1!)
$em:er 4ype: ?646 Caria:le3: 1%
>n0ine: C'12 "ndexe3: 0
reated: 11:44 7edne3day8 9e:ruary 118 1;;) <:3erBation Den0t2: 120
Da3t $odifed: 11:44 7edne3day8 9e:ruary 118 1;;) ?eleted <:3erBation3: 0
(rotection: ompre33ed: =<
?ata 5et 4ype: 5orted: =<
Da:el:
----->n0ine-Eo3t ?ependent "n.ormation-----
?ata 5et (a0e 5iFe: )1;2
=um:er o. ?ata 5et (a0e3: !
9ile 9ormat: '0G
9ir3t ?ata (a0e: 1
$ax <:3 per (a0e: ')
<:3 in 9ir3t ?ata (a0e: 4;
-----6lp2a:etic Di3t o. Caria:le3 and 6ttri:ute3-----
H Caria:le 4ype Den (o3
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
4 =um ) 24
G & =um ) 4)
1% &*<74E =um ) 112
% " =um ) !2
14 "=9 =um ) 104
) ( =um ) %'
2 J@6*4>* =um ) )
10 *>6D =um ) G2
1! *>6D& =um ) ;'
11 *>6D" =um ) )0
12 *>6D#$ =um ) ))
; *>6DY =um ) '4
' #$ =um ) 40
! Y =um ) 1'
1 Y>6* =um ) 0
42e 565 5y3tem
-
11:44 7edne3day8 9e:ruary 118 1;;)
Caria:le = $ean 5td ?eB $inimum $aximum
---------------------------------------------------------------------
Y>6* 1!G 1;G4.)) ;.;!02!11 1;%).00 1;;2.00
J@6*4>* 1!G 2.4;'!%04 1.11))4;) 1.0000000 4.0000000
Y 1!4 22;4.G4 1G02.%) 4)!.%000000 %);).'0
1!G 14)!.40 11'0.40 2;1.;000000 40%!.;0
" 1!G !''.1!211') 2'%.G1)!!;; %).2000000 )4!.;000000
#$ 1!G -2G.!'1!1!; 44.!4;;G); -14%.1000000 1'.'000000
& 1!G 4!2.41%!2)% !1;.2G')%;) ;4.'000000 110;.40
( 1!G %;.0G%1)2% !1.2'!'4!' 2%.4000000 120.'000000
*>6DY 1!4 !!)G.4% ;0'.0;4%'00 1;0!.%4 4;04.'2
*>6D 1!G 21GG.12 '%'.GGGG4)' ;)'.14)'4)' !!'1.44
*>6D" 1!G %4G.200'!2; 1'4.!%!%04! 1;'.'21'21' )0'.)G022;0
*>6D#$ 1!G -2G.G;G!%42 4G.1!;G2'G -14%.')2G!0; !4.0)'242!
*>6D& 1!G '%0.';1'!%0 1'%.2)424%' !1;.%;4%;4' ;!4.2G)!%0%
"=9 1!! 4.4'220G% !.G1!!0); -1!.G12!G4' 10.)%4%0!%
&*<74E 1!0 2.;1%4%)0 2.%101)01 -!.22%!'44 ).010)'01
---------------------------------------------------------------------
42e 565 5y3tem
11:44 7edne3day8 9e:ruary 118 1;;)
$odel: $<?>D1
?ependent Caria:le: *>6D"
6naly3i3 o. Cariance
5um o. $ean
5ource ?9 5quare3 5quare 9 Calue (ro:K9
$odel 1 2)0);!4.0001 2)0);!4.0001 ';).22! 0.0001
>rror 1!2 %!10!2.)0)%' 4022.;G%)2
4otal 1!! !!!;;''.)0)G
*oot $5> '!.42';! *-3quare 0.)410
?ep $ean %%4.%G!;0 6dL *-3q 0.)!;)
.C. 11.4!G0'
(arameter >3timate3
(arameter 5tandard 4 .or E0:
Caria:le ?9 >3timate >rror (arameter+0 (ro: K M4M
"=4>*>( 1 11.2'G;2! 21.2G)G01)2 0.%!0 0.%;G!
*>6DY 1 0.1'0!)) 0.00'0';)0 2'.424 0.0001
--
EXAMPLE #2
&ile!ame macroraw 'c()temp)macro*t+t';
data macro;
i!&ile macroraw;
i!put ,ear -.4 /uarter -
0 -4.2 C 24.3 1 34.4
23 44.5 4 54.6 P 64.6%;
5eal0 = -60/P;
5ealC = -6C/P;
5eal1 = -61/P;
5eal23 = -623/P;
5eal4 = -64/P;
i!& = -67P.lag47P88/lag47P8;
growth = -675eal0.lag475eal088/lag475eal08;
proc co!te!ts;
proc mea!s;
proc reg;
model 5eal1 = 5eal0;
proc reg;
model 5eal1 = 5eal0 growth;
model 5eal1 = 5eal0 growth i!&;
ru!;
-2
EXAMPLE #2, OUTPUT
42e 565 5y3tem
11:44 7edne3day8 9e:ruary 118 1;;)
<=4>=45 (*<>?@*>
?ata 5et =ame: 7<*A.$6*< <:3erBation3: 1!)
$em:er 4ype: ?646 Caria:le3: 1%
>n0ine: C'12 "ndexe3: 0
reated: 11:4% 7edne3day8 9e:ruary 118 1;;) <:3erBation Den0t2: 120
Da3t $odifed: 11:4% 7edne3day8 9e:ruary 118 1;;) ?eleted <:3erBation3: 0
(rotection: ompre33ed: =<
?ata 5et 4ype: 5orted: =<
Da:el:
----->n0ine-Eo3t ?ependent "n.ormation-----
?ata 5et (a0e 5iFe: )1;2
=um:er o. ?ata 5et (a0e3: !
9ile 9ormat: '0G
9ir3t ?ata (a0e: 1
$ax <:3 per (a0e: ')
<:3 in 9ir3t ?ata (a0e: 4;
-----6lp2a:etic Di3t o. Caria:le3 and 6ttri:ute3-----
H Caria:le 4ype Den (o3
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
4 =um ) 24
G & =um ) 4)
1% &*<74E =um ) 112
% " =um ) !2
14 "=9 =um ) 104
) ( =um ) %'
2 J@6*4>* =um ) )
10 *>6D =um ) G2
1! *>6D& =um ) ;'
11 *>6D" =um ) )0
12 *>6D#$ =um ) ))
; *>6DY =um ) '4
' #$ =um ) 40
! Y =um ) 1'
1 Y>6* =um ) 0
-3
42e 565 5y3tem
11:44 7edne3day8 9e:ruary 118 1;;)
Caria:le = $ean 5td ?eB $inimum $aximum
---------------------------------------------------------------------
Y>6* 1!G 1;G4.)) ;.;!02!11 1;%).00 1;;2.00
J@6*4>* 1!G 2.4;'!%04 1.11))4;) 1.0000000 4.0000000
Y 1!4 22;4.G4 1G02.%) 4)!.%000000 %);).'0
1!G 14)!.40 11'0.40 2;1.;000000 40%!.;0
" 1!G !''.1!211') 2'%.G1)!!;; %).2000000 )4!.;000000
#$ 1!G -2G.!'1!1!; 44.!4;;G); -14%.1000000 1'.'000000
& 1!G 4!2.41%!2)% !1;.2G')%;) ;4.'000000 110;.40
( 1!G %;.0G%1)2% !1.2'!'4!' 2%.4000000 120.'000000
*>6DY 1!4 !!)G.4% ;0'.0;4%'00 1;0!.%4 4;04.'2
*>6D 1!G 21GG.12 '%'.GGGG4)' ;)'.14)'4)' !!'1.44
*>6D" 1!G %4G.200'!2; 1'4.!%!%04! 1;'.'21'21' )0'.)G022;0
*>6D#$ 1!G -2G.G;G!%42 4G.1!;G2'G -14%.')2G!0; !4.0)'242!
*>6D& 1!G '%0.';1'!%0 1'%.2)424%' !1;.%;4%;4' ;!4.2G)!%0%
"=9 1!! 4.4'220G% !.G1!!0); -1!.G12!G4' 10.)%4%0!%
&*<74E 1!0 2.;1%4%)0 2.%101)01 -!.22%!'44 ).010)'01
---------------------------------------------------------------------
-4
42e 565 5y3tem
11:44 7edne3day8 9e:ruary 118 1;;)
$odel: $<?>D1
?ependent Caria:le: *>6D"
6naly3i3 o. Cariance
5um o. $ean
5ource ?9 5quare3 5quare 9 Calue (ro:K9
$odel 1 2)0);!4.0001 2)0);!4.0001 ';).22! 0.0001
>rror 1!2 %!10!2.)0)%' 4022.;G%)2
4otal 1!! !!!;;''.)0)G
*oot $5> '!.42';! *-3quare 0.)410
?ep $ean %%4.%G!;0 6dL *-3q 0.)!;)
.C. 11.4!G0'
(arameter >3timate3
(arameter 5tandard 4 .or E0:
Caria:le ?9 >3timate >rror (arameter+0 (ro: K M4M
"=4>*>( 1 11.2'G;2! 21.2G)G01)2 0.%!0 0.%;G!
*>6DY 1 0.1'0!)) 0.00'0';)0 2'.424 0.0001
42e 565 5y3tem
11:44 7edne3day8 9e:ruary 118 1;;)
$odel: $<?>D1
?ependent Caria:le: *>6D"
6naly3i3 o. Cariance
5um o. $ean
5ource ?9 5quare3 5quare 9 Calue (ro:K9
$odel 2 2''4%%'.'%'G 1!!22G).!2)! !;;.)%! 0.0001
>rror 12G 42!1%!.)10!2 !!!1.;1;GG
4otal 12; !0)GG10.4'G
*oot $5> %G.G22G) *-3quare 0.)'!0
?ep $ean %'2.1G))4 6dL *-3q 0.)'0)
.C. 10.2'G';
(arameter >3timate3
(arameter 5tandard 4 .or E0:
Caria:le ?9 >3timate >rror (arameter+0 (ro: K M4M
"=4>*>( 1 -4).11!;2; 2!.20!!0140 -2.0G4 0.0401
*>6DY 1 0.1'GGG0 0.00%;!)!G 2).2%2 0.0001
&*<74E 1 11.)1%;%1 2.0)G211)) %.''1 0.0001
-5
42e 565 5y3tem
11:44 7edne3day8 9e:ruary 118 1;;)
$odel: $<?>D2
?ependent Caria:le: *>6D"
6naly3i3 o. Cariance
5um o. $ean
5ource ?9 5quare3 5quare 9 Calue (ro:K9
$odel ! 2)!0400.001! ;4!4''.''G11 4'1.;;) 0.0001
>rror 12' 2%G!10.4'%') 2042.14'%%
4otal 12; !0)GG10.4'G
*oot $5> 4%.1;012 *-3quare 0.;1'G
?ep $ean %'2.1G))4 6dL *-3q 0.;14G
.C. ).0!)!;
(arameter >3timate3
(arameter 5tandard 4 .or E0:
Caria:le ?9 >3timate >rror (arameter+0 (ro: K M4M
"=4>*>( 1 -10;.'4'G2% 1;.40'!%1G4 -%.'%0 0.0001
*>6DY 1 0.1%;4!! 0.004G401; !!.'!4 0.0001
&*<74E 1 1'.G%44%) 1.G2!4)'11 ;.G21 0.0001
"=9 1 1%.%2)0'' 1.G2!10%21 ;.012 0.0001
-6