Anda di halaman 1dari 38

SAS Proficiency

Test

Judy Loren
August 24, 2007
SAS Proficiency Test

 Used as a hiring tool years ago


 Publisher’s Clearinghouse
 Sensitive but not specific
– If you pass it doesn’t mean you’re good
– But if you’re good, you should pass

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 2


Part I
Question 1_1

 What happens if no VAR statement is used in PROC


PRINT?
– All variables

 What order?
– In the order the vars exist on the dataset

 What determines that?


– In the order it encounters the variable names during syntax scan

 Need to understand how SAS processes a data step


 Demonstration spt_1_1.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 3


Part I
Question 1_2

 Explain what a (data step) SELECT statement does.


– Sets up a series of IF…THEN….ELSE logic based on the value of
1 variable.
 Demonstration spt_1_2.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 4


Part I
Question 1_3

 What is the default length of a numeric variable?


–8

 All numbers are floating point binary, NOT a sequence of


base 10 digits.
 Binary joke: There are 10 kinds of people in the world:
those who understand binary, and those who don’t.
 Each byte has 8 bits, which can store an integer as high as
27 or 128. Two bytes, 16 bits, 215 or 32,768.
 Since all numbers are floating point, SAS uses some bits
for the power and the sign, so it takes a minimum of 3
bytes to store any number at all.
Copyright © Health Dialog Services Corporation 2007. All rights reserved. 5
Part I
Question 1_3

 Largest integer SAS can store accurately by length of


numeric variable:

Length
2
 Demonstration spt_1_3.sas

3
Copyright © Health Dialog Services Corporation 2007. All rights reserved. 6
Part I
Question 1_4

 What is the difference between DO WHILE and DO


UNTIL?
– WHILE stops when the condition is false; UNTIL stops when the
condition is true
– WHILE evaluates at the TOP of the loop
– UNTIL evaluates at the BOTTOM of the loop

 Demonstration spt_1_4.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 7


Part I
Question 1_5

 What does BY after SET A; do?


– Creates first. and last. variables for each var in the BY statement
– ERRORs if the dataset is not sorted by the vars in the BY
statement
 What does BY after SET A B; do?
– In addition to above, interleaves the observations from A and B so
that the resulting dataset remains sorted by the BY vars
– Can result in problems if A and B have different variables.

 Demonstration spt_1_5.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 8


Part I
Question 1_6

 What is a subsetting IF statement?


– Condition for keeping observations
– Causes that execution of the data loop to end if condition is false
– Can cause problems
 When used in combo with IF FIRST. or IF LAST.
 When used in combo with DO / SET / END logic

 Demonstration spt_1_6.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 9


Part I
Question 1_7

 What’s the difference between INFILE and INPUT?


– INFILE declares which file will supply records
– INPUT translates portions of each record into variables in an
observation
– Both are executable
 INPUT reads from the most recently declared INFILE
 Can change INFILES during processing
 Attempt to read an additional record from any INFILE that is at
the bottom will cause the DATA step to end
 How to concatenate INFILEs

 Demonstration spt_1_7.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 10


Part I
Question 1_8

 What information is given by a Proc Contents?


– Header information only
 Cannot give observations count of a view. Why?
 How do you modify the header?
– Proc datasets
 Which elements can you change?
– Names, labels, formats
 Which elements can you NOT change without re-creating
the dataset?
– lengths
 Demonstration spt_1_8.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 11


Part I
Question 1_9

 How would you write an input statement to read a


character variable X of length 1 from column 2?
– Input @ 2 X $1.;
– Input X $ 2-2;
– Input +1 X $1.;
– Other?

 Demonstration spt_1_9.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 12


Part I
Question 1_10

 What is the purpose of the single trailing @ sign?


– Hold the current line in the input buffer for additional inputs
– Used to check variables to decide:
 Whether to read the rest of the line
 What format to use on the rest of the line

 Demonstration spt_1_10.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 13


Part I
Question 1_11

 What statements are used to include or exclude specific


variables in a data step?
– Statements
 drop, keep
– Dsoptions
 (keep= …), (drop= …)
 On incoming dataset or output dataset
– Interactions with rename
– Shortcuts

 Demonstration spt_1_11.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 14


Part I
Question 1_12

 How can you generate a 2-dimensional crosstabulation?


– Proc Freq
– Proc Tabulate
– Proc Report

 Demonstration spt_1_12.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 15


Part I
Question 1_13

 In the macro language, all macro commands begin with?


–%
– Macro variables requiring substitution start with &

 Demonstration spt_1_13.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 16


Halftime

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 17


Part II
Question 2_1

 What occurs if multiple datasets are included in a SET


statement?
– concatenation

 Already addressed in 1_5

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 18


Part II
Question 2_2

 What symbol is used to concatenate two character values?


– || as in a = trim(b) || left(z);
– New function CATX will do the trim, left, and insert a separator
 a = CATX(‘ ‘,b,z);
CAT Concatenates character strings without removing leading or
trailing blanks
CATS Concatenates character strings and removes leading and
trailing blanks
CATT Concatenates character strings and removes trailing blanks

CATX Concatenates character strings, removes leading and


trailing blanks, and inserts separators

– Demonstration spt_2_2.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 19


Part II
Question 2_3

 Explain the substr function.


– Ray Cloutier’s talk on 2/13
– Left or right of =
– Isolates certain positions within a character variable

 New functions
– Char(name,3) takes just 1 character
 Third = char(name,3);
– First(name) takes the first character
 Initial = first(name);

 Demonstration spt_2_3.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 20


Part II
Question 2_4

 What are the differences among these functions?


– Round()
– Ceil()
– Floor()

 Demonstration spt_2_4.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 21


Part II
Question 2_5

 In the data step, records are usually added to a SAS


dataset automatically. Under what circumstances is there
no automatic output?
– IKWID

 Demonstration spt_2_5.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 22


Part II
Question 2_6

 How might the result of these two statements differ?


– C = A + B;
– C = SUM ( A , B ) ;
 Missing values

 What is the result of this statement?


– C = SUM ( A1 – A10 );

 Demonstration spt_2_6.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 23


Part II
Question 2_7

 In a data step, how can you identify which of 2 input


datasets an observation came from?
– Set one (in=a) two (in=b);
– If a and b;

 What happens if you use the same in= variable for 2


datasets?
 Demonstration spt_2_7.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 24


Part II
Question 2_8

 How do you begin a data step if you do not want to create


a SAS dataset?
– Data _null_;

 When would you use this?


– Create a flat file (usually use export)
– Create a report (usually use export)
– Load a macro variable (often use SQL)
 New function call symputx handles left justification, trimming and
character conversion
 Demonstration spt_2_8.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 25


Part II
Question 2_8

 Call symputx 3rd argument—NOT case sensitive

– G specifies that the macro variable is stored in the global symbol


table, even if the local symbol table exists.

– L specifies that the macro variable is stored in the most local


symbol table that exists, which might be the global symbol table.

– F specifies that if the macro variable exists in any symbol table,


CALL SYMPUTX uses the version in the most local symbol table in
which it exists. If the macro variable does not exist, CALL
SYMPUTX stores the variable in the most local symbol table.

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 26


Part II
Question 2_9

 What are the automatic variables first. and last. ?


– Created whenever BY is used in data step
– First.patkey is true when this observation is the first one with this
value of patkey, false otherwise
– Last.patkey is true when this observation is the last one with this
value of patkey, false otherwise
– When multiple variables appear in the BY statement, these
variables operate within levels of the variables above them.
 Useful for finding duplicates
 Demonstration spt_2_9.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 27


Part II
Question 2_10

 What does a 2 level input dataset name such as


OLD.TWO indicate ?
– TWO is a dataset in the library with the libname OLD

 What is the default libname, where one-level dataset


names go?
– work
– %savework allows you to save your work space

 Demonstration spt_2_10.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 28


Part II
Question 2_11

 What statement allows you to keep track of previous


values of variables or keep a running total?
– Retain a 0; initializes a to 0 the first time through the data step
– a+1; this form will automatically retain a
– What about
 a+b;
 Treats missing values of b as zero
 a+ (-b); decrements by b NOT a - b;

 What happens if you retain the value of a variable in the


incoming dataset?
 Demonstration spt_2_11.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 29


Part II
Question 2_12

 When an invalid data field is encountered when inputting a


numeric variable, what happens?
– Variable set to missing (or whatever value is specified in the
INVALIDDATA system option)
– Message written to log
– Message written to log
– Message written to log
– _error_ set to 1 (dump buffer and all variable values)
 What are some ways to avoid this?
– ? or ?? modifier
 Both suppress the invalid data message
 ?? Sets _error_ to 0
 Demonstration spt_2_12.sas
Copyright © Health Dialog Services Corporation 2007. All rights reserved. 30
Part II
Question 2_13

 What’s the difference between $w. and $CHARw.


informats?
– $w. will cause the characters found to be left justified within the
variable’s value
– $CHARw. preserves leading blanks

 Demonstration spt_2_13.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 31


Part III
Question 3_1

 Explain the _type_ variable generated by proc summary.


– Another application of binary numbers
Proc summary;
class sex agecat plan ;
22 21 20
_type_ 0 0 0 = 0 (grand total)
0 0 1 = 1 (plan only)
0 1 0 = 2 (agecat only)
0 1 1 = 3 (plan & agecat)
1 0 0 = 4 (sex only)
1 0 1 = 5 (sex and plan)
1 1 0 = ?
1 1 1 = ?

 Demonstration spt_3_1.sas
Copyright © Health Dialog Services Corporation 2007. All rights reserved. 32
Part III
Question 3_3

 What is the purpose of a %GLOBAL statement?


– To make a macro variable common to all macros

 Demonstration spt_3_3.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 33


Part III
Question 3_4

 What do CALL SYMPUT and CALL SYMGET do?


– Store and retrieve values of macro variables

 Values only exist after data step is completed


 I’ve never had occasion to use SYMGET
 Demonstration spt_3_4.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 34


Part III
Question 3_5

 If you want to print a hyphenated phone number with an


area code in parentheses, how would you do it?
– Bunch of substr
 OR
– PICTURE

 Demonstration spt_3_5.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 35


Part III
Question 3_8

 What date is the reference date for calculating the value of


a SAS date variable?
– January 1, 1960
– What is the value of A in the following? (write these down!)
 A = ’01jan1960’d;
 B = mdy (1, 2, 1960);
 C = intnx(‘month’, A, -1 );
 D = intck(‘year’, A, C);

 Demonstration spt_3_8.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 36


Part III
Question 3_9

 This is the question for all the money.


 This is the question that separates the pros from the
amateurs.
 This is the question that makes you an ubergeek or just a
wannabe.
 What is _n_?
– NOT an observation counter
– Number times you’ve executed the data loop

 Demonstration spt_3_9.sas

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 37


Game Over

 How did you do??


 Did you learn anything??
 I did!

Copyright © Health Dialog Services Corporation 2007. All rights reserved. 38