Anda di halaman 1dari 29

SQL Chapter Two

Overview
Basic Structure
Verifying Statements

Specifying Columns
Specifying Rows
Introduction

SQL is a modular
language that uses
statements and clauses.
Basic structure of PROC SQL:

PROC SQL;
statement (select)
clauses (from, where, group by,
having, order by);
QUIT;

Note: place semicolon at the end of the last clause


only.

Statements
select - specifies the columns to be selected

Select statement has the following features:


-selects data that meets certain conditions
-groups data
-specifies an order for the data
-formats data
-calculates new variables
Clauses

from - specifies the tables to be queried


where - subsets the data based on a condition -
optional
group by - classifies the data into groups -
optional
having - subsets groups of data based on a group
condition
order by - sorts row by the values of specific
columns

Note: the order of the clauses are significant.


Overview
Basic Structure
Verifying Statements
Specifying Columns
Specifying Rows
Verifying Statements
Two functions that can be used to verify if your
statement syntax are:

validate - used to check the select statement


syntax

noexec - checks for invalid syntax in all types of


SQL statements
Validate

procsql; procsql;
validate validate
selecttimemile,restpulse, selecttimemile,restpulse,
maxpulse maxpulse,
fromproject.fitness fromproject.fitness
wheretimemilegt7;
wheretimemilegt7;

Syntax error, expecting one


NOTE: PROC SQL statement
of the following: a quoted
has valid syntax.
string, !, !!, &...
NoExect

proc sql noexec;


select timemile, restpulse, maxpulse
from project.fitness
where timemile gt 7;

NOTE: Statement not executed due to NOEXEC option.


Contrasting
Features of validate: Features of noexec:
-tests syntax of query -Checks for invalid
without executing the syntax in all types of
query SQL statements
-checks the validity of
column name
-prints error messages
for invalid queries
-is only used for select
statements

Overview
Basic Structure
Verifying Statements

Specifying Columns
Specifying Rows
Specifying Columns

Objectives
-Displaying columns directly from a table

-Displaying columns calculated from


other columns

-Calculating columns using a CASE


expression
Displaying data from a table
To print all of a table
columns in the order that they were
stored, use an asterisk
in the SELECT statement:
PATIENTPULSETEMPBPSBPD
1017298.513088
1017598.613392

1017498.513690
102819914193

1027798.714497
1027898.714293
1037798.313779
PROCSQL; 1037798.513374
1037898.614080
1037599.214789
1047298.812883
SELECT* 1046999.113186


FROMVITALS;

QUIT;
Printing Specify Columns
If you do not want to print out all columns in a table in
the order that they were stored, you can specify the
columns to be printed in the order that you want them
in the SELECT statement or CASE EXPRESSION in the
select statement . PATIENTDOSEGRP
PROCSQL;
CREATETABLETESTMEDAS 101MedB
SELECTPATIENT, 101MedB
CASE((PATIENT/2= 101MedB
102MedA
INT(PATIENT/2))+
102MedA
(PATIENT=.))
102MedA
WHEN1THEN'MedA'
103MedB
WHEN0THEN'MedB'
103MedB
ELSE'Error'
103MedB
ENDASDOSEGRP
103MedB
LENGTH=5 104MedA
FROMVITALS 104MedA
ORDERBYPATIENT; 104MedA
QUIT;

Calculating Columns
We can calculate a new column by using data in an
existing column and then naming the new column
using the as function.
CalculatetheproportionofUnitsformeachcountry

CODE:

OUTPUT:
Calculated columns using SAS

Dates

Recall from previous chapters in our SAS book


that dates are stored in a different format
when run through SAS.

We will then use these dates to calculate new


columns.
Example: Calculate the range of dates in a
Dailyprices dataset.
CODE:

OUTPUT:
Creating new columns
The use of CASE expression can be used to create a
new column
OUTPUT:
CODE:
Creating a table
To create and populate a table with the rows from
an SQL query, use create table.

State_ObsCodeState_Name

99UTUtah
procsql;
100VTVermont
create tablestatesas
101VAVirginia
selectstate_code,
102WAWashington
state_name
103WVWestVirginia
fromd2data.state;
104WIWisconsin
quit;
105WYWyoming
106N/A

Overview
Basic Structure
Verifying Statements

Specifying Columns
Specifying Rows
Specifying Rows in a table

Objectives

-Selecting a subset of rows

-Removing duplicate rows

-Subsetting using where clauses, escape


clauses, and calculated values
Selecting a subset of rows
Largeorders
TotalRetailPrice
ProductIDForThisProduct
procsql; 240200100076$1,796.00
title'largeorders'; 240400200097$1,250.40
selectProduct_ID, 240100400043$1,064.00
240200200013$1,266.00
total_retail_price 240300100032$1,200.20
fromd2data.order_item 240300300070$1,514.40
230100700009$1,687.50
230100700008$1,542.60
where total_retail_price > 240300300090$1,561.80
230100700009$1,136.20
1000; 230100200025$1,103.60
240200100173$1,937.20
quit;
Where clause
Use a where to specify a condition that data must
fulfill before being selected.

CODE:

OUTPUT:

Where clauses uses common comparisons (lt, gt,


eq, etc) and logical operators (OR, Not, And, In, Is
Null, ...).
Removing duplications
Use distinct keyword to eliminate duplications.

CODE (withoutDISTINCT): CODE (with DISTINCT):

OUTPUT:
Escape Clause
The escape clause allows you to designate a single
character that will indicate how proc sql will interpret
LIKE wildcards when SAS is searching within a
character string.
Example: Select observationsfromastringvariable
containinganunderscore('_').
CODE:
OUTPUT:
Subsetting calculated values

Since the where clause is evaluated before the


select, it's possible for an error to show up since the
columns used in the where clause must exist in the
table or be derived from an existing column.

There are two fixes for this, the first would be


repeating the calculation in the where clause. The
alternative method would be using CALCULATED
keyword to refer to an already calculated column in
the select.
Subsetting calculated values

Lackofprofit
procsql; ProductIDprofit
title'Lackofprofit'; 2301005000450.7
selectProduct_ID, 2301005000680.9
2401001004331.85
((total_retail_price/quantity)- 2407002000042
costprice_per_Unit)asprofit 2402001000211.5
fromd2data.order_item 2401001000312.4
2407002000072.9
wherecalculated profit<3; 2401001002321.9
quit; 2301005000041.85
title; 2301005000041.85
240700100017-1.41
Summary
Basic Structure
PROC SQL;
statement (select)
clauses (from, where, group by, having, order by);
QUIT;
Verifying Statements
validate - used to check the select statement syntax
noexec - checks for invalid syntax in all types of SQL
statements
Specifying Columns
Displaying columns directly from a table
Displaying columns calculated from other columns
Calculating columns using a CASE expression
Specifying Rows
Selecting a subset of rows
Removing duplicate rows
Subsetting using where clauses, escape clauses, and
calculated values

Anda mungkin juga menyukai