Anda di halaman 1dari 40

Lesson 2

Descriptive Statistics:
Tabular and Graphical Methods

Summarizing Quantitative Data


Summarizing Qualitative Data
Exploratory Data Analysis
Crosstabulations
and Scatter Diagrams

2003 Thomson/South-Western

Exploratory Data Analysis

The techniques of exploratory data analysis


consist of simple arithmetic and easy-to-draw
pictures that can be used to summarize data
quickly.

2003 Thomson/South-Western

Summarizing Quantitative Data

Frequency Distribution
Relative Frequency and Percent Frequency
Distributions
Dot Plot
Histogram
Cumulative Distributions

2003 Thomson/South-Western

Example: Hudson Auto Repair


The manager of Hudson Auto would like to get
a
better picture of the distribution of costs for
engine
tune-up parts. A sample of 50 customer invoices
has
91taken
78 and
93 the
57 costs
75 of
52 parts,
99 rounded
80 97 to
62
been
71 69 72 89 66 75 79 75 72 76
the
104 dollar,
74 62 are
68listed
97 below.
105 77 65 80 109
nearest
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

2003 Thomson/South-Western

Frequency Distribution

Guidelines for Selecting Number of Classes


Use between 5 and 20 classes.
Data sets with a larger number of elements
usually require a larger number of classes.
Smaller data sets usually require fewer
classes.

IBM UET Lahore

Business Statistics
2003 Thomson/South-Western

Fall 2010

Lesson 2

L
argestD
aN
V
lm
ebr
u
lofC
S
m
a
estD
aV
lue
Frequency Distribution

Guidelines for Selecting Width of Classes


Use classes of equal width.
Approximate Class Width =

2003 Thomson/South-Western

Example: Hudson Auto Repair

Frequency Distribution
If we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5
10
Cost ($)
Frequency
50-59
2
60-69
13
70-79
16
80-89
7
90-99
7
100-109
5
Total
50

2003 Thomson/South-Western

Example: Hudson Auto Repair

Relative Frequency and Percent Frequency


Distributions
Relative
Cost ($)
Frequency
50-59
.04
60-69
.26
70-79
.32
80-89
.14
90-99
.14
100-109
.10
Total 1.00

2003 Thomson/South-Western

Percent
Frequency
4
26
32
14
14
10
100
8

Example: Hudson Auto Repair

Insights Gained from the Percent Frequency


Distribution
Only 4% of the parts costs are in the $50-59
class.
30% of the parts costs are under $70.
The greatest percentage (32% or almost
one-third) of the parts costs are in the $7079 class.
10% of the parts costs are $100 or more.

IBM UET Lahore

Business Statistics
2003 Thomson/South-Western

Fall 2010

Lesson 2

Dot Plot

One of the simplest graphical summaries of


data is a dot plot.
A horizontal axis shows the range of data
values.
Then each data value is represented by a dot
placed above the axis.

2003 Thomson/South-Western

10

Example: Hudson Auto Repair

Dot Plot

.
50

..
..
.
.
.
.. .....
.. ..........
.. .. .. .. . .. . . ...
. . ...
60

70

80

90

100

Cost ($)

2003 Thomson/South-Western

11

Histogram

Another common graphical presentation of


quantitative data is a histogram.
The variable of interest is placed on the
horizontal axis.
A rectangle is drawn above each class interval
with its height corresponding to the intervals
frequency, relative frequency, or percent
frequency.
Unlike a bar graph, a histogram has no natural
separation between rectangles of adjacent
classes.

2003 Thomson/South-Western

12

Example: Hudson Auto Repair

Histogram
18

Frequency

16
14
12
10
8
6
4
2
50

60

70

2003 Thomson/South-Western

80

90

100

Parts
Cost ($)
110

13

Cumulative Distributions

Cumulative frequency distribution -- shows the


number of items with values less than or equal
to the upper limit of each class.
Cumulative relative frequency distribution -shows the proportion of items with values less
than or equal to the upper limit of each class.
Cumulative percent frequency distribution -shows the percentage of items with values less
than or equal to the upper limit of each class.

2003 Thomson/South-Western

14

Example: Hudson Auto Repair

Cumulative Distributions
Cumulative Cumulative
Cumulative
Relative
Percent
Cost ($)
Frequency
Frequency
Frequency
< 59
2
.04
4
< 69
15
.30
30
< 79
31
.62
62
< 89
38
.76
76
< 99
45
.90
90
< 109
50
1.00
100

2003 Thomson/South-Western

15

Scatter Diagram

A scatter diagram is a graphical presentation


of the relationship between two quantitative
variables.
One variable is shown on the horizontal axis
and the other variable is shown on the vertical
axis.
The general pattern of the plotted points
suggests the overall relationship between the
variables.

2003 Thomson/South-Western

16

Scatter Diagram

A Positive Relationship

2003 Thomson/South-Western

17

Scatter Diagram

A Negative Relationship

2003 Thomson/South-Western

18

Scatter Diagram

No Apparent Relationship

2003 Thomson/South-Western

19

Example: Panthers Football Team

Scatter Diagram
The Panthers football team is interested in
investigating the relationship, if any, between
interceptions made and points scored.
x = Number of
Interceptions
1
3
2
1
3

2003 Thomson/South-Western

y = Number of
Points Scored
14
24
18
17
27
20

Example: Panthers Football Team


Scatter Diagram
Number of Points Scored

y
30
25
20
15
10
5
0

1
2
3
Number of Interceptions

2003 Thomson/South-Western

21

Example: Panthers Football Team

The preceding scatter diagram indicates a


positive relationship between the number of
interceptions and the number of points scored.
Higher points scored are associated with a
higher number of interceptions.
The relationship is not perfect; all plotted
points in the scatter diagram are not on a
straight line.

2003 Thomson/South-Western

22

Summarizing Qualitative Data

Frequency Distribution
Relative Frequency
Percent Frequency Distribution
Bar Graph
Pie Chart

2003 Thomson/South-Western

23

Frequency Distribution

A frequency distribution is a tabular summary


of data showing the frequency (or number) of
items in each of several nonoverlapping
classes.
The objective is to provide insights about the
data that cannot be quickly obtained by
looking only at the original data.

2003 Thomson/South-Western

24

Example: Marada Inn


Guests staying at Marada Inn were asked to rate the
quality of their accommodations as being excellent,
above average, average, below average, or poor.
The
ratings provided by a sample of 20 quests are shown
below.
Below Average
Average
Above Average
Above Average
Above Average
Above Average
Above Average
Below Average
Below
Average Average
Poor Poor
Above Average
Excellent
Above Average
Average
Above Average
Average
Above Average Average
2003 Thomson/South-Western

25

Example: Marada Inn

Frequency Distribution
Rating
Frequency
Poor
2
Below Average
3
Average
5
Above Average
9
Excellent
1
Total
20

2003 Thomson/South-Western

26

Relative Frequency Distribution

The relative frequency of a class is the fraction


or proportion of the total number of data items
belonging to the class.
A relative frequency distribution is a tabular
summary of a set of data showing the relative
frequency for each class.

2003 Thomson/South-Western

27

Percent Frequency Distribution

The percent frequency of a class is the relative


frequency multiplied by 100.
A percent frequency distribution is a tabular
summary of a set of data showing the percent
frequency for each class.

2003 Thomson/South-Western

28

Example: Marada Inn

Relative Frequency and Percent Frequency


Distributions
Rating

Relative
Percent
Frequency Frequency

Poor
Below Average
Average
Above Average
Excellent
Total

.10
.15
.25

10
15
25
.45

.05
1.00

2003 Thomson/South-Western

45
5
100
29

Bar Graph

A bar graph is a graphical device for depicting


qualitative data.
On the horizontal axis we specify the labels
that are used for each of the classes.
A frequency, relative frequency, or percent
frequency scale can be used for the vertical
axis.
Using a bar of fixed width drawn above each
class label, we extend the height appropriately.
The bars are separated to emphasize the fact
that each class is a separate category.

2003 Thomson/South-Western

30

Example: Marada Inn

Bar Graph
9

Frequency

8
7
6
5
4
3
2
1
Poor

Below Average Above Excellent


Average
Average

2003 Thomson/South-Western

Rating

31

Pie Chart

The pie chart is a commonly used graphical


device for presenting relative frequency
distributions for qualitative data.
First draw a circle; then use the relative
frequencies to subdivide the circle into sectors
that correspond to the relative frequency for
each class.
Since there are 360 degrees in a circle, a class
with a relative frequency of .25 would consume
.25(360) =
90 degrees of the circle.

2003 Thomson/South-Western

32

Example: Marada Inn

Pie Chart
Exc.
Poor
5%
10%
Above
Average
45%

Below
Average
15%
Average
25%

Quality Ratings
2003 Thomson/South-Western

33

Example: Marada Inn

Insights Gained from the Preceding Pie Chart


One-half of the customers surveyed gave
Marada a quality rating of above average
or excellent (looking at the left side of the
pie). This might please the manager.
For each customer who gave an excellent
rating, there were two customers who gave
a poor rating (looking at the top of the
pie). This should displease the manager.

2003 Thomson/South-Western

34

Crosstabulation

Crosstabulation is a tabular method for


summarizing the data for two variables
simultaneously.
Crosstabulation can be used when:
One variable is qualitative and the other is
quantitative
Both variables are qualitative
Both variables are quantitative
The left and top margin labels define the
classes for the two variables.

2003 Thomson/South-Western

35

Example: Finger Lakes Homes

Crosstabulation
The number of Finger Lakes homes sold for each
style and price for the past two years is shown below.
Price
Range

Home Style
Colonial Ranch Split A-Frame

< $99,000
55
> $99,000
45
Total
100

18
12

6
14

30

2003 Thomson/South-Western

20

19
16
35

Total

12
3
15

36

Example: Finger Lakes Homes

Insights Gained from the Preceding


Crosstabulation
The greatest number of homes in the
sample (19) are a split-level style and priced
at less than or equal to $99,000.
Only three homes in the sample are an AFrame style and priced at more than
$99,000.

2003 Thomson/South-Western

37

Crosstabulation: Row or Column


Percentages

Converting the entries in the table into row


percentages or column percentages can
provide additional insight about the
relationship between the two variables.

2003 Thomson/South-Western

38

Example: Finger Lakes Homes

Row Percentages
Price
Range
Total
< $99,000
21.82
> $99,000
100

Home Style
Colonial Ranch Split A-Frame
32.73
100
26.67

10.91
31.11

34.55
35.56

6.67

Note: row totals are actually 100.01 due to rounding.

2003 Thomson/South-Western

39

Example: Finger Lakes Homes

Column Percentages
Price
Range
< $99,000
80.00
> $99,000
20.00
Total
100

Home Style
Colonial Ranch Split A-Frame
60.00

30.00

54.29

40.00

70.00

45.71

100

100

100

2003 Thomson/South-Western

40

Anda mungkin juga menyukai