Anda di halaman 1dari 40

Statistical Methods:

INTRODUCTION TO
STATISTICS
Von Christopher G. Chua, MST

Important Course Concerns


Contact me directly through email:
von_christopher_chua@dlsu.edu.ph

vaughnchua@gmail.com

All learning resources, additional


readings, assignment and paper
details, and important
announcements will be relayed
through:
MATHbyCHUA:
mathbychua.weebly.com

This slideshow presentation will be made available through the classs official website,
mathbychua.weebly.com. The site will also provide access to download this file in printable
format.

Session Objectives
For this three-hour period, graduate students in education are
expected to develop the following learning competencies:
1. Describe basic terms in statistics
such as population, sample,
parameter, and stastic.
2. Classify data as quantitative or
qualitative, discrete or continuous,
and according to scales of
measure.

3. Differentiate methods of data


presentation.
4. Construct Frequency Distribution
Tables.
5. Represent frequency distribution
tables through histograms and
frequency polygons.

This slideshow presentation will be made available through the classs official website,
mathbychua.weebly.com. The site will also provide access to download this file in printable
format.

Basic Terms in
Statistics
Developing an understanding of statistical jargons

What is STATISTICS?
Statistics is derived from the Latin word status meaning
state.
Triola, 1998

Statistics is concerned with scientific methods for


collecting, organizing, summarizing, presenting, analyzing,
interpreting data and drawing conclusions based on that
data.
Schaum, 2008

Population and Sample


Sample
Target
population
Population

A population is the complete


collection of all elements to be
studied.

A target population is a specific


subgroup of the population whose
elements have some common
defining characteristic.
A sample is a subcollection of
elements drawn from the population.

Parameter and Statistic


A parameter is a numerical measurement describing
some characteristic of a population.
A statistic is a numerical measurement describing some
characteristic of a sample.

Sampling and its


Techniques
Recognizing options for selection of samples

How much sample is enough?


Given the population size, N, the sample size, n, may be
obtained through the formula:

=
1 + 2
where e is the margin of error.

How much sample is enough?

=
1 + 2
Compute for the sample size from a population size of
1350 with a margin of error of 5%.
What happens to the sample size as the margin of error is
increased? Explain what this means.

Think this through


As a principal in a very large school with a population of
2000, you wanted to know the level of reading
comprehension of the students in the school. Since you
cannot obtain data on all 2000 at a small period of time,
you decided to assess the reading comprehension of all
students in the honor roll only. The result showed that 95%
of these students are in the independent reading level.
You therefore conclude that 95% of the school population
have good reading comprehension skills.
Is the conclusion valid?

The Representative Sample


A more advanced research process is to select individuals
or schools who are representative of the population.
Representative refers to the selection of individuals as
sample of a population such that the sample are typical of
the population under study, enabling you to draw
conclusions from the sample about the population as a
whole

Sampling Techniques
Probability Sampling
Simple Random
Sampling
Systematic Sampling
Cluster Sampling

Systematic Sampling
Consider the sample size
of 309 from the population of
1350.
Compute for the value of k

as = .

Take every kth element in


the population as a sample.

Sampling Techniques
Grade
level

Distribution of
Population

Distribution
of Sample

250

87

225

12

212

56

10

178

119

TOTAL

865

274

If the samples are selected through random sampling, are


the selected samples representative of the population?

Sampling Techniques
Probability Sampling
Simple Random
Sampling
Systematic Sampling
Cluster Sampling
Stratified Sampling

Stratified Sampling
Grade
level

Distribution
of Population

Percentage

Distribution
of Sample

250

28.90%

79

225

26.01%

71

212

24.51%

67

10

178

20.58%

57

TOTA
L

865

100%

274

Sampling Techniques
Probability Sampling
Simple Random
Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling

Non-probability Sampling
Convenience Sampling
Purposive Sampling
Snowball Sampling

Is the sampling sound?


A teacher wants to conduct an action research in order to
determine the effectiveness of home-based family
counseling on the attendance of students. Of her 56
students, she has selected 20 whose residences are within
a kilometers radius from the school.
What sampling technique did the teacher use?
Do you agree on the strategy she has employed?

Descriptive vs. Inferential


Statistics
If a sample is representative of a population, important
conclusions about the population can be inferred from the
analysis of the sample. The phase of statistics under which
this condition occurs in making an inference is called
inferential statistics or inductive statistics.
The phase of statistics that seeks only to describe and
analyze a given group without drawing any conclusion or
inference about a larger group is called descriptive
statistics.

The Nature of Data


Classifying data as a means of understanding their nature

Quantitative or Qualitative?
Quantitative data consist of numbers representing counts
or measurements.
Qualitative data can be separated into different
categories that are distinguished by some nonnumeric
characteristics.

Qualitative data can be artificially quantified.

Discrete or Continuous?
Discrete data result from either a finite number of possible
values or a countable number of possible values.
Continuous data result from infinitely many possible
values that can be associated with points on a continuous
scale in such a way that there are no gaps or interruptions.

Scales of Measure
Nominal scale is characterized by data that consist of names, labels, or
categories only.

Ordinal scale involves data that may be arranged in some order but
differences between data values either cannot be determined or
meaningless.
Interval scale is data for which we can determine meaningful amounts of
differences between data. However there is no inherent zero starting
point.
Ratio scale is the interval scale to include the inherent zero starting point.
For these values, differences and ratios are both meaningful.

Sort them out


In her research, a teacher wanted to examine several
variables as factors of academic performance.
As part of her statement of the problem, she indicated:
What is the demographic profile of the student respondents in
terms of: Age, Sex, Year of Birth, Familys Monthly Income,
Order of birth in the family, Parents Educational Attainment,

Classify these variables as quantitative or qualitative,


discrete or continuous, and its scale of measure.

Methods of Data
Presentation
Understanding ways by which data may be presented. Developing
the skill of constructing a Frequency Distribution Table.

Data Presentation
Data can be presented as text, in tables, or pictorially as graphs
and charts. Figures should not normally be put into text unless
there are just two or three numbers. Tables and graphs are much
clearer. Tables are usually the best way of showing structured
numeric information, whereas graphs and charts are better for
showing relationships, making comparisons and indicating trends.
Even where a graph or chart is used, it is usual to include a table
to show the data from which it was drawn.

Textual
According to the National Statistics Office (NSO), the Philippines has a population of
92,337,852. This is based on the census that the agency has conducted last May, 2010. In the
same census, it was found out that the National Capital Region is home to 11,855,975 while the
Cordillera Administrative Region has a population of 1,616,867. In Luzon, the regional
population are as follows: Region I, 4,748,372; Region II, 3,229,163; Region III, 10,137,737;
Region IVA, 12,609,803; Region IVB, 2,744,671; and Region V, 5,420,411.
In the Visayas, Region VI has a total population of 7,102,438 while Region VII has 6,800,180
and Region VIII with 4,101,322.
For Mindanao, the population per region are registered as follows: Region IX, 3,407,353;
Region X, 4,297,323; Region XI, 4,468,563; Region XII, 4,109,571; the Autonomous Region of
Muslim Mindanao, 3,256,140; and CARAGA with 2,429,224.
SOURCE: National Statistics Office Website

Tabular
REGION
National Capital Region (NCR)

POPULATION
11,855,975

Cordillera Administrative Region (CAR)

1,616,867

Region I Ilocos Region

4,748,372

Region II Cagayan Valley

3,229,163

Region III Central Luzon

10,137,737

Region IVA CALABARZON

12,609,803

Region IVB MIMAROPA

2,744,671

Region V Bicol Region

5,420,411

Region VI Western Visayas

7,102,438

Region VII Central Visayas

6,800,180

Region VIII Eastern Visayas

4,101,322

Region IX Zambaonga Region

3,407,353

Region X Northern Mindanao

4,297,323

Region XI Davao Region

4,468,563

Region XII SOCCSKSARGEN

4,109,571

Autonomous Region of Muslim Mindanao (ARMM)


CARAGA
Others*, Special Cases (eg homeless)

3,256,140
2,429,224
2,739

Graphical
Population of the
Philippines by
Region

IX
4%
VIII
4%

CAR
2%
I
5%
II
3%
III
11%

VII
7%

Population in Millions

CARAG
XII ARMM
XI
A
4%
4%
5%
3%
NCR
X
13%
5%

Population of the Philippines by Region


14
12

IVA
14%

V
6%
IVB
3%

10

10

3
2

0
VI
8%

13

12

4
3

Frequency Distribution
Tables

A frequency table lists categories or classes of scores along with


counts (or frequencies) of the number of scores that fall into each
category.
These table may be of an ungrouped data, which means that
categories are individually tabulated with the corresponding
frequencies. Data is grouped when there are too many scores to
tabulate and the difference between the highest and lowest scores
is relatively large.

Ungrouped Data
A chef wants to build his own restaurant in a certain area. He
decides to base his menu on the preferred cuisine of the
immediate residents of the area so he did a survey on that.
Of the 200 residents interviewed, 93 stated a preference to homecooked Filipino food. Thirty-nine likes Chinese food while 45 goes
for the classic American fast food. On the other hand 16 would go
for Japanese, while the rest were undecided.

Ungrouped Data
Cuisine

Number of Residents

Relative Frequency

Filipino

93

46.50

Chinese

39

19.50

American

45

22.50

Japanese

16

8.00

Undecided

3.50

N=200

Ungrouped Data
Preferred Cuisine by 200 Residents in an Area
100
90
80
70
60
50
40
30
20
10
0
Filipino

Chinese American Japanese Undecided

Ungrouped Data

Preferred Cuisine by 200


Residents in an Area
Japanes
e
8%

America
n
23%

Filipino
46%
Chinese
19%

Undecid
ed
4%

Ungrouped Data
A survey was taken on 5th Ave. In each of 20 homes, people were
asked how many cars were registered to their households. The
results were recorded as follows:
1, 2, 1, 0, 3, 4, 0, 1, 1, 1, 2, 2, 3, 2, 3, 2, 1, 4, 0, 0
Construct a frequency distribution table for the given data.

Ungrouped Data
Number of Cars
Owned

Number of
Residents

Relative
Frequency

20

30

25

15

10

N=20

Grouped Data
The following are the height of 30 students in a school:
98

120

135

107

143

125

120

94

138

99

149

107

160

138

141

161

105

112

121

108

109

119

119

136

153

140

140

115

142

116

Represent the data through a frequency distribution table.

Grouped Data
One. Solve for the RANGE and CLASS INTERVALS
Two. Construct CLASSES starting with the lowest score.
Three. Determine the frequency in each interval.
Height (in cm)

Tally

94-105

IIII

106-117

IIII-II

118-129

IIII-II

130-141

IIII-I

142-153

IIII

154-165

II

2
n=30

Grouped Data
Four. Compute for the CLASS MARK of each interval.
Five. Calculate the relative and cumulative frequencies.
Height (in cm)

Tally

Class Mark
x

rf

Cf>

Cf<

94-105

IIII

99.5

13.33

30

106-117

IIII-II

111.5

23.33

26

11

118-129

IIII-II

123.5

20.00

19

17

130-141

IIII-I

135.5

23.33

13

24

142-153

IIII

147.5

13.33

28

154-165

II

159.5

6.67

30

n=30

100

Your Turn
A researcher did a survey on the number of minutes it takes 30
commuters to reach their workplace during rush hour. The data
gathered, in minutes, is given below.
28
26

25
32

10
26

30
10

35
37

40 45 20 26
25 42 39 35
25 13 10 23

33
30

18
30

50
27

55
30

Construct a Frequency Distribution Table for the given data.

Assignment

Download the modules which will be made available on the course


website, mathbychua.weebly.com, as early as December 1, 2016.
Submit the completed modules during the next regular class on
December 10, 2016.