INTRODUCTION TO

STATISTICS

Von Christopher G. Chua, MST

Contact me directly through email:

von_christopher_chua@dlsu.edu.ph

vaughnchua@gmail.com

readings, assignment and paper

details, and important

announcements will be relayed

through:

MATHbyCHUA:

mathbychua.weebly.com

This slideshow presentation will be made available through the classs official website,

mathbychua.weebly.com. The site will also provide access to download this file in printable

format.

Session Objectives

For this three-hour period, graduate students in education are

expected to develop the following learning competencies:

1. Describe basic terms in statistics

such as population, sample,

parameter, and stastic.

2. Classify data as quantitative or

qualitative, discrete or continuous,

and according to scales of

measure.

presentation.

4. Construct Frequency Distribution

Tables.

5. Represent frequency distribution

tables through histograms and

frequency polygons.

This slideshow presentation will be made available through the classs official website,

mathbychua.weebly.com. The site will also provide access to download this file in printable

format.

Basic Terms in

Statistics

Developing an understanding of statistical jargons

What is STATISTICS?

Statistics is derived from the Latin word status meaning

state.

Triola, 1998

collecting, organizing, summarizing, presenting, analyzing,

interpreting data and drawing conclusions based on that

data.

Schaum, 2008

Sample

Target

population

Population

collection of all elements to be

studied.

subgroup of the population whose

elements have some common

defining characteristic.

A sample is a subcollection of

elements drawn from the population.

A parameter is a numerical measurement describing

some characteristic of a population.

A statistic is a numerical measurement describing some

characteristic of a sample.

Techniques

Recognizing options for selection of samples

Given the population size, N, the sample size, n, may be

obtained through the formula:

=

1 + 2

where e is the margin of error.

=

1 + 2

Compute for the sample size from a population size of

1350 with a margin of error of 5%.

What happens to the sample size as the margin of error is

increased? Explain what this means.

As a principal in a very large school with a population of

2000, you wanted to know the level of reading

comprehension of the students in the school. Since you

cannot obtain data on all 2000 at a small period of time,

you decided to assess the reading comprehension of all

students in the honor roll only. The result showed that 95%

of these students are in the independent reading level.

You therefore conclude that 95% of the school population

have good reading comprehension skills.

Is the conclusion valid?

A more advanced research process is to select individuals

or schools who are representative of the population.

Representative refers to the selection of individuals as

sample of a population such that the sample are typical of

the population under study, enabling you to draw

conclusions from the sample about the population as a

whole

Sampling Techniques

Probability Sampling

Simple Random

Sampling

Systematic Sampling

Cluster Sampling

Systematic Sampling

Consider the sample size

of 309 from the population of

1350.

Compute for the value of k

as = .

the population as a sample.

Sampling Techniques

Grade

level

Distribution of

Population

Distribution

of Sample

250

87

225

12

212

56

10

178

119

TOTAL

865

274

the selected samples representative of the population?

Sampling Techniques

Probability Sampling

Simple Random

Sampling

Systematic Sampling

Cluster Sampling

Stratified Sampling

Stratified Sampling

Grade

level

Distribution

of Population

Percentage

Distribution

of Sample

250

28.90%

79

225

26.01%

71

212

24.51%

67

10

178

20.58%

57

TOTA

L

865

100%

274

Sampling Techniques

Probability Sampling

Simple Random

Sampling

Systematic Sampling

Stratified Sampling

Cluster Sampling

Non-probability Sampling

Convenience Sampling

Purposive Sampling

Snowball Sampling

A teacher wants to conduct an action research in order to

determine the effectiveness of home-based family

counseling on the attendance of students. Of her 56

students, she has selected 20 whose residences are within

a kilometers radius from the school.

What sampling technique did the teacher use?

Do you agree on the strategy she has employed?

Statistics

If a sample is representative of a population, important

conclusions about the population can be inferred from the

analysis of the sample. The phase of statistics under which

this condition occurs in making an inference is called

inferential statistics or inductive statistics.

The phase of statistics that seeks only to describe and

analyze a given group without drawing any conclusion or

inference about a larger group is called descriptive

statistics.

Classifying data as a means of understanding their nature

Quantitative or Qualitative?

Quantitative data consist of numbers representing counts

or measurements.

Qualitative data can be separated into different

categories that are distinguished by some nonnumeric

characteristics.

Discrete or Continuous?

Discrete data result from either a finite number of possible

values or a countable number of possible values.

Continuous data result from infinitely many possible

values that can be associated with points on a continuous

scale in such a way that there are no gaps or interruptions.

Scales of Measure

Nominal scale is characterized by data that consist of names, labels, or

categories only.

Ordinal scale involves data that may be arranged in some order but

differences between data values either cannot be determined or

meaningless.

Interval scale is data for which we can determine meaningful amounts of

differences between data. However there is no inherent zero starting

point.

Ratio scale is the interval scale to include the inherent zero starting point.

For these values, differences and ratios are both meaningful.

In her research, a teacher wanted to examine several

variables as factors of academic performance.

As part of her statement of the problem, she indicated:

What is the demographic profile of the student respondents in

terms of: Age, Sex, Year of Birth, Familys Monthly Income,

Order of birth in the family, Parents Educational Attainment,

discrete or continuous, and its scale of measure.

Methods of Data

Presentation

Understanding ways by which data may be presented. Developing

the skill of constructing a Frequency Distribution Table.

Data Presentation

Data can be presented as text, in tables, or pictorially as graphs

and charts. Figures should not normally be put into text unless

there are just two or three numbers. Tables and graphs are much

clearer. Tables are usually the best way of showing structured

numeric information, whereas graphs and charts are better for

showing relationships, making comparisons and indicating trends.

Even where a graph or chart is used, it is usual to include a table

to show the data from which it was drawn.

Textual

According to the National Statistics Office (NSO), the Philippines has a population of

92,337,852. This is based on the census that the agency has conducted last May, 2010. In the

same census, it was found out that the National Capital Region is home to 11,855,975 while the

Cordillera Administrative Region has a population of 1,616,867. In Luzon, the regional

population are as follows: Region I, 4,748,372; Region II, 3,229,163; Region III, 10,137,737;

Region IVA, 12,609,803; Region IVB, 2,744,671; and Region V, 5,420,411.

In the Visayas, Region VI has a total population of 7,102,438 while Region VII has 6,800,180

and Region VIII with 4,101,322.

For Mindanao, the population per region are registered as follows: Region IX, 3,407,353;

Region X, 4,297,323; Region XI, 4,468,563; Region XII, 4,109,571; the Autonomous Region of

Muslim Mindanao, 3,256,140; and CARAGA with 2,429,224.

SOURCE: National Statistics Office Website

Tabular

REGION

National Capital Region (NCR)

POPULATION

11,855,975

1,616,867

4,748,372

3,229,163

10,137,737

12,609,803

2,744,671

5,420,411

7,102,438

6,800,180

4,101,322

3,407,353

4,297,323

4,468,563

4,109,571

CARAGA

Others*, Special Cases (eg homeless)

3,256,140

2,429,224

2,739

Graphical

Population of the

Philippines by

Region

IX

4%

VIII

4%

CAR

2%

I

5%

II

3%

III

11%

VII

7%

Population in Millions

CARAG

XII ARMM

XI

A

4%

4%

5%

3%

NCR

X

13%

5%

14

12

IVA

14%

V

6%

IVB

3%

10

10

3

2

0

VI

8%

13

12

4

3

Frequency Distribution

Tables

counts (or frequencies) of the number of scores that fall into each

category.

These table may be of an ungrouped data, which means that

categories are individually tabulated with the corresponding

frequencies. Data is grouped when there are too many scores to

tabulate and the difference between the highest and lowest scores

is relatively large.

Ungrouped Data

A chef wants to build his own restaurant in a certain area. He

decides to base his menu on the preferred cuisine of the

immediate residents of the area so he did a survey on that.

Of the 200 residents interviewed, 93 stated a preference to homecooked Filipino food. Thirty-nine likes Chinese food while 45 goes

for the classic American fast food. On the other hand 16 would go

for Japanese, while the rest were undecided.

Ungrouped Data

Cuisine

Number of Residents

Relative Frequency

Filipino

93

46.50

Chinese

39

19.50

American

45

22.50

Japanese

16

8.00

Undecided

3.50

N=200

Ungrouped Data

Preferred Cuisine by 200 Residents in an Area

100

90

80

70

60

50

40

30

20

10

0

Filipino

Ungrouped Data

Residents in an Area

Japanes

e

8%

America

n

23%

Filipino

46%

Chinese

19%

Undecid

ed

4%

Ungrouped Data

A survey was taken on 5th Ave. In each of 20 homes, people were

asked how many cars were registered to their households. The

results were recorded as follows:

1, 2, 1, 0, 3, 4, 0, 1, 1, 1, 2, 2, 3, 2, 3, 2, 1, 4, 0, 0

Construct a frequency distribution table for the given data.

Ungrouped Data

Number of Cars

Owned

Number of

Residents

Relative

Frequency

20

30

25

15

10

N=20

Grouped Data

The following are the height of 30 students in a school:

98

120

135

107

143

125

120

94

138

99

149

107

160

138

141

161

105

112

121

108

109

119

119

136

153

140

140

115

142

116

Grouped Data

One. Solve for the RANGE and CLASS INTERVALS

Two. Construct CLASSES starting with the lowest score.

Three. Determine the frequency in each interval.

Height (in cm)

Tally

94-105

IIII

106-117

IIII-II

118-129

IIII-II

130-141

IIII-I

142-153

IIII

154-165

II

2

n=30

Grouped Data

Four. Compute for the CLASS MARK of each interval.

Five. Calculate the relative and cumulative frequencies.

Height (in cm)

Tally

Class Mark

x

rf

Cf>

Cf<

94-105

IIII

99.5

13.33

30

106-117

IIII-II

111.5

23.33

26

11

118-129

IIII-II

123.5

20.00

19

17

130-141

IIII-I

135.5

23.33

13

24

142-153

IIII

147.5

13.33

28

154-165

II

159.5

6.67

30

n=30

100

Your Turn

A researcher did a survey on the number of minutes it takes 30

commuters to reach their workplace during rush hour. The data

gathered, in minutes, is given below.

28

26

25

32

10

26

30

10

35

37

40 45 20 26

25 42 39 35

25 13 10 23

33

30

18

30

50

27

55

30

Assignment

website, mathbychua.weebly.com, as early as December 1, 2016.

Submit the completed modules during the next regular class on

December 10, 2016.

