STATISTICS
Statistically thinking
Doing exercises
Group presentation
Self-study
GRADING BREAKDOWN
MARK (%) FORM OF ASSESSMENT
- Collect
- describe
- summarize
- present
- analyze
More details, Statistics covers some
major jobs:
Business Physical
Economics, Engineering, Sciences
Marketing, Astronomy,
Computer Science Chemistry, Physics
Areas where
STATISTICS
are used
Health &
Medicine Environment
Agriculture,
Genetics, Clinical Trials, Ecology, Forestry,
Epidemiology, Animal Populations
Pharmacology
Government
Census, Law,
National Defense
Source: American Statistical
Association
Applications in
Business and Economics
Accounting
Public accounting firms use statistical
sampling procedures when conducting audits
for their clients.
Economics
Economists use statistical
information in making forecasts
about the future of the economy
or some aspects of it.
Applications in
Business and Economics
Marketing
Electronic point-of-sale scanners at retail
checkout counters are used to collect data
for a variety of marketing research
applications.
Production
A variety of statistical quality
control charts are used to monitor
the output of a production
process.
Applications in
Business and Economics
Finance
Financial advisors use price-earnings ratios and dividend
yields to guide their investment recommendations.
II/ Definitions
1/ Population is the WHOLE set of all items or
individuals of interest
2/ Sample is an observed subset of population values
3/ Variable is a characteristic that changes or varies over
time for different individuals or objects under
consideration
Population vs. Sample
Population Sample
a b cd b c
ef gh i jk l m n gi n
o p q rs t u v w o r u
x y z y
III/ Descriptive statistics and Inferential
statistics
Statistics
Descriptive Inferential
Statistics Statistics
1/ Descriptive statistics
Descriptive statistics: Methods used to summarize
and describe the main features of the whole population
in quantitative term.
Tabular, graphical, and numerical methods (mean,
median, variance, standard deviation…)
Used when we can enumerate the whole population
Descriptive Statistics
- Collect data
e.g., Survey, Observation,
Experiments
- Present data
e.g., Charts and graphs
- Characterize data x i
Ordinal Ratio
Highest Level
Measurements
Ratio/Interval Scale Complete Analysis
1. Firewood
2. Coal
3. Oil
4. Gas
Scales of Measurement
Ordinal
( ) Firewood
( ) Coal
( ) Oil
( ) Gas
Scales of Measurement
Interval
-3 -2 -1 +1 +2 +3
1. Firewood.................VND
2. Coal.........................VND
3. Oil............................VND
4. Gas..........................VND
Example: there is a survey on FTU’s students. Describe
them as quantitative or qualitative, and the scales of
measurement
1. Full name:..........................................
2. Sex: Male Female
3. Age :
4. Which year student:
1st 2nd 3rd 4th
5. a/ Have you got a part-time job?
Yes No
b/ If yes, how many hours per week?...........
c/ What do you think how much does your part-
time job fit your study field?
Very suitable Not at all
5 4 3 2 1
DATA COLLECTION
Methods of Data Collection:
Cencus
Sample survey
Experiment
Observational study
Census. A census is a study that obtains data from every member of a
population. In most studies, a census is not practical, because of the cost
and/or time required.
Sample survey. A sample survey is a study that obtains data from a
subset of a population, in order to estimate population attributes.
Experiment. An experiment is a controlled study in which the
researcher attempts to understand cause-and-effect relationships. The
study is "controlled" in the sense that the researcher controls (1) how
subjects are assigned to groups and (2) which treatments each group
receives
Observational study. Like experiments, observational studies
attempt to understand cause-and-effect relationships. However, unlike
experiments, the researcher is not able to control (1) how subjects are
assigned to groups and/or (2) which treatments each group receives.
Survey Design Steps
Define the issue
what are the purpose and objectives of the survey?
Demographic Questions
◦ Questions about the respondents’ personal characteristics
Example: Gender: __Female __ Male
Populations and Samples
A Population is the set of all items or individuals of interest
◦ Examples: All likely voters in the next election
All parts produced today
All sales receipts for November
Convenience sample
• A convenience sample is made up of
people who are easy to reach
Statistical Sampling
Items of the sample are chosen based on known or calculable
probabilities
Probability Samples
Population
Divided
into 4
strata
Sample
Systematic Samples
Decide on sample size: n
Divide frame of N individuals into groups of k individuals:
k=N/n
Randomly select one individual from the 1st group
Select every kth individual thereafter
N = 64
n=8 First Group
k=8
Cluster Samples
Population is divided into several “clusters,” each
representative of the population
A simple random sample of clusters is selected
All items in the selected clusters can be used, or items can be
chosen from a cluster using another probability sampling
technique
Population
divided into
16 clusters.
Randomly selected
clusters for sample
BIAS IN SURVEY SAMPLING
Bias often occurs when the survey sample does not accurately
represent the population
Two causes of bias:
selection bias
Response bias
Selection bias
Results from an unrepresentative sample
3 types of selection bias
Undercoverage
Non-response
Voluntary response
To improve survey quality: use random sampling
Response bias
Results from problems in the measurement process
Two common causes:
Leading question
Social desirability
Learn to View Statistics with a
Critical Eye
There are three kinds of lies…..
Lies
Damn Lies
Statistics
You need to make statistics work for you, not lie for
you!
Alert
“Statistics don’t lie, statisticians do.”
Exercise 1
Describe the variable implicit in these 10 items as quantitative or
qualitative, and describe the scale of measurement
1. Age of household head
2. Sex of household head
3. Number of people in household
4. Use of electric heating (yes/no)
5. Numbers of large appliances used daily
6. Average number of hours heating is on
7. Average number of heating days
8. Household incomes
9. Average monthly electric bill
10. Ranking of this electric company among 4 electricity suppliers
Problem
An auto analyst is conducting a satisfaction survey, sampling from a list of
10,000 new car buyers. The list includes 2,500 Ford buyers, 2,500 GM
buyers, 2,500 Honda buyers, and 2,500 Toyota buyers. The analyst selects a
sample of 400 car buyers, by randomly sampling 100 buyers of each brand.
Is this an example of a simple random sample?
(A)Yes, because each buyer in the sample was randomly sampled.
(B) Yes, because each buyer in the sample had an equal chance of being
sampled.
(C) Yes, because car buyers of every brand were equally represented in
the sample.
(D) No, because every possible 400-buyer sample did not have an equal
chance of being chosen.
(E) No, because the population consisted of purchasers of four different
brands of car.
Problem
Which of the following statements are true?
I. Random sampling is a good way to reduce response bias.
II. To guard against bias from undercoverage, use a convenience
sample.
III. Increasing the sample size tends to reduce survey bias.
IV. To guard against nonresponse bias, use a mail-in survey.
(A) I only
(B) II only
(C) III only
(D) IV only
(E) None of the above.