Anda di halaman 1dari 11

Statistics Sampling Issues

Prelims Random sampling: every member of a


population has an equal chance of being
I. Research and Concepts selected into a sample.
Statistic – a fact or piece of data from a Representative sampling: selecting cases
study of large numerical data. so that they will match the larger population
Index – an indicator, sign or measure of on specific characteristics
something. Convenience Sampling: selects participants
Why study statistics? on the basis of proximity, ease-of-access,
and willingness to participate.
1. It helps us process data/information;
data being the raw material of II. Variables and Scales of
knowledge; and. Measurement
2. To learn things and inform our lives Variables and Constants
Statistics – is the study of how best to A variable is pretty much anything that can
collect, analyze, and draw conclusions from be codified and has more than a single
data. value.
The Scientific Method: For example: sex, age, height, attitudes
1. Ask a question or Address a problem about school, score on a test.
2. Research A constant, in contrast, has only a single
3. Hypothesis score
4. Experiment
5. Analysis Quantitative and Qualitative Variables
6. Conclusion
A quantitative (continuous/discrete)
Social Science Research and variable is one that is scored in such a way
Terminologies that the numbers, or values, indicate some
sort of amount.
A population (N) is an individual or group
that represents all the members of a certain E.g.: Height, Age, Number of
group or category of interest. Children, Year Level

A value generated from, or applied to, a Qualitative variables are those for which
population is called a parameter. the assigned values do not indicate more or
less of a certain quality.
A sample (n) is a subset drawn from larger
population. E.g.: Class, Sex (dichotomous),
Ethnicity, Religion, Year Level
A value generated from, or applied to, a
sample is called a statistic. Scales of Measurement

Descriptive and Inferential Statistics Qualitative Scales:

Descriptive statistics apply only to the 1. Nominal: Categories or labels.


members of a sample or population from
which data have been collected. E.g., Sex (male/female), civil status
(single/married/divorced), Jersey #
It exists to describe and simplify information.
1. Ordinal: Categories or labels
Inferential statistics: using sample data to indicating rank or “order” (hence,
infer/conclude about the characteristics of ordinal), but no meaningful distance
the larger population. between scores. Order matters,
distance does not.
It bridges the known (sample) to the
unknown (population). E.g., Eldest/Middle/Youngest,
1st/2nd/3rd, Champion/Runner-up/Last

Quantitative Scales:

1
Interval and Ratio Measures. Mode = Most frequent value; “Peak(s)” of
the distribution
Note: Continuous - infinite numbers
between measures. Ordinal data has a median and mode only*,
and nominal data has only a mode.
Discrete - absolute values without values
in between. * -- a consensus has not been reached
among statisticians about whether the mean
Interval Variables can be used with ordinal data.
Numbers have order (like ordinal), PLUS Strengths and Weaknesses
clear, equal, and meaningful intervals
between values. The HOUSE OF MEAN

E.g., Grade; the difference between 100 Hero (Central Measure): Mean
and 99 (1 interval between 2 values) is the
same as difference between 82 and 81 (1 Sidekicks (Variability Measures):
interval between 2 values). Standard Deviation and Variance

Rating scales (1-5); 1-Strongly Disagree, Strength: Precise/Exact


2-Disagree, 3-Neutral, 4-Agree, 5-Strongly Weakness: Prone to being influenced by
Agree outliers (not Robust)
Ratio Variables The HOUSE OF MEDIAN
Like interval (numerical and with clear, Hero (Central Measure): Median
equal, and meaningful intervals), PLUS
ratios are meaningful (twice as much) and Sidekicks (Variability Measures):
has a true zero point (zero means absence of Minimum, Maximum, Inter-quartile Range
what you are measuring).
Strength: Not influenced by outliers
E.g., Weight; 10 lbs. Is twice as much as 5 (Robust)
lbs. (ratio); 0 lbs. Means no weight or
absence of weight (true, meaningful 0 point). Weakness: Not as precise/exact as the
mean.
No. of Children; 4 children is twice as much
as 2 children (ratio); 0 children means Outliers - Data that is very much bigger
absence of children (true zero point). or smaller than the next nearest data
point.
Comparing Scales
IV. Measures of Variability
Ordinal vs. Interval
Range
Interval: Grade; a one (1) point difference
Difference between largest and smallest
is the same at all points in the scale.
values in a distribution.
Ordinal: Place in race; 1st, 2nd, 3rd.
Range = Max - Min
Difference between 1st and 2nd places may
not be as close/far as difference between 3rd Range defines the broadness
and 4th places. of the base of a distribution
III. Measures of Central Tendency Variance
Distribution – any collection of scores on a The variance measures how far each number
variable in the set is from the mean.
Mean, Median, and Mode Standard Deviation
Mean = Average; central point in a set of Standard = average; Deviation = distance of
data; balance point. values from the mean
Median = “Middle value”; cuts distribution Standard deviation is the average distance of
into upper and lower halves. values from the mean.

2
V. Mean, Median, and Outliers In a normal distribution...

5 Number Summary 1. Mean = median = mode


1. Median 2. Symmetry about the center
2. Minimum

3. Maximum 50% of values less than the mean and 50%


greater than the mean
4. Inter-quartile Range - Divides the
distribution into 4 equal quarters The Standard Deviation defines the Spread

IQR = Q3 - Q1 where: - 68% of values are within


1 standard deviation of the mean
Q1 - Median of Lower Half
(25% of the data is below q1) - 95% of values are within
2 standard deviations of the mean
Q3 - Median of Upper Half (75% of
the data is below q3) - 99.7% of values are within
3 standard deviations of the mean
Boxplots
- A.k.a.
Boxplots are standardized graphical
representation of the 5 number summary - The empirical rule,

Median vs. Mean - The three-sigma rule, or

1. We still want the mean because of - The 68-95-99.7 rule


its exactness (for n) and precision (to
the mean of N); and Normal Distributions and your world

2. The median helps us track and trim 1. It is the most common


outliers. distribution in nature (as
distributions go)
VI. The Normal Distribution
- Normal distributions are also called
The normal distribution refers to a family “natural distributions”
of continuous probability distributions
described by the normal equation. - Height, weight, intelligence, etc.

The value of the random variable Y is: 1. Statistical relationships become


clear if one assumes the normal
distribution.
The normal curve is a mere representation of
probabilities. - Does IQ lead to success?

Normal Distribution (Bell Curve; Gaussian - Will material things lead you to more
Distribution) happiness?

All normal distributions are symmetric and - Is studying important to career


have bell-shaped density curves with a growth?
single peak (Unimodal).
VII. Normal Distribution Concepts
Two specific measures:
The distribution curve is just a model that
1. The MEAN (center and peak of seeks to reflect reality.
density)
Describing Distributions: Skewness and
2. The STANDARD DEVIATION Kurtosis
(spread or girth of the bell curve).
Skewness
The Mean and Standard Deviation
Skew: Degree of deviation from the
The Mean Defines the Center normal in terms of asymmetrical
extension of the tails.

3
Normal distributions have a skew of 0.

Kurtosis

The shape of a distribution of scores in terms


of its flatness or peakedness.

Normal distributions have a standard kurtosis


of 3.

Platykurtic=flat; k<3

Leptokurtic=thin; k>3

Locating Values via Standardization:


Percentiles and Z-scores

Standardization: The process of converting


a raw score into a standard score.

Raw scores: Individual observed scores on


measured variables.

Percentile (%ile): % scores below a


certain value in the distribution.

Standard score (z-score): A raw score that


has been converted to a z score.

VIII. Central Limit Theorem

1. Theorems are found in math.

2. They are statements that have been


proved using proved statements
(theorems).

3. Unlike theories, theorems are


deductive.

The CLT (in simple terms)

1) Sample means will be distributed


roughly as a normal distribution
around the population mean.

All of this will be true no matter what the


distribution of the underlying population
looks like.

1) The distribution of means will


approach a normal shape faster if...

4
a) The population from which The Standard Error
the sample it is taken
from is a normal The standard error is the standard deviation
distribution; OR of sample means. It is a measure of how
representative sample is likely to be of the
b) The sample size (n) is population.
relatively large (30 or
more). Interpreting Standard Errors

Making Inferences Large standard errors (relative to the sample


mean): high variability between the
If we have detailed information about some means of different samples…
population…
Some samples might not actually represent
Then we can make powerful inferences the population.
about any properly drawn sample from
that population. Small standard errors: most sample means
are similar to the population mean…
If we have detailed information about a
properly drawn sample (mean and standard Our samples are likely to be accurate
deviation)... reflections of the population.

We can make strikingly accurate Confidence Intervals


inferences about the population from A confidence interval for the mean is a range
which that sample was drawn. of scores where the population mean will fall
If we have data describing a particular within this range in 95% of samples.
sample, and data on a particular population… In 100 samples, the confidence intervals of
We can infer whether or not that sample 95 samples would contain the true value of
is consistent with a sample that is likely the mean in the population.
to be drawn from that population. X. Hypothesis
Last, if we know the underlying Definition: A hypothesis is an educated
characteristics of two samples… guess that can be tested.
We can infer whether or not both An educated guess is a
samples were likely drawn from the guess that is based on theories.
same population.
That can be tested means that it is not a
Midterms truism (e.g., I hypothesize that the sun will
IX. Inferential Statistics shine and set tomorrow).

1. Samples allow you to make good Two Types


inferences about the sample itself; 1. Alternative Hypotheses (H1)
and denote the presence of an effect.
2. Samples allow you to make good A.k.a Experimental Hypothesis if the
inferences about the population as a methodology is experimental.
whole.
2. The Null Hypotheses (H0) denote the
#1 is called internal validity (the state of absence of an effect.
being factual or logically sound).
The null hypotheses was
#2 is called external validity. made to be rejected. =(
Inferential statistics are techniques that allow We need H0 because we cannot prove the
us to use samples in making alternative hypothesis using statistics... But
generalizations about the populations we can reject the null hypothesis.
from which the samples were drawn.
If our data give us confidence to reject the
null hypothesis then this provides support

5
(not proof) for our alternative/experimental P-value significance level for one-tailed
hypothesis. test: 0.05

Science is not above proving an effect Two-tailed tests are statistical tests that look
(accepting H1), but disproving the absence of out for an effect on both tails of the
an effect (rejecting H0). distribution (non-directional).

Directional and Non-Directional P-value significance level for two-test:


Hypotheses 0.025

Directional hypotheses state that an effect Decision Errors: Type I and Type II
will occur, but it also states the direction
of the effect. Reality 1: There is, in reality, an effect in the
population; or
“Students will know more about research
methods after taking EDP 211.” Reality 2: There is, in reality, no effect in the
population.
A non-directional hypothesis states that an
effect will occur, but it doesn’t state the Statistics will NOT tell us which reality is
direction of the effect. TRUE. But statistics can show us the
probability of which reality is MORE LIKELY
“Students knowledge of research (and whether the effect is strong or not).
methods will change after EDP 211.”

Hypothesis Testing
One of the four things can happen.
Steps in Hypothesis Testing
1. You can say it ain’t (null) and it don’t
1. Determine H0 and H1; (no effect in reality)

2. Collect Data and Calculate Test 2. You can say it be (alternative) and it
Statistic; do (effect is present in reality)

3. Check the p-value to determine if 3. You can say it ain’t (null) and it do
the effect just happened by chance; (effect is present in reality)
and
4. You can say it be (alternative) and it
4. Given the p-value, make decision: don’t (no effect in reality)

a. If effect is likely to have #3 and #4 are “false alarms”. False


happened by chance, do not alarms are allowable based on a p-value
reject H0; OR

b. If effect is unlikely to have Type I and Type II Errors


happened by chance, reject
H0. False Alarms:

P-Value 1. You can say it ain’t (null) and it do


(effect is present in reality)
Developed by Ronald Fisher, the p-value is
the probability that an effect happened #1 is called Type II error: When we
by chance (false alarm). believe that there is no effect in the
population when, in reality, there is (denial).
The probability that the value observed (or a
more extreme one) happened by chance if 1. You can say it be (alternative) and it
the Null Hypothesis was true. don’t (no effect in reality)

P-values and Directionality #2 is called Type I error: When we believe


that there is a genuine effect in our
One-tailed and Two-tailed Test population, when in fact there isn’t
(assumption).
One-tailed tests are statistical tests that look
out for an effect on one tail of the distribution Sampling issues and other statistical
(directional). mistakes lead you to wrong generalizations
(Type I or Type II Errors).

6
EFFECT SIZE Two fundamental characteristics:

p values are never enough... 1. Direction

“Just because a test statistic is significant Positive (+) Correlations:


doesn’t mean that the effect it measures is variables move in the same direction.
meaningful or important.”
Negative (-) Correlations:
An effect size is a standardized measure (0 variables move in opposite directions.
to 1) of the magnitude of observed effect in a
sample. 2. Magnitude

Statistical Power Strength of correlations range from


weak to perfect.
Statistical power is the measure of a
statistical test’s ability to find effects in a A perfect correlation indicates that the
population (assuming that the effect is correlation is taking place in EVERY
present). member of the sample or population.

Low statistical power will under- A weak correlation indicates that the
report significance and effect size. correlation is taking place in a “FEW”
members of the sample or population.
High statistical power will correctly
report significance and effect size. Measures of Correlation

0.8 Statistical Power is generally The Correlation Coefficient (r)


acceptable. The correlation coefficient (r) is a
Two Uses standardized measure of how much variables
are in sync when they change (covariance).
Statistical Power help us…
In short: r is a standardized measure of
1. See how powerful your test covariance.
statistics are; and
The r can be anywhere between -1.00 to
2. If you have a desired effect size, +1.00
calculate the sample size
necessary to achieve a given R denotes direction.
level of power of a test (using A positive (+) r means that the direction
softwares such as G*Power or Tables of the correlation is positive:
by Cohen).
Both variables covary in the same direction.
XI. Correlation
A negative (-) r means that the direction
Correlation: When is a relationship really a of the correlation is negative:
“relationship”?
Both variables covary in opposite directions.
The logic of statistical correlations
The p-value in correlations.
Suppose X and Y are two correlated
variables… The p value is the probability that the
correlation/covariance happened by chance
Changes in X causes Y to change... (random or accidental) if the null hypothesis
AND was true.

Changes in Y causes X to change. If P > 0.05, the correlation is not


significant (effect may have happened
“Covariance”: comparing how variables are by chance).
in sync when they change.
If P <=0.05, the correlation is
Correlation: How do we describe significant (effect did not happen by
relationships? chance).

7
Reporting Correlations Pearson Correlation

“There was a significant correlation Example research questions that use the
between <Variable x> and <Variable Y>, r Pearson Correlation:
= ____, p (one/two-tailed) = _____.”
1. Medicine - Will increased water
The direction and strength of the correlation intake significantly bring down a
is then explained and analyzed in the fever?
discussion of research results.
2. Psychology - Will performance in a
“There was no significant correlation previous reading test affect the next
between <Variable X> and <Variable Y>, r reading test?
= ___, p (one/two-tailed) = ____.”
Partial Correlations and the “Third
Parametric and Non-Parametric Tests Variable Problem”
for Correlation
Spurious correlations: two effects that can
Chi-Square Test of Independence be statistically linked by correlation despite
having no clear causal relationship.
H0 = Variables are independent of each
other (no correlation). E.g.: Do storks bring babies? Statistics have
shown that increased number in storks are
Example research questions that use the Chi- related to increased number of births.
Square:
Spurious correlations are caused by a “third
1. Medicine - Are children more likely variable”, and partial correlations control
to get infected with virus A than these third variables.
adults?

2. Psychology - Are males likely to do


better in exams than females? XII. Regression: The Power to
Predict
Spearman Ranks (Rho)
A correlation in statistics becomes a
Kendall’s Tau is used if there are many tied causation when there is a theory that
ranks. accurately describes a link between the
Example research questions that use the two.
Spearman Rho: Independent and Dependent Variables
1. Sociology - Do people with a higher A variable that we think is a cause is
level of education have a stronger known as an independent variable
opinion of whether or not tax reforms (because its value does not depend on any
are needed? other variables).
2. Psychology - Does one’s general A variable that we think is an effect is
cognitive ability correlate to success called a dependent variable because the
in college? value of this variable depends on the cause
Point Biserial Correlation (independent variable).

Point biserial correlations are used to Cause and Effect; IV -> DV


correlate a BINARY (two values) nominal In experimental psych, we manipulate the IV
variable and a scale variable. (cause) to trigger a change in the DV (effect)
Example research questions that use the Predictions
Point Biserial Correlation:
Regression Analysis
1. Sociology - Are males more likely to
earn than females? Regression analysis is a way of…

2. Social psychology - Is satisfaction Predicting an outcome variable from one


with life higher the older (elderly vs. predictor variable (simple regression),
not elderly) you are?

8
Or several predictor variables (multiple In logistic regression, only forced entry
regression). and stepwise methods are used.

Logistic Regression Prefinals

Logistic regression is multiple regression XIII. Comparing Two Means


but…
Differences
... with an outcome variable that is a
categorical variable, and In experimental research, we try to
manipulate what happens to people so
... predictor variables that are that we can make causal inferences.
continuous or categorical.
1. Manipulation of a variable (IV)
Binary and Multinomial (Polychotomous)
2. Measuring effect (DV)
Binary Logistic Regression: When we are
trying to predict membership of only two 3. Looking at differences between
categorical outcomes, groups to see if the effect was
significant (or not) across the groups
Multinomial Logistic Regression: When studied.
we want to predict membership of more than
two categories. E.g. Diet pills.

Interpreting Regression Analysis Groups: Control and Experimental

R, R-Squared (R2), and goodness-of-fit 1. Control Group - the control group


does not receive treatment or
R-squared is a statistical measure of how intervention.
close the data are to the fitted regression
line. 2. Experimental Group - the
experimental group(s) receive
A.k.a. coefficient of determination, or the treatment or intervention.
coefficient of multiple determination (for
multiple regression). Designing Tests of Difference

R-squared is always between 0% to 100%. 1. We can either expose different


people to different experimental
100% indicates that the model explains all manipulations (between-group or
the variability of the response data around its independent design)...
mean.
2. or take a single group of people
R (correlation coefficient) is the square and expose them to different
root of R2. experimental manipulations at
different points in time (repeated-
Methods in Regression measures design).
Regression Methods Between-Group or Independent Design
1. Forced Entry - All variables are Pre-test Post-test Design
entered simultaneously.
Control:
2. Hierarchical - Variables are entered Pretest > Post-test (OO)
one-by-one in order of importance.
Experimental: Pretest > Treatment
3. Stepwise Method - Order of predictors > Post-test (OXO)
are decided upon by a computer, depending
on the t-statistic. Post-test only Design

Forward stepwise means strongest predictors Control: Post-test (O)


are chosen first, and backward stepwise
means weakest predictors are chosen first.
Experimental: Treatment > Post-test (XO)
Logistic Regression

9
Repeated Measures Design 3. Equality of Variance: Not equal

One group with two or more treatments 4. One sample, repeated measures

Experimental Group: XIV. Comparing Several Means

Pretest (optional) > Treatment > Post-test > If we have one control group and two
Treatment > Post-Test experimental groups to study, how many t-
tests do we need to compare the means?
Repeated measures designs are relatively
more powerful than independent designs. ANOVA still compares means, but why does it
“analyze” variances?
Independent Samples (Between Groups)
T-Test The Idea Behind Analysis of Variance

Conditions: Variance:

1. Scales: Interval/Ratio - At the sample level: The square of


the average distance of the data
2. Distribution: Normal
points from the mean (square of the
3. Equality of Variance: Not equal standard deviation)

4. Two different groups/samples - At the population level: the


squared distance of the average
Interpreting Independent Samples distance of all samples from the
(Between Groups) T-Test population mean.

LEVENE’S TEST Simply put: Analysis of variance is the


analysis of how far samples are from the
F - ratio of variance (variance1/variance2)
population mean.
Generally, an F-value far greater than 1
Recap: Central Limit Theorem
means that the difference between the two
groups are great. CLT: Population characteristics (central
tendency and variability) are generally
reflected by its samples.
T-test for Equality of Means
Example: DNA Testing & the (little) variance
T-statistic - ratio of the means (mean group we share in the genome. (Biology)
1/mean group 2)
Membership of a certain
- T = 1 means no difference (null culture. (Social Sciences)
hypothesis)
Writing ANOVA Hypotheses
- A t-stat far greater than one means
H0: There is no significant difference
one group’s mean is significantly
between groups/measures.
greater/lower than the other group’s
mean x1 = x2 = x3

Sig. - significance between the ratio of H1: There is a significant difference between
means. the groups/measures.

- Is the difference between the means - Three possible scenarios


of the groups happening by chance?
One Way Analysis of Variance (ANOVA)

Repeated Measures (One-sample or Analysis of the variation within each group,


Dependent Samples) T-test v.s. the amount of variation between
each group.
Conditions:
- If there's a lot of variation within
1. Scales: Interval/Ratio each group and only a little bit of
variation between each group, then
2. Distribution: Normal
it's harder to say that the result is

10
"significant" -- it might be due to
chance alone.

Example: Height of students from Manresa,


La Storta, and Loyola.

Repeated Measures ANOVA

Analysis of the variation within measure,


v.s. the amount of variation between
each measure.

- If there's little variation within each


measure and a lot of variation
between each measure, then it's
easier to say that the result is
"significant" -- it might NOT be due
to chance alone.

Example: Prelim, Midterms, and Prefinals


Exams.

ANOVA: Analysis of Variance is a variability


Ratio

Variance between/variance within

LARGE/SMALL = reject h0 = at least one


mean is an outlier and each distribution is
narrow; distinct from each other

SIMILAR/SIMILAR =fail to reject h0 = means


are fairly close to overall mean and/or
distributions overlap a bit; hard to
distinguish.

SMALL/LARGE = fail to reject h0 = the


means are very close to overall mean and/or
distributions “melt” together

Some additional points:

1. ANOVA proceeds with the


condition that the variances are
equal. Generally, the variances will
equalize given a good sample size
and representativeness.

2. We can still do ANOVA for two


samples/measures. But t-tests will
generally be enough.

3. ANOVA is an omnibus test. It can


only say if there is a significant
difference between the
groups/measures, but now HOW the
groups/measures are different.

11

Anda mungkin juga menyukai