0 Suka0 Tidak suka

2 tayangan58 halamanlecture notes form course in economic research method

Apr 21, 2017

© © All Rights Reserved

PDF, TXT atau baca online dari Scribd

lecture notes form course in economic research method

© All Rights Reserved

2 tayangan

lecture notes form course in economic research method

© All Rights Reserved

- Multivariate data analysis Hair Chapter 01_US 7e (1)
- Assignment 1.docx
- Sensory Analysis
- BA 578 Live Fin Set2
- Physical Abuse Around the Time of Pregnancy
- 2.pdf
- Chapter 8 Hypothesis Testing
- Statistics Lectures
- 14 Chapter 5
- QM0034
- A Z
- Unit 19
- Sport and Recreation Profile: Gardening
- Hypothesis Testing
- uroots_pucp_eco_330_2010_2
- SI- Hypothesis Testing FIS
- Sample Size
- Evans Thomas 2006 Ma
- Four Steps of Hypothesis
- Statistics/Statistical Hypothesis

Anda di halaman 1dari 58

IGS

1

Statistical Conclusion Validity

correlation (covariation) between

treatment X and outcome Y

2

Glossary

population mean, the population standard deviation, . . .); we

conventionally indicate parameters using greek letters (, , , . . .)

Sample statistic: A feature that varies from one sample to another

(e.g., the sample average, the sample standard deviation, . . .)

Estimator: Any function of sample data used to estimate

parameters

Expectation: The mathematical expectation of a variable

indicated as is the population average of this variable

When a sample statistic has expectation equal to the corresponding

population parameter, its said to be an unbiased estimator of that

parameter

3

Formal Statistical Inference

population based on sample data

Practical questions

How much uncertainty is associated with sample

data?

Do my results constitute strong evidence or just a

lucky draw/chance finding?

4

Formal Statistical Inference (contd)

units, every possible sample of size has the same chance of selection

!

= possible samples are equally likely

! !

56 samples are equally likely

! 8! 87654321

= = = 56

! ! 3! 5! (3 2 1)(5 4 3 2 1)

5

The Mean

Only one [ ] (parameter)

Many sample averages = that

depend on

What units are drawn

6

Unbiasedness of the Sample Mean

the average of the resulting sample means would be

the population mean

= [ ]

7

Variability of the Sample Mean

Sampling variance

1

=

2

= = 1

2

Variance = 2 ( )2 =

1

( ) =1

=1 2

=

Std. Dev. =

SE summarizes the

variability in an estimate due

to random sampling 8

Estimated standard error

unknown and must be estimated by replacing

with

9

T-statistic for the sample mean

= =

10

T-statistic for the sample mean (contd)

= = 0

11

Central limit theorem

enough, has a sampling distribution that is

close a standard normal distribution (mean of 0 and

standard deviation of 1), irrespective of the

population distribution of

In other words, for large samples, the distribution

of a t-statistic is independent of the distribution of

the underlying data

12

Distribution of a t-statistic

13

Hypothesis testing

frequency of values larger than 2 is about

5%

Any t-statistic larger than 2 in absolute value

is too unlikely to be consistent with the null

hypothesis We reject the null

14

Confidence interval

samples from the same population

, + 2

2

the time

15

Confidence Level

samples from the same population that can be expected to

include the true population parameter

If we repeatedly drew infinitely many independent samples

from the same population and we calculated a confidence

interval for each sample, then a certain percentage

(confidence level) of the intervals would include the

unknown population parameter

Confidence intervals are usually calculated so that this

percentage is 95%, but we can produce 90%, 99%, 99.9% (or

whatever) confidence intervals for the unknown parameter

16

Comparison of Two Group Averages

1 = [ = 1

0 = [ = 0

0 : 1 0 = = 0

1 0 1 0

= =

1

0

1 1

+

1 2

17

Significance vs. Effect magnitude

18

Null Hypothesis Significance Testing

(NHST)

The null hypothesis (0 ) is a claim to be tested,

usually an hypothesis of no difference (e.g., no

difference between test scores in group A and

group B)

The alternative hypothesis (1 ) is the one we

would believe if the null hypothesis is rejected

Rejecting 0 does not prove 0 to be false nor 1 to be

true

The only way 0 can be proven false (or true) is to know

the value of the population parameter(s) specified in the

null hypothesis; sample data do not provide that kind of

information

19

p-value

or more extreme results if the null

hypothesis were true

Following Fisher (1926), we usually say that

results are statistically significant if p < .05

(arbitrary)

20

More on NHST

DECISION

(FALSE POSITIVE)

H0 is true Prob = 1 Prob =

(significance)

(FALSE NEGATIVE)

H0 is false Prob = Prob = 1

(power)

21

More on NHST (contd)

= =

= 1 = 1

22

Statistical Conclusion Validity

1. Do X and Y covary?

Type I error (false positive): We may incorrectly

conclude that X and Y covary when they do not

Type II error (false negative): We may incorrectly

conclude that X and Y do not covary when they do

2. How strongly do X and Y covary?

We can over/underestimate

The magnitude of covariation

The degree of confidence that magnitude estimate

warrants

23

Threats to Statistical

Conclusion Validity

24

1. Low Statistical Power

incorrectly conclude that the relationship between

treatment and outcome is not significant (Shadish et

al. 2002, 55)

= 1 = 1

The ability of a test to detect relationships that exist in

the population

The probability that a statistical test will reject the null

hypothesis when it is false

25

1. Low Statistical Power (contd)

confidence intervals

Common practice to set = .20 Power = .80

Important to increase power when missing a real effect

would have negative consequences, e.g. when testing for

harmful effects of a new drug

Low power is a problem when effect sizes are small

Remedy: Meta-analysis

Comprehensive list of remedies: Table 2.3 SKC

26

1. Low Statistical Power (contd)

A. Sample size: The larger the sample

size, the higher the power (see figure

1) Remedy:

Increasing sample size (sometimes

expensive/difficult)

27

Figure 1: The relationship between sample size and power for

H0: = 75, real = 80, one-tailed = 0.05, for 's of 10 and 15.

n

Source: Lane 2015

28

1. Low Statistical Power (contd)

B. Standard deviation (SD): The smaller

the SD, the higher the power (see

figure 1) Remedies:

Sampling from a homogeneous

population

Reducing random measurement error

29

Figure 1: The relationship between sample size and power for

H0: = 75, real = 80, one-tailed = 0.05, for 's of 10 and 15.

n

Source: Lane 2015

30

1. Low Statistical Power (contd)

C. Effect size (i.e. difference between

hypothesized and true parameter):

Easier to detect larger effects (see

figure 2)

31

Figure 2. The relationship between and power for

H0: = 75, one-tailed = 0.05, for 's of 10 and 15

32

1. Low Statistical Power (contd)

D. Significance level (): The lower the

(i.e. the probability of Type I

error/false positive), the lower the

power (see figure 3)

33

Figure 3. The relationship between significance level and

power with one-tailed test: H0: = 75, real = 80, and = 10.

n

Source: Lane 2015

34

2. Violated Assumptions of the Test Statistics

either overestimating or underestimating the size and

significance of an effect (Shadish et al. 2002, 55)

Example: If we ignore the hierarchical/multivel

structure of the data (e.g., soccer players nested

within teams, students nested within classes), we

may severely underestimate standard errors and

conclude that effects that might be ascribed to

chance are real (i.e. a higher risk of Type I error)

35

3. Fishing and the Error Rate Problem

the number of tests, can artifactually inflate statistical

significance (Shadish et al. 2002, 55)

If the nominal = .05, the actual = .923 when the test is

repeated fifty times (Maxwell & Delaney 1990)

Examples:

Fishing until we find a significant effect

Multiple researchers analyzing the same data

Remedy:

Bonferroni correction: divides the target by the number of tests and uses

the Bonferroni-corrected in all individual tests

Bonferroni and other corrections may be too conservative in low-powered studies (high

risk of Type II error)

36

4. Unreliability of Measures

and strengthens or weakens the relationship between three or more

variables (Shadish et al. 2002, 55)

With three or more variables, unreliability of measures can lead to

either false positives or false negatives what does that mean?

Particularly problematic in longitudinal studies that assess change

over time

Remedies:

Increasing the number of measurements

More items to measure the same concept

Multiple raters

Improving the quality of measures

Using validated scale items

Training for raters

Techniques like latent variable modelling

37

4. Unreliability of Measures (contd)

statistical modeling techniques (e.g., confirmatory

factor analysis, path analysis) to test theoretical

models

Two main components

Measurement model: uses observed variables (e.g.,

survey items) to define latent constructs (e.g., happiness,

self-efficacy, intelligence)

Structural regression model: system of simultaneous

regression equations to estimate paths linking the latent

constructs

38

5. Restriction of Range

relationship between it and another variable (Shadish et al.

2002, 55)

Small range Lower power

This problem can affect either the

Independent variable (IV). Example:

Comparing two similar treatments Remedy: using different

treatment doses and even full-dose vs. no treatment

Dependent variable (DV). Examples:

Dummies

Floor effects (respondents cluster near the bottom)

Ceiling effects (respondents cluster near the top)

Remedy: Using models that are appropriate for limited

39

variables (e.g., Tobit, truncated regression, Heckman)

6. Unreliability of Treatment Implementation

standardized manner is implemented only partially for some

respondents, effects may be underestimated compared with

full implementation (Shadish et al. 2002, 55)

Common in field experiments

It usually decreases effect size, but it can also increase the

effect size when implemention is tailored to the recipients

Important to measure all components of the treatment

package

40

7. Extraneous Variance in the Experimental

Setting

Some features of an experimental setting may inflate error,

making detection of an effect more difficult (Shadish et al.

2002, 55)

Example: Fire drill or concert downstairs during lab

experiment

Particularly frequent in field experiments

When sources of extraneous variance cannot be controlled,

we should measure them and include them in the statistical

analysis

41

8. Heterogeneity of Units

increases error variance, making detection of a relationship more

difficult (Shadish et al. 2002, 55)

Heterogeneity of respondents on an outcome variable increases

standard deviations on that variable and on any other correlated with

it Weaker treatment effect

Remedies

Sample units that are similar on characteristics correlated with outcome

Potential risks:

Lower external validity

Limited range on DV

Measure respondent characteristics that interact with a cause-effect

relationship and use them for blocking or as covariates

Within-participants designs comparing pre- and post-test scores for each

participant

42

9. Inaccurate Effect Size Estimation

underestimate the size of an effect (Shadish et al.

2002, 55)

Examples

Outliers (departing from normal distribution) can

dramatically decrease effect sizes

Analyzing binary outcomes with effect size measures

intended for continuous variables (correlation coefficient

or standardized mean difference statistic)

Underestimation of effect size

43

Internal Validity

44

Internal Validity

whether observed covariation

between X (the presumed treatment)

and Y (the presumed outcome) reflects

a causal relationship from X to Y as

those variables were manipulated or

measured

45

Internal Validity (contd)

Local: Causal conclusions are limited to the context

of the particular treatments, outcomes, times,

settings, and persons studied

Molar: Treatments are a complex package

consisting of many components, all of which are

tested as a whole

46

Threats to Internal Validity

47

Threats to Internal Validity

eliminating other possible causes (Mackie 1974, p.

67)

Threats to internal validity are those other possible

causes

Different threats are not necessarily independent

48

1. Ambiguous Temporal Precedence

confusion about which variable is the cause and which is the

effect (Shadish et al. 2002, 55)

Correlational studies are often unable to answer the

question: Which came first, the chicken or the egg?

Not always: e.g., unlikely that an increase in the sales of

airconditioners increases outside temperature

Particularly tricky because some causation is bidirectional

(reciprocal)

High performance Self-efficacy Higher performance

49

2. Selection

characteristics that could also cause the observed effect

(Shadish et al. 2002, 55)

Example

A new drug is given only to patients who volunteer to take the new

treatment

The volunteering patients might differ from nonvolunteers in ways

(e.g., sicker, older, etc.) that might affect the outcome

Random assignment eliminates selection bias because

randomly formed groups differ only by chance

50

3. History

treatment could cause the observed effect (Shadish

et al. 2002, 55)

Example: A study of psychotherapy with depressed

patients at the time a new antidepressant went on

the market

51

4. Maturation

confused with a treatment effect (Shadish et al.

2002, 55)

While maturation is internal, a natural course of

things having to do with some quality of the

participants in the study, history has to do with an

external event of some kind

Example: We may think that an ineffective medicine

works because patients get better by themselves

52

5. Regression Artifacts

they will often have less extreme scores on other

variables, an occurrence that can be confused with a

treatment effect (Shadish et al. 2002, 55)

Test theory Every measure has

A true score component reflecting a true ability

Plus a random error component that is normally and

randomly distributed around the mean of the

measure

53

5. Regression Artifacts (contd)

High scores will tend to have more positive random error pushing

them up, low scores will tend to have more negative random error

pulling them down

On the same measure at a later time, or on other measures at the

same time, the random error is less likely to be so extreme

Examples

A compensatory tutoring program for kids in the lowest 10 percent on a

pretest will seem more effective than it actually is because those kids will

tend to improve anyway in the post-test

People tend to go to psychotherapy after a shock and organizations tend to

hire consultants after a downturn; clients measured progress is partly a

movement back toward their stable mean as the temporary shock grows less

acute

54

6. Attrition/Mortality

artifactual effects if that loss is systematically correlated with

conditions (Shadish et al. 2002, 55)

A special subset of selection bias occurring after the treatment is in

place

Unlinke selection bias, attrition is not controlled by random

assignment

Example

If those dropping out of a compensatory tutoring course are the low pretest

test scorers, by the end of the course the participants who remain will be the

ones with higher academic skills

By comparing the average pretest to posttest scores we would overestimate

the effect of the course

55

7. Testing

exposures to that test, an occurrence that can be

confused with a treatment effect (Shadish et al.

2002, 55)

Only in pretest-posted designs

Example: People commonly improve on

standardized tests such as intelligence tests, SATs, or

GREs, due to practice, familiarity or other forms of

reactivity

56

8. Instrumentation

in a way that could be confused with a treatment effect

(Shadish 2002, 55)

Only in pretest-postest designs

Whereas testing involves a change in the participant,

instrumentation involves a change in the instrument

Examples

The spring on a bar press might become weaker and easier to push

over time

Schools often use two different types of tests before and after a

compensatory tutoring course (to reduce the testing threat); if the

level of difficulty is not the same between the two tests, part or all of

any pre-post difference is due to the change in instrument, not to the

course

57

9. Additive and Interactive Effects of Threats to

Internal Validity

that of another threat or may depend on

the level of another threat (Shadish et al.

2002, 55).

58

- Multivariate data analysis Hair Chapter 01_US 7e (1)Diunggah olehPedro Luque
- Assignment 1.docxDiunggah olehThìlä Ramachåndrañ
- Sensory AnalysisDiunggah olehNur Qistina
- BA 578 Live Fin Set2Diunggah olehSumana Salauddin
- Physical Abuse Around the Time of PregnancyDiunggah olehIrfan Fauzi
- 2.pdfDiunggah olehLaeeqHcc771
- Chapter 8 Hypothesis TestingDiunggah olehChouaib El Hajjaji
- Statistics LecturesDiunggah olehMarkJasonPerez
- 14 Chapter 5Diunggah olehAnuradha Nagarajan
- QM0034Diunggah olehMithesh Kumar
- A ZDiunggah olehOıɔɐuƃı ɹɐ ʎɐɾ
- Unit 19Diunggah olehZaenal Muttaqin
- Sport and Recreation Profile: GardeningDiunggah olehSPARC NZ
- Hypothesis TestingDiunggah olehrahul-singh-6592
- uroots_pucp_eco_330_2010_2Diunggah olehEysson Asca Gamarra
- SI- Hypothesis Testing FISDiunggah olehAmna Iftikhar
- Sample SizeDiunggah olehchemist_tma
- Evans Thomas 2006 MaDiunggah olehSean Glynn
- Four Steps of HypothesisDiunggah olehP3 Powers
- Statistics/Statistical HypothesisDiunggah olehpately5315
- QM TemplateDiunggah olehabhaajmera
- Practical SigDiunggah olehcorreoprincipal12158
- wp39Diunggah olehAditya Jadhav
- 20 HRMpractices inMNCsDiunggah olehAabha Gaur
- statistical methods for researchDiunggah olehwahidasaba
- 4_Analyze - Intro to Hypothesis Testing.pptxDiunggah olehParaschivescu Cristina
- Intro ReliabilityDiunggah olehmagihzarasan
- CH17 Mock TestDiunggah olehnageswara_mutyala
- Kim 2015Diunggah olehAndres
- [Bird] Analysis of variance via confidence intervals.pdfDiunggah oleheman_tenan2220

- Wooldridge, J. M., 2010. Econometric Analysis of Cross Section and Panel DataDiunggah olehAssan Achibat
- 06 RegressionDiunggah olehAssan Achibat
- 05 Randomized TrialsDiunggah olehAssan Achibat
- 04 Construct & External ValidityDiunggah olehAssan Achibat
- 02 Generalized Causal InferenceDiunggah olehAssan Achibat
- GMM Stata Implementation ESS 2017Diunggah olehAssan Achibat
- Bruno_lecture_notes.pdfDiunggah olehAssan Achibat
- Matrix Algebra NotesDiunggah olehAssan Achibat
- BMandGBMpresDiunggah olehAssan Achibat
- problem set financial contractingDiunggah olehAssan Achibat
- NC study caseDiunggah olehAssan Achibat
- Geneva 16Diunggah olehAlfredo Jalife Rahme
- Lec 4 a Capital Structure 1Diunggah olehAssan Achibat
- Bhaskar Arao 82Diunggah olehAssan Achibat
- (Guitar Tabs) the Real Book of BluesDiunggah olehAlessandro D Leone
- Sheila Davis Craft of Lyric WritingDiunggah olehAssan Achibat
- how to read and do proofs - Solow.pdfDiunggah olehAssan Achibat

- Guide for Exchange Students 2014Diunggah olehOlga Putri Sholicha
- End of life estimation and optimisation of maintenance of HV switchgear.pdfDiunggah olehwebotpad
- Innovation SurveyDiunggah olehajaykd
- The Victorian AgeDiunggah olehzouadra
- Doro 7050 ManualDiunggah olehEric
- RSL MPI 803Diunggah olehfarizan
- cia-sisDiunggah olehjbhelfrich
- The Ultimate Purpose of Spiritual LifeDiunggah olehSejal Mishra
- LSS Measure Tollgate TemplatesDiunggah olehCarlos Oliver Montejano
- Hobbs 1997Diunggah olehSky
- Geology Field Trip Jebel HafeetDiunggah olehDaniela Agreda Cardenas
- Outlining Critical Psychology of WorkDiunggah olehHernan Camilo
- Curriculum VitaeDiunggah olehSandeep Shrivastava
- ManualDiunggah olehVenugopal Bhaskaran
- 60591428 New Tribological WaysDiunggah olehoperationmanager
- Luminol PresentationDiunggah olehrobertoalfaro492023
- Equal Talents, Unequal OpportunitiesDiunggah olehemma brown
- Identification ChecklistsDiunggah olehivan.alagic
- Zksample2 DocDiunggah olehEds
- Malaysia Salary Increment TrendDiunggah olehFazlizam Abu Bakar
- Remote SensingDiunggah olehmahesh
- 129332441624687500_52622Diunggah olehNishali Sam
- 5666 Assignment No 1 (1)Diunggah olehHafiz Ahmed
- Regional Inequality in EuropeDiunggah olehmog_art
- Report on UFOs - Congressional Research Service (1983). By Marcia SmithDiunggah olehsergejsh
- Expert System PresentationDiunggah olehArshit Mahajan
- Managerial Accounting ManualDiunggah olehbabytall
- B2B presentationDiunggah olehrajkumar_bhoraniya
- Seminar Synopsis - 3d printerDiunggah olehAadarsh Gochhayat
- Chap 11Diunggah olehmouche2010

## Lebih dari sekadar dokumen.

Temukan segala yang ditawarkan Scribd, termasuk buku dan buku audio dari penerbit-penerbit terkemuka.

Batalkan kapan saja.