Anda di halaman 1dari 20

Working with relationships between two variables

Size of Teaching Tip & Stats Test Score

100
90
80
70
60
Stats
Test 50
Score 40
30
20
10
0
$0 $20 $40 $60 $80
Correlation & Regression
Univariate & Bivariate Statistics
U: frequency distribution, mean, mode, range, standard deviation
B: correlation two variables
Correlation
linear pattern of relationship between one variable (x) and
another variable (y) an association between two variables
relative position of one variable correlates with relative
distribution of another variable
graphical representation of the relationship between two
variables
Warning:
No proof of causality
Cannot assume x causes y
Scatterplot!
No Correlation
Random or circular
assortment of dots
Positive Correlation
ellipse leaning to right
GPA and SAT
Smoking and Lung Damage

Negative Correlation
ellipse learning to left
Depression & Self-esteem
Studying & test errors
Pearsons Correlation Coefficient
r indicates
strength of relationship (strong, weak, or none)
direction of relationship
positive (direct) variables move in same direction
negative (inverse) variables move in opposite directions
r ranges in value from 1.0 to +1.0

-1.0 0.0 +1.0


Strong Negative No Rel. Strong Positive

Go to website!
playing with scatterplots
Practice with Scatterplots

r = .__ __ r = .__ __

r = .__ __ r = .__ __
Correlation Guestimation
Corre lations

Miles walk ed
per day W eight Depres sion Anxiet y
Miles walk ed per day Pearson Correlation 1 -.797** -.800** -.774**
Sig. (2-tailed) .002 .002 .003
N 12 12 12 12
W eight Pearson Correlation -.797** 1 .648* .780**
Sig. (2-tailed) .002 .023 .003
N 12 12 12 12
Depres sion Pearson Correlation -.800** .648* 1 .753**
Sig. (2-tailed) .002 .023 .005
N 12 12 12 12
Anxiet y Pearson Correlation -.774** .780** .753** 1
Sig. (2-tailed) .003 .003 .005
N 12 12 12 12
**. Correlation is s ignificant at the 0.01 level (2-tailed).
*. Correlation is s ignificant at the 0.05 level (2-tailed).
Samples vs. Populations
Sample statistics estimate Population parameters
M tries to estimate
r tries to estimate (rho greek symbol --- not p)
r correlation for a sample
based on a the limited observations we have
actual correlation in population
the true correlation
Beware Sampling Error!!
even if =0 (theres no actual correlation), you might get r =.08
or r = -.26 just by chance.
We look at r, but we want to know about
Hypothesis testing with Correlations
Two possibilities
Ho: = 0 (no actual correlation; The Null Hypothesis)
Ha: 0 (there is some correlation; The Alternative Hyp.)
Case #1 (see correlation worksheet)
Correlation between distance and points r = -.904
Sample small (n=6), but r is very large
We guess < 0 (we guess there is some correlation in the pop.)
Case #2
Correlation between aiming and points, r = .628
Sample small (n=6), and r is only moderate in size
We guess = 0 (we guess there is NO correlation in pop.)
Bottom-line
We can only guess about
We can be wrong in two ways
Reading Correlation Matrix
Correlationsa

Time spun
Total ball Distance before Aiming Manual College grade Confidence
toss points from target throwing accuracy dexterity point avg for task
Total ball toss points Pearson Correlation 1 -.904* -.582 .628 .821* -.037 -.502
Sig. (2-tailed) . .013 .226 .181 .045 .945 .310
N 6 6 6 6 6 6 6
Distance from target Pearson Correlation -.904* 1 .279 -.653 -.883* .228 .522
Sig. (2-tailed) .013 . .592 .159 .020 .664 .288
N 6 6 6 6 6 6 6
Time spun before Pearson Correlation -.582 .279 1 -.390 -.248 -.087 .267
throwing Sig. (2-tailed) .226 .592 . .445 .635 .869 .609
N
6 6 6 6 6 6 6

Aiming accuracy Pearson Correlation .628 -.653 -.390 1 .758 -.546 -.250
Sig. (2-tailed) .181 .159 .445 . .081 .262 .633

Manual dexterity
N
Pearson Correlation
6 6 6 6 6 Correlationsa
6 6
.821* -.883* -.248 .758 1 -.553 -.101
Sig. (2-tailed) .045 .020 .635 .081 . .255 .848
N 6 6 6 6 6
Time spun
6 6
College grade point avg Pearson Correlation
Sig. (2-tailed)
N
-.037
.945
6
.228

Total ball
.664
6
-.087
.869
6
-.546
.262
6
-.553

Distance
.255
6
1

before
.
6
r = -.904
-.524

Aiming
.286
6
Manual College grade Confidence
Confidence for task Pearson Correlation -.502
toss points
.522 .267 -.250
from target throwing accuracy dexterity
-.101 -.524 1
point avg for task
Total ball
Sig. (2-tailed)

toss points Pearson Correlation


N
.310
6
.288
6
1
.609
6
.633
6
.848
6
-.904*
.286
6
-.582
p = .
.
.628
6
013 -- Probability of
.821* -.037 -.502
*. Correlation is significant at the 0.05 level (2-tailed).
a. Day sample collected = Tuesday
Sig. (2-tailed) . .013 .226
getting
.181
a correlation
.045
this size
.945 .310
N 6 6 6
by sheer6
chance.
6
Reject6
Ho 6
Distance from target Pearson Correlation -.904* 1 p .05.-.883*
.279 if -.653 .228 .522
Sig. (2-tailed) .013 sample
. .592 .159 .020 .664 .288
N 6 6size 6
Time spun before Pearson Correlation
r 6(4) = -.904,
6 6
p.05 6
-.582 .279 1 -.390 -.248 -.087 .267
throwing Sig. (2-tailed) .226 .592 . .445 .635 .869 .609
Predictive Potential
Coefficient of Determination
r
Amount of variance accounted for in y by x
Percentage increase in accuracy you gain by using the regression
line to make predictions
Without correlation, you can only guess the mean of y
[Used with regression]

0% 20% 40% 60% 80% 100%


Limitations of Correlation
linearity:
cant describe non-linear relationships
e.g., relation between anxiety & performance
truncation of range:
underestimate stength of relationship if you cant see full range
of x value
no proof of causation
third variable problem:
could be 3rd variable causing change in both variables
directionality: cant be sure which way causality flows
Regression
Regression: Correlation + Prediction
predicting y based on x
e.g., predicting.
throwing points (y)
based on distance from target (x)
Regression equation
formula that specifies a line
y = bx + a
plug in a x value (distance from target) and predict y (points)
note
y= actual value of a score
y= predict value Go to website!
Regression Playground
Regression Graphic Regression Line
120 See correlation
& regression
100 worksheet

80

60

y=47
40

y=20 20

0 Rsq = 0.6031
8 10 12 14 16 18 20 22 24 26

Distance from target if x=18 if x=24


then then
Regression Equation
y= bx + a See correlation
y = predicted value of y & regression
b = slope of the line worksheet
x = value of x that you plug-in
a = y-intercept (where line crosses y access)
In this case.
y = -4.263(x) + 125.401

So if the distance is 20 feet


y = -4.263(20) + 125.401
y = -85.26 + 125.401
y = 40.141
SPSS Regression Set-up
Criterion,
y-axis variable,
what youre trying
to predict

Predictor,
x-axis variable,
what youre basing
the prediction on

Note: Never refer to the IV or DV when doing regression


Getting Regression Info from SPSS
See correlation
Model Summary
& regression
Adjusted Std. Error of worksheet
Model R R Square R Square the Estimate
1 .777a .603 .581 18.476
a. Predictors: (Constant), Distance from target

y = b (x) + a

a y = -4.263(20) + 125.401

Coefficientsa

Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 125.401 14.265 8.791 .000
Distance from target -4.263 .815 -.777 -5.230 .000
a. Dependent Variable: Total ball toss points

b
Predictive Ability
Mantra!!
As variability decreases, prediction accuracy ___
if we can account for variance, we can make better predictions
As r increases:
r increases
variance accounted for increases
the prediction accuracy increases
prediction error decreases (distance between y and y)
Sy decreases
the standard error of the residual/predictor
measures overall amount of prediction error
We like big rs!!!
Drawing a Regression Line by Hand
Three steps

1. Plug zero in for x to get a y value, and then


plot this value
Note: It will be the y-intercept

2. Plug in a large value for x (just so it falls on the


right end of the graph), plug it in for x, then
plot the resulting point

3. Connect the two points with a straight line!