Anda di halaman 1dari 63

ADVANCED CORRELATIONAL STRATEGIES

Chapter 8

ADVANCED CORRELATIONAL STRATEGIES

The Pearson r correlation examines the linear association between two variables Related techniques examine how and why sets of variables are related.
Strategies we will cover:

regression analysis cross-lagged panel design structural equation modeling multilevel modeling factor analysis

PREDICTING BEHAVIOR: REGRESSION STRATEGIES

Regression analysis involves writing a regression equation that:

Provides a mathematical description of how variables are related. Allows us to predict scores on one variable based on one or more other variables.

Simple linear regression involves one predictor variable. Multiple regression involves 2+ predictor variables.

LINEAR REGRESSION

A linear regression equation defines the straight line that best represents the linear relationship between two variables The regression lines goes though the center of the data (on a scatterplot)

LINEAR REGRESSION
POSITIVE SLOPE:

Negative slope:

LINEAR REGRESSION
When variables are linearly related, we can describe their relationship with the equation for a straight line: y = 0 + 1x
y

= the variable we would like to predict (the dependent variable, criterion variable, or outcome variable) = the variable we are using to predict y (the predictor variable)

LINEAR REGRESSION
y = 0 + 1x

0 and 1 are fixed constants that define the line


0 is the regression constant; the y-intercept of the line that best fits the data in the scatterplot. The value of Y when X
= 0.

1 is a regression coefficient; the slope of the line that best fits the data in the scatterplot.

How much will Y change if X increases by 1 (standard deviations)? Rise over run

So, we estimate an intercept and slope for the line

LINEAR REGRESSION

b vs. b

b is an unstandardized regression coefficient


You can interpret b as the predicted change in Y given a one unit change in X Using the original scales of each variable

b (beta) is a standardized regression coefficient


Predicted change in Y (in standard deviations) for a one standard deviation change in X Independent of the original scales of the variables Allows you to compare the size of different slopes

REGRESSION AND ERROR


What does the best-fitting line actually mean? is the predicted Y - the value of Y predicted by the regression equation for each value of X (Y- ) is the distance of each data point from the regression line:

A.k.a., a residual Residuals represent unexplained error

WHAT DOES THE BEST FITTING LINE MEAN?


is the predicted Y - the value of Y predicted by the regression equation for each value of X (Y- ) is the distance of each data point from the regression line:

= 0 + 1x

A residual Residuals represent unexplained error

PERFECTLY FIT
REGRESSION LINE:

Imperfectly fitting regression line:

WHAT DOES THE BEST FITTING LINE MEAN?

Regression minimizes total squared error


Total squared error = Squared residuals Least squares criterion Minimizes distance between all data points and the line (minimize residual error)

Hypothesis test is a type of F-test

MS regression MS residual

MULTIPLE REGRESSION ANALYSIS

Multiple regression analyses use more than one predictor variable.


y = 0 + 1x + 2z + Three types:

Standard (or simultaneous) Stepwise Hierarchical

STANDARD MULTIPLE REGRESSION

Standard (or simultaneous) multiple regression all of the predictor variables are entered into the regression analysis at the same time. The resulting equation provides a regression constant (0 or intercept) and separate regression coefficients for each predictor (e.g., 1, 2, 3, )

STEPWISE MULTIPLE REGRESSION

Stepwise multiple regression builds the regression equation by entering predictor variables one at a time based on their ability to predict the outcome variable At each step:

Do any variables predict significant unique variance (above and beyond the other predictors)? If so, which is the strongest predictor? (Add strongest predictor and repeat until no predictors account for significant unique variance)

Each step looks at unique associations Not the same as a normal zero-order Pearson correlation!

HIERARCHICAL MULTIPLE REGRESSION

Hierarchical multiple regression the predictor variables are entered into the equation in an order that is predetermined by the researcher
As each new variable is entered into the equation, the researcher tests whether the new variable significantly predicts unique variance in the criterion variable. Can be used:

To control for confounding variables To test interactions with continuous variables (moderation) To test for mediation

MULTIPLE CORRELATION COEFFICIENT (R)

The multiple correlation coefficient (R) describes the degree of relationship between the criterion variable (y) and the set of all predictor variables R can range from 0 to 1.00. The larger the value of R, the better job the regression equation does of predicting the criterion variable from the predictor variables.

R2 shows the proportion of variance in the criterion variable that can be accounted for by the set of all predictor variables.

INTERPRETING REGRESSION EQUATIONS


Just like a correlation, a significant linear regression analysis does not equal causation! Simple linear regression only explains linear relationships Not all regression equations are useful or meaningful. Ideally, they should be:

Parsimonious Likely to replicate

STATISTICALLY CONTROLLING FOR CONFOUNDING VARIABLES

Confounded variables tend to co-occur, making it difficult to look at their separate, independent effects. Hierarchical Multiple Regression allows you to statistically control for confounding variables (confounds)
Step 1: Enter the confound or control variable Step 2: Enter the predictor you are interested in to test its unique effects, over and above the control variable

Example: Does education predict well-being, controlling for income?


Step 1: incomewell-being Step 2: educationwell-being, controlling for income

THIS is what I mean by control variables!

EXAMPLE: DO CHILDREN MAKE YOU HAPPY (OR MISERABLE)?

Nelson et al. (2013): Parents reported higher levels of life satisfaction; = 0.22, p < .001 Bhargava et al. (2013)
Need

to control for confounding factors! Parents are more likely to be married! They reanalyzed the data from Nelson et al. (2013) by statistically controlling for marital status
Step 1: Marital status (0,1) Life satisfaction = 0.65, p < .001 Step 2: Parental status (0,1) Life satisfaction = -0.05, p = .36 Result is not significant!

INTERACTIONS WITH CONTINUOUS VARIABLES


Factorial ANOVA lets you test interactions with discrete variables Hierarchical multiple regression lets you test interactions with continuous variables

MAIN EFFECT: Individual effect due to one variable


does exercise intensity influence sleep? does time of day during exercise influence sleep?

INTERACTION: Combined effect of two variables

does the effect of exercise intensity on sleep differ depending on time of day?

INTERPRETING INTERACTIONS
An interaction means that the effect of one variable depends on the value of the other variable Non-parallel lines (that cross or converge) in a graph indicate an interaction

Sleep (hours)

8
Morning

Night

5 Mild Workout Intense Workout

GRAPH OF GROUP MEANS WITH AND WITHOUT


INTERACTION

NO interaction (lines are parallel):

Interaction (lines are NOT parallel):

This represents two main effects

INTERPRETING MAIN EFFECTS AND INTERACTIONS FROM GRAPHS AND TABLES

Example: Effects of practice and reward (high/low) on performance Interaction = Lines are NOT parallel Main effect = One condition is higher than the other, after averaging across the levels of the other factor

NO MAIN EFFECTS, NO INTERACTION


6 5 4 3 2 1 0 Low High No Practice Practice

REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 4 4 4 HIGH 4 4 4 MARGINAL MEANS 4 4

MAIN EFFECT OF PRACTICE, NO INTERACTION

7 6 5 4 3 2 1 0 Low High No Practice Practice

REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 4 6 5 HIGH 4 6 5 MARGINAL MEANS 4 6

MAIN EFFECT OF REWARD, NO INTERACTION


4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Low High

No Practice Practice

REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 2 2 HIGH 4 4 4 MARGINAL MEANS 3 3

TWO MAIN EFFECTS, NO INTERACTION

7 6 5 4 3 2 1 0 Low High No Practice Practice

REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 4 3 HIGH 4 6 5 MARGINAL MEANS 3 5

NO MAIN EFFECTS, INTERACTION


4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Low High

No Practice Practice

REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 4 3 HIGH 4 2 3 MARGINAL MEANS 3 3

MAIN EFFECT OF PRACTICE, INTERACTION


7 6 5 4 3 2 1 0 Low High No Practice Practice

REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 6 2 4 HIGH 4 4 4 MARGINAL MEANS 5 3

MAIN EFFECT OF REWARD, INTERACTION


7 6 5 4 3 2 1 0 Low High No Practice Practice

REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 4 3 HIGH 6 4 5 MARGINAL MEANS 4 4

TWO MAIN EFFECTS AND AN INTERACTION


8 7 6 5 4 3 2 1 0 Low High No Practice Practice

REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 1 1.5 HIGH 4 7 5.5 MARGINAL MEANS 3 4

TESTING INTERACTIONS WITH CONTINUOUS VARIABLES USING HIERARCHICAL MULTIPLE REGRESSION

First, you need to center your continuous predictor variables


Subtract the mean from each individuals score The new mean will be zero for both variables If you have a discrete variable, code it as 0 or 1

Second, calculate an interaction term by multiplying the two centered predictor variables Conduct a hierarchical multiple regression:

Step 1: Enter the two centered predictor variables Step 2: Enter the interaction term

Interpret the results at Step 2:


Two main effects (one for each predictor variable) Interaction

EXAMPLE OF INTERPRETING AN INTERACTION

There was a main effect of exercise intensity, such that more intense exercise predicted better sleep quality. This main effect was qualified by a significant interaction between exercise intensity and time of exercise such that exercise intensity was more strongly positively associated with sleep quality if the exercise occurred in the morning rather than at night.
8 7 6 5 4 3 2 1 0 Low High No Practice Night Morning Practice

Exercise Intensity
REWARD

MEDIATION AND MODERATION

An interaction can also be called moderation (one variable moderates) the effect of the other Mediation means that the association between a predictor and outcome variable can be accounted for or explained by another variable

Step 1: Show that predictor predicts outcome Step 2: Add the mediating variable and see if it helps explain the association

MEDIATION EXAMPLE

Feeling Lonely

Social Interaction Quality

Sleep Quality

MEDIATION EXAMPLE

Feeling Lonely
-.31*** -.05*

Social Interaction Quality

.01 (.03*) .03*

Sleep Quality

CROSS-LAGGED PANEL DESIGN

In a cross-lagged panel design, the correlation between two variables, x and y, is calculated at two different points in time

Correlate the scores on x at Time 1 with the scores on y at Time 2 Correlate the scores on y at Time 1 with the scores on x at Time 2 If x causes y, then the correlation between x at Time 1 and y at Time 2 should be larger than the correlation between y at Time 1 and x at Time 2.

EXAMPLE OF A CROSS-LAGGED PANEL DESIGN


Because the correlation between TV violence at Time 1 and aggressiveness at Time 2 (.31) is greater than the correlation between aggressiveness at Time 1 and TV violence at Time 2 (.01), these results support the hypothesis that watching violent TV increases later aggression.

STRUCTURAL EQUATION MODELING

In structural equations modeling, the researcher makes a prediction regarding how a set of variables are causally related. This prediction implies that the variables ought to be correlated in a particular pattern. This predicted pattern is then compared to actual pattern of correlations.

STRUCTURAL EQUATION MODELING (SEM)


Fit Index indicates how well the hypothesized model fits the observed data If the hypothesized model does not adequately fit the data, we can conclude that the model is not likely to be correct. By comparing fit indices for various models, the researcher can determine which model fits the data the best.

EXAMPLE OF SEM (ALSO, OF DYADIC DATA)

Perceived and Actual Similarity in Humor Appreciation

EXAMPLE OF SEM WITH LATENT VARIABLES

A latent variable is measured indirectly using several manifest (measured) variables

(From Hoyle, 1995)

ANOTHER EXAMPLE OF SEM

MULTILEVEL MODELING (MLM)

Used to analyze data sets with a nested structure For example, students may be nested within classrooms, which are nested within schools Because the responses of students within a classroom are not independent of one another, such data cannot be analyzed using traditional statistical techniques that require independence of observations

Also: Hierarchical Linear Modeling (HLM)

ANALYZING NESTED DATA: MULTILEVEL MODELING

Multilevel modeling separates the various influences that are operating at various levels of the nested data structure

For example, it would allow us to examine the separate influences of students personal capabilities, features of the classroom, and aspects of the school

EXAMPLE OF MULTILEVEL MODELING


Examining fluctuations in loneliness across days Daily diary study:

In initial session, measured general trait loneliness Then, measured daily loneliness every day for 2 weeks. Today, to what extent did you feel

in tune with the people around you? isolated from others? there are people who care about you? your social relationships were superficial?

This data is nested / hierarchical:


Level 1: Variability within individuals across days (state) Level 2: Variability between individuals (trait)

Looking at the total variance in loneliness:


49% exists at Level 1 (within-person variability) 51% exists at Level 2 (between-person variability)

Dependent (outcome) measure: Sleep quality

Level 1 (trait loneliness)


b = 0.32*** People who are generally lonely have poorer quality sleep

Level 2 (state loneliness)


b = -.05* People tend to have poorer quality sleep after a relatively lonely day

FACTOR ANALYSIS

Factor analysis is a class of statistical techniques that are used to analyze the interrelationships among a large number of variables. The presence of correlations among several variables suggests that the variables may all be related to some underlying factors. Factor analysis does not tell you what each factor means.

FACTOR ANALYSIS

In this matrix, Variables A, B, and C correlate highly with each other, whereas Variables D and E correlate highly with each other. This pattern suggests that there may be two factors underlying this pattern of correlations.

A B C D E

A 1.00

B .78 1.00

C .85 .70 1.00

D .01 .09 -.02 1.00

E -.07 .00 .04 .86 1.00

FACTOR ANALYSIS

Factor analysis is used to identify the minimum number of factors needed to account for the relationships among the variables. The factor matrix is used to interpret the nature of the underlying factors. Factor loadings are correlations between the variables and the factors. Variables that correlate highly with a factor are said to load on that factor.

USES OF FACTOR ANALYSIS


1.

To study the underlying structure of psychological constructs To reduce a large number of variables to a smaller, more manageable set of data

2.

3.

To confirm that self-report measures of attitude and personality are unidimensional (measure only one thing).

EXAMPLE: AFFECT

To what extent do you feel the following way right now?

happy distressed ashamed enthusiastic upset

afraid joyful delighted cheerful irritable

How should we score this measure of affect?

CORRELATIONS AMONG ITEMS

happy

joyful

cheerful

delighted enthusiastic

upset

afraid

irritable

ashamed distressed

happy joyful cheerful delighted

enthusiastic
upset afraid irritable ashamed distressed

1 ** .66 ** .64 ** .73 ** .59 -.08 * -.20 -.10 -.15 -.16

1 ** .69 ** .78 ** .72 .07 -.03 .02 -.03 .02

1 ** .69 ** .65 -.02 * -.21 .03 -.09 -.01

1 ** .70 -.09 -.16 -.04 -.14 -.02

1 .01 -.15 -.06 -.10 -.06

1 ** .37 ** .56 ** .47 ** .43

1 .18 1 ** ** .38 .63 1 ** ** ** .32 .48 .39 1

FACTOR ANALYSIS

Two factors:
Factor 1 explains 39% of the variance Factor 2 explains 27% of the variance

These two factors explain 65% of the total variance in the items

EXAMPLE: SATISFACTION WITH LIFE SCALE

Use the following 1-7 scale to indicate your agreement with each item.
1 = Strongly Disagree 2 = Disagree 3 = Slightly Disagree 4 = Neither Agree or Disagree 5 = Slightly Agree 6 = Agree 7 = Strongly Agree

______1. In most ways my life is close to my ideal. ______2. The conditions of my life are excellent. ______3. I am satisfied with life. ______4. So far I have gotten the important things I want in life. ______5. If I could live my life over, I would change almost nothing.

Is this measure really unidimensional?

FACTOR ANALYSIS

One factor explains 64% of the variance

That seems pretty unidimensional! Now, we can just calculate an average for those 5 items.

WHAT ABOUT LIFE SATISFACTION AND AFFECT TOGETHER?

WHAT ABOUT LIFE SATISFACTION AND AFFECT TOGETHER?

It looks like we can calculate three averages:


Positive affect (a = .92) Negative affect (a = .76) Life satisfaction (a = .85)

LifeSatisfaction PositiveAffect NegativeAffect


LifeSatisfaction PositiveAffect NegativeAffect

1 ** .495 * -.230

1 -.101

EXAMPLE: HOW COUPLES USE HUMOR


VAR
Fun1 Fun2 Fun3

ITEM TEXT
To have fun with your partner (P) To have a good time with your P To see your P smile

FUN / BOND
.870 .826 .832

EMO SUPT

DESTRESS

Fun4
Fun5 EmoS1 EmoS2 EmoS3

To laugh with your P


To bond with your P To let your P know that you care To give your P emotional support To let your P know that you are on his/her side

.825
.802 .697 .684 .675

EmoS4
EmoS5 Strss1 Strss2 Strss3 Strss4 Strss5

To show your P you understand


To open up with your P To take the edge off of a tough day To lighten your worries To get through tough days To take your minds off of your problems To relieve stress

.657
.644 .783 .780 .770 .764 .737

EXAMPLE: HOW COUPLES USE HUMOR


VAR
Ang1 Ang2 Ang3

ITEM TEXT
To let your P know how annoyed you are To let your P know youre angry To make it clear youre irritated

SHOW ANGER
.797 .825 .819

HURT

AVOID ISSUE

Ang4
Ang5 Hurt1 Hurt2 Hurt3 Hurt4 Hurt5 Avoid1 Avoid2 Avoid3 Avoid4 Avoid5

to show your P youre mad


To tell your P youre upset To hurt your P To insult your P To be mean to your P To put your P down To make your P feel stupid To dodge an issue To sidestep a problem To avoid a subject To avoid talking about serious issues To get out of a serious discussion

.792
.737 .820 .841 .854 .858 .840 .862 .846 .834 .832 .813

HIGHER ORDER FACTOR ANALYSIS (HUMOR USES)


Subscale
To give support To relieve stress To have fun / bond To express anger To hurt your partner To avoid an issue
Positive Uses Negative Uses

.829 .796 .733 .239 -.025 .339

.197 .261 .009 .831 .745 .731

QUESTIONS?

Anda mungkin juga menyukai