Chapter 8

ADVANCED CORRELATIONAL STRATEGIES
Chapter 8
ADVANCED CORRELATIONAL STRATEGIES
The Pearson r correlation examines the linear association between two variables Related techniques examine how and why sets of variables are related.
Strategies we will cover:

regression analysis cross-lagged panel design structural equation modeling multilevel modeling factor analysis
PREDICTING BEHAVIOR: REGRESSION STRATEGIES
Regression analysis involves writing a regression equation that:
Provides a mathematical description of how variables are related. Allows us to predict scores on one variable based on one or more other variables.

Simple linear regression involves one predictor variable. Multiple regression involves 2+ predictor variables.
LINEAR REGRESSION
A linear regression equation defines the straight line that best represents the linear relationship between two variables The regression lines goes though the center of the data (on a scatterplot)
LINEAR REGRESSION
POSITIVE SLOPE:
Negative slope:
LINEAR REGRESSION
When variables are linearly related, we can describe their relationship with the equation for a straight line: y = 0 + 1x
y
= the variable we would like to predict (the dependent variable, criterion variable, or outcome variable) = the variable we are using to predict y (the predictor variable)
LINEAR REGRESSION
y = 0 + 1x
0 and 1 are fixed constants that define the line

0 is the regression constant; the y-intercept of the line that best fits the data in the scatterplot. The value of Y when X
= 0.
1 is a regression coefficient; the slope of the line that best fits the data in the scatterplot.
How much will Y change if X increases by 1 (standard deviations)? Rise over run
So, we estimate an intercept and slope for the line
LINEAR REGRESSION
b vs. b
b is an unstandardized regression coefficient

You can interpret b as the predicted change in Y given a one unit change in X Using the original scales of each variable
b (beta) is a standardized regression coefficient

Predicted change in Y (in standard deviations) for a one standard deviation change in X Independent of the original scales of the variables Allows you to compare the size of different slopes
REGRESSION AND ERROR

What does the best-fitting line actually mean? is the predicted Y - the value of Y predicted by the regression equation for each value of X (Y- ) is the distance of each data point from the regression line:
A.k.a., a residual Residuals represent unexplained error
WHAT DOES THE BEST FITTING LINE MEAN?

is the predicted Y - the value of Y predicted by the regression equation for each value of X (Y- ) is the distance of each data point from the regression line:
= 0 + 1x
A residual Residuals represent unexplained error
PERFECTLY FIT
REGRESSION LINE:
Imperfectly fitting regression line:
WHAT DOES THE BEST FITTING LINE MEAN?
Regression minimizes total squared error

Total squared error = Squared residuals Least squares criterion Minimizes distance between all data points and the line (minimize residual error)
Hypothesis test is a type of F-test
MS regression MS residual
MULTIPLE REGRESSION ANALYSIS
Multiple regression analyses use more than one predictor variable.

y = 0 + 1x + 2z + Three types:
Standard (or simultaneous) Stepwise Hierarchical
STANDARD MULTIPLE REGRESSION
Standard (or simultaneous) multiple regression all of the predictor variables are entered into the regression analysis at the same time. The resulting equation provides a regression constant (0 or intercept) and separate regression coefficients for each predictor (e.g., 1, 2, 3, )
STEPWISE MULTIPLE REGRESSION
Stepwise multiple regression builds the regression equation by entering predictor variables one at a time based on their ability to predict the outcome variable At each step:
Do any variables predict significant unique variance (above and beyond the other predictors)? If so, which is the strongest predictor? (Add strongest predictor and repeat until no predictors account for significant unique variance)
Each step looks at unique associations Not the same as a normal zero-order Pearson correlation!
HIERARCHICAL MULTIPLE REGRESSION
Hierarchical multiple regression the predictor variables are entered into the equation in an order that is predetermined by the researcher
As each new variable is entered into the equation, the researcher tests whether the new variable significantly predicts unique variance in the criterion variable. Can be used:
To control for confounding variables To test interactions with continuous variables (moderation) To test for mediation
MULTIPLE CORRELATION COEFFICIENT (R)
The multiple correlation coefficient (R) describes the degree of relationship between the criterion variable (y) and the set of all predictor variables R can range from 0 to 1.00. The larger the value of R, the better job the regression equation does of predicting the criterion variable from the predictor variables.

R2 shows the proportion of variance in the criterion variable that can be accounted for by the set of all predictor variables.
INTERPRETING REGRESSION EQUATIONS

Just like a correlation, a significant linear regression analysis does not equal causation! Simple linear regression only explains linear relationships Not all regression equations are useful or meaningful. Ideally, they should be:
Parsimonious Likely to replicate
STATISTICALLY CONTROLLING FOR CONFOUNDING VARIABLES
Confounded variables tend to co-occur, making it difficult to look at their separate, independent effects. Hierarchical Multiple Regression allows you to statistically control for confounding variables (confounds)
Step 1: Enter the confound or control variable Step 2: Enter the predictor you are interested in to test its unique effects, over and above the control variable
Example: Does education predict well-being, controlling for income?

Step 1: incomewell-being Step 2: educationwell-being, controlling for income
THIS is what I mean by control variables!
EXAMPLE: DO CHILDREN MAKE YOU HAPPY (OR MISERABLE)?
Nelson et al. (2013): Parents reported higher levels of life satisfaction; = 0.22, p < .001 Bhargava et al. (2013)
Need
to control for confounding factors! Parents are more likely to be married! They reanalyzed the data from Nelson et al. (2013) by statistically controlling for marital status
Step 1: Marital status (0,1) Life satisfaction = 0.65, p < .001 Step 2: Parental status (0,1) Life satisfaction = -0.05, p = .36 Result is not significant!
INTERACTIONS WITH CONTINUOUS VARIABLES

Factorial ANOVA lets you test interactions with discrete variables Hierarchical multiple regression lets you test interactions with continuous variables

MAIN EFFECT: Individual effect due to one variable

does exercise intensity influence sleep? does time of day during exercise influence sleep?
INTERACTION: Combined effect of two variables
does the effect of exercise intensity on sleep differ depending on time of day?
INTERPRETING INTERACTIONS
An interaction means that the effect of one variable depends on the value of the other variable Non-parallel lines (that cross or converge) in a graph indicate an interaction
Sleep (hours)
8
Morning
Night
5 Mild Workout Intense Workout
GRAPH OF GROUP MEANS WITH AND WITHOUT

INTERACTION
NO interaction (lines are parallel):
Interaction (lines are NOT parallel):
This represents two main effects
INTERPRETING MAIN EFFECTS AND INTERACTIONS FROM GRAPHS AND TABLES
Example: Effects of practice and reward (high/low) on performance Interaction = Lines are NOT parallel Main effect = One condition is higher than the other, after averaging across the levels of the other factor
NO MAIN EFFECTS, NO INTERACTION

6 5 4 3 2 1 0 Low High No Practice Practice
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 4 4 4 HIGH 4 4 4 MARGINAL MEANS 4 4
MAIN EFFECT OF PRACTICE, NO INTERACTION
7 6 5 4 3 2 1 0 Low High No Practice Practice
REWARD PRACTICE
MAIN EFFECT OF REWARD, NO INTERACTION

4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Low High
No Practice Practice
REWARD PRACTICE
TWO MAIN EFFECTS, NO INTERACTION
REWARD PRACTICE
NO MAIN EFFECTS, INTERACTION

4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Low High
No Practice Practice
REWARD PRACTICE
MAIN EFFECT OF PRACTICE, INTERACTION

REWARD PRACTICE
MAIN EFFECT OF REWARD, INTERACTION

REWARD PRACTICE
TWO MAIN EFFECTS AND AN INTERACTION

8 7 6 5 4 3 2 1 0 Low High No Practice Practice
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 1 1.5 HIGH 4 7 5.5 MARGINAL MEANS 3 4
TESTING INTERACTIONS WITH CONTINUOUS VARIABLES USING HIERARCHICAL MULTIPLE REGRESSION
First, you need to center your continuous predictor variables

Subtract the mean from each individuals score The new mean will be zero for both variables If you have a discrete variable, code it as 0 or 1
Second, calculate an interaction term by multiplying the two centered predictor variables Conduct a hierarchical multiple regression:
Step 1: Enter the two centered predictor variables Step 2: Enter the interaction term

Interpret the results at Step 2:

Two main effects (one for each predictor variable) Interaction
EXAMPLE OF INTERPRETING AN INTERACTION
There was a main effect of exercise intensity, such that more intense exercise predicted better sleep quality. This main effect was qualified by a significant interaction between exercise intensity and time of exercise such that exercise intensity was more strongly positively associated with sleep quality if the exercise occurred in the morning rather than at night.
8 7 6 5 4 3 2 1 0 Low High No Practice Night Morning Practice
Exercise Intensity
REWARD
MEDIATION AND MODERATION
An interaction can also be called moderation (one variable moderates) the effect of the other Mediation means that the association between a predictor and outcome variable can be accounted for or explained by another variable
Step 1: Show that predictor predicts outcome Step 2: Add the mediating variable and see if it helps explain the association
MEDIATION EXAMPLE
Feeling Lonely
Social Interaction Quality
Sleep Quality
MEDIATION EXAMPLE
Feeling Lonely
-.31*** -.05*
Social Interaction Quality
.01 (.03*) .03*
Sleep Quality
CROSS-LAGGED PANEL DESIGN
In a cross-lagged panel design, the correlation between two variables, x and y, is calculated at two different points in time

Correlate the scores on x at Time 1 with the scores on y at Time 2 Correlate the scores on y at Time 1 with the scores on x at Time 2 If x causes y, then the correlation between x at Time 1 and y at Time 2 should be larger than the correlation between y at Time 1 and x at Time 2.
EXAMPLE OF A CROSS-LAGGED PANEL DESIGN

Because the correlation between TV violence at Time 1 and aggressiveness at Time 2 (.31) is greater than the correlation between aggressiveness at Time 1 and TV violence at Time 2 (.01), these results support the hypothesis that watching violent TV increases later aggression.
STRUCTURAL EQUATION MODELING
In structural equations modeling, the researcher makes a prediction regarding how a set of variables are causally related. This prediction implies that the variables ought to be correlated in a particular pattern. This predicted pattern is then compared to actual pattern of correlations.
STRUCTURAL EQUATION MODELING (SEM)

Fit Index indicates how well the hypothesized model fits the observed data If the hypothesized model does not adequately fit the data, we can conclude that the model is not likely to be correct. By comparing fit indices for various models, the researcher can determine which model fits the data the best.
EXAMPLE OF SEM (ALSO, OF DYADIC DATA)
Perceived and Actual Similarity in Humor Appreciation
EXAMPLE OF SEM WITH LATENT VARIABLES
A latent variable is measured indirectly using several manifest (measured) variables
(From Hoyle, 1995)
ANOTHER EXAMPLE OF SEM
MULTILEVEL MODELING (MLM)
Used to analyze data sets with a nested structure For example, students may be nested within classrooms, which are nested within schools Because the responses of students within a classroom are not independent of one another, such data cannot be analyzed using traditional statistical techniques that require independence of observations
Also: Hierarchical Linear Modeling (HLM)
ANALYZING NESTED DATA: MULTILEVEL MODELING
Multilevel modeling separates the various influences that are operating at various levels of the nested data structure
For example, it would allow us to examine the separate influences of students personal capabilities, features of the classroom, and aspects of the school
EXAMPLE OF MULTILEVEL MODELING

Examining fluctuations in loneliness across days Daily diary study:
In initial session, measured general trait loneliness Then, measured daily loneliness every day for 2 weeks. Today, to what extent did you feel

in tune with the people around you? isolated from others? there are people who care about you? your social relationships were superficial?
This data is nested / hierarchical:

Level 1: Variability within individuals across days (state) Level 2: Variability between individuals (trait)
Looking at the total variance in loneliness:

49% exists at Level 1 (within-person variability) 51% exists at Level 2 (between-person variability)
Dependent (outcome) measure: Sleep quality
Level 1 (trait loneliness)

b = 0.32*** People who are generally lonely have poorer quality sleep
Level 2 (state loneliness)

b = -.05* People tend to have poorer quality sleep after a relatively lonely day
FACTOR ANALYSIS
Factor analysis is a class of statistical techniques that are used to analyze the interrelationships among a large number of variables. The presence of correlations among several variables suggests that the variables may all be related to some underlying factors. Factor analysis does not tell you what each factor means.
FACTOR ANALYSIS
In this matrix, Variables A, B, and C correlate highly with each other, whereas Variables D and E correlate highly with each other. This pattern suggests that there may be two factors underlying this pattern of correlations.
A B C D E
A 1.00
B .78 1.00
C .85 .70 1.00
D .01 .09 -.02 1.00
E -.07 .00 .04 .86 1.00
FACTOR ANALYSIS
Factor analysis is used to identify the minimum number of factors needed to account for the relationships among the variables. The factor matrix is used to interpret the nature of the underlying factors. Factor loadings are correlations between the variables and the factors. Variables that correlate highly with a factor are said to load on that factor.
USES OF FACTOR ANALYSIS

1.
To study the underlying structure of psychological constructs To reduce a large number of variables to a smaller, more manageable set of data
2.
3.
To confirm that self-report measures of attitude and personality are unidimensional (measure only one thing).
EXAMPLE: AFFECT
To what extent do you feel the following way right now?
happy distressed ashamed enthusiastic upset
afraid joyful delighted cheerful irritable
How should we score this measure of affect?
CORRELATIONS AMONG ITEMS
happy
joyful
cheerful
delighted enthusiastic
upset
afraid
irritable
ashamed distressed
happy joyful cheerful delighted
enthusiastic
upset afraid irritable ashamed distressed
1 ** .66 ** .64 ** .73 ** .59 -.08 * -.20 -.10 -.15 -.16
1 ** .69 ** .78 ** .72 .07 -.03 .02 -.03 .02
1 ** .69 ** .65 -.02 * -.21 .03 -.09 -.01
1 ** .70 -.09 -.16 -.04 -.14 -.02
1 .01 -.15 -.06 -.10 -.06
1 ** .37 ** .56 ** .47 ** .43
1 .18 1 ** ** .38 .63 1 ** ** ** .32 .48 .39 1
FACTOR ANALYSIS
Two factors:
Factor 1 explains 39% of the variance Factor 2 explains 27% of the variance
These two factors explain 65% of the total variance in the items
EXAMPLE: SATISFACTION WITH LIFE SCALE
Use the following 1-7 scale to indicate your agreement with each item.
1 = Strongly Disagree 2 = Disagree 3 = Slightly Disagree 4 = Neither Agree or Disagree 5 = Slightly Agree 6 = Agree 7 = Strongly Agree
______1. In most ways my life is close to my ideal. ______2. The conditions of my life are excellent. ______3. I am satisfied with life. ______4. So far I have gotten the important things I want in life. ______5. If I could live my life over, I would change almost nothing.
Is this measure really unidimensional?
FACTOR ANALYSIS
One factor explains 64% of the variance
That seems pretty unidimensional! Now, we can just calculate an average for those 5 items.
WHAT ABOUT LIFE SATISFACTION AND AFFECT TOGETHER?
WHAT ABOUT LIFE SATISFACTION AND AFFECT TOGETHER?
It looks like we can calculate three averages:

Positive affect (a = .92) Negative affect (a = .76) Life satisfaction (a = .85)
LifeSatisfaction PositiveAffect NegativeAffect

LifeSatisfaction PositiveAffect NegativeAffect
1 ** .495 * -.230
1 -.101
EXAMPLE: HOW COUPLES USE HUMOR

VAR
Fun1 Fun2 Fun3
ITEM TEXT
To have fun with your partner (P) To have a good time with your P To see your P smile
FUN / BOND
.870 .826 .832
EMO SUPT
DESTRESS
Fun4
Fun5 EmoS1 EmoS2 EmoS3
To laugh with your P

To bond with your P To let your P know that you care To give your P emotional support To let your P know that you are on his/her side
.825
.802 .697 .684 .675
EmoS4
EmoS5 Strss1 Strss2 Strss3 Strss4 Strss5
To show your P you understand

To open up with your P To take the edge off of a tough day To lighten your worries To get through tough days To take your minds off of your problems To relieve stress
.657
.644 .783 .780 .770 .764 .737
EXAMPLE: HOW COUPLES USE HUMOR

VAR
Ang1 Ang2 Ang3
ITEM TEXT
To let your P know how annoyed you are To let your P know youre angry To make it clear youre irritated
SHOW ANGER
.797 .825 .819
HURT
AVOID ISSUE
Ang4
Ang5 Hurt1 Hurt2 Hurt3 Hurt4 Hurt5 Avoid1 Avoid2 Avoid3 Avoid4 Avoid5
to show your P youre mad

To tell your P youre upset To hurt your P To insult your P To be mean to your P To put your P down To make your P feel stupid To dodge an issue To sidestep a problem To avoid a subject To avoid talking about serious issues To get out of a serious discussion
.792
.737 .820 .841 .854 .858 .840 .862 .846 .834 .832 .813
HIGHER ORDER FACTOR ANALYSIS (HUMOR USES)

Subscale
To give support To relieve stress To have fun / bond To express anger To hurt your partner To avoid an issue
Positive Uses Negative Uses
.829 .796 .733 .239 -.025 .339
.197 .261 .009 .831 .745 .731
QUESTIONS?

Chapter 8

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Chapter 8

Diunggah oleh

Hak Cipta:

Format Tersedia

ADVANCED CORRELATIONAL STRATEGIES

ADVANCED CORRELATIONAL STRATEGIES

PREDICTING BEHAVIOR: REGRESSION STRATEGIES

Regression analysis involves writing a regression equation that:

0 and 1 are fixed constants that define the line

So, we estimate an intercept and slope for the line

b is an unstandardized regression coefficient

b (beta) is a standardized regression coefficient

REGRESSION AND ERROR

A.k.a., a residual Residuals represent unexplained error

WHAT DOES THE BEST FITTING LINE MEAN?

A residual Residuals represent unexplained error

Imperfectly fitting regression line:

WHAT DOES THE BEST FITTING LINE MEAN?

Regression minimizes total squared error

Hypothesis test is a type of F-test

MULTIPLE REGRESSION ANALYSIS

Multiple regression analyses use more than one predictor variable.

Standard (or simultaneous) Stepwise Hierarchical

STANDARD MULTIPLE REGRESSION

STEPWISE MULTIPLE REGRESSION

HIERARCHICAL MULTIPLE REGRESSION

MULTIPLE CORRELATION COEFFICIENT (R)

INTERPRETING REGRESSION EQUATIONS

Parsimonious Likely to replicate

STATISTICALLY CONTROLLING FOR CONFOUNDING VARIABLES

Example: Does education predict well-being, controlling for income?

THIS is what I mean by control variables!

EXAMPLE: DO CHILDREN MAKE YOU HAPPY (OR MISERABLE)?

INTERACTIONS WITH CONTINUOUS VARIABLES

MAIN EFFECT: Individual effect due to one variable

INTERACTION: Combined effect of two variables

5 Mild Workout Intense Workout

GRAPH OF GROUP MEANS WITH AND WITHOUT

NO interaction (lines are parallel):

Interaction (lines are NOT parallel):

This represents two main effects

INTERPRETING MAIN EFFECTS AND INTERACTIONS FROM GRAPHS AND TABLES

NO MAIN EFFECTS, NO INTERACTION

MAIN EFFECT OF PRACTICE, NO INTERACTION

7 6 5 4 3 2 1 0 Low High No Practice Practice

MAIN EFFECT OF REWARD, NO INTERACTION

TWO MAIN EFFECTS, NO INTERACTION

7 6 5 4 3 2 1 0 Low High No Practice Practice

NO MAIN EFFECTS, INTERACTION

MAIN EFFECT OF PRACTICE, INTERACTION

MAIN EFFECT OF REWARD, INTERACTION

TWO MAIN EFFECTS AND AN INTERACTION

TESTING INTERACTIONS WITH CONTINUOUS VARIABLES USING HIERARCHICAL MULTIPLE REGRESSION

First, you need to center your continuous predictor variables

Interpret the results at Step 2:

EXAMPLE OF INTERPRETING AN INTERACTION

MEDIATION AND MODERATION

Social Interaction Quality

Social Interaction Quality

.01 (.03*) .03*

CROSS-LAGGED PANEL DESIGN

EXAMPLE OF A CROSS-LAGGED PANEL DESIGN

STRUCTURAL EQUATION MODELING

STRUCTURAL EQUATION MODELING (SEM)

EXAMPLE OF SEM (ALSO, OF DYADIC DATA)

Perceived and Actual Similarity in Humor Appreciation

EXAMPLE OF SEM WITH LATENT VARIABLES

.01 (.03) .03

1 .66 .64 .73 .59 -.08 * -.20 -.10 -.15 -.16

1 .69 .78 ** .72 .07 -.03 .02 -.03 .02

1 .69 .65 -.02 * -.21 .03 -.09 -.01

1 .37 .56 .47 .43

1 .18 1 .38 .63 1 ** .32 .48 .39 1