Chapter 8
The Pearson r correlation examines the linear association between two variables Related techniques examine how and why sets of variables are related.
Strategies we will cover:
regression analysis cross-lagged panel design structural equation modeling multilevel modeling factor analysis
Provides a mathematical description of how variables are related. Allows us to predict scores on one variable based on one or more other variables.
Simple linear regression involves one predictor variable. Multiple regression involves 2+ predictor variables.
LINEAR REGRESSION
A linear regression equation defines the straight line that best represents the linear relationship between two variables The regression lines goes though the center of the data (on a scatterplot)
LINEAR REGRESSION
POSITIVE SLOPE:
Negative slope:
LINEAR REGRESSION
When variables are linearly related, we can describe their relationship with the equation for a straight line: y = 0 + 1x
y
= the variable we would like to predict (the dependent variable, criterion variable, or outcome variable) = the variable we are using to predict y (the predictor variable)
LINEAR REGRESSION
y = 0 + 1x
1 is a regression coefficient; the slope of the line that best fits the data in the scatterplot.
How much will Y change if X increases by 1 (standard deviations)? Rise over run
LINEAR REGRESSION
b vs. b
= 0 + 1x
PERFECTLY FIT
REGRESSION LINE:
MS regression MS residual
Standard (or simultaneous) multiple regression all of the predictor variables are entered into the regression analysis at the same time. The resulting equation provides a regression constant (0 or intercept) and separate regression coefficients for each predictor (e.g., 1, 2, 3, )
Stepwise multiple regression builds the regression equation by entering predictor variables one at a time based on their ability to predict the outcome variable At each step:
Do any variables predict significant unique variance (above and beyond the other predictors)? If so, which is the strongest predictor? (Add strongest predictor and repeat until no predictors account for significant unique variance)
Each step looks at unique associations Not the same as a normal zero-order Pearson correlation!
Hierarchical multiple regression the predictor variables are entered into the equation in an order that is predetermined by the researcher
As each new variable is entered into the equation, the researcher tests whether the new variable significantly predicts unique variance in the criterion variable. Can be used:
To control for confounding variables To test interactions with continuous variables (moderation) To test for mediation
The multiple correlation coefficient (R) describes the degree of relationship between the criterion variable (y) and the set of all predictor variables R can range from 0 to 1.00. The larger the value of R, the better job the regression equation does of predicting the criterion variable from the predictor variables.
R2 shows the proportion of variance in the criterion variable that can be accounted for by the set of all predictor variables.
Confounded variables tend to co-occur, making it difficult to look at their separate, independent effects. Hierarchical Multiple Regression allows you to statistically control for confounding variables (confounds)
Step 1: Enter the confound or control variable Step 2: Enter the predictor you are interested in to test its unique effects, over and above the control variable
Nelson et al. (2013): Parents reported higher levels of life satisfaction; = 0.22, p < .001 Bhargava et al. (2013)
Need
to control for confounding factors! Parents are more likely to be married! They reanalyzed the data from Nelson et al. (2013) by statistically controlling for marital status
Step 1: Marital status (0,1) Life satisfaction = 0.65, p < .001 Step 2: Parental status (0,1) Life satisfaction = -0.05, p = .36 Result is not significant!
does the effect of exercise intensity on sleep differ depending on time of day?
INTERPRETING INTERACTIONS
An interaction means that the effect of one variable depends on the value of the other variable Non-parallel lines (that cross or converge) in a graph indicate an interaction
Sleep (hours)
8
Morning
Night
Example: Effects of practice and reward (high/low) on performance Interaction = Lines are NOT parallel Main effect = One condition is higher than the other, after averaging across the levels of the other factor
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 4 4 4 HIGH 4 4 4 MARGINAL MEANS 4 4
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 4 6 5 HIGH 4 6 5 MARGINAL MEANS 4 6
No Practice Practice
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 2 2 HIGH 4 4 4 MARGINAL MEANS 3 3
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 4 3 HIGH 4 6 5 MARGINAL MEANS 3 5
No Practice Practice
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 4 3 HIGH 4 2 3 MARGINAL MEANS 3 3
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 6 2 4 HIGH 4 4 4 MARGINAL MEANS 5 3
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 4 3 HIGH 6 4 5 MARGINAL MEANS 4 4
REWARD PRACTICE
NO PRACTICE PRACTICE MARGINAL MEANS LOW 2 1 1.5 HIGH 4 7 5.5 MARGINAL MEANS 3 4
Second, calculate an interaction term by multiplying the two centered predictor variables Conduct a hierarchical multiple regression:
Step 1: Enter the two centered predictor variables Step 2: Enter the interaction term
There was a main effect of exercise intensity, such that more intense exercise predicted better sleep quality. This main effect was qualified by a significant interaction between exercise intensity and time of exercise such that exercise intensity was more strongly positively associated with sleep quality if the exercise occurred in the morning rather than at night.
8 7 6 5 4 3 2 1 0 Low High No Practice Night Morning Practice
Exercise Intensity
REWARD
An interaction can also be called moderation (one variable moderates) the effect of the other Mediation means that the association between a predictor and outcome variable can be accounted for or explained by another variable
Step 1: Show that predictor predicts outcome Step 2: Add the mediating variable and see if it helps explain the association
MEDIATION EXAMPLE
Feeling Lonely
Sleep Quality
MEDIATION EXAMPLE
Feeling Lonely
-.31*** -.05*
Sleep Quality
In a cross-lagged panel design, the correlation between two variables, x and y, is calculated at two different points in time
Correlate the scores on x at Time 1 with the scores on y at Time 2 Correlate the scores on y at Time 1 with the scores on x at Time 2 If x causes y, then the correlation between x at Time 1 and y at Time 2 should be larger than the correlation between y at Time 1 and x at Time 2.
In structural equations modeling, the researcher makes a prediction regarding how a set of variables are causally related. This prediction implies that the variables ought to be correlated in a particular pattern. This predicted pattern is then compared to actual pattern of correlations.
Fit Index indicates how well the hypothesized model fits the observed data If the hypothesized model does not adequately fit the data, we can conclude that the model is not likely to be correct. By comparing fit indices for various models, the researcher can determine which model fits the data the best.
Used to analyze data sets with a nested structure For example, students may be nested within classrooms, which are nested within schools Because the responses of students within a classroom are not independent of one another, such data cannot be analyzed using traditional statistical techniques that require independence of observations
Multilevel modeling separates the various influences that are operating at various levels of the nested data structure
For example, it would allow us to examine the separate influences of students personal capabilities, features of the classroom, and aspects of the school
In initial session, measured general trait loneliness Then, measured daily loneliness every day for 2 weeks. Today, to what extent did you feel
in tune with the people around you? isolated from others? there are people who care about you? your social relationships were superficial?
FACTOR ANALYSIS
Factor analysis is a class of statistical techniques that are used to analyze the interrelationships among a large number of variables. The presence of correlations among several variables suggests that the variables may all be related to some underlying factors. Factor analysis does not tell you what each factor means.
FACTOR ANALYSIS
In this matrix, Variables A, B, and C correlate highly with each other, whereas Variables D and E correlate highly with each other. This pattern suggests that there may be two factors underlying this pattern of correlations.
A B C D E
A 1.00
B .78 1.00
FACTOR ANALYSIS
Factor analysis is used to identify the minimum number of factors needed to account for the relationships among the variables. The factor matrix is used to interpret the nature of the underlying factors. Factor loadings are correlations between the variables and the factors. Variables that correlate highly with a factor are said to load on that factor.
To study the underlying structure of psychological constructs To reduce a large number of variables to a smaller, more manageable set of data
2.
3.
To confirm that self-report measures of attitude and personality are unidimensional (measure only one thing).
EXAMPLE: AFFECT
happy
joyful
cheerful
delighted enthusiastic
upset
afraid
irritable
ashamed distressed
enthusiastic
upset afraid irritable ashamed distressed
FACTOR ANALYSIS
Two factors:
Factor 1 explains 39% of the variance Factor 2 explains 27% of the variance
These two factors explain 65% of the total variance in the items
Use the following 1-7 scale to indicate your agreement with each item.
1 = Strongly Disagree 2 = Disagree 3 = Slightly Disagree 4 = Neither Agree or Disagree 5 = Slightly Agree 6 = Agree 7 = Strongly Agree
______1. In most ways my life is close to my ideal. ______2. The conditions of my life are excellent. ______3. I am satisfied with life. ______4. So far I have gotten the important things I want in life. ______5. If I could live my life over, I would change almost nothing.
FACTOR ANALYSIS
That seems pretty unidimensional! Now, we can just calculate an average for those 5 items.
1 ** .495 * -.230
1 -.101
ITEM TEXT
To have fun with your partner (P) To have a good time with your P To see your P smile
FUN / BOND
.870 .826 .832
EMO SUPT
DESTRESS
Fun4
Fun5 EmoS1 EmoS2 EmoS3
.825
.802 .697 .684 .675
EmoS4
EmoS5 Strss1 Strss2 Strss3 Strss4 Strss5
.657
.644 .783 .780 .770 .764 .737
ITEM TEXT
To let your P know how annoyed you are To let your P know youre angry To make it clear youre irritated
SHOW ANGER
.797 .825 .819
HURT
AVOID ISSUE
Ang4
Ang5 Hurt1 Hurt2 Hurt3 Hurt4 Hurt5 Avoid1 Avoid2 Avoid3 Avoid4 Avoid5
.792
.737 .820 .841 .854 .858 .840 .862 .846 .834 .832 .813
QUESTIONS?