Anda di halaman 1dari 6

Assignment 2: Correlation & Regression

Due: Wednesday 10/5 @ 11:59PM


100 Points Total
DIRECTIONS: Answer all parts of the following questions. Please type all answers in
blue font!!! It will make it easier on my eyes and on your eyes. Also, YOU MUST save
your lab file as: Lab1LastnameFirstname.doc. Files will be uploaded as attachments to
Blackboard.
Example: Assignment2MitchellJoni.doc
Part 1: Conceptual Questions & Calculations By Hand (55pts)
1. Correlations are incredibly useful and common across majors and fields. Id like you to find a
few examples. For your current or planned field (major), please list three different pairs of
variables that are likely to follow the relationships listed below. Please try to come up with
original examples that we have not discussed in class. You can use outside resources (internet,
textbooks) to generate your lists.
A. (6pts) Positive Linear Correlation:
Example: SAT scores and College GPA (this counts as 1 pair of variables that
follows this type of relationship you need to list 3 pairs of variables for each
type of relationship)
Pair 1:
Pair 2:
Pair 3:

B. (6pts) Negative Linear Correlation:


Pair 1:
Pair 2:
Pair 3:

C. (6pts) No Correlation (variables that some might think are related but turn out not to
be):

Pair 1:
Pair 2:
Pair 3:

D. (6pts) Curvilinear Correlation:

Pair 1:
Pair 2:
Pair 3:

2. (10pts) Using the fake data provided below, calculate the correlation between peoples
Depression Scores (higher numbers = more severe depression) and the number of hours per week
they spend engaging in social interactions with others. Please do all calculations by hand, and
show all your work. You can round all of your statistics (mean, SD, z-scores, r) to 2 decimal
places.
Participant
1
2
3
4
5
6

Depression Score
45
51
31
60
40
54

Hours Spent with Others


12
13
20
7
14
13

7
8

36
35

22
19

*Hint* you need to start by calculating the mean and standard deviation for each variable.

3. (5pts) We all know that correlation is not evidence for causation. Explain the three possible
causal relationships that could exist be between Depression and Hours Spent with Others.

4. (10pts) Using the same data from question 2 above, use linear regression to determine what
depression score would we predict someone to have if we knew they spent 17 hours per week
interacting with others. Show your work.

5. (6pts) In a few sentences, please verbally explain what b0 (the intercept) and b1 (the slope)
mean in this example. In other words, how would you explain what their value means or
represents, if you had to explain it to a friend?

PART 2: Minitab (31pts)

6. Use the Minitab data set BrainSize_IQ to answer the questions below. The data set contains
real data from a study conducted by Willerman, Schultz, Rutledge, and Bigler (1991) called "In
Vivo Brain Size and Intelligence. The study attempted to determine if there is any relationship
between ones physical features, including the size of their brain, and their IQ scores. They
collected data from 40 psychology students on 4 variables: 1) participant IQ score using the
WAIS test, 2) participant body weight in pounds, 3) participant height in inches, and 4)
participant brain size based on the number of pixels used in an fMRI image of the brain. They
also looked at Family Income.

A. (2pts) Copy and paste a scatterplot with the best fitting line to graph the relationship
between IQ and Brain Size and paste it below. Please put Brain Size on the x-axis and IQ
on the y-axis.

B. (4pts) What is the value of r for the relationship between IQ and Brain Size? Is this
positive or negative? Is this a strong, moderate, or weak relationship? In a sentence or
two, explain what this r value tells us about the relationship between peoples IQ and
their Brain Size.

C. (4pts) What is the value of r for the relationship between participants Heights and
participants Weights? Is this positive or negative? Is this a strong, moderate, or weak
relationship? In a sentence or two, explain what this r value tells us about the
relationship between peoples Height and their Weight.

D. (4pts) What is the correlation between participants Family Income and participants IQ?
Is this positive or negative? Strong, moderate or weak? In a sentence or two, explain what
this r value tells us about the relationship between Family Income and IQ. Also, please
copy and paste the scatterplot with the best fit line for Income and IQ.

E. (5pts) Using Regression, what would we predict someones IQ score to be if we knew


than an fMRI image of their Brain Size required 960,000 pixels.

F. (4pts) For the prediction you calculated in question E, what is the standard error of the
estimate? In a sentence or two, explain what this statistic tells us.

G. (4pts) What is the value of r2 for the relationship between IQ and Brain Size? In a
sentence or two, explain what this statistic tells us.

H. (4pts) What is the value of r2 for the relationship between Height and Weight? In a
sentence or two, explain what this statistic tells us.

PART 3: Multiple Choice & True/False (14pts)


7. (2pts) True or False: If two variables are correlated, it means there can never be a causal
relationship between them
8. (2pts) True or False: The sign of a correlation (+ or -) has nothing to do with the strength
of the relationship

9. (2pts) If we use regression to predict the value of Y given a certain known X, then this
statistic tells us, on average, how far off our prediction will be.
a. Standard Deviation
b. Standard error of the estimate
c. R-squared (r2)
d. Correlation coefficient
Answer: _____

Questions 10 through 13 correspond to the 3 Scatterplots below (Scatterplots A, B and C):


8
12

10

12

10

6
5

4
6

0
40

50

60

70

80

DOPA MINE

90

100

V AR00004

SCHIZO

SCHIZO

3
4

0
50

60

70

80

100

0
2

10

VAR00005

DOPAMINE

90

10. (2pts) Which of the scatterplots shows the weakest relationship between two variables?
11. (2pts) If we call our two variables X and Y, in which of the scatterplots will a high z-score
on X correspond to a low z-score on Y?

12. (2pts) In Scatterplot A, if we were to do a regression calculation using these two


variables, and if we knew that Zx = 1.3, which would be more likely:
a. That Zy is fairly close to 1.3
b. That Zy is fairly close to -1.3
13. (2pts) In which of the scatterplots will the Standard Error of the Estimate be lowest? In
other words, if we were to do regression, in which scatterplot would there be the least amount
of error in our prediction?

Anda mungkin juga menyukai