# Assignment 3

## Introduction to Research Analysis (SOCI 3040)

Overview
This assignment will focus on basic hypothesis testing using both hand calculations and R.
The breakdown of points is as follows:

## Topic Number of Points

Hypothesis Testing and Inference 50
Hypothesis Testing in R 50
Total 100

This assignment is due November 1st, 2018 at the start of class. Since some of the
questions require hand calculation, you do not need to type all of your answers using word
processing software. For questions requiring hand calculations, you may submit your hand-
written results (write neatly, please). Be sure to show all of your work, since partial credit
is possible. For the R component of this assignment, please copy and paste the results di-
rectly into word processing software (e.g., Microsoft Word, Libre Office Write, etc.). This
assignment is worth 100 points, or 10% of your final grade.

## Basic Hypothesis Tests

Answer the questions below using the appropriate statistical technique. For questions in-
volving the use of hypothesis testing, you must:

## 2. Provide the Z(critical), T(critical), or χ2 (critical) score corresponding to the α thresh-

old for your test

## 4. Provide your decision about statistical significance

You must also substantively interpret the results of your test (but keep it short). That is,
don’t just focus on whether or not a test is statistically significant, but also briefly comment
on what the result actually means. You may find it helpful to proceed using the five-step
model, but this is optional. Be sure to select a test that is appropriate given the
question and decide whether the question calls for a one-tailed or two-tailed test.
As well, do not round until the final step of your calculations.

1
1. A random sample of 350 persons yields a sample mean of 105 and a sample standard
deviation of 10. Construct three different confidence intervals to estimate the popu-
lation mean, using 95%, 99%, and 99.9% levels of confidence. What happens to the
interval width as the confidence level increases? Why? (10 points)

## 2. An advantage that often comes with a basic knowledge of statistics is a change in

salary. To see whether this was the case for Tulane University graduates, you took a
random sample of 57 students who completed a statistics class and asked about their
starting salaries (in thousands) after graduation. The sample had a mean of 53.3 with
a standard deviation of 3.72 (i.e., x = 53.3 and s = 3.72). A call to the Office of
the Registrar indicates that the average starting salary value for all Tulane students is
47.1. Do students who take statistics courses earn an equal salary compared to Tulane
students generally? Use α = 0.001. (10 points)

3. An insurance company wants to know whether perceptions of personal health are in-
dependent of gender. The company has hired you as a statistical consultant. You then
collected a random sample of people and asked about their perceived health status.
Based on the results of this survey, is the sex of the respondent independent from their
perceived health status? Use 95% confidence. (10 points)

Women Men
Very Good 89 131
Perceived Health Status Good 82 65
Not so good 42 30

4. At St. Algebra College, the sociology and economics departments have been feuding
for years about the respective quality of their programs. In an attempt to resolve
the dispute, you have gathered data about the graduate school experience for random
samples of both groups of majors. One indicator of program quality that is frequently
cited by departments is the proportion of students who completed their degree. The
results from the random sample are presented presented below. Is there a significant
difference in quality between the two programs? Use α = 0.05. (10 points)

Proportion Completing
Sociology Economics
p1 = 0.77 p2 = 0.72
N1 = 125 N2 = 125

5. Are married women more likely than divorced women to maintain contact with their
families? The average number of visits per year with close family members and the
standard deviation for two samples are listed below. You may assume equal variances
in the population. Use a 0.01 α-value. (10 points)

2
Married Women Divorced Women
x1 = 8.3 x2 = 7.2
s1 = 0.50 s2 = 0.31
N1 = 13 N2 = 10

Working with R
Use your personalized GSS dataset to answer the questions below. All analysis must be
completed using your personalized data set that is available on Canvas. Failure to
use your personalized data set will result in a zero for the question. As above, you
will need to read and interpret the question to determine which statistical test is appropriate,
as well as whether it is a one-tailed or two-tailed test. Similar to the questions above, you
must:
1. State the null and research hypotheses
2. Provide the Z(critical), T(critical), or χ2 (critical) score corresponding to the α thresh-
old for your test
3. Embed the relevant R output directly into your document (and only the relevant
output)
4. Explicitly discuss your test statistic that is in your R output. Simply stating ’see table’
or something equivalent is not sufficient. Provide the specific number for each question.
5. Provide your decision about statistical significance
6. Avoid hand calculations; let R do the work
You should also substantively interpret the results of your test (again, keep it short). You
must also submit a complete and fully functional syntax file that will replicate
your results as an appendix to your assignment. You can do this by copying and
pasting directly into your word processor.
1. Are there statistically significant differences in family income between women and men
in the GSS? Use the variables income16 and sex to answer this question. You can
assume that income is measured at the interval-ratio level. Use a 95% confidence level.
(10 points)
2. The population mean for a full-time worker is 40 hours per week. On average, do
the respondents in the GSS work either significantly more or significantly less than 40
hours a week? Use an α value of 0.01 and the variable hrs1 to answer this question.
(10 points)
3. Is educational attainment independent from gender in the GSS? To answer this ques-
tion, first start by recoding the variable educ into a new variable with categories for
respondents who have completed the 12th grade or less, and respondents who have at
least some college or more education. You will also need to use the variable sex. Can
you conclude that the two variables are independent at 99.9% confidence? (10 points)

3
4. During the 2012 presidential election, Barack Obama received 51.1% of the popular
vote. Looking at the GSS, we see that the sample appears to have higher levels of
support for Obama. Does the GSS have a significantly different number of Obama
supporters, or is this normal sampling variation that we might expect? Use a 95%
level of confidence. To answer this question, you will need to recode the variable
pres12 into a variable where values of 1 denote votes for Obama, and values of 0
denote votes for Romney, with all other missing/DK/etc. values coded to missing. (10
points)

5. Using the variables contained in the GSS, begin by briefly stating a research question
then determine the appropriate statistical test to answer your question, implement
the test in R, and interpret the results. You may use one-sample tests of means or
proportions, two-sample tests of means, or a Chi-square test. You may also select
whatever alpha-level seems appropriate, though it must be consistent with minimal
standards in the social sciences (i.e., 95% confidence or better). (10 points)