Anda di halaman 1dari 5

1

Assignment 2
Due Date:
Weighting:

21 December, 2015
20%

Answering the questions in this assignment should not be your first attempt at these types of
questions. It is essential that you work through practice exercises from the tutorial sheets
and/or text book first.

This assignment is important in providing feedback and helping to establish competency in


essential skills.

Answer all the questions. The questions are not of equal weight, and some questions are
worth much more than others. Part marks for each question have been given.

The questions relate to material in Modules 1 to 6.

Before starting this assignment read Notes Concerning Assignments under the Introductory
Material link on the StudyDesk.

When you are asked to comment on a finding, usually a short paragraph is all that is
required.

Do not copy/paste SPSS output into your assignment unless specifically asked to do so. In
many cases the SPSS output contains much more information than is required for a correct
and complete answer. In those cases just reproducing the output may not attract any marks.
Make sure you report only the information from the SPSS output relevant to your answer.

In order to obtain full marks for any question you must show all working.

This assessment item consists of 6 questions.

Question 1

(14 marks)

Suppose the weekly grocery bill by Springfield households is known to follow a normal model with a
mean of $248 and a standard deviation of $16.
(a) (4 marks) What proportion of households spend $283 or more on the weekly grocery bill?
(b) (4 marks) The bottom 15% of households spend what amount on the weekly grocery bill?
(c) (6 marks) What is the interquartile range for the weekly grocery bill by Springfield
households?

Question 2

(14 marks)

This question uses information from the data file body.sav found under the Assignments and
Datasets link on the StudyDesk (also see body.txt for more details about the study and the variables
measured).
Make sure the variable view in SPSS is setup correctly with all labels correctly defined (with units),
all values assigned correctly for categorical variables and the correct measure selected for all
variables.
This question will examine the relationship between age categories and gender to see if there is a
relationship between the two variables.
(a) (2 marks) Using SPSS produce a contingency table to display the relationship between age
category and gender for the participants involved in the body study.
Include an appropriate title.
(b) (2 marks) What proportion of participants are female and are in the 21-30 years age group?
(c) (2 marks) What proportion of males are in the 21-30 years age group?
(d) (8 marks) Do the gender and age of the participants seem to be associated? Explain in less
than 50 words and include:
a conditional distribution table, conditional on gender;
percentages from this table in your explanation, to support your explanation.

Question 3

(18 marks)

Consider the data in the file body.sav again. Use SPSS to find the answers to the following questions,
but do not copy and paste SPSS output into your answer for parts (c) and (d) (make sure you always
include units where appropriate).
(a) (4 marks) Display the distribution of the ages (NOT age category) of the participants in the
body study using an appropriate graph. Label the axes correctly, include units of measure
and provide an appropriate title. Include your name in the title of your graph.

(b) (4 marks) Using the graph in (a) only (dont refer to SPSS summary statistics), describe in no
more than 60 words, the distribution of the ages of the participants in the study. Include
comments on shape, centre and spread of the distribution and the existence of outliers, if
any.
Do not include information from any calculations, use the graph only.
(c) (4 marks) Find the following statistics for the distribution of the age of participants in the
study (be sure to include units in your answers):
mean
standard deviation
median
IQR
(Do not copy/paste SPSS output).
(d) (2 marks) For the distribution of the age of participants in part (a), which statistics are
appropriate to measure the centre and spread? Give a reasonable explanation for your
choice.
(e) (4 marks) Produce a graph to compare the distribution of age for male and female
participants. Comment on any similarities or differences between the genders that is
displayed by this graph.

Question 4

(18 marks)

Facebook Inc. has 1.39 billion active monthly users (those people who use the social network at least
once a month). Half of these users log into the site on a daily basis (that is around 890 million daily
users). Amongst those who use Facebook daily, 85% check the site using a mobile device.
(a) (4 marks) State the variable of interest and suggest an appropriate model to estimate the
probabilities of the possible number of daily Facebook users who check the site using a
mobile device.
Explain why this model is appropriate by discussing the important properties of the
model (also assuming information from part (b) of this question)
Define the population for this situation?
(b) (2 marks) Using the model from part (a), estimate the probability that, in a sample of 20
randomly selected daily Facebook users, at least 18 will check the site using a mobile device.
(c) (2 marks) Using the model from part (a), estimate the probability that, in a sample of 20
randomly selected daily Facebook users, less than 13 will check the site using a mobile
device.
(d) (4 marks) A random sample of 300 daily Facebook users are selected, estimate the mean and
standard deviation of the number of users that will check Facebook using a mobile device.
(e) (6 marks) What is the probability that less than 250 daily Facebook users in the random
sample of 300 will check the site using a mobile device?

Question 5

(12 marks)

The following abstracts where obtained from newspaper articles (The Sunday Mail, Sunday 22nd July,
2007 & Sunday 15th March 2008) in relation to the trial of the drug adrenalin. The article includes the
following statements:
Heart-attack patients will be used as guinea pigs in a controversial medial trial proposed by
the Queensland Ambulance Service. Paramedics attending to cardia arrest cases will inject
either a life-saving drug adrenalin or a placebo into the patient. Neither paramedic nor
patient will know only the trial operators.
Adrenalin is used to make the heart beat if it has stopped. A placebo such as a saline
solutions, will produce no response in a patient suffering a heart attack. Medical experts said
the idea of the trial was to evaluate the value of adrenalin in a cardiac arrest and potential
side-effects, and was vital to achieving advances in medicine.

The trial, overseen by an Australian University, required 4500 subjects, with about 1500 to
come from Queensland over 18 months.
(a) (3 marks) Is this study observational or experimental? Justify your answer in less than 50
words.
(b) (3 marks) Does there appear to be control, replication and randomisation? Explain each of
these in the context of the above statements.
(c) (3 marks) What is the explanatory variable and response variable?
(d) (3 marks) What do you think the objective of the above study was?

Question 6

(24 marks)

Consider the data in the file body.sav again. Researchers want to know if for female participants,
there is a relationship between hip girth and waist girth.
(a) (2 marks) What are the two variables you will need to include in your analysis? Which
subgroup is being examined in this question?
(b) (4 marks) Select the participants who are female. Use an appropriate graph to display the
relationship between the two variables identified in part (a) for these participants. Label the
axes correctly, include units of measure and provide an appropriate title. Include your name
in the title of your graph.
(c) (4 marks) From the graph in part (b), describe (in no more than 30 words) the form, direction
and scatter of this relationship, and identify any outliers.

(d) (4 marks) Calculate an appropriate statistic to measure only the strength of the relationship
between the two variables for these participants.
Interpret this statistic.
(e) (4 marks) Use SPSS to find the equation of the regression line which could be used to predict
the waist girth from the participants hip girth for females only.
State the equation and include this line on the graph produced in part (b).
(f) (4 marks) Using the information found in part (e), interpret the following:
The intercept of the regression line
The slope of the regression line
(g) (2 marks) Using the regression equation from part (e), predict the waist girth of a female who
has a hip girth of 70cm.
Would you expect this to be an accurate prediction? Why?