Anda di halaman 1dari 10

STATS1900 Business Statistics

Major Assignment
Semester 2, 2013
Date Due: Please refer to course descrition
Total Mar!s: "0 mar!s #ort$: 20% of &nal
assessment
This assignment requires a substantial amount of computer work and written
comment. You may need to seek guidance from your tutor along the way. Do not
leave things until too late!!
The questions give a careful statement of what is required and information about
the presentation of your answers. Please follow these carefully! Marks may be
deducted for poor presentation.
n this assignment you will e!amine data used by a "eal #state investment advisor.
$he wants you to answer some speci%c questions put by clients about houses prices
in the neighbourhood encompassed by & suburbs around the city of Melbourne. The
data is contained in the %le Real_Estate.xls and contains the following columns
'variables()
'aria(le
)ame
Descrition
D *ouse dentity number
Price $elling Price of the house 'in
+++,s(
-edrooms .umber of bedrooms
$i/e *ouse $i/e 'm
0
(
Pool +1*ouse without a Pool
21*ouse with a Pool
Distance Distance from city centre
'km(
$uburb $uburb number
3arage +1*ouse without a 3arage
21*ouse with a 3arage
1* +andom Samle: -efore you begin your analysis you are required to
take a random sample of si/e 22+ from the 24+ cases in the %le. 5se the
%le Random_Sample_Generator-13-2.xls to do this. Your tutor will show
you how this can be done in #67#8. 9nswers to the questions below are
to be based on your sample of 22+ cases. Make sure to keep a safe copy
of the sample you use since you cannot use Random_Sample_Generator-
13-2.xls to reproduce it. Provide a printout of the data in your sample:
with ,D numbers in ascending order.
1
2 marks
2
2* Summar- ta(le:
a(
i( Prepare a summary table that shows the mean: standard deviation
and ;<= con%dence interval for the mean of the following
variables)
$elling Price: .umber of bedrooms: $i/e of house: Distance from
city centre
The following table shows descriptive statistics such as Mean: $tandard
deviation and ;<= con%dence interval for the variables $elling price'in
+++,s(: .umber of bedroom: si/e of the house and distance of house from
city centre.
Descriptive Statistics:
Variable Mean SD
95 % CI for Mean
Lower Limit Upper Limit
Selling Price of the house
(in 000s)
428.727 92.5491 411.238 446.216
Nu!er of "e#roos 2.945 1.100 2.737 3.153
Si$e of house 164.073 33.465 157.749 170.397
%ist&nce fro cit' centre 9.101 4.711 8.211 9.991
ii( 5se some of the information in 'i( to describe a typical house in
these suburbs.
n the four suburbs: the mean selling price of the house is &0>404 '$D 1
;0<&;.2(. The average number of bedrooms is 0.;&< '$D 1 2.2++(. The
mean si/e of the houses in these suburbs is 2?&.+4@ m
0
'$D 1 @@.&?< m
0
(
3
2 marks
1 mark
b(
i( Prepare a summary table that shows the mean and standard
deviation of Price for houses in the " Su(ur(s according 'subAect(
to the variable Bedrooms. Think carefully about the layout of the
rows and columns of your table. 9s well as means and standard
deviations you should also include the number of houses in each
group. $o each cell in your %nal table should contain the mean: the
standard deviation and n, the number of houses in that group.
Selling Price of the house (in 000s)
Nu!er of
"e#roos
Su!ur!
Nu!er
N M SD
1 2 4 332.500 5.000
2
1 15 370.667 37.696
2 11 457.273 33.494
3 9 331.111 31.798
4 5 292.000 13.038
3
1 6 463.333 44.572
2 14 544.286 51.696
3 11 393.636 52.397
4 8 325.000 40.000
4
1 5 554.000 55.045
3 6 456.667 36.697
4 3 416.667 5.774
5
3 4 532.500 47.871
4 7 488.571 53.675
6
3 1 590.000 .
4 1 570.000 .
ii( 7omment: in bullet point form: on the Price of any combinations for
Su(ur( and Bedrooms variables 'i.e. cells in the table(.
The average price of 2 bedroom houses in suburb 0 is @@0<++ '$D 1
<+++(
The highest price of 0 bedroom house is in suburb 0 'M 1 &<404@: $D 1
@@&;&(
The lowest price of 0 bedroom house is in suburb & 'M 1 0;0+++: $D 1
2@+@>(
The highest price of @ bedroom house is in suburb 0 'M 1 <&&0>?: $D 1
<2?;?(
The lowest price of @ bedroom house is in suburb & 'M 1 @0<+++: $D 1
&++++(
4
1 mark
2 marks
The highest price of & bedroom house is in suburb 2 'M 1 <<&+++: $D 1
<<+&<(
The lowest price of & bedroom house is in suburb & 'M 1 &2???4: $D
1<44&(
The highest price of < bedroom house is in suburb @ 'M 1<@0<++: $D 1
&4>42(
The lowest price of < bedroom house is in suburb & 'M 1 &>><42: $D 1
<@?4<(
The price of a ? bedroom house in suburb @ and & is <;++++is and
<4++++ respectively.
3* 9 local real estate %rm has told a client that the average Price of a
house in Su(ur( 2 is B&0+:+++. You have been asked to evaluate this
claim. 5se a ne Sample t !est for the Mean to evaluate the claim that
the average price is B&0+:+++.
The following table shows descriptive statistics for suburb 2. n this
sample total 0; houses were sampled from suburb 2. The mean house
price in this suburb is &>0+?; '$D 1 >@4?>(
Descriptive Statistics for Suburb 1
N M SD SE Mean
Selling Price of the
house (in 000s)
29 482.069 83.768 15.555
To Test)
*+) The average price of house in suburb 2 is &0++++
*2) The average price of house in suburb 2 is not equal to &0++++
*ere test value is &0++++.
The following table shows the results of tCtest analysis. Drom this result
we conclude that the mean price in suburb 0 is signi%cantly diEerent
from the test value &0++++ 't 1 @.;;+: df 1 0>: p F +.+++(. The mean
diEerence between the prices of house in suburb 2 and test value is
?0+?>.;4.
*ence we have suGcient evidence to reAect the null hypothesis. *ence
we conclude that: the average price of house in suburb 2 is not equal to
&0++++
OneSample !est "!est Value #$% "%%%&s'
t df p (e&n %ifference
Selling Price of the
house (in 000s)
3.990 28 ) 0.000 62.06897
5
3 marks
"* Si.e and Price: Hne of the clients wants information on Si.e of houses
as it relates to price.
a( Dirst create a new variable 'column( labelled $i/e 3roup which
divides Si.e up into two si/e groups as follows)
5nder 0++ square meters $mall
0++ square meters and over 8arge
The new variable is created in the #!cel $heet 'as $i/e 3roup(
b( Produce suitable graphs or charts to help in providing the
information requested on the Si.e of the house as it relates to
Price.
Selling Price of the house (in 000s) &ccor#ing to si$e grou*
Si$e +rou* N M SD
S&ll 93 410.538 86.356
,&rge 17 528.235 55.027
-ot&l 110 428.727 92.549
c( Dind ;<= 7 intervals for the small and large houses Price. s there
any interaction 'overlap( between the two 7on%dence ntervalsI
Jhat does this tell you about the Prices for the two Si.esI
6
1 mark
2 marks
2 marks
The following table shows descriptive statistics for the selling price
of houses 'in +++,s( according to the si/e group 'small or large(.
Drom these results: we can conclude that the mean price of large
houses is 'M 1 &2+<@>: $D 1 >?@<?( higher as compared to the
mean price of small houses 'M 1 <0>0@<: $D 1 <<+04(
Selling Price of the house (in 000s) &ccor#ing to si$e grou*
Si$e
+rou*
N M SD SE Mean
95. /0 for (e&n
,o1er
,iit
2**er
,iit
S&ll 93 410.538 86.356 8.955 392.753 428.322
,&rge 17 528.235 55.027 13.346 499.943 556.527
-ot&l 110 428.727 92.549 8.824 411.238 446.217
7
/* Produce a scatter plot of Price vs. Si.e 'Si.e should be on the
hori/ontal a!is(. Make sure you label your a!es properly and that your
graph has an appropriate title. -rieKy describe the nature of the
relationship between these two variables.
The above scatter diagram shows a positive correlation between variables
si/e of the house and price of the house. The coeGcient of determination is
+.&@0>: which indicates that around &@.0> = variation in the dependent
variable 'price( is e!plained by the independent variable 'si/e of the house(
5se 68 to carry out a regression analysis on these two variables. 7opy the
output into your assignment and use it to respond to the following)
S2((345 62-P2-

Regression Statistics
(ulti*le 4 0.6579
4 S7u&re 0.4328
3#8uste# 4
S7u&re 0.4276
St&n#&r#
9rror 70.0216
6!ser:&tions 110

3N6;3
df SS MS F
Significanc
e F
4egression 1 404095.3 404095.3 82.41759 5.789<15
4esi#u&l 108 529526.5 4903.023
-ot&l 109 933621.8
8
2 marks
marks
4 marks
marks


Coefficient
s
Standard
Error t Stat P-value Lower 95
!pper
95
Lower
95"#
!pper
95"#
0nterce*t 130.2087 33.5532 3.8807 0.0002 63.7005 196.7169 63.7005 196.7169
Si$e 1.8194 0.2004 9.0784 0.0000 1.4222 2.2167 1.4222 2.2167
a( Jrite down the regression equation.
The regression #quation can be written as below.
Selling Price of 0ouse 1in 0002s3 4 130*2056 7 1*519" 8
0ouse Si.e 1in M
2
3
b( $tate the "C$quare value and the $tandard #rror and e!plain what they
mean with respect to this data.
The "Csquare value is +.&@0>: which indicates that the &@.0> =
variation in the dependent variable house price is e!plained by the
independent variable si/e of the house. The standard error is the
estimate of the standard deviation in this case it is 4+.+02?.
c( Jrite down the value of the gradient of the regression line and e!plain
what it means in this case.
n the above regression equation: the gradient of the regression line is
2.>2;&: it indicates that the for a unit change in independent variable:
the dependent variable changes in multiples of 2.>2;&.
d( s the constant or intercept value signi%cant in this caseI *ow do you
know thisI
The constant in this equation is the value of dependent variable when
the value of independent variable is /ero. n this case constant is
signi%cant since the associated pCvalue is less than +.+<.
e( -rieKy e!plain why you think this regression model is: or is not: a good
model.
The above regression model is a good model: since the regression
coeGcient is signi%cant 'p F +.++2( and the "Csquare is also suGciently
large. *ence the independent variable in this regression analysis is a
good predictor of dependent variable.
Price and Su(ur( inde9es:
a( Determine the Su(ur( inde9 for each suburb after regressing Price
with $i/e of the house. 5se the multiplicative model in calculating
suburb indices) mprovedPredictedPrice1Predicted Price 'as a function
of $i/e( L $uburb nde!.
*int) 5se a similar technique to the time series technique that
calculates seasonal indices.
b( nterpret the suburb indices in the conte!t of the problem.
9
4 marks
marks
2 marks
marks
2 marks
marks
2 marks
marks
2 marks
marks
2 marks
marks
2 marks
marks
:* 5sing information from your analyses write a short concluding paragraph
about house prices and si/es for diEerent suburbs.
Drom the above analysis: we can conclude that on average the prices of houses in
the suburb 0 are higher as compared to the other suburbs. Dor all $uburbs: there is a
positive correlation between the house price and the si/e. $i/e of the house is a
good predictor of the house price in these suburbs. Dor all $uburbs: the price of the
large houses is higher than the prices of the small houses.
10
2 marks