Anda di halaman 1dari 12

BUSINESS STATISTICS: TERM PROJECT

House Price Data

Group No : 03 Section : G
Group members :
SR. NO NAMES ROLL NO
1 Akash Basa 2017232011
2 Aman Srivastava 2017232015
3 Anindhya Sharma 2017232023
4 Anuankit Panda 2017232028
5 Chitwan Singh 2017232039
6 Tejas Bhosale 2017231103

Submitted to : Dr. S Maheshwaran


Business Statistics: House price data analysis 1
Outline of presentation

 Measures of central tendency & variability for house price data and
built up area (Slide no. 3 and 4)

 Correlation and Simple Regression analysis for House price data


and built up area data (Slide no. 5 to 7)

 Correlation and Simple Regression analysis for No of offers


received and House price data (Slide no. 8 to 10)

 Analysis of given data using Multiple Regression model.


(Slide no.11)

 Observations and Conclusion (Slide no. 12)

Business Statistics: House price data analysis 2


Measures of Central Tendency & Variability for House Price data :

Mean 132207.12
Standard Error 4014.57
Median 124500
Mode 124600
Standard Deviation 39944.4
Sample Variance 1595558516.37
Kurtosis 0.56
Skewness 0.85

 Here Mean ≠ Median ≠ Mode, hence house price data


is not normally distributed
 Major of observations i.e. house prices would fall in
Mean ± Standard deviation i.e. ($132207.12 ± $39944.4)
Business Statistics: House price data analysis 3
Measures of Central Tendency & Variability for built up area

Mean 1628.1
Standard Error 31.33
Median 1630
Mode 1950
Standard Deviation 313.33
Sample Variance 98181.2
Kurtosis -0.97
Skewness 0.06

Here Mean ≠ Median ≠ Mode, hence built area data is not


normally distributed
• Major of observations i.e. built area in Sq. feet would fall in
Mean ± Standard deviation i.e. (1628.1 ± 313.33) Sq.feet

Business Statistics: House price data analysis 4


Scatter plot for House price vs. built up area data
2500

2000
BUILT UP AREA (SQ. FEET)

1500

1000

500

0
0.00 50000.00 100000.00 150000.00 200000.00 250000.00 300000.00

HOUSE PRICE

Business Statistics: House price data analysis 5


House price vs. built up area data analysis:

CONCEPT USED: Correlation (Strength of relationship)


Considering X variable : House prices
Considering Y variable : Built up area in Sq. Feet

FINDING:
Coefficient of Correlation (r) : 0.668

INTERPRETATION:
1) Weak Positive Correlation exists between House Prices and
Built up area : Both X and Y variables are moving in same direction

2) Even if we interchange the X and Y variables, the value of


Coefficient of Correlation doesn’t change

Business Statistics: House price data analysis 6


House Price vs. Sq. Feet data analysis (Contd.):
CONCEPT USED: Simple Regression.
Considering X variable : Built up area in Sq. Feet
Considering Y variable : House prices

ASSUMPTION: α (Level of Significance) = 5%

FINDING:
Coefficient of determination (r2) : 0.45

INTERPRETATION:
1) Model built through sample data for estimation captures only 45%
of variation and remaining 55% variation is captured by error or other
factors that are not considered (i.e. no of offers, no of bedrooms, no of
bathrooms etc.)
2) 45% variation in House prices is due to Built up area.
Business Statistics: House price data analysis 7
Scatter plot for No. of offers vs. House Prices
6

4
NO. OF OFFERS

0
0.00 50000.00 100000.00 150000.00 200000.00 250000.00 300000.00
HOUSE PRICE

Business Statistics: House price data analysis 8


No. of offers vs. House Price data analysis :

CONCEPT USED: Correlation (Strength of relationship)


Considering X variable : House Prices
Considering Y variable : No. of Offers

FINDING:
Coefficient of Correlation (r) : 0.125

INTERPRETATION:
1) Zero Correlation exists between House Prices and
No. of Offers: i.e No Linear relationship exists between
X and Y variables.

2) Even if we interchange the X and Y variables, the value of


Coefficient of Correlation doesn’t change

Business Statistics: House price data analysis 9


No. of offers vs. House Prices analysis (contd.):
CONCEPT USED: Simple Regression.
Considering X variable : House Prices
Considering Y variable : No. of Offers

ASSUMPTION: α (Level of Significance) = 5 %

FINDING:
Coefficient of determination (r2) : 0.0157

INTERPRETATION:
1) Model built through sample data for estimation captures only 1.57%
of variation and remaining 98.43% variation is captured by error or
other factors that are not considered.
2) ONLY 1.57% variation in No. of Offers is due to House Prices.

Business Statistics: House price data analysis 10


CONCEPT USED: Multiple Regression.
Considering X variable : Built up area, no. of bedrooms, no. of
bathrooms, no. of offers received, brick/non
brick house and location of house.
Considering Y variable : House Prices

ASSUMPTION: α (Level of Significance) = 5 %

FINDING:
Coefficient of determination (r2) : 0.494

INTERPRETATION:
1) Model built through sample data for estimation captures only 49.4%
of variation and remaining 50.6% variation is captured by error or other
factors that are not considered
2) 49.4% variation in House prices is due to Built up area, no. of
bedrooms, no. of bathrooms, no. of offers received, brick/non brick house
and location of house.
Business Statistics: House price data analysis 11
Observations:

 The goal is NOT to get the highest R-square value. Instead, the goal
is to develop a model that is statistically sound, creating the best fit
with existing data.
 More the R-square value, better is the Model because we are able to
capture many reasons for Variance.
 In case of simple regression: when only Sq. feet is considered as
our independent variable (i.e. X) , then we get 45% variation in our
dependent variable(i.e. Y), Price of houses.
 In case of Multiple regression: when built up area, no. of bedrooms,
no. of bathrooms, no. of offers received, brick/non brick house and
location of house are considered as our independent variables, then
we get 49.4% variation in our dependent variable i.e. Price of
houses.
 Conclusion: Limiting the no. of independent variables also limits
the usefulness of the Model. Hence to develop a model that fits the
data better, we have to go for Multiple Regression Model.

Business Statistics: House price data analysis 12

Anda mungkin juga menyukai