Anda di halaman 1dari 2

Analysis of House Sales Data for Gainesville

Ramin Shamshiri UFID: 90213353

Figure 2. Distribution of homes types, 2006. (n=100)

Since there are outliers in sample data, we cannot use a normal curve to describe the distribution of population. The hypothesis to be tested can be stated as follow:
Figure 1. Histogram of home prices 2006. (n=100)

Gainesville Realtors Association (GRA) claim that in 2009 the average price of a house sold in 2009 was $100 per square foot and this is below the average for 2006. A random sample of house sold in 2006 was selected to test this claim. The sample data included the square feet price for one hundred houses sold in 2006. The five number summary, as well as mean and the standard deviation of the data are given in table 1. The distribution of house price is shown by means of a histogram plot (Figure 1) with the corresponding values given in table 2. The coefficient of variation of the data is 33.69%. The distribution of houses status sold in 2006 is shown by the pie chart in Figure 2. According to the sample data, 11 percent of the houses sold were new and 89% were old. The box plot shown in Figure 3 represents the lower and upper quintile of the price per square feet for the sample data. To test the GRA claim, having a random sample of 100 houses sold in 2006, since the sample size is greater than 30, we can use either z- test or ttest. Here we dont know the population standard deviation, instead we can use sample standard deviation since our sample size is greater than 30. We can also use the sample mean as an estimation of the 2006 population mean and then use t-test for population means.

: = : > () From the exploratory analysis we have n=100 (df=99), SD=31.105, Mean=92.33. From the ttable, with = 0.05 and d.f=99, we have = 1.65. We want to check if the claim that the mean price of house sold in 2006 is greater than 100 is valid or not. We use t-test to transfer the sample mean into the standard normal distribution. The test statistic is then: = / = 92.33 100 31.105/ 100 = 2.4658

The p-value corresponding to this test statistic is greater than our significant level (P-value > = 0.05), hence we fail to reject the null hypothesis and conclude that there is not enough evidence that the mean price of house sold in 2006 is greater than 100. This procedure is shown in Figure 4.

Figure 3. Hypothesis testing

Variable SqftPrice

N 100

SE Mean Mean StDev Minimum Q1 Median Q3 Maximum 3.11 92.33 31.1052 21.00 70.50 90.50 108.00 196.00 Table 1. Summary of selling price of homes in Gainesville, Florida, Fall 2006

Range 175

Price per square feet (dollars)

10 to 30 30 up to 50 50 up to 70 70 up to 90 90 up to 110 110 up to 130 130 up to 150 150 up to 170 170 up to 190 190 up to 210 Total

Number of homes 3 7 23 31 18 5 9 3 0 1 100

Table 2. Distribution of Homes in Gainesville by Price, Fall 2006 (n=100) Type of Home Frequency 11 New 89 Old Table 3. Distribution of Homes in Gainesville

Figure 4. Box plot of the home prices, 2006. (n=100) The assumption of this test was: Random variable: The average price of a house sold in 2006. Distribution of the population: It is not normal. Parameter of the distribution: = mean of the population. Data collection method and type of data: Assume SRS, Quantitative variable. The sample size: n = 100 is greater than 30 so we can either use t-test or z-test.

Anda mungkin juga menyukai