Anda di halaman 1dari 10

Brief Introduction: AJ Davis is a department store chain, which has many credit customers and want to find out

more information about these customers. AJ Davis has complied a sample of 50 credit customers with data selected in the following variables: Location, Income (in $1,000s), Size (Number of people living in the household), Years (number of years the customer has lived in the current location), and Credit Balance (customers current credit card balance on the stores credit car, in $). The manager at AJ Davis has speculated the following: a. The average (mean) annual income was less than $50,000. b. The true population proportion of customers who live in an urban area exceeds 40% c. The average (mean) number of years lived in the current home is less than 13 years d. The average (mean) credit balance for suburban customers is more than $4300 I will analyze the speculated data listed above by performing hypothesis test for each of the above situations (using the Seven elements of a Test Hypothesis with a=.05) in order to see if there is evidence to support my managers beliefs in each case (a-d), explain my conclusion in simple terms, compute the p-value with the interpretation, follow up with computing 95% confidence intervals for each of the variables described in a. to d. along with interpreting these intervals. This paper will also include an Appendix with all the steps in hypothesis testing, as well as the confidence intervals and Minitab output In order to understand how hypothesis testing is done it is important that you know the elements of the Test of Hypothesis, and what each step means. The Seven elements of a Test of Hypothesis are: 1. Null Hypothesis - A theory about the specific values of one or more population parameters. The theory generally represents the status quo, and we accept it until proven false. 2. Alternative (research) hypothesis (Ha)- A theory about the specific values of one or more population parameters. The theory generally represents the status quo, and we accept it until proven false 3. Test statistic - A sample statistic used to decide whether to reject the null hypothesis. 4. Rejection Region - The numerical values of the test statistic for which the null hypothesis will be rejected. 5. Assumptions- Clear statements of any assumptions made about the populations being sampled.

6. Experiment and calculation of test statistics- Performance of the sampling experiment and determination of the numerical value of the test statistic. 7. Conclusiona. If the numerical value of the test statistic falls in the rejection region then we reject the null hypothesis and conclude that the alternative is true. b. If the test statistic does not fall in the rejection region, then we do not reject H0 as we have insufficient data to do so.

a.

The average (mean) annual income was less than $50,000 I found that the average annual incomes are 43.74 or $46,060, and the standard deviation to be 14.64 or $14.064. Set up Hypothesis Test o Ho: =50 o H1: <50 For a= 0.5 and < in the Ha, I found that z= -1.645, so the Rejection Region would be z<-1.645 Next I calculated the test statistic, using the formula below to calculate the test statistic z.
z = x - u0
sx

where u0 is the mean in the null hypothesis and s x = s/

Z= (43.74-50)/7.0711= 2.08, because s x =-2.07,because s x =

14.64/ 50 = 7.07107 The p-value= 0.001. The p-value is another complementary and equally valid way we can evaluate the null and alternative hypotheses is by looking at the pvalue and compare the p-value to alpha. If the p-value is less than alpha, reject the null hypothesis and accept the alternative hypothesis, at the given alpha. When you look at the calculated test statistics results you can see that both the test statistic and the p-value methods have the same reject or not reject results. Because the p-value = 0.001 is less than alpha = 0.05: we reject the null hypothesis H0: =50 and we accept the alternative hypothesis Ha: <50, at =0.05. My calculated test statistic of -2.07 falls in the rejection region of z < -0.1645, therefore, I would reject the null hypothesis and say there is sufficient evidence to indicate u<50 or $50,000.

b. The true population proportion of customers who live in an urban area exceeds 40% 22 of the 50 surveyed live in the Urban area, which is 44% or 0.44, this is the point estimate for p. Therefore my hypothesis would be o Ho: = 0.40 vs. Ha: p>0.40 In order to conduct the large sample z-test, we first need to verify that the sample size is large enough. o nPo= 50(0.40) = 20 and 50 (1-0.44) = 30, both are larger than 15, so we can conclude that sample size is large enough to apply the large sample z test. Z= (0.44 0.400)/ 0.69282= 0.58 where s phat= sqrt (((0.40) (0.60))/50= 0.069282 This is a one tailed (upper or right since HA has >). Our rejection regions would be z > 1.645. 0.58 is not greater than 1.645 (and is not in the rejection regions) so we would not Reject the Ho. The p-value= 0.282. The p-value is another complementary and equally valid way we can evaluate the null and alternative hypotheses is by looking at the pvalue and compare the p-value to alpha. If the p-value is less than alpha, reject the null hypothesis and accept the alternative hypothesis, at the given alpha. When you look at the calculated test statistics results you can see that both the test statistic and the p-value methods have the same reject or not reject results. Because the p-value = 0.282 is more than alpha = 0.05: we do not reject the null hypothesis H0: =40 and we do not accept the alternative hypothesis Ha: <40, at =.05. Since we are not rejecting the Ho, we are saying there is insufficient evidence to conclude the true population of customers who live in the Suburban location is greater than 40%.

c. The average (mean) number of years lived in the current home is less than 13 years. o The average number of years in the current home form survey data to be 12.260, and the standard deviation to be 5.086 o Set up Hypothesis Test Ho: u = 13 H1: u<13 For a = 0005 and < in the Ha, I found that z= -1.645, so the rejection Region would be z < -1.645 Now I calculate the test statistic z = x - u0
sx

where u0 is the mean in the null hypothesis and s x = s/

z= (12.26 -13)/0.7193= -1.03, because

sx = 5.086/

n (50)= 0.7193

Because the p-value = 0.152 is more than alpha = 0.05: we do not reject the null hypothesis H0: =13 and we do not accept the alternative hypothesis Ha: <13, at =.05. My calculated test statistic of -1.03 does not fall in the rejection region of z < 1.645, therefore, we would not reject the null hypothesis and say there is insufficient evidence to indicate U<13

d. The average (mean) credit balance for suburban customers is more than $4300. o I found he average credit balance for those surveyed is $3970, and the standard deviation is 932. o Set up Hypothesis Test Ho: u = 4300 H1: u> 4300 For a = .05 and > in the Ha, I found z= 1.645, so the Rejection Region would be z > 1.645. Now I calculate the test statistic
z = x - u0
sx

where u0 is the mean in the null hypothesis and s x = s/

z= (3970- 4300)/131.8 = -2.50, because s x = 932/

n (50)= 131.8

The p-value= 0.994. The p-value is another complementary and equally valid way we can evaluate the null and alternative hypotheses is by looking at the pvalue and compare the p-value to alpha. If the p-value is less than alpha, reject the null hypothesis and accept the alternative hypothesis, at the given alpha. When you look at the calculated test statistics results you can see that both the test statistic and the p-value methods have the same reject or not reject results. Because the p-value = 0.994 is not less than alpha = .05: we do not reject the null hypothesis H0: =4300 and we do not accept the alternative hypothesis Ha: >4300 at =.05. My calculated test statistic of -2.50 does not fall in the rejection region of Z > -1.645, therefore, I would NOT reject the null hypothesis and say there is insufficient evidence to indicate U>4300.

Appendix 2) Follow this up with computing 95% confidence intervals for each of the variables described in a. - d., and gain interpreting these intervals. a. The average (mean) annual income was less than $50,000 One-Sample Z: Income ($1000) The assumed standard deviation = 14.64 Variable N Mean StDev SE Mean 95% CI Income ($1000) 50 43.74 14.64 2.07 (39.68, 47.80) Conclusion: According to the confidence interval, we are 95% confident that the true mean income lies between $39,680 and $47,800. b. The true population proportion of customers who live in an urban area exceeds 40% Sample X N Sample p 95% CI Z-Value P-Value 1 22 50 0.440000 (0.302411, 0.577589) 0.58 0.564 Conclusion: According to the confidence interval, we are 95% confident that the mean population lies between 0.302 and 0.577. c. The average (mean) number of years lived in the current home is less than 13 years One-Sample Z: Income ($1000) The assumed standard deviation = 5.086 Variable N Mean StDev S E Mean 95% CI Income ($1000) 50 43.740 14.640 0.719 (42.330, 45.150) Conclusion: According to the confidence interval, we are 95% confident that the average mean of people living in their current homes lies between 42.33 and 45.15. d. The average (mean) credit balance for suburban customers is more than $4300 One-Sample Z: Credit Balance($) The assumed standard deviation = 932 Variable N Mean StDev SE Mean Credit Balance($) 50 3970 932 132 95% CI (3712, 4229)

Conclusion: We are 95% confident that the true mean credit balance lies between $3,712 and $4,229.

Minitab calculations for first part of Part B Project a. The average (mean) annual income was less than $50,000Descriptive Statistics: Income ($1000)
Descriptive Statistics: Income ($1000)
Variable Income ($1000) Mean 43.74 StDev 14.64 Minimum 21.00 Maximum 67.00

One-Sample Z
Test of mu = 50 vs < 50 The assumed standard deviation = 14.64 95% Upper Bound 47.15

N 50

Mean 43.74

SE Mean 2.07

Z -3.02

P 0.001

Normal, Mean=0, StDev=1 0.4

Distribution Plot

0.3
Density

0.2

0.1 0.05 0.0 -1.645 0 X

b. The true population proportion of customers who live in an urban area exceeds 40%.
Location Rural Suburban Urban N= Count 13 15 22 50 Percent 26.00 30.00 44.00

Test and CI for One Proportion


Test of p = 0.4 vs p > 0.4 95% Lower Bound 0.324532

Sample 1

X 22

N 50

Sample p 0.440000

Z-Value 0.58

P-Value 0.282

Test and CI for One Proportion


Sample 1 X 22 N 50 Sample p 0.440000 95% CI (0.302411, 0.577589)

c. The average (mean) number of years lived in the current home is less than 13 years

Descriptive Statistics: Years


Variable Years Mean 12.260 StDev 5.086 Minimum 1.000 Maximum 20.000

One-Sample Z: Years
Test of mu = 13 vs < 13 The assumed standard deviation = 5.086 95% Upper Bound 13.443

Variable Years

N 50

Mean 12.260

StDev 5.086

SE Mean 0.719

Z -1.03

P 0.152

Normal, Mean=0, StDev=1 0.4

Distribution Plot

0.3
Density

0.2

0.1 0.05 0.0 -1.645 0 X

d. The average (mean) credit balance for suburban customers is more than $4300
Descriptive Statistics: Credit Balance($)
Variable Credit Balance($) Mean 3970 StDev 932 Minimum 1864 Maximum 5678

One-Sample Z: Credit Balance($)


Test of mu = 4300 vs > 4300 The assumed standard deviation = 932 95% Lower Bound 3754

Variable Credit Balance($)

N 50

Mean 3970

StDev 932

SE Mean 132

Z -2.50

P 0.994

Normal, Mean=0, StDev=1 0.4

Distribution Plot

0.3
Density

0.2

0.1 0.05 0.0 0 X 1.645

10