Parameters:
(Likewise, we consider
Statistics:
for Population 2)
is 1- 2
is
??
degrees of freedom
degrees of freedom
degrees of freedom
F Distribution
The F distribution is similar to the
distribution in that
its starts at zero (is non-negative) and is not symmetrical.
Two parameters define this distribution, and like weve
already seen these are again degrees of freedom.
is the numerator degrees of freedom and
is the denominator degrees of freedom.
Determining Values of F
For example, what is the value of F for 5% of the area under
the right hand tail of the curve, with a numerator degree
of freedom of 3 and a denominator degree of freedom of 7?
Solution: use the F look-up (Table 6).
There are different tables
for different values of A.
Make sure you start with
the correct table!!
F.05,3,7
F.05,3,7=4.35
Determining Values of F
For areas under the curve on the left hand side of the
curve, we can leverage the following relationship:
F F / 2,1 , 2
or F F1 / 2, ,
1 2
Example 13.1
Millions of investors buy mutual funds choosing from
thousands of possibilities.
Some funds can be purchased directly from banks or
other financial institutions while others must be
purchased through brokers, who charge a fee for this
service.
This raises the question, can investors do better by
buying mutual funds directly than by purchasing mutual
funds through brokers.
Example 13.1
To help answer this question a group of researchers
randomly sampled the annual returns from mutual funds
that can be acquired directly and mutual funds that are
bought through brokers and recorded the net annual
returns, which are the returns on investment after
deducting all relevant fees (Data File: Xm13-01).
Example 13.1
To answer the question we need to compare the
population of returns from direct and the returns from
broker-bought mutual funds.
Example 13.1
The hypothesis to be tested is that the mean net annual
return from directly-purchased mutual funds (1) is
larger than the mean of broker-purchased funds (2).
Hence the alternative hypothesis is
H1: 1- 2 > 0
and
H0: 1- 2 = 0
To decide which of the t-tests of 1 - 2 to apply, we
conduct the F-test of 12/ 22 .
Example 13.1
From the data we calculated the following statistics:
s12 = 37.49 and s22 = 43.34
Test statistic: F = 37.49/43.34 = 0.86
Rejection region:
Example 13.1
Click Data, Data Analysis, and F-Test Two Sample for
Variances
Example 13.1
A
B
C
1 F-Test Two-Sample for Variances
2
3
Direct
Broker
4 Mean
6.63
3.72
5 Variance
37.49
43.34
6 Observations
50
50
7 df
49
49
8 F
0.86
9 P(F<=f) one-tail
0.3068
10 F Critical one-tail
0.6222
Example 13.1
There is not enough evidence to infer that the population
variances differ. It follows that we must apply the equalvariances t-test of 1- 2 .
Example 13.1
Click Data, Data Analysis,
Assuming Equal Variances
t-Test:
Two-Sample
Example 13.1
A
B
C
1 t-Test: Two-Sample Assuming Equal Variances
2
3
Direct
Broker
4 Mean
6.63
3.72
5 Variance
37.49
43.34
6 Observations
50
50
7 Pooled Variance
40.41
8 Hypothesized Mean Difference
0
9 df
98
10 t Stat
2.29
11 P(T<=t) one-tail
0.0122
12 t Critical one-tail
1.6606
13 P(T<=t) two-tail
0.0243
14 t Critical two-tail
1.9845
Example 13.1
The value of the test statistic is 2.29. The one-tail pvalue is .0122.
We observe that the p-value of the test is small (and the
test statistic falls into the rejection region).
As a result we conclude that there is sufficient evidence
to infer that on average directly-purchased mutual funds
outperform broker-purchased mutual funds
Summary: Case 1
Factors that identify the equal-variances t-test and estimator
of
(refer to the equal variance case for appropriate
d.f.):
Summary: Case 2
Factors that identify the unequal variances t-test and estimator
of
(refer to the unequal variance case for appropriate
d.f.):
Example 13.4
In the last few years, a number of web-based companies
that offer job placement services have been created.
The manager of one such company wanted to investigate
the job offers recent MBAs were obtaining.
In particular, she wanted to know whether finance
majors were being offered higher salaries than marketing
majors.
Example 13.4
In a preliminary study she randomly sampled 50 recently
graduated MBAs half of whom majored in finance and
half in marketing.
Example 13.4
The parameter is the difference between two means
(where 1 = mean highest salary offer to finance majors
and 2 = mean highest salary offer to marketing majors).
Example 13.4
The hypotheses are
H 0 : (1 2 ) 0
vs
H1 : (1 2 ) 0
Example 13.4
A
B
C
1 t-Test: Two-Sample Assuming Equal Variances
2
3
Finance
Marketing
4 Mean
65,624
60,423
5 Variance
360,433,294 262,228,559
6 Observations
25
25
7 Pooled Variance
311,330,926
8 Hypothesized Mean Difference
0
9 df
48
10 t Stat
1.04
11 P(T<=t) one-tail
0.1513
12 t Critical one-tail
1.6772
13 P(T<=t) two-tail
0.3026
14 t Critical two-tail
2.0106
Example 13.4
The value of the test statistic (t =1.04) and its p-value
(.1513) indicate that there is very little evidence to
support the hypothesis that finance majors attract higher
salary offers than marketing majors.
Example 13.5
Suppose now that we redo the experiment in the
following way.
We examine the transcripts of finance and marketing
MBA majors.
We randomly sample a finance and a marketing major
whose grade point average (GPA) falls between 3.92
and 4 (based on a maximum of 4).
We then randomly sample a finance and a marketing
major whose GPA is between 3.84 and 3.92.
Example 13.5
We continue this process until the 25th pair of finance
and marketing majors are selected whose GPA fell
between 2.0 and 2.08.
(The minimum GPA required for graduation is 2.0.)
As we did in Example 13.4, we recorded the highest
salary offer (Data File: Xm13-05).
Can we conclude from these data that finance majors
draw larger salary offers than do marketing majors?
Example 13.5
The experiment described in Example 13.4 is one in which
the samples are independent.
That is, there is no relationship between the observations in
one sample and the observations in the second sample.
However, in this example the experiment was designed in
such a way that each observation in one sample is matched
with an observation in the other sample.
The matching is conducted by selecting finance and
marketing majors with similar GPAs.
Thus, it is logical to compare the salary offers for finance and
marketing majors in each group.
This type of experiment is called matched pairs.
Example 13.5
For each GPA group, we calculate the matched pair
difference between the salary offers for finance and
marketing majors.
Example 13.5
The numbers in black are the original starting salary data
(Xm13-05); the numbers in blue were calculated.
although a student is either in Finance OR in
Marketing (i.e. independent), that the data is
grouped in this fashion makes it a matched
pairs experiment (i.e. the two students in
group #1 are matched by their GPA range)
the difference of the means is equal to the mean of the differences, hence we will consider
the mean of the paired differences as our parameter of interest:
Example 13.5
Do Finance majors have higher salary offers than
Marketing majors?
Since
we want to research this hypothesis: H1:
(and our null hypothesis becomes H0:
Example 13.5
Click Data, Data Analysis, t-Test: Paired Two- Sample for
Means
Example 13.5
A
B
C
1 t-Test: Paired Two Sample for Means
2
3
Finance
Marketing
4 Mean
65,438
60,374
5 Variance
444,981,810 469,441,785
6 Observations
25
25
7 Pearson Correlation
0.9520
8 Hypothesized Mean Difference
0
9 df
24
10 t Stat
3.81
11 P(T<=t) one-tail
0.0004
12 t Critical one-tail
1.7109
13 P(T<=t) two-tail
0.0009
14 t Critical two-tail
2.0639
Example 13.5
The p-value is .0004. There is overwhelming evidence
that Finance majors do obtain higher starting salary
offers than their peers in Marketing.
A
B
1 t-Estimate: Mean
2
3
4 Mean
5 Standard Deviation
6 LCL
7 UCL
Difference
5065
6647
2321
7808
Sampling Distribution
The statistic
is approximately normally distributed
if the sample sizes are large enough so that:
Since it is approximately normal, we can describe the
normal distribution in terms of mean and variance
Example 13.9
The General Products Company produces and sells a bath
soap, which is not selling well.
Hoping to improve sales General products
introduce more attractive packaging.
decided to
Example 13.9
The first design features several bright colors to distinguish
it from other brands.
The second design is light green in color with just the
companys logo on it.
As a test to determine which design is better the marketing
manager selected two supermarkets.
In one supermarket the soap was packaged in a box using
the first design and in the second supermarket the second
design was used.
Example 13.9
The product scanner at each supermarket tracked every
buyer of soap over a one week period.
The supermarkets recorded the last four digits of the
scanner code for each of the five brands of soap the
supermarket sold (Data File: Xm13-09)
Example 13.9
After the trial period the scanner data were transferred to
a computer file.
Because the first design is more expensive management
has decided to use this design only if there is sufficient
evidence to allow them to conclude that it is better.
Example 13.9
The problem objective is to compare two populations. The
first is the population of soap sales in supermarket 1 and the
second is the population of soap sales in supermarket 2.
The data are nominal because the values are buy General
Products soap and buy other companies soap.
These two factors tell us that the parameter to be tested is the
difference between two population proportions p1-p2 (where
p1 and p2 are the proportions of soap sales that are a General
Products brand in supermarkets 1 and 2, respectively).
Example 13.9
Because we want to know whether there is enough evidence
to adopt the brightly-colored design, the alternative
hypothesis is
H1: (p1 p2) > 0
The null hypothesis must be
H0: (p1 p2) = 0
which tells us that this is an application of Case 1. Thus, the test
statistic is
z
(p 1 p 2 )
1
1
p(1 p )
n
1 n2
Example 13.9
A
B
C
1 z-Test: Two Proportions
2
3
Supermarket 1 Supermarket 2
4 Sample Proportions
0.1991
0.1493
5 Observations
904
1038
6 Hypothesized Difference
0
7 z Stat
2.90
8 P(Z<=z) one tail
0.0019
9 z Critical one-tail
1.6449
10 P(Z<=z) two-tail
0.0038
11 z Critical two-tail
1.96
Example 13.9
The value of the test statistic is z = 2.90; its p-value is
.0019. There is enough evidence to infer that the brightlycolored design is more popular than the simple design.
As a result, it is recommended that management switch to
the first design.
Example 13.10
Suppose in our test marketing of soap packages scenario
that instead of just a difference between the two package
versions, the brightly colored design had to outsell the
simple design by at least 3%.
Example 13.10
Our research hypothesis now becomes:
H1: (p1p2) > .03
And so our null hypothesis is: H0: (p1p2) = .03
Example 13.10
A
B
C
1 z-Test: Two Proportions
2
3
Supermarket 1 Supermarket 2
4 Sample Proportions
0.1991
0.1493
5 Observations
904
1038
6 Hypothesized Difference
0.03
7 z Stat
1.14
8 P(Z<=z) one tail
0.1261
9 z Critical one-tail
1.6449
10 P(Z<=z) two-tail
0.2522
11 z Critical two-tail
1.96
Example 13.10
There is not enough evidence to infer that the brightly
colored design outsells the other design by 3% or more.
Example 13.11
To help estimate the difference in profitability, the
Marketing manager in the previous two examples
would like to estimate the difference between the two
proportions. A confidence level of 95% is suggested.
Example 13.11
A
B
C
D
1 z-Estimate: Two Proportions
2
3
Supermarket 1 Supermarket 2
4 Sample Proportions
0.1991
0.1493
5 Observations
904
1038
6
7 LCL
0.0159
8 UCL
0.0837