TO:
SIR SHAHID MAHMOOD BY: ABUZAR TABASSUM M.SC ZOOLOGY 3RD SEMESTER 10040814-017
Chi-Square Test
A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory (i.e. Mendels laws as expressed in the Punnett square). A statistical method used to determine GOODNESS OF FIT Goodness of fit refers to how close the observed data are to those predicted from a hypothesis
Goodness of Fit
Mendel has no way of solving this problem. Shortly after the rediscovery of his work in 1900, Karl Pearson and R.A. Fisher developed the chi-square test for this purpose. The chi-square test is a goodness of fit test: it answers the question of how well do experimental data fit expectations. We start with a theory for how the offspring will be distributed: the null hypothesis. We will discuss the offspring of a self-pollination of a heterozygote. The null hypothesis is that the offspring will appear in a ratio of 3/4 dominant to 1/4 recessive.
Formula
To calculate the chi-square statistic following formula is used.
How can you tell if an observed set of offspring counts is legitimately the result of a given underlying simple ratio? For example, you do a cross and see 290 purple flowers and 110 white flowers in the offspring. This is pretty close to a 3/4 : 1/4 ratio, but how do you formally define "pretty close"? What about 250:150?
Example
As an example, you count F2 offspring, and get 290 purple and 110 white flowers. This is a total of 400 (290 + 110) offspring. We expect a 3/4 : 1/4 ratio. We need to calculate the expected numbers, this is done by multiplying the total offspring by the expected proportions. This we expect 400 * 3/4 = 300 purple, and 400 * 1/4 = 100 white. Thus, for purple, obs = 290 and exp = 300. For white, obs = 110 and exp = 100. Now it's just a matter of plugging into the formula: '2 = (290 - 300)2 / 300 + (110 - 100)2 / 100 = (-10)2 / 300 + (10)2 / 100 = 100 / 300 + 100 / 100 = 0.333 + 1.000 = 1.333. This is our chi-square value: now we need to see what it means and how to use it.
Reasonable
What is a reasonable result is subjective and arbitrary. For most work a result is said to not differ significantly from expectations if it could happen at least 1 time in 20. That is, if the difference between the observed results and the expected results is small enough that it would be seen at least 1 time in 20 over thousands of experiments, we fail to reject the null hypothesis. For technical reasons, we use fail to reject instead of accept. 1 time in 20 can be written as a probability value p = 0.05, because 1/20 = 0.05. Another way of putting this. If your experimental results are worse than 95% of all similar results, they get rejected because you may have used an incorrect null hypothesis.
The test statistic is compared to a theoretical probability distribution In order to use this distribution properly you need to determine the degrees of freedom If the level of significance read from the table is greater than .05 or 5% then your hypothesis is accepted and the data is useful The hypothesis is termed the null hypothesis which states that there is no substantial statistical deviation between observed and expected data.
Degrees of Freedom
A critical factor in using the chi-square test is the degrees of freedom . Degrees of freedom is the number of phenotypic possibilities in your cross minus one. Or Degrees of freedom is simply the number of classes of offspring minus 1. For our example, there are 2 classes of offspring: purple and white. Thus, degrees of freedom (d.f.) = 2 -1 = 1.
Critical Chi-Square
Critical values for chi-square are found on tables, sorted by degrees of freedom and probability levels. Be sure to use p = 0.05. If your calculated chi-square value is greater than the critical value from the table, you reject the null hypothesis . If your chi-square value is less than the critical value, you fail to reject the null hypothesis (that is, you accept that your genetic theory about the expected ratio is correct).
Chi-Square Table
9:3:3:1
phenotype round yellow round green wrinkled yellow wrinkled green total
Chi-Square Table
Note: The wild-type allele is designated with a + sign Recessive mutant alleles are designated with lowercase letters
The Cross:
A cross is made between two true-breeding flies (c+c+e+e+ and ccee). The flies of the F1 generation are then allowed to mate with each other to produce an F2 generation.
The outcome
F1 generation All offspring have straight wings and gray bodies F2 generation 193 straight wings, gray bodies 69 straight wings, ebony bodies 64 curved wings, gray bodies 26 curved wings, ebony bodies 352 total flies
Step 2: Calculate the expected values of the four phenotypes, based on the hypothesis According to our hypothesis, there should be a 9:3:3:1 ratio on the F2 generation
Phenotype straight wings, gray bodies straight wings, ebony bodies curved wings, gray bodies curved wings, ebony bodies Expected probability 9/16 3/16 3/16 1/16 Expected number 9/16 X 352 = 198 3/16 X 352 = 66 3/16 X 352 = 66 1/16 X 352 = 22 Observed number
193 64 62 24
G! G!
(O2 E2)2 E2
(O3 E3)2 E3
(O4 E4)2 E4
(69 66)2 66
(64 66)2 66
Step 4: Interpret the chi square value The calculated chi square value can be used to obtain probabilities, or P values, from a chi square table
These probabilities allow us to determine the likelihood that the observed deviations are due to random chance alone
Low chi square values indicate a high probability that the observed deviations could be due to random chance alone High chi square values indicate a low probability that the observed deviations are due to random chance alone If the chi square value results in a probability that is less than 0.05 (ie: less than 5%) it is considered statistically significant The hypothesis is rejected
Step 4: Interpret the chi square value Before we can use the chi square table, we have to determine the degrees of freedom (df) The df is a measure of the number of categories that are independent of each other If you know the 3 of the 4 categories you can deduce the df = n 1 where n = total number of categories In our experiment, there are four phenotypes/categories Therefore, df = 4 1 = 3 Refer to Table
1.06
Step 4: Interpret the chi square value With df = 3, the chi square value of 1.06 is slightly greater than 1.005 (which corresponds to P-value = 0.80) P-value = 0.80 means that Chi-square values equal to or greater than 1.005 are expected to occur 80% of the time due to random chance alone; that is, when the null hypothesis is true. Therefore, it is quite probable that the deviations between the observed and expected values in this experiment can be explained by random sampling error and the null hypothesis is not rejected. What was the null hypothesis?
If your hypothesis is supported by data you are claiming that mating is random and so is segregation and independent assortment. If your hypothesis is not supported by data you are seeing that the deviation between observed and expected is very far apart something non-random must be occurring .
F1 x F1
5610 1881
1896
622
(obs exp) ' ! exp Using the chi square formula compute the chi square total for this cross: (5610 - 5630)2/ 5630 = .07 (1881 - 1877)2/ 1877 = .01 (1896 - 1877 )2/ 1877 = .20 (622 - 626) 2/ 626 = .02 G 2= .30 How many degrees of freedom?
2
(obs exp) ' ! exp Using the chi square formula compute the chi square total for this cross: (5610 - 5630)2/ 5630 = .07 (1881 - 1877)2/ 1877 = .01 (1896 - 1877 )2/ 1877 = .20 (622 - 626) 2/ 626 = .02 G 2= .30 How many degrees of freedom? 3
2
Accept Hypothesis
Probability (p) Degrees of Freedom 1 2 3 4 5 6 7 8 9 10 0.95 0.004 0.10 0.35 0.71 1.14 1.63 2.17 2.73 3.32 3.94 0.90 0.02 0.21 0.58 1.06 1.61 2.20 2.83 3.49 4.17 4.86 0.80 0.06 0.45 1.01 1.65 2.34 3.07 3.82 4.59 5.38 6.18 0.70 0.15 0.71 1.42 2.20 3.00 3.83 4.67 5.53 6.39 7.27 0.50 0.46 1.39 2.37 3.36 4.35 5.35 6.35 7.34 8.34 9.34 0.30 1.07 2.41 3.66 4.88 6.06 7.23 8.38 9.52 10.66 11.78 0.20 1.64 3.22 4.64 5.99 7.29 8.56 9.80 11.03 12.24 13.44 0.10 2.71 4.60 6.25 7.78 9.24 10.64 12.02 13.36 14.68 15.99
Reject Hypothesis
0.05 3.84 5.99 7.82 9.49 11.07 12.59 14.07 15.51 16.92 18.31
0.01 6.64 9.21 11.34 13.38 15.09 16.81 18.48 20.09 21.67 23.21
0.001 10.83 13.82 16.27 18.47 20.52 22.46 24.32 26.12 27.88 29.59
When reporting chi square data use the following formula sentence .
With degrees of freedom, my chi square value is , which gives me a p value %, I therefore between % and my null hypothesis.
This sentence would go in the reults section of your formal lab. Your explanation of the significance of this data would go in the discussion section of the formal lab.
Looking this statistic up on the chi square distribution table tells us the following: the P value read off the table places our chi square number of .30 close to .95 or 95% This means that 95% of the time when our observed data is this close to our expected data, this deviation is due to random chance. We therefore accept our null hypothesis.
What is the critical value at which we would reject the null hypothesis? For three degrees of freedom this value for our chi square is > 7.815 What if our chi square value was 8.0 with 4 degrees of freedom, do we accept or reject the null hypothesis? Accept, since the critical value is >9.48 with 4 degrees of freedom.