7
Why is hypothesis testing so
important?
• The research hypothesis is a specific version of
the research question that summarizes the
main elements of the study (sample, predictor
and outcome variables
• The basis of the tests of statistical significance.
• A good hypothesis must be based on a good
research question. It should also be simple,
specific, and stated in advance.
Case I
• Age at menarche (age of starting menstrual
periods) is an important risk factor for breast
cancer and possibly ovarian cancer. In general,
women with earlier age at menarche have a
higher incidence of breast cancer. The long-
term trend in developed countries is that age
at menarche has been declining over the past
50 years. One hypothesis is that women with
higher childhood socioeconomic status (SES)
have an earlier age at menarche.
Questions for case I:
• What are the dependent and independent variables
in the study? What is the appropriate scale of
measurements for both variables?
• Is it paired or unpaired observation?
• What are the null hypothesis and alternative
hypothesis? Is it one-sided or two-sided?
• What is the appropriate hypothesis testing
(parametric/non parametric) that we can use to test
the null hypothesis?
• Describe data analysis in descriptive statistic
Answer
• Dependent : age at menarche (year,
numerical)
Independent childhood socioeconomic status
(lower and higher, nominal)
• Unpaired observation, no matching procedure
• H0: age at menarche higher SES = age at menarche
lower SES
H1: age at menarche higher SES < age at menarche
lower SES (one-sided)
Test of distribution
Variable Descriptive
Statistic
Temperature Descriptive
(mean/SD) Statistic
Before 102.2 (0.7989)
After 100.45(0.5649)
Case III
• Researchers want to compare protein intake
(mg) among three groups of postmenopausal
women:
– Women eating a standard American diet (SAD)
– Women eating a lacto-ovo-vegetarian diet (LAC)
– Women eating a strict vegetarian diet (VEG)
Questions for case III:
• What are the dependent and independent
variables in the study? What is the
appropriate scale of measurements for both
variables?
• Is it paired or unpaired observation?
• What are null hypothesis and alternative
hypothesis?
• What is the appropriate parametric test that
we can use to test the null hypothesis?
Answer
• Dependent : protein intake (mg, numerical)
Independent : diet type (SAD, LAC, and VEG,
nominal)
• Unpaired observation, no matching procedure
• H0: SAD = LAC = VEG
H1: at least one of the pairs has mean
difference of protein intake
Test of distribution
Protein Descriptive
(mean/SD) Statistic
STD 74.7(5.056)
LAC 56.7(5.559)
VEG 46.7(5.559)
If I have one Dependent Variable, which statistical test do I use?
YES NO
YES NO YES NO
YES NO
T-test ANOVA