Anda di halaman 1dari 3

The ToothGrowth data analysis in R

Domingos Savio Apolonio Santos


Thursday, November 20, 2014

Synopsis
This is part of the project for the Statistical Inference class in the Johns Hopkins Data Science Specialization by Coursera.
This report analyzes the ToothGrowth data in the R datasets package. The goals of this analysis are:
Perform some basic exploratory data analyses and provide a basic summary of the data.
Compare tooth growth by supp and dose Using confidence intervals and/or hypothesis tests.
State conclusions about the data and the assumptions needed for it.

The ToothGrowth data


The ToothGrowth datasets (https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/ToothGrowth.html) has data for the analysis of the effect of
vitamin C on tooth growth in Guinea pigs. The data has the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin
C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).
This data frame has 60 observations and 3 variables:
len - Tooth length (numeric).
supp - Supplement type: VC-Vitamin C" or OJ-orange juice (factor).
dose - Dose in milligrams (numeric).
This following R code compactly displays the internal structure of the ToothGrowth dataset:

Basic exploratory data analyses


The following scatterplot shows approximately how much the variable dose is affected by the variable len for each type of supp:

Basic summary of the data


The data summary by factor len variable is this:

The following plot shows data summary for each supplement:

Comparing tooth growth by supp and dose


It performs hypothesis tests by function t.test in R (http://127.0.0.1:23755/library/stats/html/t.test.html).
It considers the following assumptions in this analysis:
The guinea pigs are chosen randomly and, considering each dose level and delivery method, they are independent groups.
The default value of the confidence interval is 95%.
It uses var.equal = FALSE for T Tests, considering that the variances of each analyzed populations are different.

dimnames=list(combinationNames,c("P-value","Conf low", "Conf hight")))

P-values are almost all less than 0.05. The confidence intervals do not contain zero for most of the comparisons. So the null hypothesis can be denied.
This indicates that the difference in mean values between the supplements is significant for the comparisons performed. It is observed two exceptions for
the comparison of orange juice and vitamin C with the dose = 2 mg and for the comparison of orange juice and vitamin C with the dose = 1 mg to 2 mg.
P-values decrease when the dose increase for the same supplement (OJ.0.5~OJ.1 and OJ.0.5~OJ.2, for example). This indicates that increasing the
dosages gets a positive impact on teeth growth.

Conclusions
The mainly conclusions are:
Both Vitamin C and Orange juice have effect on tooth growth.
Increasing the supplement dose level leads to increased tooth growth.

Anda mungkin juga menyukai