Handout #2 - Statistical Notes

Dr. T.A.
Carlo
Points to remember when doing analyzes and interpreting statistical output:

1. Frist always check the distribution of the response variable you want to examine. If
you are using t-test, ANOVA (Analysis of Variance), or linear regression, the most
common approaches assume that the data is normally distributed. If the data is not
normally distributed you can:
a. transform the response variable (only for y = a continuous variable) using:
log (y)
log (y+1)
b. if data is continuous and not normal you can perform a Non-Parametric test that does
not assume data to be of any type of distribution (also called distribution-free tests:
Wilcoxon Test & Kruskal-Wallis ANOVA most commonly used for comparing two or
more averages).
c. if data is not normal you can also use the GLM platform (see note section below), or
you may ignore this violation and proceed with the analyses anyway (most traditional
tests are very robust to violations of the non-normality assumption).
2. Next thing to do is to proceed with the “one-way” approach to data exploration of
each factor separately – this allows you to see general patterns and trends, and to see
which figures are worth doing in report (even if you do a factorial analysis, you show
results for the important factors separately in one-way fashion):
If your study or experiment has only one factor, then you can only use a “one-way”
approach. Tests include t-test (for comparing two averages), simple regressions (one
continuous factor and a continuous response), Non-parametric tests (for non-normal
data), logistic regressions (x factor is continuous, y factor is categorical [e.g., predated,
non-predated], and Contingency Table/Chi-square analyses (for a categorical factor and
a categorical response).
Most one-way analyses can be done in the “Fit Y by X” platform in the “Analyze” many
of JMP. Notice that for paired data – the Fit Y by X platforms allow you to use the
“Blocking” tab to include paired data (pairing only works when all pairs are complete -
there is no missing data for some pairs.).
d. For ANOVAS (comparing three or more averages), report the F test value, the Degrees
of Freedom, and the p-value. If you want to know which averages are different from each
other you need to perform a “post-hoc” test such as the Tukey-Kramer HSD. You can
do this in the red triangle tab next to “one-way” analysis tab.
e. for regressions report the r-square value, the sample size (n), and the p-value
1
3. If you have two or more factors and want to do a more elegant “factor analysis”, you
need to use the “Fit Model” platform of JMP. Place the response variable in the Select
and place the columns of each factor in the appropriate box (the “Construct Model
Effects” box). You can create interactions by selecting “Cross” – for crossing both factors
need to be inside the “Construct Model Effect” box, and one needs to be selected
(highlighted) inside that box, while the other needs to be selected(highlighted) inside the
“ Select Columns” box – then hit the “Cross” button.
Note: use a Generalized Linear Model platform (in the Fit Model under Analyze
menu, select Generalized linear model under the “Personality” tab, then specify
distribution) that allow you to fit other types of error distributions such as:
binomial – for proportion data (data values ranging between 0 and 1)
poisson - for count data (0, 1, 2, 3, etc.)
exponential - (for continuous data)
normal (Gaussian)– this platform also analyses normally-distributed
continuous data
f. When performing a factorial analysis, you have a lot of output – learn to seek the
following information:
Whole Model Fit = seek the R-Squared value – r2 ranges between 0 and 1. It can be
interpreted as the amount of the variance that is explained by the model as you
constructed it. A value of 0.91 can be interpreted as “the model explains 91% of the
variation in the response”. Next, seek the p-value of the Whole Model – if it is below the
0.05 cutoff – your model is not good and you need to revise it. You can revise the model
by reducing the model, which is done by removing away factors (start with non-
significant interaction terms, then with non-significant factors if it comes to that).
Effect Tests table = this is what you want to see once you have a significant “whole
model” p-value. This table is usually what you need to report in your papers because it
shows which factors are significant, and the F value of each factor can be used to
calculate its “effect size”, which is a measure of its relative importance in the model. The
larger the F value of a significant factor is in relation to other significant factor, the more
important it is!
Normalization of F values: you can add all the F values (significant and non-significant)
and then divide the F value of each factor by the total Sum of all F values in the Effect
Tests table. This is called “normalizing the F value”. This tells the amount of “explained
variance” that each factor contributes to the explained variance in the model. Multiply
the “normalized” F value by the R-Square value of the whole model, you know how
much variance each factor explains in your data set.

Handout #2 - Statistical Notes

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Handout #2 - Statistical Notes

Diunggah oleh

Hak Cipta:

Format Tersedia

Dr. T.A.

Points to remember when doing analyzes and interpreting statistical output:

Anda mungkin juga menyukai