Anda di halaman 1dari 1

Correlation 23

2.3.1 The correlation coefficient

Correlation may be defined as a measure of tin; strength of association he race n rx> variables
measured on a number of individuals, and is quantified using the Pearson ptoduct-rnomcnt
coefficient of linear correlation, usually known as the correlation coefficient. Thus the calculation of
the correlation coefficients bccwccn CaO and AJ2OJ and K20 and NijO can provide an answer co the
questions asked above.
When, as is normal in geochemistry, only a sample of the total population is measured, the
sample corrchtion coefficient (r) may be calculated from the expression
r= covariance foy) .
.M's ' VIvariance (x) x variance (y)] ,
where there are n values of variable x (xj ... x,) and of variable y (y, ... y a ).
Ar. easier lornr for computation is
CSCP n,,
r = -------------------------------------------------------------------------------------------------------------------- [2.2j
V(CSSX. CSSY)
where CSCP (corrected sum of cross produos) * T(xy) Z(x).5(y)/n CSSX
(corrected sum of squares for x) = Ifx2) - 2(x).2(x)/n
CSSY (currected sum of squares for y) = X(y2) - X(y).(y)/n
Values of r vary from -1 ro fl. When r -H then there is perfect sympathy between x and y and
there :s a perfect linear relationship. When r = -1 there is perfect antipathy between x and y. If r = 0
then there is no relationship between x and y at all. The value of ? is also useful, for it is a measure
of the fraction of the total variance of x ar.d y that is explained by the linear relationship. For
instance, if the correlation coefficient r = 0.90, then r2 = 0.81; tlsat is, 81 % of the total variance is
explained by the linear relationship.

2.3.2 The significance of the correlation coefficient (r)

The sample correlation coefficient (r) i3 an estimate of the population correlation coefficient (p), i.e.
the correlation that exists in the total population of which only a simple has been measured.
It is important to know whether a calculated value for r represents a statistically significant
relationship between x and y. That is, dots the relationship observed in the sample hold for the
population? The probability that this is the case may be estimated for different levels of significance,
usuilly at chc 5 % (or 0.05) level or the 1 % (0.01) level. (These values may also be expressed as
confidence limits, in this case 95% or 99% respectively.) Estimates of this sou are normally made by
reference ta a table of values for r (Table 2.1). For a given number of degrees of freedom (number of
samples minus 2, in this case), values for r are tabulated for different significance levels. The values
represent the minimum values for rejecting the null hypothesis that the correlation coefficient cf the
population is aero (p = 0) at the given level of significance. Two sets of tables are given depending
upon whether the sign of the correlation coefficient is important. The cne-sided test may be used
wbcii die alternative to die null hypoilsesis (p = 0) is either p > 0 or p < 0.

Anda mungkin juga menyukai