Anda di halaman 1dari 7

Psych 494

Fall 2001
Solution 4

1.
Variable=X1
W:Normal 0.827419 Pr<W 0.0007

Normal Probability Plot


27.5+ * * ++
| ++++++++++
| ++++++*+*
| ************ **
| *+*+*+**+*+
2.5+ +++*+++++
+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2

Variable=X2
W:Normal 0.924988 Pr<W 0.0697

Normal Probability Plot


17+ +++*+
| * * *+*+
| * *++++
| **+++
9+ +++*
| +++****
| *+*** *
| * *+++*
1+ * ++++
+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2

Variable=X3
W:Normal 0.967717 Pr<W 0.6204

Normal Probability Plot


17+ * *++++*
| *+*+++
| +**+*+
| ******
| *****
| *+*+**+
| +++++
3+ +++*++ *
+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2

The Q-Q plots show that only X1 has gross departure from normality. The Shapiro-Wilk statistics confirm this observation.
2)
Chi-square Plot
16 +
|
| *
|
|
|
14 +
|
|
|
|
|
12 +
|
|
|
|
|
10 +
|
D |
i |
s |
t | *
a 8 +
n |
c |
e |
| *
|
6 +
|
|
| *
|
|
4 + * *
| * * *
|
| *
|
| * *
2 +
|
| * *
| *** *
| ** **
| *
0 + * *
-+------+------+------+------+------+------+------+------+------+------+-
0 1 2 3 4 5 6 7 8 9 10

ChiSq

Most of the observations (except for observation 9) lie along the line. This suggests that the assumption of multivariate
normality is tenable.
TOPLOT
1.4236522 0.8807247
4.0135936 3.8113869
3.0763677 2.8907255
3.6648708 3.7549258
1.0878828 0.6533776
1.9635806 1.143333
0.7558302 0.5598385
4.4149844 3.8723804
9.8374093 15.262769
0.1848318 0.0355722
2.3659739 1.3338642
1.7767839 1.019959
0.5843744 0.4707491
1.2543524 0.6645781
3.3554449 3.7399316
2.5857116 2.3258664
0.922479 0.6167535
2.8213339 2.3763783
0.4011734 0.1499929
6.2513886 6.8316784
7.40688 8.3442448
1.5973096 0.9699873
4.8904101 3.9779303
2.1593473 1.2612803
5.4773439 5.0517725

CHI80
4.6416277

Around 20 observations are expected to lower than the quantile 4.64. The actual count is 21, which is very close to what is
expected under normality. Based on the chi-square plot and the probability contour plot, we can say that it is reasonable to assume
trivariate normality.

3)
A=1, B=2, etc. Plot of X1*X2. A=1, B=2, etc. Plot of X1*X3.

30 + 30 +
| A | A
| A | A
| |
| |
| |
20 + 20 +
| |
| AA | A A
X1 | A A X1 | A A
| AA A A | A AAA
| A A A A | A A A A
10 + A A A B A 10 + A AB A A
| A A A | A AA
| A | A
| A | A
| |
| |
0 + 0 +
-+---------------+---------------+ -+-------+-------+-------+-------+
0 10 20 0 5 10 15 20

X2 X3
A=1, B=2, etc. Plot of X2*X3.

20 +
|
| A
|
X2 | A
| A A A
| A
| A
| A A
10 +
|
| A A A
| A
| AA BA A A
|
| A
| A A A
| A
0 +
-+-------+-------+-------+-------+
0 5 10 15 20

X3

It can be argued that at least one of the observations (9) is an outlier. This is evident from the scatter plots and the large
statistical distance. Observation 21 is also a likely candidate.

4. a) H0: µ ' = [10, 10, 10]


H1: µ ' ≠ [10, 10, 10].

T_SQ T_SQCRIT PVALUE


30.373968 10.224691 0.0004974

The T2 = 30.37 with a p-value of .0005. At an alpha level of .05, we reject the null hypothesis that the means are all equal to 10.

b)
VAL
24.898543
6.757645
4.6820876

VEC
0.458362 0.0873568 0.884462
0.7521083 -0.568348 -0.333637
0.473537 0.8181376 -0.326211

The length of the longest axis is 2 24.90 10.22 / 23 = 6.65 The next longest axis is 2 6.76 10.22 / 23 = 3.47 while the
shortest axis has a length of 2 4.68 10.22 / 23 = 2.89 . The direction of the ellipsoid is given by the eigenvectors.
c) Simultaneous Confidence Intervals for µ1 , µ 2 , µ 3 and µ1 + µ 2 + µ 3 :

T-Square Simulateneous Confidence Intervals

SCLM
9.2549748 13.243286
4.7246222 10.188421
8.4131009 12.755595
23.647662 34.932338

Bonferroni Simulateneous Confidence Intervals

SCLM
9.552743 12.945518
5.1325507 9.7804928
8.7373125 12.431383
24.490179 34.089821

The 4 intervals based on the Bonferroni method are shorter compared to those using T2. In most situations given the same
confidence level, Bonferroni intervals are more precise. Although one may be inclined to always use the Bonferroni, some
situations dictate that simultaneous confidence intervals be based on T2. In situations where no particular contrast is of interest
prior to the implementation of the study, the T2 method is ideal since this method maintains same the confidence level even if data
snooping is involved. In some situations where the number of contrasts is relatively large, one may be better off with the T2
method, regardless of when the contrasts are formulated.

SAS Program for Homeowrk 4:

options ls=78;

data milk;
infile 'milk.dat';
input x1-x3;

* Univariate Test For Normality;

proc univariate normal plot;

*Multivariate Normality and Chi-square Plot;

proc iml;

START distance(X);
n=nrow(x);
p=ncol(x);
one=J(N,1);
xbar=t(X[:,]);
I=I(n);
SSCP=X`*(I-one*t(one)*1/n)*X;
s = SSCP/(N-1);

d=j(nrow(x),1);
means=one*t(xbar);
do i=1 to nrow(x);
d[i]= (x[i,]- means[i,])*inv(s)*t(x[i,]-means[i,]);
end;
y=ranktie(d);
chisq=cinv((y-.5)/n,p);
toplot=chisq||d;

call pgraf(toplot,'*','ChiSq','Distance','Chi-square Plot');


print toplot;
finish;

use milk;
read all var _num_ into X;
run distance(x);

chi80=cinv(.8,3);
print chi80;

proc plot hpercent=50 vpercent=50;


plot x1*x2;
plot x1*x3;
plot x2*x3;

data milk1;
set milk;
if _n_ ne 9 and _n_ ne 21;

proc iml;
use milk1;
read all var{x1 x2 x3} into x;

START stat(X,Xbar,S);
N=nrow(X);
one=J(N,1);
Xbar=t(X[:,]);
I=I(n);
SSCP=X`*(I-one*t(one)*1/n)*X;
S = SSCP/(N-1);
FINISH stat;

START test(x,xbar,s,mu,alpha);
n=nrow(x);
p=ncol(x);
t_sq=n*(xbar-mu)`*inv(s)*(xbar-mu);
t_sqcrit=(n-1)*p/(n-p)*finv(1-alpha,p,n-p);
pvalue=1-probf((n-p)*t_sq/((n-1)*p),p,n-p);
print t_sq t_sqcrit pvalue;
FINISH test;

mu={10,10,10};
alpha=.05;

run stat(x,xbar,s);
run test(x,xbar,s,mu,alpha);

call eigen(val,vec,s);
print val, vec;

start sclm_t2(xbar,s,n,a,alpha);
p=nrow(s);
i=i(p);
nclm=nrow(a);
sclm=j(nclm,2);
crit=p*(n-1)*finv(1-alpha,p,n-p)/(n*(n-p));
do i=1 to nclm;
me=sqrt(crit*a[i,]*s*t(a[i,]));
sclm[i,1]=a[i,]*xbar-me;
sclm[i,2]=a[i,]*xbar+me;
end;

print ,'T-Square Simulateneous Confidence Intervals', sclm;


finish sclm_t2;

a={1 0 0,
0 1 0,
0 0 1,
1 1 1};
n=23;
alpha=.05;

run stat(x,xbar,s);
run sclm_t2(xbar,s,n,a,alpha);

start sclm_b(xbar,s,n,a,alpha);
p=nrow(s);
i=i(p);
nclm=nrow(a);
sclm=j(nclm,2);
crit=tinv(1-alpha/(2*nclm),n-1);
do i=1 to nclm;
me=crit*sqrt(a[i,]*s*t(a[i,])/n);
sclm[i,1]=a[i,]*xbar-me;
sclm[i,2]=a[i,]*xbar+me;
end;
print , 'Bonferroni Simulateneous Confidence Intervals', sclm;
finish sclm_b;

run sclm_b(xbar,s,n,a,alpha);