Anda di halaman 1dari 4

SIMPLE LINEAR REGRESSION ANALYSIS

EXAMPLE 1
An engineer wants to analyze the relationship between the monthly amount
of steam pounds used in a factory and the average ambient temperature.
Next data were obtained:
Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

Temperature
(0F)
15
22
32
47
50
55
66
74
62
50
41
25

Monthly amount of used steam (pounds)


18579
21447
28803
42484
45468
53903
62155
67506
56203
45293
36995
27398

In order to analyze the data, answer the next questions:


1. ID which is the dependent variable and which is the independent
variable
The dependent variable Y is also called the response variable
The independent variable X is also called the explanatory or predictor
variable
2. Build the datas dispersion diagram
What is the behavior of the data? Comment.
3. Find graphically the straight line that best fit the data
4. What is the equation of the straight line that best fit the data?
The straight line that best fit the data has several names:
Best Fitted Line
Regression Line
Line of the Least Squares Method
Its equation is called the regression equation and it is given by
Y^ =Y estimated =b 0+ b1 X
5. a) What are the meanings of

b0

and

b1

b). Get the confidence intervals ( bi t /2 ,n2 SE coef )


values of

b0

and

b1

for the expected

and write down their convoluted explanations

c) Are meaningful (or significant) the values of


d) Perform a hypothesis testing for
conclusions?

b0

b0

and for

and

b1

b1 . What are your

6. What is the dispersion of the real data (Y) about the regression
line ( Y^ )?
This dispersion is equal to the Standard Error of the Regression (S yx= S=
SE), which measures the average value of the differences between the real
Ys and the fitted Y^ ' s
7. a) What is the value of the total variance?
b) What is the value of the amount of variance that is explained by
the regression model?
c) What is the value of the amount of variance that is not explained
by the regression model?
d) What is the value of the percentage of variance that is explained
by the regression model?
e) What is the value of the percentage of variance that is not
explained by the regression model?
The percentage of variance that is explained by the regression model is
known as the Determination Coefficient R2.
It can be said also that R2 measures the degree of linear association between
the Y and the X or that it measures the degree in which the X explains the
variation of the Y.
The positive or negative square root of the R 2 is called the Correlation
Coefficient R.
8. What can you say about the Models Significance Test?
9. a). Get the point estimations for Y when X= 5, 20, 75
b). Which of these predictions are valid?
c). If X= 20, get the confidence interval for the expected or mean
value of Y
d). If X= 20, get the predicted interval for the future value of Y
Note: The prediction intervals (PI) are always wider than the confidence
intervals (CI). The PIs are calculated when working with the interval
associated to a new Y value; while the CIs are used when working with the
mean or expected value of Y for a given X value.

EXAMPLE 2
A rocket engine is built gluing together an ignition and a support component inside a
metallic box. It is believed that the break resistance of the engine is related to the age of
the support component. See data below:
Support components age
Break resistance (psi)
(weeks)
2158.70
15.50
1678.15
23.75
2316.00
8.00
2061.30
17.00
2207.50
5.50
1708.30
19.00
1784.70
24.00
2575.00
2.50
2357.90
7.50
2256.70
11.00
2165.20
13.00
2399.55
3.75
1779.80
25.00
2336.75
9.75
1765.30
22.00
2053.50
18.00
2414.40
6.00
2200.50
12.50
2654.20
2.00
1753.70
21.50
In order to analyze the data, answer the next questions:
1. ID the dependent variable and the independent variable
2. Comment about the data behavior according to its dispersion diagram
3. What is the equation of the regression line?
4. a) What are the meanings of bo and b1?
b) Get their confidence intervals. Write down their convoluted explanations.
c) Do they have significant values? Explain yourself.

d) Perform a hypothesis testing for 0 and for 1 . What are your conclusions?
5. What is the dispersion of the real data about the regression line?
6. What is the value of the a) total variance? b) explained variance? c)
unexplained variance?
7. What is the value and meaning of the a) Determination Coefficient R2.
b) Correlation Coefficient R.
8. What can you say about the Models Significance Test?
9. Using X value = mean of all X values, get the
a). point estimation for Y

b). confidence interval for the expected value of Y


c). confidence interval for the future value of Y
d) Write down the convoluted conclusion of these two intervals
INDIVIDUAL HOMEWORK: Page 72, Regresin Lineal Textbook. Problems: 4, 5,
7

EXAMPLE 3
In the article Update on Ozone Trends in Californias South Coast Air Basin,
O
Air and Waste, 43, 226, 1993) the ozone concentrations 3 in the air basin
of the South California Coast were analyzed in the 1976-1991 period. It is
O3 f 0.20 ppm depends on the
believed that the amount of days in which
seasonal meteorological index, which in turn is a way to measure the
average seasonal temperature.
Your task is to answer the same 9 questions displayed in the Example 2 for
the next set of data:
Year
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991

Number of days
91
105
106
108
88
91
58
82
81
65
61
48
61
43
33
36

Index
16.7
17.1
18.2
18.1
17.2
18.2
16
17.2
18
17.2
16.9
17.1
18.2
17.3
17.5
16.6

Anda mungkin juga menyukai