Regression
Act of returning or going back Francis Galton
Regression
A statistical measure that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables).
Regression - Significance
Estimation of value of the dependent variable from the independent variable To obtain a measure of error involved in using the regression line With the help of regression coefficient we can calculate the correlation coefficient
Regression - Types
Linear regression
Uses one independent variable to explain and/or predict the outcome of Y
Multiple regression
Uses two or more independent variables to predict the outcome
Regression Lines
Regression equation of Y on X Y = a + bX Y: X: a: b: A dependent variable An independent variable Y-intercept Slope of line
Random Component
where Y is the dependent variable, the variable we wish to explain or predict; X is the independent variable, also called the predictor variable; and is the error term, the only random component in the model, and thus, the only source of randomness in Y. 0 is the intercept of the systematic component of the regression relationship. 1 is the slope of the systematic component. The conditional mean of Y:
E [Y X ] 0 1 X
Regression Plot
E[Y]=0 + 1 X Yi
Error: i
The simple linear regression model posits an exact linear relationship between the expected or average value of Y, the dependent variable, and X, the independent or predictor variable: E[Yi]=0 + 1 Xi Actual observed values of Y differ from the expected value by an unexplained or random error:
1 = Slope
}
1
0 = Intercept
X Xi
Yi = E[Yi] + i = 0 + 1 Xi + i
and Y is a straight-line relationship. The values of the independent variable X are assumed fixed (not random); the only randomness in the values of Y comes from the error term i. The errors i are normally distributed with mean 0 and variance 2. The errors are uncorrelated (not related) in successive observations. That is: ~ N(0,2)
E[Y]=0 + 1 X
(Y Y)
bxy = x y / y2
(X X)
byx = x y / x2
1. Both the regression coefficient will have the same sign. 2. The value of product of both the regression coefficient must come less than one. 3. Coefficient of correlation will have the same sign that of the regression coefficient.
Multiple Regression
Two or more independent variable are used to estimate the values of dependent variable.
To derive an equation which provide estimates of the dependent variable from values of the two or more independent variables. To obtain a measure of error involved in using this regression equation.
X1 = Dependent variable X2 ,X3 = Independent variables then X1 = f (X2 ,X3 ) In case of three variables the regression equation of X1 on X2 ,X3
(X1 - X1) =
}{ } {
S1
S2
}{ }
S1 S3
(X3 X3)
Here,
b12.3 =
b13.2 =