Logistic regression
Simple linear regression
1
https://onlinecourses.science.psu.edu/stat501/node/250
2
2
https://netfiles.umn.edu/users/nacht001/www/nachtsheim/kut86916_ch01.pdf
3
3
https://netfiles.umn.edu/users/nacht001/www/nachtsheim/kut86916_ch02.pdf
4
4
https://netfiles.umn.edu/users/nacht001/www/nachtsheim/kut86916_ch02.pdf
5
5
https://netfiles.umn.edu/users/nacht001/www/nachtsheim/kut86916_ch13.pdf
6
6
https://onlinecourses.science.psu.edu/stat501/node/418
7
7
http://www.biostathandbook.com/HandbookBioStatThird.pdf
How to obtain best fit line (Value of and )?
Is a non-iterative method
Goal - fits a model such that the sum-of-squares of differences of observed and predicted values
is minimized (minimizing the sum of the squares of the vertical deviations from each data point to
the line / minimize the sum of the squared prediction errors). Because the deviations are first
squared, when added, there is no cancelling out between positive and negative values.
( )2 2
The terms deviation and residual will be used in the following ways8:
To analyse the adjustment of the model to the observed data, it is necessary to consider the following
characteristics:
Sum of squares of the residuals (SSR) / residual sum of squares (RSS) / Sum of squared errors of
prediction (SSE) : This quantity indicates the residual variation of the observed values in relation to
the estimated values of the response variable of the model, which can be considered as the variation of
the observed values that is not explained by the model.
Sum of squares of the deviations of the estimated values of the response variable of the model:
This quantity indicates the variation of the estimated values of the response variable of the model in
relation to its mean, that is the variation of the response variable explained by the model.
Total sum of squares of the deviations of the observed values: This quantity indicates the total
variation of the observed values in relation to the mean
where
r2 (coefficient of determination) is the percentage of the total variation that is explained by the model
and
1-r2 is the percentage of the total variation that is not explained by the model.
8
http://www.fao.org/docrep/006/X8498E/x8498e0e.htm
Simple Linear Regression using Gradient Descent
Gradient Descent
Gradient descent is an optimization algorithm that minimizes functions. For a given function J defined by a
set of parameters ( ), gradient descent finds a local (or global) minimum by assigning an initial
set of values to the parameters and then iteratively keeps changing those values proportional to the
negative of the gradient of the function.
If we minimize function J, we will get the best line for our data which means lines that fit our
data better will result in lower overall error. To run gradient descent on this error function, we
first need to compute its gradient. The gradient will act like a compass and always point us
downhill. To compute it, we will need to differentiate our error function. Since our function is
defined by two parameters ( ), we will need to compute a partial derivative for each.
These derivatives work out to be:
The Learning Rate controls how large of a step we take downhill during each iteration. If we take
too large of a step, we may step over the minimum. However, if we take small steps, it will
require many iterations to arrive at the minimum.
Finally:
And this is the final equation for updating the intercept iteratively.
Now we have what needed to run gradient descent. First we initialize our search to start at any
pair of values and then let the gradient descent algorithm march downhill on our error
function towards the best line. Each iteration will update to a line that yields slightly
lower error than the previous iteration. The direction to move in for each iteration is calculated
using the two partial derivatives from above. A good way to ensure that gradient descent is
working correctly is to make sure that the error decreases for each iteration.
Convexity
Not all problems have only one minimum (global minima). It is possible to have a problem with
local minima where a gradient search can get stuck in.
Multiple Linear Regression
Multiple linear regression (MLR) is a method used to model the linear relationship between a dependent
variable (target) and one or more independent variables (predictors).
NOTES9
9
https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/
10
10
https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/
11
11
https://www.google.ro/search?q=BODY+FAT+Bray+1998&biw=1600&bih=770&tbm=isch&tbo=u&source=univ
&sa=X&ved=0ahUKEwjNhauL0LfSAhWLWhoKHfVJAyIQsAQIGg#imgrc=rL1yOHvHBsKigM:
12
12
https://en.wikipedia.org/wiki/Simple_linear_regression
13
13
https://en.wikipedia.org/wiki/Logistic_regression
SOCR DATA - SOCR Body Density Data
This is a comprehensive dataset that
lists estimates
of the percentage of body fat
determined by underwater weighing and
various body circumference measurements (12 variables)
for 252 men.
14
14
Selection from SCOR data from adress:
http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_BMI_Regression#References
LINEAR REGRESSION (SIMPLE)
Correlations
WIRST BMI
r 1 ,627**
WAIST Sig. ,000
N 252 252
r ,627** 1
BMI Sig. ,000
N 252 252
http://academica.udcantemir.ro/wp-content/uploads/article/studia/s2/S2A4.pdf
http://www.amaniu.ase.ro/studenti/management/regresie.pdf
http://www.unibuc.ro/prof/druica_e_n/docs/res/2011aprEconometrie_-_suport_de_curs.pdf
https://profs.info.uaic.ro/~val/statistica/StatWork_7.pdf
https://profs.info.uaic.ro/~val/statistica/StatWork_10.pdf
http://math.ucv.ro/~gorunescu/courses/curs/7.pdf
Regression application
15
15
http://www.saedsayad.com/flash/SLR.html
Weka application
16
16
http://www.saedsayad.com/mlr_exercise.htm
Tutorial SPSS MODELER
https://www.youtube.com/watch?v=UKqV51pRNjQ