Anda di halaman 1dari 31

# FELIPE REGO

23 Oct 2015

QUICK
GUIDE:
INTERPRETI
SIMPLE
LINEAR
MODEL
OUTPUT
IN R
Linear regression models are
a key part of the family of
supervised learning models.

## supervised learning models.

In particular, linear

## I’ve written on Simple Linear

Regression - An example
using R. In general, statistical

## analyst who is starting with

linear regression in R to

## cars dataset found in the

datasets package in R

## package you can call:

library(help =

"datasets") .

summary(cars)
## speed
## Min. : 4.0 Min.
## 1st Qu.:12.0 1st Q
## Median :15.0 Medi
## Mean :15.4 Mean
## 3rd Qu.:19.0 3rd Q
## Max. :25.0 Max.

## speed variable varies from

cars with speed of 4 mph to

stop.

## Below is a scatterplot of the

variables:

plot(cars, col='blue', p
xlab="Speed in m
From the plot above, we can

## distil and interpret the key

components of the R linear

## model but we are more

interested in interpreting

## would then allow us to

potentially de ne next

## steps in the model

building process.

price drop

## Let’s get started by running

one example:
set.seed(122)
speed.c = scale(cars\$spe
mod1 = lm(formula = dist
summary(mod1)

##
## Call:
## lm(formula = dist ~
##
## Residuals:
## Min 1Q Med
## -29.069 -9.525 -2.
##
## Coefficients:
## Estimate
## (Intercept) 42.9800
## speed.c 3.9324
## ---
## Signif. codes: 0 '*
##
## Residual standard er
## Multiple R-squared:
## F-statistic: 89.57 on
The model above is achieved

model.

price drop

Formula Call

## Note the simplicity in the

syntax: the formula just

Residuals

## model output breaks it down

into 5 summary points. When

Coe cients

## and slope terms in the linear

model. If we wanted to

## formula. Ultimately, the

analyst wants to nd an
intercept and a slope such

## a stop. The second row in the

Coe cients is the slope, or in

3.9324088 feet.

## ideally want a lower number

relative to its coe cients. In

## hypothesis of the existence of

a relationship between speed

## from zero and are large

relative to the standard error,

## which could indicate a

relationship exists. In

## that it is unlikely we will

observe a relationship

## value of 5% or less is a good

cut-o point. In our model

## signi cant p-value.

Consequently, a small p-

## measure of the quality of a

linear regression t.

## deviate from the true

regression line by

approximately 15.3795867

## feet, on average. In other

words, given that the mean

## with 48 degrees of freedom.

Simplistically, degrees of

these parameters

## had 50 data points and two

parameters (intercept and

slope).

Multiple R-squared,

## actual data. It takes the form

of a proportion of variance.

R
2
is a measure of the linear

## does not explain the variance

in the response variable well

## that the answer would almost

certainly be a yes. That why

2
.

## Nevertheless, it’s hard to

de ne what level of R
2
is

2

2
is the preferred

## measure as it adjusts for the

number of variables

considered.
F-Statistic

## F-statistic is a good indicator

of whether there is a

## of data points and the

number of predictors.

## hypothesis (H0 : There is no

relationship between speed

## to ascertain that there may be

a relationship between

## variables. In our example the

F-statistic is 89.5671065

## above was just an example to

illustrate how a linear model

## way we could start to improve

is by transforming our

## response variable log-

transformed mod2 =

lm(formula = log(dist) ~

## bringing in new variables,

new transformation of

## variables and then

subsequent variable

## YOU MAY ALSO

LIKE...

3 Things I Learned

Helping Australia’s

Agency Win A Large

Analytics Project In

## Data Science and

Analytics Panel @

Big Data

DASAMA™ - Data

## Science and Analytics

Maturity Assessment

model

Data Analytics

Initiative

## SERVICES CASE STUDIES

TESTIMONIALS BLOG

EMAIL

## PHONE +61 2 8005

MESSAGE

EMAIL felipe@felip
SOCIAL  
SEND MESSAGE

## © FELIPE REGO INSPIRED BY MASSIVELY THEME

DESIGN BY HTML5 UP JEKYLL INTEGRATION BY SOMIIBO