Data
Less outliers
Large
dataset
numeric
Mean
Median
Categori
cal
Mode
Associat
ed
variable
Type of
technique
Categorical
Categori
cal
Categorical
Numeric
Remarks
Assumptions
Decision
tree
Nave
Bayesian
Decesion tree
need no
assumption
Nave bayes
assume
independent
variables
Logistic
regression
K-NN
classifier
K-NN
CLASSIFIER
need no
assumption
Regression
assumption of
normality,
homoscedasticity
etc
Regression
assumption of
normality,
homoscedasticity
etc
Numeric
Numeric
Regression
model
Clustering
Clustering
need no
assumption
Numeric
Categori
cal
Clustering
No
assumption
Categorical
Both
Decision
No
Regression
K-NN Classifier
3-NN
classifier
K-NN Classifier
K-NN Classifier
K-NN Classifier
K-NN Classifier
Attributes may have to be scaled to prevent distance measures from being
dominated by one of the attributes
Nave Bayesian
Classifier
P(A|B) = P(B|A) *P(A) / P(B)
theorem )
(Bayes
P(Spam|free)=P(free|spam)* P(Spam) /
P(free)
Since P(Spam|free) > P(ham|free) ,
hence with this word, the message is
classified as spam
How it works
1