Application of Data Mining Technique For Prediction of Academic Performance of Student A Literature Survey

International Journal on Recent and Innovation Trends in Computing and Communication
Volume: 2 Issue: 12
ISSN: 2321-8169
3962 - 3965
_______________________________________________________________________________________________
Application of Data Mining Technique for Prediction of Academic Performance

of Student A Literature survey
Mr. Bhushan S. Olokar
Prof. Ms. V.M.Deshmukh
rd
ME 3 Sem Information Technology

Prof. Ram Meghe Institute of Technology & Research
Badnera-Amravati, India
bhushanolokae@outlook.com
Associate Professor & Head Information Technology

Prof. Ram Meghe Institute of Technology & Research
Badnera-Amravati, India
msvmdeshmukh@rediffmail.com
Abstract Application of data mining in the educational Systems can be directed to support the specific need of each of the participants in the
education system and the process. Students are required to add the recommendation for additional activities, teaching material and task that
would favor and improve his/her learning process. Professors would have the feedback, possibilities to classify students into groups base on
their need for guidance and monitoring, to find the mistakes, and find the effective actions. There are so many prediction model are available
with difference approach and techniques in student performance prediction was reported by researchers, but there is no possibility if there are
any predictors that accurately determine whether a student will be an genius, a drop out, or an average performer. The target of this study was to
apply the k-map method for mining data to analyze the relationships in between students success and their behavior and to develop model for
Prediction of Academic Performance of Students. This would be done by using Support Vector Machine (SVM) classifications and kernel k-map
clustering mechanism. By Predicting students performance can help to identify the students who are at risk of failure and thus management can
provide timely help and take essential steps to coach the students to improve performance.
Keywords: Data mining, SVM, kernel-k-means, SOM, Student Performance.
__________________________________________________*****_________________________________________________
I.
INTRODUCTION
used to uncover hidden or unknown information that is not
Data mining is the powerful technology for
capable of being seen, but strongly useful [7]. The data can be
analyzing important information from the data warehouse. It is
personal or academic which can be used to understand
data analysis methodology used to identify hidden patterns in a
students behavior to improve coursework, to improve teaching
large data set. KDD process includes the data mining.
and many other benefits.
Knowledge discovery (KDD) aims at the discovery of useful
The topic of prediction system of academic performance
information from large collections of data [2]. The main goal
is widely researched. The prediction of student success in
of data mining in the KDD process concerned with the
every institution is still the most topical debates in higher
algorithmic means by which patterns or structures are
study centers. In the previous studies, the model of Tinto [15]
enumerated from the data under acceptable computational
is the predominant theoretical framework for considering
efficiency limitations. Data mining has a wide range of
consisting factors in academic goal. The model of Tinto's
applications including the educational environment. In this
considers the process of student attrition as a psychological
environment, data mining is an interesting research area which
interplay between the characteristics of the student entering in
extracts useful, previously unknown patterns from the
university and the experience at the institute. Using data
database for better understanding. It in turns improves the
mining technique in this field is relatively advanced. There are
educational performance and assessment of the student
many data mining techniques was used in this field, such as
learning process [3].
decision tree, Bayesian network, and neural networks, so on
There are increasing research interests in education field
[1]
using data mining. Data mining techniques concerns to
This study investigates the educational domain of data
develop the methods that discover knowledge from data and
mining using a detail case study from data that mostly comes
3962
IJRITCC | December 2014, Available @ http://www.ijritcc.org
_______________________________________________________________________________________

Volume: 2 Issue: 12
ISSN: 2321-8169
3962 - 3965
_______________________________________________________________________________________________
from behavior of students. It always showed what type of data
values, or insufficient memory area to store the kernel matrix,
could be basically collected, how could we reprocess that data,
that make it unsophisticated for large corporation. The new
how to apply kernel method for data mining on the collected
clustering scheme is a large scale integrity clustering for
data, and finally how can we benefited from the knowledge
Kernel K-means algorithm [4].
discovery of collected data. In this case study, university
II.
LITERATURE REVIEW/SURVEY
students were predicated their final grade by using SVM

classification and grouped the students according to their
similar characteristics, by clustering. The clustering process
was carried out using kernel k-means algorithm technique.
Most cited literature survey in educational Data

mining have been by Romero and Ventura [1] which indicate
performance prediction as one of the emerging field of
educational data mining Various Bayesians Classification
1.1 Support Vector Machine (SVM) for Classification

Classification is data mining task that predicts group
memberships for data instance [8]. In educational area
application of the classification method, given works of a
student, one may predicate his/her final grade. The SSVM is
further development of Support Vector Machine (SVM)
[10][14]. The SSVM generated and solve an unconstrained
smooth reformulation of the SVM for pattern classification
using completely arbitrary kernel [10]. SSVM is solved by a
very fast Newton-Armijo algorithm and has been extended to
non linear separation surfaces by using non linear kernel
technique. The numerical results show that SSVM is faster
than other methods and has better generalization ability [8].
techniques have been Used and comparative study suggest that

Ensemble methods gives best overall accuracy.
Cheewaprakobkit [13] considered 1600 students
records bet 2001 and 2011 in Thailand University and applies
decision tree and neural network to most important factors
affecting students academic achievement. Decision tree
proves to be a better classifier than the neural network with
1.311% more accuracy. Number of hours worked per
semester, additional English course, no of credits enrolled per
semester and marital status of the students are major factors
affecting the performance.
Bharadwaj and Pal [5] base their experiment only on
Previous Semester marks, seminar performance, Assignment,
class test marks, attendance, Lab work to predict end semester
1.2. An Effective Kernel K-Means for Clustering

Clustering is making groups of objects such that the objects in
one group will be similar to one another and different the
objects to another group [4]. In educational field, clustering
would be used to grouping students according to their behavior
and performance. In this study we used Kernel K-means
algorithm to cluster the given data. A drawback behind the
original K-means is that it cannot separate cluster that are non
linearly separable I/p space. Kernel K-Means is one approach
has emerged for handling such a problem. Kernel K-means
before clustering, points mapped to a higher dimensional
feature space using a non linear functionality, and then Kernel
K-means partitions the points by linear separator in new and
additional space[12]. Kernel K-means has been extended to
sufficient and effective large scale clustering [4], since the
original Kernel K-means had serious problems, such as the
high clustering cost due to the repeated simulation of kernel
marks. Records of 50 students of Session 2007 to 2010 MCA

of Purvanchal University were considered. The paper
calculates Split info, gain ratio of each predictor and products
prediction rules.
The drop out of the student from open polytechnic of
New Zealand due to failure has been explored byKovaic.Z[9].
Enrollment data consisting of socio-demographic variables
such as (age, gender, class, work status, education and
disability) and study environment (course program and course
block), of near about 435 students of polytechnic students of
Information system course were collected. The final label
consisting of two categories PASS (those who completed the
course) and FAIL (Those who did not complete) were
considered. Feature selection indicated that most important
attributes for prediction are ethnicity, course program and
course block.
The research had been motivated by a number of
3963
_______________________________________________________________________________________

Volume: 2 Issue: 12
ISSN: 2321-8169
3962 - 3965
_______________________________________________________________________________________________
practical data mining projects Where Self Organizing map has
factors like mothers education and family income were highly
been a central data analysis tool [6]. It could become an easily
correlated with the student academic performance. Conducted
seen that while the SOM can be used to quickly create a
study for the student performance using association rule
qualitative overview of the data, turning this qualitative
technique and they find the interesting ratio of student in
information to quantitative characterizations requires a great
opting class teaching language.
deal of expertise and completely user manual work. There is
III.
PROPOSED WORK
not another wide process of decision making that seeks

widespread
agreement
among
group
members
or
understanding of the methods needed for post-processing of

the SOM-based data analysis. The subsequent research has
concentrated on devising such methods and on gaining a better
understanding of the strengths, possibilities and weaknesses of
the SOM in data exploration.
[8] Applied the classification of data as data mining
technique to estimate student performance, they had used
decision tree method for classification of similar data. This
study is helpful earlier in identifying the drop-outs and
students who need special attention and allow teacher to
provide appropriate promotions. [9] and Applied the
classification as data mining technique to estimate student
performance, they had used decision tree method for
classification. This study allows the University management to
prepare necessary resources for the new enrolled students to
get desired result and indicates at an early stage which type of
students will potentially be enrolled and what areas to
concentrate upon in higher educational system for support.
[10] is applied the association rule mining analysis based on
students failed courses to identifies students failure patterns.
The main goal of their study is to identify hidden relationship
between the failed courses and suggests relevant causes of the
failure to improve the low capacity students performances.
[11] Used k-means clustering algorithm for prediction
of student's learning activities. The information gets generated
after the implementation of data mining technique might be
helpful for instructor and also for students. Using Bayesian
Classification Method as a data mining technique and to the
that students grade in senior secondary exam, location,
medium, mother's qualification, other habits, family annual
Educational data mining is the emerging field regarding to

prediction of future performance The objective of the
proposed methodology is to build the classification model that
classifies a students performance and has been built by
combining the Standard Process for Data Mining that includes:
business and data understanding, data preparation, modeling
and finally application of data mining techniques which is
classification in present study. Particularly, we will implement
the rules into SVM algorithm to predicate the students final
grade. Also we clustered the student into group using kernel kmeans clustering. This study expressed the strong correlation
between mental condition of student and their final academic
performance. DMT has a potential in performance monitoring
of universities and other levels education offering historical
perspectives of students performances. The results may both
supplement and complement increment ratio of education
performance monitoring and assessment implementations.
CONCLUSION
As we have seen classification task has been used on
student database to predict the students performance on the
basis of previous database record. There are many approaches
that are basically used for the data classification. Information
like Class test, Attendance, Seminar, innovative activities and
Assignment marks were collected from the students previous
database record, to predict the performance at the end of the
each semester. This study will definitely help for the students
and the teachers to improve the performance of the student.
This study will also help full to identify those students which
needed special attention to reduce failure ration and taking
appropriate action for the next academic examination.
REFERENCES
income and status of students family were highly correlated

with the student academic performance. Used simple
sophisticated linear regression analysis and it found that the
3964
_______________________________________________________________________________________

Volume: 2 Issue: 12
ISSN: 2321-8169
3962 - 3965
_______________________________________________________________________________________________
[1] C. Romero and S. Ventura, Educational data mining:
W.P,
A.Embong.,Smooth
Support
Vector
a survey from 1995 to 2005, Expert Systems with
Machine for Breast Cancer Classification, IMT-GT
Applications, no. 33, pp. 135146, 2007.
Conference
[2] Heikki, Mannila, .Data mining:
machine learning,
on
Mathematics,
Statistics
and
Applications(ICMSA), 2008
[11] Christoper Burges. A Tutorial on support vector
statistics, anddatabases. IEEE, 1996.

[3] Moucaryet,al.,Improving student performance using data
clustering and neural networks in foreign language
based higher education, The Research Bulletin of Jordan
Machines for Pattern Recognition, Data Mining and

Knowledge Discovery, 2(2), 1998
[12] Mark Girolami. Mercer Kernel Based Clustering in
Feature Space I EEE Trans. On Newral Networks.
ACM, vol II (III).

[4] Rong Zhang and Alexander I. Rudnicky, A large Scale
[13] P.Cheewaprakobkit,
Study
Factor
Undergraduate,
Analysis
AffectingAchievements
Computer Science, Carnegie Mellon University 5000
presented atInternational Multi Conference of Engineers
Forbes Avenue, Pittsburgh, PA 15213, USA.2006
and ComputerScientists, IMECS , Hong Kong, HK,
Analyze Students Performance, International Journal
of
of
Clustering Scheme for kernel-K-Means School of
[5] B.K.Bhardwaj and S.Paul , Mining Educational Datato
Paper
March 13 - 15, 2013.

[14] Furqan,M.,A.Embong,
Suryanti,A,
Santi
W.P.,
Advanced Computer Science and applicationVol. 2 No. 6
Sajadin,S.,Smooth Support Vector Machine For Face
, 2011 .
Recognition Using Principal Componen Analysis.
[6] Kohonen, T., Self-Organizing Maps, Series in Information

Sciences, second edn. 1997,Springer, Heidelberg.
[7] PavelBerkhin,
Survey
of
Clustering
Data
Mining
Machine
for
classification,
Engineering Malahayati University, Bandar Lampung,

Indonesia.
Y.J. Lee. And O.L Mangasarian, A Smooth Support

Vector
Proceeding 2nd International Conference On Green

Technology and Engineering (ICGTE), 2009. Faculty of
Techniques, Accrue Software, Inc.

[8]
[10] Santi
Journal
of
Computational Optimization and Applications.20, 2001,
[15] V. Tinto, Limits of theory and practice in student

attrition," Journal of Higher Education no. 53, pp. 687700,1982.
pp.5-22
[9]
Kovaic Z (2010) Early prediction of student success;

mining student enrolment data paper presented at
proceeding if information science &IT Education and
conference(InSITE), casinioitalia, june, 19-24,2010
3965
_______________________________________________________________________________________

Application of Data Mining Technique For Prediction of Academic Performance of Student A Literature Survey

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Application of Data Mining Technique For Prediction of Academic Performance of Student A Literature Survey

Diunggah oleh

Hak Cipta:

Format Tersedia

International Journal on Recent and Innovation Trends in Computing and Communication

Application of Data Mining Technique for Prediction of Academic Performance

Prof. Ms. V.M.Deshmukh

ME 3 Sem Information Technology

Associate Professor & Head Information Technology

Keywords: Data mining, SVM, kernel-k-means, SOM, Student Performance.

used to uncover hidden or unknown information that is not

Data mining is the powerful technology for

analyzing important information from the data warehouse. It is

personal or academic which can be used to understand

data analysis methodology used to identify hidden patterns in a

students behavior to improve coursework, to improve teaching

large data set. KDD process includes the data mining.

and many other benefits.

Knowledge discovery (KDD) aims at the discovery of useful

The topic of prediction system of academic performance

information from large collections of data [2]. The main goal

is widely researched. The prediction of student success in

of data mining in the KDD process concerned with the

every institution is still the most topical debates in higher

algorithmic means by which patterns or structures are

study centers. In the previous studies, the model of Tinto [15]

enumerated from the data under acceptable computational

is the predominant theoretical framework for considering

efficiency limitations. Data mining has a wide range of

consisting factors in academic goal. The model of Tinto's

applications including the educational environment. In this

considers the process of student attrition as a psychological

environment, data mining is an interesting research area which

interplay between the characteristics of the student entering in

extracts useful, previously unknown patterns from the

university and the experience at the institute. Using data

database for better understanding. It in turns improves the

mining technique in this field is relatively advanced. There are

educational performance and assessment of the student

many data mining techniques was used in this field, such as

learning process [3].

decision tree, Bayesian network, and neural networks, so on

There are increasing research interests in education field

using data mining. Data mining techniques concerns to

This study investigates the educational domain of data

develop the methods that discover knowledge from data and

IJRITCC | December 2014, Available @ http://www.ijritcc.org

International Journal on Recent and Innovation Trends in Computing and Communication

values, or insufficient memory area to store the kernel matrix,

could be basically collected, how could we reprocess that data,

that make it unsophisticated for large corporation. The new

how to apply kernel method for data mining on the collected

clustering scheme is a large scale integrity clustering for

data, and finally how can we benefited from the knowledge

Kernel K-means algorithm [4].

discovery of collected data. In this case study, university

students were predicated their final grade by using SVM

Most cited literature survey in educational Data

1.1 Support Vector Machine (SVM) for Classification

techniques have been Used and comparative study suggest that

1.2. An Effective Kernel K-Means for Clustering

marks. Records of 50 students of Session 2007 to 2010 MCA

IJRITCC | December 2014, Available @ http://www.ijritcc.org

International Journal on Recent and Innovation Trends in Computing and Communication

factors like mothers education and family income were highly

been a central data analysis tool [6]. It could become an easily

correlated with the student academic performance. Conducted