Essays
ABSTRACT:
Automatic grading systems for essays conventionally check linguistic
structure and grammatical aspects along with the degree of relation with the topic
in case of descriptive essays. They primarily follow techniques of Singular Value
Decomposition (SVD) in Latent Semantic Indexing (LSI) aided by the NLP
methods for a consistent and uniform grading of essays. However, these methods
can be supplemented, for a higher accuracy rate to match human grading, by a
better choice of the surface and complex parameters in the Content Vector
analysis (CVA) based on established rubrics and paradigms. This project upgrades
the use of the automated grading system from descriptive essays to
argumentative and issue essays by examining the logical structure as well. It
focuses on constructing vectors and document matrix using character n-grams as
features instead of the normal method of using words.
INTRODUCTION:
With the advent of online examinations like GRE, GMAT and CET4 there had
been an increasing call for automation in the scoring process. Scoring of objective
questions has been considerably simple and it has been existent since years back.
But the evaluation of essays has been in practice only manually in most of the
cases. This is because of the high complexity involved in programming such a
system that will perform as good as a human in its cognition. With the evolution of
advanced text database practices and Natural Language Processing (NLP)
techniques, this has become possible off late.
Any e-rater should offer several salient features, most importantly, the
following:
1. Speed: Score generated in a matter of seconds as against the time-consuming
manual correction.
2. Ease/Less fatigue: Process made easy by automation as against the laborious
manual task.
3. Equitable: No place for any unjust favoring or unfair partiality or preference; all
scores are generated unbiased.
4. Uniformity: Overcomes the problem of different mindsets or attitudes of
different evaluators; ensures all essays are graded on a similar outlook.
LITERATURE SURVEY:
[1]AUTOMATED ESSAY SCORING USING KNN ALGORITHM
Lin Bin,Lu Jun,Yao Jian-Min,Zhu Qiao-Ming
International Conference on Computer Science and Software Engineering
IEEE-2008
EXPLANATION:
Transformation of Essays into Vectors:
KNN
1. Memory based : Large space requirement to store the entire dataset is
required.
2. Unreliable Neighborhood-Lack of Overlapping results: Since the
dataset usually gives a sparse matrix, there are no overlapping values. But
similarity measures require high overlapping for higher reliability.
3. Unsuitable for corporate dataset- Due to Sparseness: By the above
argument, since the corporate datasets are usually sparse, KNN is less
suitable for them.
[2]AUTOMATED ESSAY SCORING SYSTEM FOR CET4
Yali Li,Yonghong Yan
Second International Conference on Education technology and Computer
Science
IEEE-2010
EXPLANATION:
Score-Determining Components:
1.Surface Features:
The number of characters in the document(Chars)
The number of words in the document(words)
The number of different words (Diffwds)
The fourth root of the number of words in the document, as suggested by the
Page(Rootwds)
The number of sentences in the document(Sents)
Average word length(Wordlen=Chars/Words)
Average sentence length(Sentlen=Words/Sents)
Sensitive to outliers
ISSUES:
The issues that occur in the performance due to SVD are:
Order of complexity of the Algorithm : O(n^2k^3)- very high
Requires Normal distribution of term : Words are required to be normally
distributed across the documents. But in corporate datasets there is sparse
distribution.
[4]AN EFFECTIVE AUTOMATED ESSAY SCORING SYSTEM
USING SUPPORT VECTOR REGRESSION
Yali Li, YonghongYan
Fifth International Conference on Intelligent Computation Technology
and Automation
IEEE-2012
PROPOSED SYSTEM:
Dataset Construction Using Character n-grams over words
Content Vector Analysis(CVA) over Latent Semantic Analysis(LSA)
Uses SVM
o Model Based
o Popular in text classification problems where very high-dimensional
spaces are the norm
Support Vector Regression
Evaluation of rhetorical arguments
Each argument=Mini-document
I. Methodology :
II. Testing:
Cosine relation between test vector and document/class vector
Class with highest correlation is selected
REFERENCES:
Dikli, S. (2006). An Overview of Automated Scoring of Essays. Journal of
Technology, Learning, and Assessment, 5(1). Retrieved [date] from
http://www.jtla.org.
J. Burstein, K. Kukich, S. Wolff, C. Lu, M. Chodorow, L. Bradenharder, and M.
Dee Harris, Automated Scoring Using A Hybrid Feature Identification
Technique, in Proc. In the Proceedings of the Annual Meeting of the
Association of Computational Linguistics,1998
System and method for computer-based automatic essay scoring, Jill C.
Burstein et al
Automatic essay scoring system-Yvacheslav Andreyev et al
Building an automated English sentence evaluation system for students
learning English as a second language-Kong Joo Lee , Yong-Seok Choi , Jee Eun
Kim
CONCLUSION:
The grades assigned by using this software in place of a human being
will be as efficient as when a second rater is used in its place. Any deviation, that
rises, can be adjusted by suitable mathematical models.