Bijoy Chhetri
Sr. Lecturer
Department of Computer Science and Engineering
Centre for Computers and Communication Technology
bijoychhetri@gmail.com
ABSTRACT
1
Keywords
BIG DATA, Hadoop, Map Reduce, Cloud, NO SQL.
1. INTRODUCTION
BIG DATA is very much similar to small
data but bigger in the scale, complexity and variable
generation modes. But having BIG DATA means having to set
newer technologies and different approach in handling bigger
dataset which aims to solve new problems and even solve old
problems in a better way.
Data have become a torrent flowing into every area of the
global economy and the science behind it as Data Science
which deals with collection, preparation, analysis,
visualization, management & preservation of large collection
of Information. In other words, data science2 is the integration
of methods from statistics, computer science, and other fields
for gaining insights from data. In practice, data science
encompasses an iterative process of data harvesting, cleaning,
analysis and visualization, and implementation. Ultimately,
this interdisciplinary and cross-functional field leads to
decisions that move an organization forward, whether the
2. METHODOLOGIES USED TO
ANALYSE BIG DATA
Figure Fig-2, representing the different phases of
harnessing, cleaning analyzing and interpretation of huge set
of data in efficient matter.
logs for example. These web logs are turned into browsing
sessions by running MapReduce programs on the cluster and
generating aggregated result on the same.
Web Mining
Machine Learning Techniques
Crowd Sourcing
Genetic Algorithm
NLP
Sentiment Analysis
Visualization
Time Series Analysis
Hadoop
Map reduce
BIG table
Cassandra
Mash-up
Big query
4. Experimental Results
5. CONCLUSION
There is no question that there is enough data
available that traditional database management systems will
be overwhelmed and overloaded, because new systems using
big data will extend, and possibly replace, traditional
DBMSs. And the rate of data acquisition is accelerating
quickly enough that perhaps we will eventually coin a new
term based on BIG DATA.
6. REFERENCES
A/B/N Testing