0 penilaian0% menganggap dokumen ini bermanfaat (0 suara)
79 tayangan4 halaman
Neural Networks is known for its ability to derive the complicated
data to extract the complex information in the form of particular
patterns which can’t be noticed by human beings or by any other
computing technique. In neural networks, there are various
approaches of pattern recognition, which are list down in this
paper, and out of them, k-Means Clustering Algorithm is discussed
with simulation results, pros-cons and applications.
Neural Networks is known for its ability to derive the complicated
data to extract the complex information in the form of particular
patterns which can’t be noticed by human beings or by any other
computing technique. In neural networks, there are various
approaches of pattern recognition, which are list down in this
paper, and out of them, k-Means Clustering Algorithm is discussed
with simulation results, pros-cons and applications.
Neural Networks is known for its ability to derive the complicated
data to extract the complex information in the form of particular
patterns which can’t be noticed by human beings or by any other
computing technique. In neural networks, there are various
approaches of pattern recognition, which are list down in this
paper, and out of them, k-Means Clustering Algorithm is discussed
with simulation results, pros-cons and applications.
www. i j e c t . o r g INTERNATIONAL JOURNAL OF ELECTRONICS & COMMUNICATION TECHNOLOGY 21 ISSN : 2230-7109 (Online) | ISSN : 2230-9543 (Print) A Comprehensive Review on k-Means Clustering Algorithm in Neural Networks 1 Neha, 2 Neelam Chaudhary, 3 Tanvir Singh 1,2,3 Centre for Development of Advanced Computing, Mohali, Punjab, India Abstract Neural Networks is known for its ability to derive the complicated data to extract the complex information in the form of particular patterns which cant be noticed by human beings or by any other computing technique. In neural networks, there are various approaches of pattern recognition, which are list down in this paper, and out of them, k-Means Clustering Algorithm is discussed with simulation results, pros-cons and applications. Keywords k-Means Clustering, Neural Networks, Pattern Recognition, MATLAB. I. Introduction Neural network is inspired from the Human Brain which computes very fast as compared to computers. The basic component behind this fast computations by the brain is NEURON. Brain is a parallel computer which organizes its computing component neuron to perform computation much faster than digital computers. One of the main advantage of Neural Network is the ability to adapt its weights according to the changes in the surrounding environment. Fault tolerance is one of the main advantages. II. Various Pattern Recognition Techniques k-Means clustering 1. Kohenenons maps or self-organizing maps 2. Back-propagation algorithm 3. Hopfeld model 4. Perceptron model 5. k-Means clustering is the oldest and most useful pattern recognition technique which is to be discussed. III. k-Means Clustering Algorithm Clustering is a criteria to group the given set of patterns into clusters in a way such that patterns in the same cluster are homogeneous and in the different clusters are unlike. k-Meansclustering is an algorithm to cluster objects based on some characteristics.As shown in fg. 1 and fg. 2, objects are grouped based on color. K in the name of algorithm describes the number of clusters. K is a positive integer value.The value of K for fg. 1 is 6 and for fg. 2 is 3.K is the input value.Basically there are two types of learning: (A) Supervised (B) Unsupervised. k-Means Algorithm comes under the category of unsupervised leaning. It means the whole criteria does not use any knowledge. It just follows the logic which is described below [1-2]. Fig. 1: Clusters for k-Meanns Algorithm Fig. 2: k-Means Cluster Algorithm: Initialize the clusters. Any number of clusters can be 1. initialized. Calculate the Euclidean distance of each object from the 2. clusters. Euclidean distance is given by: Now place the object in the cluster with minimum distance 3. d. Now calculate the mean of newly grouped data points and 4. this mean will be the respective updated clusters [14]. Repeat steps 2-4 until convergence [3-5] 5. All these steps are represented in fg. 4-7 whereas fg.4 shows fow chart for the algorithm showing the steps involved in k-Means clustering. Fig. 3: Flow Chart for Algorithm IJECT VOL. 5, ISSUE 3 SPL - 1, JULY - SEPT 2014 ISSN : 2230-7109 (Online) | ISSN : 2230-9543 (Print) www. i j e c t . o r g 22 INTERNATIONAL JOURNAL OF ELECTRONICS & COMMUNICATION TECHNOLOGY Fig. 4: Initial Representation Fig. 5: Grouping Under Clusters Fig. 6: Re-Calculate the Cluster Fig. 7: Final Representation IV. Heirarichal vs Non-Heirarichal Approach Heirarichal clustering method is the method of clustering which consists of overlapping groups i.e a cluster is a subset of another cluster[13]. This consists of nesting of clusters. Non-heirarichal clustering is a method consistes of non-overlapping groups and no heirarchy .It involves much less computation as comapred to heirarichal method. K-Means clustering is categorised under non-heirarichal clustering.Non-heirarichal method is used on large,high-dimensional datasets. V. Simulation and Result We have made data divided into two and four groups. K-mean demonstrates from which group each data points belongs. The k-mean algorithm works on two things: the raw data we provide and number of clustering group in which data has to be divided. close all, clear all, clc, %Lets consider some random data with two groups n=40; %sample size x=[randn(n,1)+2;randn(n,1)+5]; y=[randn(n,1)+2;randn(n,1)+5]; %The group identity groups=[ones(n,1);ones(n,1)+1]; %To plot the data scatter(x,y,70,groups,flled) Fig.8 shows MATLAB plot showing random data with two groups Fig.8: MATLAB plot showing random data with two groups K-Means divides the given data into different cluster groups by calculating their centroid. Here we divide our data into two clusters as shown in fg.9. data=[x,y]; IDX=kmeans(data,2); %divide data into cluster of two groups IJECT VOL. 5, ISSUE 3 SPL - 1, JULY - SEPT 2014 www. i j e c t . o r g INTERNATIONAL JOURNAL OF ELECTRONICS & COMMUNICATION TECHNOLOGY 23 ISSN : 2230-7109 (Online) | ISSN : 2230-9543 (Print)
Fig. 9: Plot Showing Data Divided Into Two Cluster K-mean divides data distribution into k groups even though there is only one group. %Lets take some random data with four groups n=80; %sample size x=[randn(n,1)+3;randn(n,1)+5]; y=[randn(n,1)+3;randn(n,1)+5]; %plot the data subplot(1,2,1) plot(x,y,ok,MarkerFaceColor,k) %divide into two using k-Means and plot the results data2=[x,y]; IDX=kmeans(data2,4); %plot the k-Means results subplot(1,2,2) scatter(x,y,50,IDX,flled) Fig.10 shows the plot showing original data and clustered data into four groups Fig. 10: Plot Showing Original Data and Clustered Data Into Four Groups VI. Pros and Cons Pros: Inexpensive as compared to other clustering techniques. Fast and easier to understand. This algorithm is moreresponsive when data patterns are well seperated [11, 12] This technique will produce tighter clusters especially when clusters are globular.Globular clusters have welldefned centers and are circular or elliptical shape 5. In case of large variables, this will work faster if the value of K is small. Cons: It is diffcult to handle in case of noisy patterns. It is diffcult to predict what would be K due to fxing of clusters. This will not work well in case of globular clusters. Non global clusters are those which do not have well defned centers and they are of chainlike shape as shown in fg. 11. Fig. 11: Non-Global clusters Different initial representation of data will result differently. It is necessary to provide a prior specifcation of number of cluster centers. Random choice of cluster center does not produce good results [5]. VII. Application of k-Means k-Means clustering algorithm is in use by Covenant University for prediction of academic performance of students [8]. Identifcation of fraud credit card transaction and risky loan applications i.e for abnormal data perception [9]. It is used for pattern and image recognition. It is used in search engines, identifcation of cancerous data, wireless sensors, drug activity prediction [10, 15]. VIII. Conclusion Automatic (machine) recognition, description, classifcation, and grouping ofpatterns are important problems in a variety of engineering and scientifc disciplines such as biology, psychology, medicine, marketing, computer vision, artifcial intelligence, and remote sensing.Pattern Recognition is basically a study that how machines can analyze the environment, and they try to learn the different patterns taken into interest upon which it fnalize the decision. k-Means Clustering algorithm is effcient algorithm for large data and reduce computation time as compare to k-Mean method. Moreover, this method is fast, easier to understand and inexpensive. IJECT VOL. 5, ISSUE 3 SPL - 1, JULY - SEPT 2014 ISSN : 2230-7109 (Online) | ISSN : 2230-9543 (Print) www. i j e c t . o r g 24 INTERNATIONAL JOURNAL OF ELECTRONICS & COMMUNICATION TECHNOLOGY References [1] k-Means Clustering Tutorial, [Online] Available: http:// si gi t wi di yant o. st aff. gunadarma. ac. i d/ Downl oads/ fles/38034/M8-Note-kMeans.pdf [2] Suman Tatiraju, Avi Mehta,"Image Segmentation using k-Means clustering, EM and Normalized Cuts", [Online] Available: http://www.ics.uci.edu/~dramanan/teaching/ ics273a_winter08/projects/avim_report.pdf [3] OnMyPhD, K-Means Clustering, [Online] Available: http:// www.onmyphd.com/?p=k-Means.clustering [4] Andrew W. Moore,"K-Means and Hierarchical Clustering", [Online] Available: http://www.cs.cmu.edu/~cga/ai-course/ kmeans.pdf [5] Ios, K-Means Clustering, [Online] Available: http://www. improvedoutcomes.com/docs/WebSiteDocs/Clustering/K- Means_Clustering_Overview.htm [6] Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman, Angela Y. Wu.An Efficient K-MeansClustering Algorithm: Analysis and Implementation, IEEE Transactions on pattern analysis and machine intelligence, Vol. 24, No. 7, July 2002. [7] A Tutorial on Clustering Algorithms, [Online] Available: http://home.deib.polimi.it/matteucc/Clustering/tutorial_ html/ [8] [Online] Available: http://covenantuniversity.edu.ng/ [9] James McCaffrey, Detecting Abnormal Data Using k-Means Clustering,[Online] Available: http://msdn.microsoft.com/ en-us/magazine/jj891054.aspx [10] Clustering Algorithm Applications,[Online] Available: https://sites.google.com/site/dataclusteringalgorithms/ clustering-algorithm-applications [11] Simon Haykin, Neural Networks, Second Edition, Pearson Education. [12] Kardi Teknomo, [Online] Available: http://people.revoledu. com/kardi/tutorial/kMean/index.html [13] Schikuta,"Grid Clustering: An Efficient Hierarchical Clustering Method for Very Large Data Sets", Proc. 13thIntl. Conference on Pattern Recognition, 2, 1996. [14] R. C. Dubes, A. K. Jain,"Algorithms for Clustering Data", Prentice Hall, 1988. [15] K. Mehrotra, C. Mohan, S. Ranka,"Elements of Artifcial Neural Networks", MIT Press, 1996. Neha is pursuing her Masters degree in Embedded Systems from Centre for Development of Advanced Computing, Mohali, Punjab. She has received her B.Tech (Electronics and Communication) from Punjab Technical University. She has one year experience as a Quality Engineer in Cenzer Industries Limited, Baddi. Her area of interest is Computer Architecture and Embedded systems Neelam Chaudhary is pursuing her Masters degree in Embedded Systems from Centre for Development of Advanced Computing, Mohali, Punjab. She has received her bachelors Degree (B.TECH- Electronics & Communicationengineering) from Meerut Institute of Engineering and Technology, Meerut. She has worked as a Lecturer in electronics and communication department for 1 year. She has published review/research papers in International Journals/Conferences. Her area of interest includes environmental sustainability and embedded system designing. Tanvir Singh is pursuing his Masters degree in Embedded Systems from Centre for Development of Advanced Computing, Mohali, Punjab. He received his bachelors Degree (Electronics and Communication Engineering) from IET Bhaddal Technical Campus, Punjab. His area of interest includes Environmental Sust ai nabi l i t y i n Wi r el ess Communication Networks and Electromagnetic Radiations with a dream to create a Technical Advanced and eco-friendly world. He has published 50+ review/research papers in International Journals/Conferences. He has started a group named Green Thinkerz to promote Environmental Sustainability (facebook. com/greenthinkerz).
ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve