ABSTRACT:
In this project we have attempted to design a software that is able classify a particular cancer into its subtype at a much faster rate and accuracy in comparison to the conventional method and also with minimum human interferance,knowledge or experience thus leading to better prognosis and higher chances of survival for cancer patients.
2
WHAT IS CANCER?
CAUSES DIAGNOSES PROGNOSIS
Oncogenes : Genes which promote cell growth and reproduction. Tumour Suppressor Genes: Genes which inhibit cell division and survival. Over-expression or the Underexpression of these genes leads to cancer.
5
The prognosis is directly related to both the type and stage of the cancer.
Medical Tests: False Positives and Negatives subclassifiction accuracy depends largely on Doctors knowledge and experience
Faster sub-classification of cancer. Minimum Human Interference required for subclassification purposes. Highly Accurate compared to traditional techniques.
High Accuracy in classification leads to better prognosis, and treatment of patients. Contains Scope in the Future for building techniques for prevention of cancer by studying genetic information
8
PROJECT OVERVIEW:
3.TRAINING ANN 1.DATA COLLECTIO N 2. PREPROCESSING 4.TESTING
5.CANCER SUBCLASSIFICA TION
6.ACCURA CY MET?
WHAT IS MGE?
DNA Microarray :A collection of microscopic DNA spots attached to a solid surface. DNA microarrays used to measure the expression levels of large numbers of genes simultaneously.
Core principle:hybridisation
11
12
13
14
Microarray is a new technology to automate th diagnostic process and can improve accuracy traditional diagnostic processes.
Examine expression of Thousands of genes at once=>Test for elevated expression genes=>Predict cancer
15
16
17
WHAT IS MGE?
18
LUNG CANCER
ADENOCARC INOMA
19
20
BREAST CANCER
BASAL
APOCRINE
21
2. DATA PREPROCESSING
22
MGE data is huge. Large size of input data->LARGE ANN>LONG TRAINING TIME->POOR GENERALIZATION. Require large amount of memory.
23
DATA COMPRESSION
DCT
DWT
24
Used to separate the image into parts (or spectral sub-bands) of differing intensity.
25
26
27
Wavelet analysis : revealing aspects of data that other signal analysis techniques miss: aspects like trends, discontinuities and self-similarity. Can compress or de-noise a signal without appreciable degradation. Wavelet transform defined as the sum over all time of the signal multiplied by scaled, shifted versions of wavelet MOTHER function.
28
29
COMPRESSION LEVEL
COMPRESSIO N TECHNIQUE DCT LUNG CANCER SKIN CANCER (22283 inputs) (7480 inputs) 90%(2283 inputs) 89%(778 inputs) BREAST CANCER (7480 inputs) 89%(778 inputs)
DWT
Dwt-5, Dwt-4, Dwt-3, 87%(939 96%(707 inputs) 93%(472 inputs) inputs) Dwt-4,93%(472 inputs)
30
3.
NEURAL NETWORK: BUILDING,TRAINING AND TESTING
31
The motivation for the development of neural network technology stemmed from the desire to mimic human brain.
A Neural Network is a powerful datamodeling tool that is able to capture and represent complex input/output relationships.
32
Self-learning
Real Fault
34
35
TRANSFER FUNCTIONS:
36
LEARNING METHODS:
I.
II.
37
38
NEURAL NETWORK
472
230
2
39
TRAINING ALGORITHMS:
40
1. ERROR BACKPROPAGATION:
Choose
network. Feed in input and obtain the result for each neuron. Feed forward the output of each layer to the next layer. Calculate the error for each neuron,and backpropagate it.
41
LOCAL MINIMUM
0.45 0.4 0.35
GLOBAL MINIMUM
20
40
60 Epoch
80
100
120
140
42
Algorithms:
1.
Gradient Descent With Momentum Backpropagation Gradient Descent With Adaptive Learning Rate Backpropagation:
2.
43
3.
4.
RESILIENT PROPAGATION:
44
3.K-MEANS CLUSTERING
kmeans partitions the observations in your data into K mutually exclusive clusters.
returns a vector of indices indicating to which of the k clusters it has assigned each observation.
The result is a set of clusters that are as compact and well-separated as possible.
45
CODE:kmeans
X=[p1;p2;p3;p15]; %input data k=3 %number of clusters [cidx, ctrs] = kmeans(X,k); %CIDX represents the cluster indice of each sample(cluster) and CTRS represents cluster centroids figure for c = 1:3 subplot(3,1,c); plot( X((cidx == c),:)'); end suptitle('K-Means Clustering of Profiles'); figure for c = 1:3 subplot(3,1,c); plot( ctrs(c,:)'); end suptitle('K-Means Clustering of Profiles showing only centroids'); figure [silh3,h] = silhouette(X,cidx,'city'); xlabel('Silhouette Value') ylabel('Cluster')
46
47
48
silhouette plot:
To get an idea of how well-separated the resulting clusters are=> silhouette plot.
It is a measure of how close each point in one cluster is to points in the neighboring clusters. Silhouette plot aids in finding(discovering) new subtypes of cancer
49
50
4. RESULTS
51
BREAST CANCER:
ALGORITHM BACKPROPA GATION BP WITH MOMENTUM BP WITH VARIABLE LEARNING RATE BP WITH VARIABLE LR AND MOMENTUM RESILIENT PROPAGATIO DCT 93.61% 93.66% 97.26% DWT-4 97.58% 97.76% 97.78% DWT-3 94.65% 97.13% 97.26%
98.03%
98.70%
98.61%
99.00%
99.75%
99.70%
52
SKIN CANCER:
53
LUNG CANCER:
ALGORITHM 1.BP 2.BP WITH MOMENTUM DCT 89.66 86.695 DWT-5 89.77 91.04
91.97
91.51
98.52
99.52
54
55
2)Each time when the power switched off the network loses its training.
3)Each time you run the network, it randomly initialises weights so difference in accuracy.
To
Extending
Discovery
Study
REFERENCES
I.
II.
III.
IV.
V.
K.V. G. Rao, P. P. Chand, M.V.R. Murthy,"A neural Network Approach in Medical Decision Systems" Journal of Theoretical and Applied Information Technology, vol. 3 No. 4, 2007 Lawrence O. Hall, Xiaomei Liu Kevin W. Bowyer2, Robert Banfield,Why are Neural Networks Sometimes MuchMore Accurate than Decision Trees: An Analysis on a Bio-Informatics Problemin IEEE International Conference on Systems, Man & Cybernetics,Washington, D.C., pp. 2851-2856, October 5-8, 2003 Charu Gupta IMPLEMENTATION OF BACK PROPAGATION ALGORITHM (of neural networks)IN VHDL, at CHITKARA INSTITUTE OF ENGG &TECH, CHANDIGARH in June 2006. V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis, An Improved Backpropagation Method with Adaptive Learning Rate, University of Patras, Department of Mathematics, TECHNICAL REPORT No.TR98-02 Gonzales, R. C., Woods, R. E., 1993. "Digital Image Processing". Addison-Wesley, Reading, Massachusetts. MATLAB,HELP SECTION.
58
VI.
VII.
VIII.
IX.
X.
Liang-Tsung Huang, An integrated method for cancer classification and rule extraction from microarray data, Journal of Biomedical Science 2009. Tripti Goel ,GPMCE, Delhi ,Vijay Nehra ,BPSMV, Khanpur,Virendra P.Vishwakarma JIIT, Noida, Comparative Analysis of various Illumination Normalization Techniques for Face Recognition, International Journal of Computer Applications (0975 8887) Volume 28 No.9, August 2011 Fayez W. Zaki, Mustafa M. Abd Elnaby, lbrahim M. Elshafiey and Amira S. Ashour, DCT AND DWT FEATURE EXTRACTION AND ANN CLASSIFICATION MATERIALS BASED TECHNIQUE FOR NONDESTRUCTIVE TESTING OF MATERIALS,EIGHTEENTH NATIONAL RADLO SCIENCE CONFERENCE March 27-29 2001 Mansoura Univ, Egypt. Ahmad M. Sarhan, A Novel Gene-Based Cancer Diagnosis with Wavelets and Support Vector Machines, European Journal of Scientific Research ISSN 1450-216X 59 Vol.46 No.4 (2010), pp.488-502 EuroJournals
THANK YOU.
60