Anda di halaman 1dari 16

An empirical framework for genome-wide single nucleotide polymorphism-based predictive modeling

Charalampos S. Floudas, MD, PhD, MS Jeya Balaji Balasubramanian, MS Marjorie Romkes, PhD Vanathi Gopalakrishnan, PhD

Department of Biomedical Informatics

TBI 2013

A workflow for prediction in cancer


Predicting Risk of early recurrence in early stage non-small cell lung cancer (NSCLC) SNPR workflow Genome-wide Single Nucleotide Polymorphisms (SNP) Bayesian rule learning (BRL) system

2 of 16

Department of Biomedical Informatics PRoBE Lab

TBI 2013

Translational Bioinformatics
Includes prediction of clinical outcomes from available genomic data Genomic data:
High-dimensional Many modalities Different aspects of disease

3 of 16

Department of Biomedical Informatics PRoBE Lab

TBI 2013

Translational Bioinformatics
Multiple clinical outcomes Combinations of datasets and outcomes Collaborative effort Many tools available Workflows
Flexibility of design Reproducibility of research

4 of 16

Department of Biomedical Informatics PRoBE Lab

TBI 2013

Core elements
Subjects: 86 early stage NSCLC patients
University of Pittsburgh Cancer Institute Lung SPORE cohort

Importance Predictors dataset: Affymetrix SNP Array 6.0, 1 million SNPs Outcome: categorical disease free survival (DFS), good vs. poor, 1952 days
5 of 16 Department of Biomedical Informatics PRoBE Lab

TBI 2013

Workflow tools
Affymetrix Genotyping Console
Quality control (QC), genotype calling

After QC: 67 samples (50 poor DFS, 17 good)

PLINK
QC of Genotypes (MAF, etc.) feature selection (2) for BRL and export of features

BRL system
Predictive rules (sets of SNPs) and metrics
6 of 16 Department of Biomedical Informatics PRoBE Lab

TBI 2013

BRL system elements


Rule learner (RL) Bayesian Rule Learner (BRL)
Bayesian scoring induces Bayesian networks
Rule models Global (GBRL): full Local (LBRL): decision tree representation

7 of 16

Department of Biomedical Informatics PRoBE Lab

TBI 2013

Workflow tools
SQLite
Fine selection of datasets and clinical parameters

Unix command line tools


Operations on datasets (Affymetrix genotypes to PLINK, PLINK selected features to BRL)

8 of 16

Department of Biomedical Informatics PRoBE Lab

TBI 2013

9 of 16

Department of Biomedical Informatics PRoBE Lab

TBI 2013

Results - feature selection


100 SNPs from PLINK 2 44 intragenic -> 33 genes Functional analysis (Ingenuity IPA)
most significantly associated disease is cancer (9 of 33 genes)

most significantly associated biological function cell-to-cell signaling and interaction (8 of 33 genes)
Department of Biomedical Informatics PRoBE Lab

10 of 16

TBI 2013

Results - feature selection


CHODL (chondrolectin) gene
associated with shorter survival in NSCLC

CDH13 (cadherin 13) gene


hypermethylated in NSCLC

CHST11 (carbohydrate (chondroitin 4) sulfotransferase 11) gene


associated with lung colonization in breast cancer
11 of 16 Department of Biomedical Informatics PRoBE Lab

TBI 2013

Results BRL prediction


5 fold cross validation

12 of 16

Department of Biomedical Informatics PRoBE Lab

TBI 2013

Conclusions
Our empirical workflow (SNPR)
Efficiently overcomes challenges of prediction using high-dimensional datasets Achieves biological relevance and good predictive performance

Can be generalized and adapted


Other experimental platforms, data mining tasks
Department of Biomedical Informatics PRoBE Lab

13 of 16

TBI 2013

Limitations
Small sample size No independent testing cohort Categorization of survival instead of time-to-event analysis

14 of 16

Department of Biomedical Informatics PRoBE Lab

TBI 2013

Acknowledgments
Cancer Biomarkers Facility of the University of Pittsburgh Cancer Institute, award P30CA047904 Grant support:
National Cancer Institute Award Number P50CA090440

National Library of Medicine Award Number R01LM010950


National Institute of General Medical Sciences Award Number R01GM100387
15 of 16 Department of Biomedical Informatics PRoBE Lab

TBI 2013

Thank you
chf35@pitt.edu

16 of 16

Department of Biomedical Informatics PRoBE Lab

Anda mungkin juga menyukai