Charalampos S. Floudas, MD, PhD, MS Jeya Balaji Balasubramanian, MS Marjorie Romkes, PhD Vanathi Gopalakrishnan, PhD
TBI 2013
2 of 16
TBI 2013
Translational Bioinformatics
Includes prediction of clinical outcomes from available genomic data Genomic data:
High-dimensional Many modalities Different aspects of disease
3 of 16
TBI 2013
Translational Bioinformatics
Multiple clinical outcomes Combinations of datasets and outcomes Collaborative effort Many tools available Workflows
Flexibility of design Reproducibility of research
4 of 16
TBI 2013
Core elements
Subjects: 86 early stage NSCLC patients
University of Pittsburgh Cancer Institute Lung SPORE cohort
Importance Predictors dataset: Affymetrix SNP Array 6.0, 1 million SNPs Outcome: categorical disease free survival (DFS), good vs. poor, 1952 days
5 of 16 Department of Biomedical Informatics PRoBE Lab
TBI 2013
Workflow tools
Affymetrix Genotyping Console
Quality control (QC), genotype calling
PLINK
QC of Genotypes (MAF, etc.) feature selection (2) for BRL and export of features
BRL system
Predictive rules (sets of SNPs) and metrics
6 of 16 Department of Biomedical Informatics PRoBE Lab
TBI 2013
7 of 16
TBI 2013
Workflow tools
SQLite
Fine selection of datasets and clinical parameters
8 of 16
TBI 2013
9 of 16
TBI 2013
most significantly associated biological function cell-to-cell signaling and interaction (8 of 33 genes)
Department of Biomedical Informatics PRoBE Lab
10 of 16
TBI 2013
TBI 2013
12 of 16
TBI 2013
Conclusions
Our empirical workflow (SNPR)
Efficiently overcomes challenges of prediction using high-dimensional datasets Achieves biological relevance and good predictive performance
13 of 16
TBI 2013
Limitations
Small sample size No independent testing cohort Categorization of survival instead of time-to-event analysis
14 of 16
TBI 2013
Acknowledgments
Cancer Biomarkers Facility of the University of Pittsburgh Cancer Institute, award P30CA047904 Grant support:
National Cancer Institute Award Number P50CA090440
TBI 2013
Thank you
chf35@pitt.edu
16 of 16