Member Institutions: 52
26 agencies of government
26 private not-for-profit health-related institutions
Annual Patient Visits: 7.1 million
Annual International Patient Visits: 16,000
Employees: 92,500
Full-time Students: 34,000*
Volunteers Daily: 12,000
Residents and Fellows: 4,000
MILLIONS OF SAMPLES
Member Institutions: 52
26 agencies of government
26 private not-for-profit health-related institutions
Annual Patient Visits: 7.1 million
Annual International Patient Visits: 16,000
Employees: 92,500
Full-time Students: 34,000*
Volunteers Daily: 12,000
Residents and Fellows: 4,000
Sample
WGS on DNA
Enrich bacteria,
viruses, eukes
WGS on cDNA
Community structure
Bacterial pangenome
DNA viruses
Bacterial transcriptome
Extract
DNA/RNA
WGS on cDNA
from virus prep
Host genome wide association studies (GWAS) combined with microbiome analysis
Skin
BCM
MD Anderson
Texas Childrens
Blood
BCM
MD Anderson
Oral
UT Health Science Center
UT School of Dentistry
Multi-site
U. Texas Health Science Center
U. Florida
Miami U
U. Texas School of Public Health
Sam Houston State U
U. Colorado
U. South Florida
Reproduction/Urogenital
BCM
Michigan State
North Shore University Hospital
Stony Brook University
Texas Childrens
University of South Carolina
Lung/Airways
BCM
Gastrointestinal
Harvard Med. School
BCM
Mass. General Hospital
Texas Childrens
U. Michigan
Harvard Med. School
UT Health Science
National Institute of Health Center
Texas A&M HSC
Panola College
U. Javeriana
City of Hope
Lexicon Pharmaceuticals
Michigan State
Nazarbayev U. (Kazakhstan)
U. del Norte (Colombia)
U. Louisville
Peruvian University Cayetano Heredia
Sam Houston State University
Nottingham U. (UK)
U. Texas School of Public Health
UT Medical Branch
MD Anderson
Genetic
Predisposition
Overt
immunologic
abnormalities
Normal insulin
release
Progressive
loss insulin
release
Glucose
normal
Overt
diabetes
C-peptide
present
No C-peptide
Age (years)
Islet Autoimmunity
T1D
Islet Autoimmunity
T1D
Islet Autoimmunity
x
T1D
Clinical Centers
Principal Investigators
ke Lernmark, Ph.D.Lund U.
Jeffrey Krischer, Ph.D.U. South Florida
William Hagopian, M.D., Ph.D. U. Washington
Olli Simell, M.D., Ph.D. U. of Turku
Jorma Toppari, M.D., Ph.D.-- U. of Turku
Anette Ziegler, Ph.D. --Technische Universitt
Mnchen
Marian Rewers, M.D., Ph.D. U. Col. Denver
Jin-Xiong She, Ph.D.Jinfiniti Biosciences
Beena Akolkar, Ph.D.NIH/NIDDK
NIDDK
NIAID
NICHD
NIEHS
CDC
JDRF
Project description:
Newborns from the general population with genetic risk (~90% of cohort) and those with first-degree
relatives of probands with T1DM (~10% of cohort)
Stool collected monthly through age 4, then quarterly through age 10, then
biannually through age 15
Stool
Cultured
Total Extracted 16S 18S/ITS WGS Virome Virome
12,67
13,403 13,403*
2
0
11,882 8,589
4,055
Plasma
1,469
* Some
samples on6,380
re-order for re-extraction.
N/A
N/A
1,469 1,469
316
Highlights
All stool samples have had nucleic acids extracted using
novel robotic pipelines with >95% success rate
Extraction and sequencing controls with every run
Over 6.8 Tb of WGS data thus far (HMP 3.6 Tb)
> 10,000 reads per 16S rDNA sample
> 1.1 Tb of direct virome data
Attempting to culture viruses from every sample
Microbial eukaryote (18S) arm about to begin
Ajami, Ayvaz, Bauch, Kusic, Railey, Tamegnon
The TEDDY 50
In 3/2014 TEDDY leadership requested an analysis teaser to present to the
consortiumthey provided limited metadata for 50 subjects
and 4 days
What is the best way to think about using the data in conjunction with other
TEDDY data on genetics and environmental exposures
TEDDY 50
50 subjects
567 samples
HLA genotypes
1='DR4*030X/0302*DR3*0501/0201 (very high risk; 1/15 get T1D by 15 years)**
2='DR4*030X/0302*DR4*030X/0302 (high risk; 1/20-1/30 get T1D by 15 years)**
4='DR4*030X/0302*DR8*0401/0402 (high risk, more genotypes from Finland)**
5='DR4*030X/0302*DR1*0101/0501 (FDR high risk, Gen. pop. with this genotype not enrolled)
6='DR4*030X/0302*DR13*0102/0604 (FDR high risk, Gen. pop. with this genotype not enrolled)
9='DR3*0501/0201*DR3*0501/0201 (moderate risk; maybe 1/35-1/50 get T1D by 15 years;
also highest risk for Celiac disease)
Bacterial Associations
16S rRNA gene
and
WGS data
2 0.03400
Germany 0.00068
4 4.8E-08
1.1E-10
Sweden 0.00845
0.09162
US 0.01568
0.01657
0.22634
0.00024
0.00309
9 0.43198
Finland Germany
Gender
HLA Category
Subject ID
Age in Years
Sampling Date
Sweden
Country
Birth month
32
59
PCoA
With the full data set, TEDDY may provide the best view of the maturation of the infant GI microbiome,
with data from multiple countries
Weighted UniFrac analysis suggests multiple starting community structures projecting towards a single structure
Do these patterns hold true in all children, or only those progressing to T1D?
Subject
RNA
anti-VP1
T1D
Healthy
Herpesvirus 4
Herpesvirus 5
Alphapapillomavirus
Adenovirus
AdV3
AdV6
AdV31
AdV41
AdV61
AdV Group A
AdV Group B
AdV Group C
AdV Group F
Ref. genome
Size
Mapped
reads#
Genome
Coverage
34169
396568
97%
203057
108
1.7%
34188
462778
99.6%
34214
9402
13%
34214
9048
14%
Reads #
Tian
Random
amplification,
barcoding, and
pooling
Illumina
Sequencing
Analysis
Pipeline
50X
Sequencing
Illumina HiSeq 2500
3 samples/lane
Trimming, custom
de-multiplexing ,and
low complexity
filtering
Remove host,
human, bacterial,
and vector
sequences
Translated
nucleotide query
(USEARCH)
Nucleotide query
(Bowtie2)
Determine quality
and coverage of
positive hits
Constraints
1 mapped paired-end read per sample
50 bases aligned to reference genome
Both ends of the pair must hit the same genome(s)
Body Site
Comparisons
Phage
3 Gb
300 Mb
30 Mb
Sequencing Depth--Illumina
Mock community of Enteroviruses in PBMC
http://www.qcmd.org/
SAMPLE
QCMD-1
QCMD-2
QCMD-3
QCMD-4
QCMD-5
QCMD-6
QCMD-7
QCMD-8
QCMD-9
QCMD-10
QCMD-11
QCMD-12
BARCODE
TCCCTTGTCTCC
ACGAGACTGATT
ATCACCAGGTGT
ATCGCACAGTAA
AGCGGAGGTTAG
TACAGCGCATAC
ACCGGTATGTAC
AATTGTGTCGGA
AGTCGAACGAGG
ACCAGTGACTCA
CAGCTCATCAGC
GCAACACCATCC
Sample ID
Yield (Gbases)
QCMD1
0.25
QCMD2
1.22
QCMD3
1.43
QCMD4
4.0
QCMD5
4.7
QCMD6
3.46
QCMD7
0.51
QCMD8
0.06
QCMD9
3.83
QCMD10
4.45
QCMD11
QCMD12
1.19
2.1
Result
QCMD1
0.25
Human Enterovirus B3
QCMD2
1.22
Echovirus 30
QCMD3
1.43
Coxsackie A9
QCMD4
4.0
Echovirus 11
QCMD5
4.7
Echovirus E11
QCMD6
3.46
Enterovirus 68
QCMD7
0.51
Enterovirus 83
QCMD8
0.06
No enteroviral hits
3.83
No enteroviral hits
QCMD10
QCMD11
4.45
1.19
Coxsackievirus B3
No enteroviral hits
QCMD12
2.1
Coxsackievirus A24
Sensitivity directly related to sequencing depth and virus abundance (see next slide)
107
4/636 Primers
qPCR
positive only
106
105
Viral copies/ml
Viral copies/ml
106
104
103
102
101
100
105
104
103
102
101
13-01 13-02 13-03 13-04 13-05 13-06 13-07 13-08 13-09 13-10 13-11 13-12
100
Sanger
Sequencing of
qPCR product
(R. Lloyd)
Illumina
Sequencing
(70% or 95%
seq.id)
13-01
13-02
13-03
Not
sequenced
Echovirus 30
Not
sequenced
Not
sequenced
Enterovirus B87,
CxB1, CxB4, and
Echovirus 30
Not
sequenced
Human
Enterovirus
B
3 paired
reads
Echovirus 30
(95% seq. id)
64 Paired
reads
Echovirus 24,
and CxA9
(80% seq. id)
Echovirus 11,
Enterovirus
79, and CxA9
(95% seq. id)
Echovirus E11
1 paired reads
Human
Enterovirus 68
(95% seq. id)
403 pairedreads
13-08
13-09
13-10
13-11
13-12
Not
sequenced
Not
sequenced
No hit
Enterovirus 71
and CxA6
CxB4 and
Enterovirus B
Not
sequenced
Echovirus 3 (80%
seq. id)
No hit
No hit
CxB3*
No hit
Enterovirus
C104 (70%
seq. id)
Sequencing reads were mapped against CMMR custom viral database at high and low stringency levels,
80%-95% sequence identity.
Constraints
1 mapped paired-end read per sample
50 bases aligned to reference genome
Both ends of the pair must hit the same genome(s)
nPOD-V Mission
nPOD is a JDRF sponsored repository for pristine human
tissues collected from T1D individuals post-mortum
Identify viral nucleic acids associated with nPOD tissues
using NexGen Sequencing and qPCR based approaches
Improve the technologies that will make this possible
Can we find a candidate viral trigger(s)?
Mark Atkinson (UF)--nPOD
Alberto Pugliese (U Miami)nPOD-V
And help of many others!
Sample
And/Or
Virus enrichment
Random
amplification,
barcoding, and
pooling
Illumina
Sequencing
Analysis
Pipeline
semi-random primer
with an integrated
barcode
Even pilot analyses are providing unique insights in the maturation of the
microbiome from birth in infants from US and Europe
Associations with HLA are provocative and may underlie autoimmunity states
Deepest hunt for viral triggers for T1D is just beginning in two projects
Only DNA data for TEDDY thus far, RNA data pending (e.g. Enteroviruses)
Provocative hits in nPOD project
Acknowledgements
Nadim Ajami
Russ Carmical
Elicia Grace
BCM MVM
Xiangjun Tian
Richard Lloyd
Daniel Smith
Michael Holder
Lauren Railey
Lenka Kusic
Matthew Ross
Tulin Ayvaz
Tonya Bauch
Lisa Atkins
Matthew Wong
Auriole Tamegnon
Tatiana Fofanova
Tyler McCue
Megan Coombs
Lorenzo DAmico
Diane Smith Hutchinson
Nguyen Truong
Richard Gibbs
Donna Muzny
Ginger Metcalf
Harsha Doddapaneni