Anda di halaman 1dari 38

Pathway Analysis: Google for Proteomics?..

Roman Zubarev Roman.Zubarev@ki.se

Physiological Chemistry I, Department for Medical Biochemistry & Biophysics, Karolinska Institutet, Stockholm

Eukaryotic Cell

Cellular complexity

J. David Sweatt, Artist and Scientist

Experiment and Analysis in Proteomics


Experiment: large-scale Analysis: few individual molecules?
Reductionist Molecular Biology: Pathway Biology:

Top-down vs Bottom-up for protein ID and characterization


Top-down
Intact protein MS

Bottom-up
Fragments

Combined Top-down/Bottom-up
Intact protein

MS

Fragments

MW

MS2

MSn
Dissociation

MSn
Dissociation

Dissociation

MSn
Fragments

MS2
MW of peptides Enzymatic digest

MS2
MW of peptides Enzymatic digest

MS

MS

Low throughput

High throughput, but information lost!

Not yet High throughput

Protein Identification by Tandem Mass Spectrometry


Tryptic peptides Protein sequence
ILNKPEDETHLEAQPTDASAQFIRNLQISNE DLSKEPSISREDLISKEQIVIRSSRQPQSQNPK LPLSILKEKHLRNATLGSEETTEHTPSDASTT EGKLMELGHKIMRNLENTVKETIKYLKSLF SHAFEVVKT EDLISK EQIVIR LPLSILK NLENTVK LMELGHK QPQSQNPK NLQISNEDLSK SLFSHAFEVVK NATLGSEETTEHTPSDASTTEGK ILNKPEDETHLEAQPTDASAQFIR

Enzymatic digest

Fragment masses
Tryptic peptide NLENTVK MS/MS

Fragmentation N L E N T V K

Molecular mass: 817.44

232.17 346.22 388.20 444.28 484.33 511.37 555.40 623.45 666.44 712.52

Your Peptide/ protein is this:

Score = 77

Mass Accuracy 1 ppm = 1 part per million

Deep vs Top Proteomics

% of proteome coverage

75

50

25

2002 2003 2004 2005 2006 2007 2008 2009

Protein Identification, Quantification

Top proteome : 1500-3000 proteins, 5000-9000 peptides No protein separation No peptide separation (on-line reverse-phase LC only) Single LC/MS experiment, 0.5-2.0 h long

What is Pathway Analysis ?


Interpretation A:

Deciphering signaling pathways


MS Interactions Database Database filling Fundamental studies

Pathway

Functional Pathway Proteomics

What is Pathway Analysis ?


Interpretation B:

Use of known signaling mechanisms to identify activated pathways


MS Pathway Database Using database to identify events Application research

Analytical Pathway Proteomics


Molecular Diagnostic of Cells GOOGLE for Proteomics

Google for Proteomics*

*with apologies to Google Corp.

Google for Proteomics*

Load your proteomics data here

Click here

*with apologies to Google Corp.

Analytical Pathway Biology

Full Proteomics Data:

Sample

Pathway Search Engine

Up- and DownRegulated (Activated) Pathways/Key Nodes Weight Factors

Control

Zubarev, R. A.; Nielsen, M. L.; Savitski, M. M.; Kel-Margoulis, O.; Wingender, E.; Kel, A. Identification of dominant signaling pathways from proteomics expression data, J. Proteomics, 2008, 1, 89-96.

Pathway Analysis Workflow


Sample Cells Control Cells

Protein ID and abundance


RT INTENSITY SEQUENCE MASCOT RT APEX INTENSITY MASCOT MZ MASCOT MZ QUANTI IPI MASCOT SCORE PROTEIN SCORE FULLINT K.LVTDLTK. 28.844.61 43.460.716.000.00 RT INTENSITY 49.000.000 CORE 357.350.006 V 28.747.738 8 0 SEQUENCE MASCOT RT APEX 2.626.406.738 INTENSITY MASCOT MZ MASCOT MZ QUANTI IPI MASCOT S PROTEIN SCORE FULLINT 395.239.288 395.238.556 IPI00022434 K.YLYEIAR. 31.538.35 K.LVTDLTK. 28.844.61 43.460.716.000.00 RT INTENSITY 395.239.288 395.238.556 IPI00022434 464.250.610 464.251.129 IPI00022434 51.000.000 R 31.408.630 2.626.406.738 49.000.000 357.350.006 357.350.006 V 28.747.738 5 8 740.226.318 7.096.348.500.000 0 MZ MASCOT MZ QUANTI IPI MASCOT SCORE PROTEIN SCORE FULLINT K.CCTESLV SEQUENCE MASCOT RT APEX INTENSITY MASCOT 27.209.02 19.580.344.000.00 K.YLYEIAR. 31.538.35 K.LVTDLTK. 28.844.61 740.226.318 7.096.348.500.000 43.460.716.000.00 569.752.808 569.753.174 IPI00022434 66.000.000 NR.R 26.978.939 0 464.250.610 464.251.129 IPI00022434 51.000.000 357.350.006 357.350.006 R 31.408.630 4 8.670.519.000.000 5 2.626.406.738 395.239.288 395.238.556 IPI00022434 49.000.000 357.350.006 V 28.747.738 8 0 K.YICENQD 27.505.02 134.643.984.000.0 K.CCTESLV 27.209.02 19.580.344.000.00 K.YLYEIAR. 31.538.35 722.324.402 722.323.059 IPI00022434 40.000.000 357.350.006 SISSK.L 27.003.353 2 8.670.519.000.000 00 569.752.808 569.753.174 IPI00022434 66.000.000 357.350.006 NR.R 26.978.939 4 8.670.519.000.000 0 740.226.318 7.096.348.500.000 464.250.610 464.251.129 IPI00022434 51.000.000 357.350.006 R 31.408.630 5 K.YICENQD 27.505.02 134.643.984.000.0 K.CCTESLV 19.580.344.000.00 K.VPQVSTPSISSK.L 27.003.353 32.766.89 27.209.02 722.324.402 722.323.059 IPI00022434 40.000.000 357.350.006 2 8.670.519.000.000 00 569.752.808 IPI00022434 NR.R 4 8.670.519.000.000 0 756.424.805 756.427.673 569.753.174 IPI00022434 74.000.000 66.000.000 357.350.006 357.350.006 TLVEVSR.N 32.628.799 26.978.939 1 8.670.519.000.000 81.970.281.250 27.505.02 134.643.984.000.0 K.KVPQVST K.YICENQD K.VPQVSTP 32.766.89 722.324.402 722.323.059 IPI00022434 40.000.000 357.350.006 357.350.006 SISSK.L 27.003.353 1 8.670.519.000.000 81.970.281.250 00 2 8.670.519.000.000 31.125.68 PTLVEVSR. 756.424.805 756.427.673 IPI00022434 74.000.000 TLVEVSR.N 32.628.799 547.317.688 547.317.871 IPI00022434 78.000.000 357.350.006 9 8.670.519.000.000 497.071.281.250 N 30.977.934 K.KVPQVST K.VPQVSTP 32.766.89 31.125.68 PTLVEVSR. 756.424.805 756.427.673 IPI00022434 74.000.000 357.350.006 357.350.006 TLVEVSR.N 32.628.799 9 8.670.519.000.000 497.071.281.250 1 8.670.519.000.000 81.970.281.250 547.317.688 547.317.871 IPI00022434 78.000.000 N 30.977.934 K.KVPQVST 31.125.68 PTLVEVSR. 547.317.688 547.317.871 IPI00022434 78.000.000 357.350.006 9 8.670.519.000.000 497.071.281.250 N 30.977.934

IPI #
1 2 3

Sample 4 5 6 7 8 9 10 11 12 13

Protein Abundance

Signal molecule Receptor Adaptors

BioBase, Germany

Proteome
Kinases

Transcription factors

mRNA

KeyNode-Mediated Analysis: Upstream


Stimulus

Score
KeyNode1 3050 KeyNode2 2987 KeyNode3 2073 KeyNodeN 25

Pathway score: (keynode score)


Proteins Observed

Quantitative Pathway Analysis


Pathway Search Engine proof of principle
200 100

Sample - Control1 Stress EGF

Number of pathways

0 200 100 0 200 100 0 -0.5

Sample - Control2 Stress Control1 - Control2 Stress


-0.4 -0.3 -0.2 -0.1 0

EGF

EGF
0.1 0.2 0.3 0.4 0.5

Score = S(sample) - S(control), arb. units

Zubarev, R. A.; Nielsen, M. L.; Savitski, M. M.; Kel-Margoulis, O.; Wingender, E.; Kel, A. Identification of dominant signaling pathways from proteomics expression data, J. Proteomics, 2008, 1, 89-96.

Quantitative Pathway Analysis


Pathway Search Engine first application

Sthl, S.; Fung, Y.M.E.; Adams, C. M.; Lengqvist, J.; Mrk, B.; Stenerlw, B.; Lewensohn, R.; Lehti, J.; Zubarev, R. A.; Viktorsson, K. Proteomics and Pathway Analysis Identifies JNK-signaling as Critical for High-LET Radiation-induced Apoptosis in Non-Small Lung Cancer Cells, Mol. Cell Proteomics, 2009, 8, 1117-1129.

Pathway Analysis: Validation of JNK++

Tumor marker Kinase upstream of JNK Sthl S et al., Mol. Cell Proteomics, 2009, 8, 1117-1129.

DYNAMIC PROTEOMICS APPROACH for drug target identification: by the speed of change (1 h), 10% selection by the total change in 48 h, 10% selection Overall: top 3% (35 proteins)

Pathway Analysis of Dynamic Proteomics Data


I) Protein mapping on Pathways

Proteins from input list

Pathway Analysis of Dynamic Proteomics Data


Upstream Search: for Speed, 0-60 min for Magnitude, 0-2800 min Key Nodes

KN Scoring: S

= (SA SB)*log2(SA/SB)

Top KN is selected: one for Speed, one for Magnitude

Pathway Analysis of Dynamic Proteomics Data


Downstream KN search Two top KNs

Overlapping Molecules = Drug Target Candidates

Identification of TOPI as the drug target from 812 proteins in the input list
Rank, magnitude

Rank, speed
Overlap of downstream lists from Fgamma, c-FLIP(h): 9 proteins, of which 2 from input list (known dynamics): TOPI, (speed + magnitude)-rank 228 26S proteasome, (speed+ magnitude)-rank 787

What if TOPI is removed from Input list?..

Rank, magnitude

Rank, speed
Overlap of downstream lists from Fgamma, c-FLIP(h): 4 proteins, none from the input list: TOPI CKII Two NR-related proteins

What if other proteins (besides TOPI) removed from Input list? -20% random -50% random -182 top scoring

TOPI + 8 other

TOPI + 11 other

TOPI + 342 other Of which: 52 from Input List TOPI - #12

Thus, Pathway Analysis is a powerful method for Drug Target discovery by Dynamic Proteomics
D.M. Good and R.A. Zubarev, submitted

Myeloid-derived suppressor cells (MDSC): accumulate in patients and animals with cancer where they mediate systemic immune suppression and obstruct immune-based cancer therapies.

MDSC suppress antitumor immunity through a variety of diverse mechanisms.

Suzanne Ostrand-Rosenberg and Pratima Sinha

Proteomic Pathway Analysis Reveals Inflammation Increases Myeloid-Derived Suppressor Cell Resistance to Apoptosis

Chornoguz et al., Mol. Cell. Proteomics, in press.

Proteomic Pathway Analysis Reveals Inflammation Increases Myeloid-Derived Suppressor Cell Resistance to Apoptosis

4T1 - spontaneously metastatic mammary carcinoma 4T1/IL-1 - transfected with the IL-1 gene (high levels of IL-1 heighten inflammation in the tumor microenvironment)

Chornoguz et al., Mol. Cell. Proteomics, in press.

Proteomic Pathway Analysis Reveals Inflammation Increases Myeloid-Derived Suppressor Cell Resistance to Apoptosis

Caspase

Fas

Chornoguz et al., Mol. Cell. Proteomics, in press.

Proteomic Pathway Analysis Reveals Inflammation Increases Myeloid-Derived Suppressor Cell Resistance to Apoptosis

Fas agonist Jo2 mAb

Proteomic Pathway Analysis Reveals Inflammation Increases Myeloid-Derived Suppressor Cell Resistance to Apoptosis

Conclusion: inflammation enhances MDSC accumulation by increasing MDSC resistance to Fas-mediated apoptosis.

Activated keynodes in Chronic Pain Mouse model: 4 + 4 mice


log(evalues)
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13

400 300 200 100 0 -8 -6 -4 -2 0 no. of key molecules 1192 key molecules MO0000 18008 KOR MO0000 22259 RSK1 Ranks 1 143 1 26 1 4 2 4 6 8 10 12 Sample Random

Sample Random
0 5 0 5 22 67 92 113 364 227 84 91 87 21 12 0 1 0 0 0 0 1 0 0 1 1 8 25 73 198 297 285 173 100 27 2 2 0 0 0 0 0 0 0

log(evalues) 12.3 1 1 3 7.59

log(evalues) -4.9 767 1190 1081 1107 5 -1.4 1051 706 323 1044 5 Random Ranks

Downregulated keynodes in SNL versus Control Tissue samples


MO00001 7741 LAP MO00003 5620 LAP1 MO00000 0208 RIP MO00001 7811 proCaspase-9 MO00001 8276 proCaspase-10 1091 1165 1188 1185 -7.19 1083 1154 1189 1188 -7.31 1047 1173 1191 1186 -7.63 1178 1189 1186 1105 -7.70 1180 1186 1181 1163 -7.77 114 605 264 578 209 262 889 510 775 951 1.06 571 -0.71 202 -0.43

285 1185

210 1064 1142 -1.57 734 886 227 0.64

With G. Bakalkin, Uppsala

Preliminary quantitative molecular model of chronic pain Predictive model: P = f(K1, K2, KN) = F(proteome) P = disease progression

VFth = Hap1 - KOR +0.33*(insulin:InsR) - 0.60*cyclin_E;


VFth - paw withdrawal thresholds in g measured by von Frey filaments

Hap1 - Huntingtin-associated protein-1, which degradation is stimulated by

insulin through ubiquitylation. Involved in trafficking of GABA-A receptor. KOR - kappa opioid receptor
1.40E+07

1.20E+07 1.00E+07

Predict VF test

R=0.962

8.00E+06 6.00E+06

4.00E+06 2.00E+06

0.00E+00

0
-2.00E+06

10

12

14

16

-4.00E+06

VF test, day 10

Take-home messages
Pathway Analysis provides activation levels of key nodes and signaling pathways starting from expression proteomics data The final goal is: - drug target discovery; - disease mechanism discovery; - patient stratification; - predictive quantitative molecular model of a disease Pathway Analysis findings need to be validated!

Anda mungkin juga menyukai