Anda di halaman 1dari 44

Mass Spectrometry I

Basic Data Processing


Mass spectrometry
• A mass spectrometer measures molecular
masses.
• The mass unit is called dalton, which is 1/12 of
the mass of a carbon atom, and is about the
mass of one hydrogen atom.
• If there is a mixture of different molecules in a
sample, all the masses are measured
simultaneously. So you get a spectrum.
Some Pictures
MALDI-R Q-Tof Micro

FT-ICR LTQ-Orbitrap
Each peak corresponds to a different
type of molecule in your sample
2790.22
100

peak list
2791.23

1324.60
2789.22
...

%
1325.62
2792.23
2789.22 3597.0
2790.22 5018.0
1265.62 2466.18
2791.23 4406.0
2465.20 2467.19
2793.23
2792.23 2868.0
1326.60 1759.93

1760.93
1974.94 2793.23 1234.0
2468.20

1477.62 1748.86
1975.93
2356.10 2794.20 3104.41

1327.61 1478.61 1761.92 1976.92 2355.11 2469.17

0
1179.41 1460.59 1540.63 2179.87
2746.23 2795.06 3103.43 3106.42

m/z
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000
Three Components of an MS
• A typical mass spectrometer contains
– Ionizer
– Mass analyzer
– Detector
• Ion source charges the to-be-measured molecules.
– Charge can be negative but often positive.
– Two common types: MALDI and ESI.
– John B. Fenn & Koichi Tanaka 2002 Nobel Prize in Chemistry
for Electrospray and MALDI
• Mass analyzer separates ions according to the mass to
charge ratio (m/z) of the ions.
– Iontrap, TOF, Quadrupole, FTICR.
• Detector detects the ions.
Ionization (1): MALDI
Matrix Assisted Laser Desorption/Ionization

Sample is co-crystallized with matrix (solid)


Formation of singly charged ions

Koichi Tanaka, Nobel Prize 2002 Other ionization method exists.


Mass Analyzer (1) – TOF
• Time of Flight.
Detector
+ -

Other mass
analyzer exists.
Time of flight is proportional to sqrt(m/z)
Putting Them Together
MALDI Time-of-flight
MALDI TOF

Drift region (D)

Average time in TOF: 10-7 sec : average speed 1-2 x 105 km/h
MALDI-TOF Linear

Mass range = 800-200,000

Sensitivity and accuracy decrease rapidly with size !


MALDI-TOF Linear vs Reflectron Mode

• Linear = poor resolution due to velocity variation of ions with the same m/z
•Reflectron = Contact lens for a near sighted machine!

Reflectron gives much better resolution for mass < 6,000


Protein “identification” with intact mass

• We measure the intact mass of the protein.


• Then search in the protein database to find a
protein with the same mass.
• Good idea but there are too many proteins
with the same mass.
• In the rest of the lecture we study more
sophisticated methods and why protein ID is
important.
Complications

isotopes

widened peaks

profile
Centroiding
Another example with lower resolution
Isotopes
Back to Basics…
Chemical Composition of Living Matter
27 of 92 natural elements are essential.
Elements in biomolecules (organic matter):
H, C, N, O, P, S
These elements represent approximately 92% of
dry weight.

Organic Matter
Organized in "building blocks"
amino acids polypeptides ( proteins)

monosaccharides starch, glycogen

nucleic acids DNA, RNA


Mass (Weights) of Atoms and Molecules
element nominal exact Percent average
mass mass abundance mass
C 12 12.00000 98.9%
13 13.00335 1.1% 12.00115
H 1 1.00783 99.98%
2 2.0140 0.02% 1.008665

O 16 15.99491 99.8%
18 17.9992 0.02% 15.994
N 14 14.00307 99.63%
15 15.00011 0.37% 14.0067
S 32 31.97207 94.93%
33 32.97146 0.76%
34 33.96787 4.29% excercise
Mass or Molecular Weight of molecules

Ethyl acetate C4H8O2

4 C12 4 x 12.0000 48.0000


8 H1 8 x 1.0078 8.064
2 O16 2 x 15.99949 31.9898

Nominal Mass: 48 + 8 + 32 = 88

Monoisotopic Mass: 88.0555

Average Mass: 48.04446 + 8.06932 + 31.988 = 88.10178


Amino Acids
• There are 20 amino acids. All have the
same basic structure but with different side
chains:

• Examples: side chain group

H
Glycine, or Gly, or G
Arginine, or Arg, or R
All the 20 Structures

* Picture copied from Dr. R.J.


Huskey’s website:
http://www.people.virginia.ed
u/~rjh9u/aminacid.html
Peptides and Proteins

GR

Glycine, or Gly, or G

Arginine, or Arg, or R

N-terminal C-terminal

peptide bonds
Mass of Amino Acids Residues

Exact Mass of Amino Acid Residues in Proteins

Gly G 57.02150
Ala A 71.03720
Gln Q 128.05860
Lys K 128.09500
Glu E 129.04270

Note: Leu (L) = Ile (I) = 113.08410


Amino Acid Table
AA Codes Mono. AA Codes Mono.
Gly G 57.021464 I Asp D 115.02694
O
Ala A 71.037114 Gln Q 128.05858
N
Ser S 87.032029 S Lys K 128.09496
O
Pro P 97.052764 Glu E 129.04259
U
Val V 99.068414 R Met M 131.04048
C
Thr T 101.04768 His H 137.05891
E
Cys C 103.00919 . Phe F 147.06841
C
Leu L 113.08406 Arg R 156.10111
O
Ile I 113.08406 M CMC 161.01467
Asn N 114.04293 Tyr Y 163.06333
- - - Trp W 186.07931
Cysteine

Proteins are often treated so that cysteine becomes


carboxyamidomethyl cysteine (CamC) or Carboxymethyl
(CmC) in order to break the disulphide bonds.
CamC = 160.03
Mass of Peptides and Proteins

Ala-Ser-Phe (ASF)

tripeptide (MW 71.04+87.03+147.07+18.01)=323.15


More precisely: monoisotopic mass 323.1481
average mass 323.3490
In a mass spectrum

Deconvolution adds all the isotopic


peaks to the monoisotopic peak.
So, the later process does not need
Monoisotope peak to worry about the isotopes.

isotope peaks

323.15 324.15 325.15


Check the difference
ESI and Multiply Charged Ions
Electrospray
Ionization (2) – ESI

Electrospray Ionization: Formation


of Charged Droplets

Formation of multiply charged ions


Multiply Charged Ions
• The same molecules may be charged
differently, and therefore form a few peaks in
the spectrum.
162.08 323.15

162.58 324.15

163.08 325.15

(M+3)/3 (M+2)/2 (M+1)/1 m/z


For protein/peptide with positive charges, the charge is obtained from adding
protons (which has mass approx. 1 dalton. As a result, a molecule with mass M
will have peaks at (M+Z)/Z
How to determine charge states?
• Isotope ions when resolution is enough.
• Check different charge states when resolution
is not enough.
Exercise

395.73

396.22

397.24
Exercise
Exercise
1211.9

1304.7
1413.2

1541.9

(A) “Multi-charge envelope” (B) After “Charge-deconvolution algorithm”


Baseline
Baseline correction
Convex Hull Method

convex

not convex
Convex Hull

• A convex hull is such that all the data points


are above the lines and their extensions.
How to calculate convex hull?
• Stack S contains all the data points that
form the convex hull so far.
• Data point D[i] = (D[i].x, D[i].y).

Algorithm:

1. S.push( D[0] ); s.push(D[1])


2. for i from 2 to n
2.1 while D[i], S.top(), S.secondtop() are
concave
2.1.1 S.pop();
2.2 S.push(D[i]);
3. return S

S.top()

S.secondtop() D[i]
Analyze the convex hull algorithm
• Correctness
– The algorithm finishes.
– The output is a convex hull.
– The proof will be included in an assignment.
• Time complexity
– O(n) time.
– Proof: each point is checked only once, and added
to (and therefore removed from) the stack at most
once.
Summarize of spectrum preprocessing
• Baseline correction
• Centroiding
• Charge recognition and deconvolution
• Noise removal

Anda mungkin juga menyukai