1
Bacterial
based
informaAon
storage
device
2
This
year,
The
CUHK…
3
In
AddiAon…
Coding System
EncrypAon DecrypAon
6
Coding
System
Encryp2on
Decryp2on
Coding System
MP
Storage
System
Encoding table
DNA sequence
Compression
7
Original message input
9
Coding
System
Coding
-‐
Compression
EncrypAon
DecrypAon
MP Storage System
10
Homopolymer
12
Coding
System
EncrypAon
EncrypAon
DecrypAon
MP Storage System
13
Simulation Analysis
8074
Characters!!!
14
FragmentaAon
of
message
• Larger
than
the
maximum
vector
inser2on
size
• Limita2on
of
current
DNA
synthesis
technology
→
Split
the
message
into
different
parts
How
do
you
deal
with
the
problem
of
posi2oning?
Postal Code
15
Storage
–
Massively
parallel
High Precision
Low
Precision
16
Header
of
AGAT
AGAC
AGTA
AGAG
2nd
fragment
Header
of
1st
fragment
Header
of
3rd
fragment
AGAT
AGAC
AGAG
AGCT
Header
of
2
1st
fragment
Header
of
3rd
3
fragment
0301
0302
0303
0321
Only
18
cells!!!
19
Capacity
20
Coding
System
DecrypAon
EncrypAon
DecrypAon
MP
Storage
System
sequencing
Checksum system
21
Checksum
Mechanism
F1
F2
F3
Checksum
CRC64
F2
F3
F1
Checksum
1
CRC64
F1
F2
F3
Checksum
2
22
23
To
prove
our
concept…
Coding
(extended
ASCII
code)
EncrypAon
DecrypAon
(message
+
rci
system)
(sequencing)
Storage
(in
E.coli
DH5
α)
24
25
Message
• we
must
learn
to
live
together
as
brothers
or
perish
together
as
ftools
ools
<<code
form
Dr.
Mar2n
Luther
King,
Jr.,
a
prominent
leader
in
the
African
American
civil
rights
movement
>>
eg. “tools”
Our message (70 characters)
DNA encoding:
TGTATCGGTC
GGTCGATGAG
(20bp)
DNA Encoding(280bp)
26
Structure
of
message
27
Parts
designed
Message
gene
template
(438bp)
Synthesized
DNA
29
ExpectaAon
30
Results
• Inverted
and
original
message
were
found
• No
loss
of
DNA
Checksum
and
high
throughput
sequencing!!!
31
High
throughput
sequencing
32
33
To
summarize…
Coding System
EncrypAon
DecrypAon
MP
Storage
System
• Experimental
proof
34
Bio-‐hard
disk
Storage
Hard
disk
2000GB
1
gram
E.coli
900,000GB
= 450
35
Rapid
&
Specific
access
• Parallel
storage
system
Insert
Header
&
Footer
in
every
message
fragment
Targeted sequencing
36
Future
ApplicaAon
ABC
37
Acknowledgement
38
Further
InformaAon
hqp://2010.igem.org/Team:Hong_Kong-‐CUHK
39
40
Q&A
41
Inversion
frequency
1. types
of
19-‐bp
repeat
sequences
(repeat-‐a>
repeat-‐d>
repeat-‐b
or
repeat-‐c)
2.
distance
between
repeat
sequences
(distance
increases,
frequency
increases)
3.
DNA
sequences
surrounding
the
repeat
sequences(symmetric
repeat
sequence
increase
frequency)
Inversion
frequency
4.
presence
of
HU
protein
(binding
of
HU
protein
to
DNA
might
facilitate
assembly
and/or
stabiliza2on
of
the
Rci-‐DNA
complex
at
the
recombina2on
sites,
increases
frequency)
5.
extent
of
DNA
supercoiling
(Inhibi2on
of
DNA
supercoilingà
decrease
Rci
ac2vityà
decrease
inversion
frequency)
To
avoid
muta2on
-‐
Reduce
reproduc2ve
cycle
-‐
Provide
favorable
condi2on
-‐
Move
on
to
eukaryotes,
make
use
of
eukaryotes’
proofreading
system(more
sophis2cated
DNA
repair
system)
Pacific
Bioscience
• Real
2me
• Read
Length
:
1000
-‐
10000bp
• Single
Molecule
Sequencing
• 30
minutes
sequencing
process