David Tam
Patrick Pang
Presentation Outline
Why DHTs?
Applications
Alternatives to DHTs
Performance Routing
Conclusion
What is DHT?
Distributed application
get (key)
put(key, data)
Distributed hash table
node
node
data
node
Why DHTs?
Why Middleware?
Simplifies the development for large-scale distributed Apps
Better security and robustness
Simple API
Why Do We Need DHTs?
Simplifies the development for large-scale distributed Apps
Better security and robustness
Simple API
Exploits P2P resources
Applications
Anything that requires a hash table
Databases, FSes, storage, archival
Web serving, caching
Content distribution
Query & indexing
Naming systems
Communication primitives
Chat services
Application-layer multi-casting
Event notification services
Publish/subscribe systems ?
0
15
start interval
3
[3,4)
succ.
5
6
10
[4,6)
[6,10)
[10,2)
7
10
1
2
14
13
12
5
11
10
7
9
0
15
start interval
11 [11,12)
succ.
12
12
12
14
2
[12,14)
[14,2)
[2,10)
14
2
1
2
14
13
12
5
11
10
7
9
0
15
start interval
11 [11,12)
succ.
12
12
12
14
2
[12,14)
[14,2)
[2,10)
14
2
1
2
14
13
12
5
11
10
7
9
0
15
start interval
15 [15,0)
succ.
15
2
6
[0,2)
[2,6)
[6,13)
2
7
1
2
14
13
12
5
11
10
7
9
0
15
start interval
15 [15,0)
succ.
15
2
6
[0,2)
[2,6)
[6,13)
2
7
1
2
14
13
12
5
11
10
7
9
0
15
1
2
14
Now Node 2 can retrive
information for key 0
from Node 1.
3
4
12
5
11
10
7
9
Alternatives to DHTs
Distributed file system
Centralized lookup
P2P flooding queries
N1
N4
Target
N9
Server
Client
Internet
Client
N1
N4
Target
Client
Client
Server
N9
N
2
N6
N2
N3
N7 N8
N3
Start
N1
0
Start
DB
N10
N6
N8
N7
Performance -- Lookup
Purpose -- to locate a target node
Each step, try to get closer to locating target node
Ask a closer neighbour
Performance & scalability tied directly to lookup algorithm
2 Aspects to Performance
Path latency
Lookup path length (# hops)
2 Aspects to Scalability
size of routing table O(log N)
lookup path length O(log N)
3 Techniques
proximity lookup
proximity neighbour
selection
interval
succ.
[3,4)
[4,6)
[6,10)
10
[10,2)
10
14
15
1
2
13
3
4
12
11
5
10
9
7
8
interval
succ.
11
[11,12)
12
12
[12,14)
12
14
[14,2)
14
[2,10)
14
15
1
2
13
3
4
12
11
5
10
9
7
8
interval
succ.
15
[15,0)
15
[0,2)
[2,6)
[6,13)
14
15
1
2
13
3
4
12
11
5
10
9
7
8
attack
attack
he said retreat
Lieutenant 2
attack
retreat
he said retreat
Lieutenant 2
Load-Balance
DHT
high
high
high
yes
Centralized Lookup
medium
medium
low
no
medium
high
low
no
Distributed FS
low
medium
medium
no
Facility
Fault-Tolerance
Self-Org Admin
DHT
high
yes
low
Centralized Lookup
low
no
medium
yes
low
Distributed FS
no
high
medium
Research Projects
Iris security & fault-tolerance US Govt
Chord circular key space
Pastry circular key space
Tapestry hypercube space
CAN n-dimensional key space
Kelips n-dimensional key space
DDS -- middleware platform for internet service construction
-- cluster-based
-- incremental scalability
Summary
Good middleware platform
Exploits P2P networks
An exciting new research area
References
Lamport, Leslie et. al. The Byzantine Generals Problem
Sit, Emil, Morris, Robert. Security Considerations for Peerto-Peer Distributed Hash Tables
Kaashoek, Frans. Distributed Hash Tables Building largesacle, robust distributed applications
Stoica, Ion et. al. Chord: A scalable peer-to-peer lookup
service for Internet applications
Parnas, D. L. Connecting Theory to Practice: Software
Engineering Programme