Anda di halaman 1dari 26

Distributed Hash Tables

David Tam
Patrick Pang

Presentation Outline

What is DHT (Distributed Hash Table)?

Why DHTs?

Applications

How lookup works?

Alternatives to DHTs

Performance Routing

Performance Load Balancing

Security Routing Attack

Security Inconsistent Behaviour

Comparison to Other Facilities

Current Research Projects

Conclusion

What is DHT?
Distributed application
get (key)
put(key, data)
Distributed hash table
node

node

data

node

DHT provides the information look up service for P2P


applications.
Nodes uniformly distributed across key space
Nodes form an overlay network
Nodes maintain list of neighbours in routing table
Decoupled from physical network topology
(Figure adopted from Frans Kaashoek)

Why DHTs?
Why Middleware?
Simplifies the development for large-scale distributed Apps
Better security and robustness
Simple API
Why Do We Need DHTs?
Simplifies the development for large-scale distributed Apps
Better security and robustness
Simple API
Exploits P2P resources

Applications
Anything that requires a hash table
Databases, FSes, storage, archival
Web serving, caching
Content distribution
Query & indexing
Naming systems
Communication primitives
Chat services
Application-layer multi-casting
Event notification services
Publish/subscribe systems ?

How lookup works?


Example: Chord [Stoica et. al.]

0
15

Finger Table for Node 2

start interval
3
[3,4)

succ.
5

6
10

[4,6)
[6,10)
[10,2)

7
10

1
2

14

13

12
5

11
10
7
9

How lookup works?


Example: Chord

0
15

Finger Table for Node 10

start interval
11 [11,12)

succ.
12

12

12

14
2

[12,14)
[14,2)
[2,10)

14
2

1
2

14

13

12
5

11
10
7
9

How lookup works?


Example: Chord

0
15

Finger Table for Node 10

start interval
11 [11,12)

succ.
12

12

12

14
2

[12,14)
[14,2)
[2,10)

14
2

1
2

14

13

12
5

11
10
7
9

How lookup works?


Example: Chord

0
15

Finger Table for Node 14

start interval
15 [15,0)

succ.
15

2
6

[0,2)
[2,6)
[6,13)

2
7

1
2

14

13

12
5

11
10
7
9

How lookup works?


Example: Chord

0
15

Finger Table for Node 14

start interval
15 [15,0)

succ.
15

2
6

[0,2)
[2,6)
[6,13)

2
7

1
2

14

13

12
5

11
10
7
9

How lookup works?


Example: Chord

0
15

1
2

14
Now Node 2 can retrive
information for key 0
from Node 1.

3
4

12
5

11
10
7
9

Alternatives to DHTs
Distributed file system
Centralized lookup
P2P flooding queries

N1
N4

Target

N9

Server

Client

Internet

Client

N1
N4

Target
Client

Client
Server

N9

N
2

N6

N2

N3
N7 N8

N3

Start

N1
0

Start

DB
N10
N6
N8
N7

(Figures adopted from Frans Kaashoek)

Performance -- Lookup
Purpose -- to locate a target node
Each step, try to get closer to locating target node
Ask a closer neighbour
Performance & scalability tied directly to lookup algorithm
2 Aspects to Performance
Path latency
Lookup path length (# hops)

2 Aspects to Scalability
size of routing table O(log N)
lookup path length O(log N)

3 Techniques
proximity lookup
proximity neighbour
selection

Performance -- Load Balancing


Issues
Hot-spots
Content
Lookup
Heterogeneous nodes & paths
System flux
Solution
Replication is the key
Also good for fault-tolerance
Cache lookup answers backwards along path

Security Incorrect Lookup (1)


When asked for the next hop, give a wrong answer
Finger Table for Node 2
start

interval

succ.

[3,4)

[4,6)

[6,10)

10

[10,2)

10

Node 2 to Node 10: Please tell


me how to reach key 0 .

14

15

1
2

13

3
4

12
11

5
10
9

7
8

Security Incorrect Lookup (2)


When asked for the next hop, give a wrong answer
Finger Table for Node 10
start

interval

succ.

11

[11,12)

12

12

[12,14)

12

14

[14,2)

14

[2,10)

Node 2 to Node 10: Please tell


me how to reach key 0 .
Node 10 answers: ask Node 14

14

15

1
2

13

3
4

12
11

5
10
9

7
8

Security Incorrect Lookup (3)


When asked for the next hop, give a wrong answer
Finger Table for Node 14
start

interval

succ.

15

[15,0)

15

[0,2)

[2,6)

[6,13)

Node 2 to Node 14: Please tell


me how to reach key 0 .
Node 14 answers: ask Node 10

14

15

1
2

13

3
4

12
11

5
10
9

7
8

Security Incorrect Lookup (4)


Solution [Sit and Morris]:
Define verifiable system invariant
Allow the querier to observe lookup progress
Our idea how this can be implemented:
Concretely, using an integral monotonically
decreasing quantity to implement the idea of
progress.
The concept of monotonically decreasing quantity
has been used in program construction guaranteeing
total correctness. [Parnas]

Security Inconsistent Behaviour


Inconsistent Behaviour, i.e., lie intelligibly
Sybil attack [Kaashoek]
Solution 1: public key solution

Security Inconsistent Behaviour


Inconsistent Behaviour, i.e., lie intelligibly
Sybil attack [Kaashoek]
Solution 1: public key solution
Solution 2: Byzantine Protocol
Byzantine Generals Problem:
How to find out the traitors
among the Generals? [Lamport]

Security Inconsistent Behaviour


Inconsistent Behaviour, i.e., lie intelligibly
Sybil attack [Kaashoek]
Solution 1: public key solution
Solution 2: Byzantine Protocol
Commander
Byzantine Generals Problem:

attack

attack

How to find out the traitors


among the Generals? [Lamport]
Lieutenant 1

he said retreat

Lieutenant 2

Security Inconsistent Behaviour


Inconsistent Behaviour, i.e., lie intelligibly
Sybil attack [Kaashoek]
Solution 1: public key solution
Solution 2: Byzantine Protocol
Commander
Byzantine Generals Problem:

attack

retreat

How to find out the traitors


among the Generals? [Lamport]
Lieutenant 1

he said retreat

Lieutenant 2

Comparison to Other Facilities


Facility

Abstraction Easy Use/Prg Scalability

Load-Balance

DHT

high

high

high

yes

Centralized Lookup

medium

medium

low

no

P2P flooding queries

medium

high

low

no

Distributed FS

low

medium

medium

no

Facility

Fault-Tolerance

Self-Org Admin

DHT

high

yes

low

Centralized Lookup

low

no

medium

P2P flooding queries depends

yes

low

Distributed FS

no

high

medium

Research Projects
Iris security & fault-tolerance US Govt
Chord circular key space
Pastry circular key space
Tapestry hypercube space
CAN n-dimensional key space
Kelips n-dimensional key space
DDS -- middleware platform for internet service construction
-- cluster-based
-- incremental scalability

Summary
Good middleware platform
Exploits P2P networks
An exciting new research area

References
Lamport, Leslie et. al. The Byzantine Generals Problem
Sit, Emil, Morris, Robert. Security Considerations for Peerto-Peer Distributed Hash Tables
Kaashoek, Frans. Distributed Hash Tables Building largesacle, robust distributed applications
Stoica, Ion et. al. Chord: A scalable peer-to-peer lookup
service for Internet applications
Parnas, D. L. Connecting Theory to Practice: Software
Engineering Programme

Anda mungkin juga menyukai