Name Enti N Locatn

Distributed Systems
Principles and Paradigms
Chapter 04
(version 15th May 2006)
Maarten van Steen

Vrije Universiteit Amsterdam, Faculty of Science
Dept. Mathematics and Computer Science
Room R4.20. Tel: (020) 444 7784
E-mail:steen@cs.vu.nl, URL: www.cs.vu.nl/ steen/
01 Introduction
02 Communication
03 Processes
04 Naming
05 Synchronization
06 Consistency and Replication
07 Fault Tolerance
08 Security
09 Distributed Object-Based Systems
10 Distributed File Systems
11 Distributed Document-Based Systems
12 Distributed Coordination-Based Systems
00 – 1 /
Naming Entities
Names, identifiers, and addresses

Name resolution
Name space implementation
04 – 1 Naming/4.1 Naming Entities

Naming
Essence: Names are used to denote entities in a dis-

tributed system. To operate on an entity, we need to
access it at an access point. Access points are enti-
ties that are named by means of an address.
Note: A location-independent name for an entity

E, is independent from the addresses of the access
points offered by E.

Identifiers
Pure name: A name that has no meaning at all; it

is just a random string. Pure names can be used for
comparison only.
Identifier: A name having the following properties:
P1 Each identifier refers to at most one entity

P2 Each entity is referred to by at most one identifier
P3 An identifier always refers to the same entity (pro-
hibits reusing an identifier)
Observation: An identifier need not necessarily be a

pure name, i.e., it may have content.
Question: Can the content of an identifier ever change?

Name Space (1/2)
Essence: a graph in which a leaf node represents

a (named) entity. A directory node is an entity that
refers to other nodes.
Data stored in n1 n0
n2: "elke" home keys
n3: "max"
n4: "steen" "/keys"
n1 n5
"/home/steen/keys"
elke steen
max
n2 n3 n4 keys
Leaf node
.twmrc mbox
Directory node
"/home/steen/mbox"
Note: A directory node contains a (directory) table of

(edge label, node identifier) pairs.

Name Space (2/2)
Observation: We can easily store all kinds of attributes

in a node, describing aspects of the entity the node
represents:
Type of the entity
An identifier for that entity
Address of the entity’s location
Nicknames
...
Observation: Directory nodes can also have attributes,

besides just storing a directory table with (edge label,
node identifier) pairs.

Name Resolution
Problem: To resolve a name we need a directory

node. How do we actually find that (initial) node?
Closure mechanism: The mechanism to select the

implicit context from which to start name resolution:

: start at a DNS name server
!" #"%$ : start at the local NFS file server
(possible recursive search)
&'&)(+*-,!&/.'.'.10'0'23. : dial a phone number
*4(!&5()065,3.752 : route to the VU’s Web server
Question: Why are closure mechanisms always im-

plicit?
Observation: A closure mechanism may also deter-

mine how name resolution should proceed

Name Linking (1/2)
Hard link: What we have described so far as a path

name: a name that is resolved by following a specific
path in a naming graph from one node to another.
Soft link: Allow a node O to contain a name of an-

other node:
First resolve O’s name (leading to O)
Read the content of O, yielding 389
Name resolution continues with 389
Observations:
The name resolution process determines that we
read the content of a node, in particular, the name
in the other node that we need to go to.
One way or the other, we know where and how to
start name resolution given /89

Name Linking (2/2)
Data stored in n1 n0
n2: "elke" home keys
n3: "max"
n4: "steen" n1 n5 "/keys"
elke steen
max
n2 n3 n4
Data stored in n6
Leaf node
.twmrc mbox keys "/keys"
Directory node
n6 "/home/steen/keys"
Observation: Node :; has only one name

Merging Name Spaces (1/3)
Problem: We have different name spaces that we

wish to access from any given name space.
Solution 1: Introduce a naming scheme by which

pathnames of different name spaces are simply con-
catenated (URLs).
<>=@? ACBDBE<>=@? FHGIFKJML FNPOQBP?PLPRSBTIU=WVVXNSB
<>=@?
ABSB Name of protocol used to talk with server
<>=@? FHGIFKJML FNPO Name space delimiter

Name of a node representing an FTP server,
B and containing the IP address of that server
?PLPRSBTIW=UVVXNSB Name space delimiter

Name of a (context) node in the name space
rooted at the context node mapped to the FTP
server

Solution 2: Introduce nodes that contain the name of
a node in a “foreign” name space, along with the infor-
mation how to select the initial context in that foreign
name space (Jade).
Name server Name server for foreign name space
Machine A Machine B
keys
remote home
vu
"nfs://flits.cs.vu.nl//home/steen"
Z steen
mbox
Y OS
Network
Reference to foreign name space
Mount point: (Directory) node in naming graph that

refers to other naming graph
Mounting point: (Directory) node in other naming graph

that is referred to.

Solution 3: Use only full pathnames, in which the

starting context is explicitly identified, and merge by
adding a new root node (DEC’s Global Name Space).
m0 home
n0 vu
home
vu
NS1 NS2
n0 m0
home keys mbox
"m0:/mbox"
elke max steen
.twmrc mbox keys

"n0:/home/steen/keys"
Note: In principle, you always have to start in the new

root

Name Space Implementation (1/2)
Basic issue: Distribute the name resolution process
as well as name space management across multiple
machines, by distributing nodes of the naming graph.
Consider a hierarchical naming graph and distinguish

three levels:
Global level: Consists of the high-level directory nodes.

Main aspect is that these directory nodes have to
be jointly managed by different administrations
Administrational level: Contains mid-level directory

nodes that can be grouped in such a way that
each group can be assigned to a separate ad-
ministration.
Managerial level: Consists of low-level directory nodes

within a single administration. Main issue is ef-
fectively mapping directory nodes to local name
servers.

Name Space Implementation (2/2)
d b [
Global
layer ` com ] edu
gov mil org net \ jp us
nl
Z sun
^ yale _ acm ieee
_ ac
` co
[ oce
a vu
] ` ] \ \ `
c Adminis-
eng cs eng jack jill keio nec cs
trational
layer
_ ai linda
` cs ` csl
ftp www
pc24
robot pub
b
b
globe
Mana-
gerial
layer
e Zone
index.txt
Item Global Administrational Managerial

1 Worldwide Organization Department
2 Few Many Vast numbers
3 Seconds Milliseconds Immediate
4 Lazy Immediate Immediate
5 Many None or few None
6 Yes Yes Sometimes
1: Geographical scale 4: Update propagation
2: # Nodes 5: # Replicas
3: Responsiveness 6: Client-side caching?

Iterative Name Resolution
fg T6h!ikjmlnf@oqpr389'*'oqrrsot/89'uwvyx is sent to z !fr!f{& re-
sponsible for jl|f
z "fr"f{& resolves fg T6h!ikjmlnf@ot/8)*x~} j l|f@* , return-
ing the identification (address) of z !fr"f@* , which
stores jmlnf@* .
Client sends fg E6H"ijl|f@*'oqpr38-,+oqrrsot/89'uwvyx to z "fr"f@*
etc.
1. <nl,vu,cs,ftp>
Root
2. #<nl>, <vu,cs,ftp> name server
nl
3. <vu,cs,ftp> Name server
nl node
Client's 4. #<vu>, <cs,ftp>
name vu
resolver 5. <cs,ftp> Name server
vu node
6. #<cs>, <ftp>
cs
7. <ftp> Name server
8. #<ftp> cs node
<nl,vu,cs,ftp>
#<nl,vu,cs,ftp> ftp
Nodes are
managed by
the same server

Recursive Name Resolution
fg T6h!ikjmlnf@oqpr389'*'oqrrsot/89'uwvyx is sent to z !fr!f{& re-

sponsible for jl|f
z " fr"f{& resolves fg T6h!ikjmlnf@ot/8)*x} jl|f@* , and sends
fg T6h!ikjmlnf@*)oqps/894,+oqrrsot/8)uvyx to z "fr"f@* , which stores
jl|f@* .
z "fr"f{&
waits for the result from z "fr"f@* , and re-
turns it to the client.
1. <nl,vu,cs,ftp>
Root
8. #<nl,vu,cs,ftp> name server 2. <vu,cs,ftp>
7. #<vu,cs,ftp> Name server

nl node 3. <cs,ftp>
Client's
name
resolver 6. #<cs,ftp>
Name server
vu node 4. <ftp>
5. #<ftp> Name server

cs node
<nl,vu,cs,ftp>
#<nl,vu,cs,ftp>
04 – 16
Caching in Recursive Name
Resolution
Server Should Looks up Passes to Receives Returns
for node resolve child and caches to requester
cs ftp # ftp — — # ftp
vu cs,ftp # cs ftp # ftp # cs
# cs, ftp
nl vu,cs,ftp # vu cs,ftp # cs # vu
# cs,ftp # vu,cs
# vu,cs,ftp
root nl,vu,cs,ftp # nl vu,cs,ftp # vu # nl
# vu,cs # nl,vu
# vu,cs,ftp # nl,vu,cs
# nl,vu,cs,ftp
Naming/4.1 Naming Entities

Scalability Issues (1/2)
Size scalability: We need to ensure that servers can

handle a large number of requests per time unit
high-level servers are in big trouble.
Solution: Assume (at least at global and administra-

tional level) that content of nodes hardly ever changes.
In that case, we can apply extensive replication by
mapping nodes to multiple servers, and start name
resolution at the nearest server.
Observation: An important attribute of many nodes

is the address where the represented entity can be
contacted. Replicating nodes makes large-scale tra-
ditional name servers unsuitable for locating mobile
entities.

Scalability Issues (2/2)
Geographical scalability: We need to ensure that

the name resolution process scales across large geo-
graphical distances.
Problem: By mapping nodes to servers that may, in

principle, be located anywhere, we introduce an im-
plicit location dependency in our naming scheme.
Solution: No general one available yet.

Locating Mobile Entities
Naming versus locating objects

Simple solutions
Home-based approaches
Hierarchical approaches
04 – 19 Naming/4.2 Locating Mobile Entities

Naming & Locating Objects (1/2)
Location service: Solely aimed at providing the ad-

dresses of the current locations of entities.
Assumption: Entities are mobile, so that their cur-

rent address may change frequently.
Naming service: Aimed at providing the content of

nodes in a name space, given a (compound) name.
Content consists of different (attribute,value) pairs.
Assumption: Node contents at global and adminis-

trational level is relatively stable for scalability rea-
sons.
Observation: If a traditional naming service is used

to locate entities, we also have to assume that node
contents at the managerial level is stable, as we can
use only names as identifiers (think of Web pages).

Naming & Locating Objects (2/2)
Problem: It is not realistic to assume stable node con-

tents down to the local naming level
Solution: Decouple naming from locating entities
Name Name Name Name Name Name Name Name
Z Naming
service
Entity ID
Z Location
Address Address
Address
Address
Address
Address
service
(a) (b)
Name: Any name in a traditional naming space

Entity ID: A true identifier
Address: Provides all information necessary to con-
tact an entity
Observation: An entity’s name is now completely in-

dependent from its location.
Question: What may be a typical address?

Simple Solutions
for Locating Entities
Broadcasting: Simply broadcast the ID, requesting
the entity to return its current address.
Can never scale beyond local-area networks (think
of ARP/RARP)
Requires all processes to listen to incoming loca-
tion requests
Forwarding pointers: Each time an entity moves, it

leaves behind a pointer telling where it has gone to.
Dereferencing can be made entirely transparent
to clients by simply following the chain of pointers
Update a client’s reference as soon as present
location has been found
Geographical scalability problems:
– Long chains are not fault tolerant
– Increased network latency at dereferencing
Essential to have separate chain reduction mech-
anisms

Home-Based Approaches (1/2)
Single-tiered scheme: Let a home keep track of where
the entity is:
An entity’s home address is registered at a nam-
ing service
The home registers the foreign address of the
entity
Clients always contact the home first, and then
continues with the foreign location
Host's home
location 1. Send packet to host at its home

2. Return address

of current location
Client's
location

3. Tunnel packet to
current location

4. Send successive packets
to current location
Host's present location

Home-Based Approaches (2/2)
Two-tiered scheme: Keep track of visiting entities:

Check local visitor register first
Fall back to home location if local lookup fails
Problems with home-based approaches:

The home address has to be supported as long
as the entity lives.
The home address is fixed, which means an un-
necessary burden when the entity permanently
moves to another location
Poor geographical scalability (the entity may be
next to the client)
Question: How can we solve the “permanent move”

problem?

Hierarchical Location Services
(HLS)
Basic idea: Build a large-scale search tree for which

the underlying network is divided into hierarchical do-
mains. Each domain is represented by a separate di-
rectory node.
The root directory

Top-level
node dir(T)
domain T
Directory node
A subdomain S
dir(S) of domain S
of top-level domain T
(S is contained in T)
A leaf domain, contained in S

HLS: Tree Organization
The address of an entity is stored in a leaf node,

or in an intermediate node
Intermediate nodes contain a pointer to a child if
and only if the subtree rooted at the child stores
an address of the entity
The root knows about all entities
Field with no data

Field for domain
dom(N) with Location record
pointer to N for E at node M
M
Location record
with only one field,
containing an address
Domain D1
Domain D2

HLS: Lookup Operation
Basic principles:
Start lookup at local leaf node
If node knows about the entity, follow downward
pointer, otherwise go one level up
Upward lookup always stops at root
Node knows
about E, so request
Node has no is forwarded to child
record for E, so
that request is M
forwarded to
parent
Look-up
Domain D
request

HLS: Insert Operation

HLS: Record Placement
Observation: If an entity E moves regularly between
leaf domains D1 and D2, it may be more efficient to

store E’s contact record at the least common ancestor
of jmlnfki D1 x and jl|fki D2 x .
Lookup operations from either D1 or D2 are on
average cheaper

Update operations (i.e., changing the current ad-

dress) can be done directly at

j m
i x
Note: assuming that E generally stays in
,
it does make sense to cache a pointer to
Domain D
Cached pointers
to node dir(D) Ethemoves regularly between
two subdomains

HLS: Scalability Issues
Size scalability: Again, we have a problem of over-
loading higher-level nodes:
Only solution is to partition a node into a number
of subnodes and evenly assign entities to sub-
nodes
Naive partitioning may introduce a node manage-
ment problem, as a subnode may have to know
how its parent and children are partitioned.
Geographical scalability: We have to ensure that

lookup operations generally proceed monotonically in
the direction of where we’ll find an address:
If entity E generally resides in California, we should
not let a root subnode located in France store E’s
contact record.
Unfortunately, subnode placement is not that easy,
and only a few tentative solutions are known

Reclaiming References
Reference counting
Reference listing
Scalability issues
04 – 31 Naming/4.3 Reclaiming References

Unreferenced Objects: Problem
Assumption: Objects may exist only if it is known that

they can be contacted:
Each object should be named
Each object can be located
A reference can be resolved to client–object com-
munication
Problem: Removing unreferenced objects:

How do we know when an object is no longer ref-
erenced (think of cyclic references)?
Who is responsible for (deciding on) removing an
object?

Reference Counting (1/2)
Principle: Each time a client creates (removes) a ref-
erence to an object O, a reference counter local to O
is incremented (decremented)
Problem 1: Dealing with lost (and duplicated) mes-

sages:
An ln"fg'!
is lost so that the object may be pre-
maturely removed
A j%"fg'!
is lost so that the object is never re-
moved
An ACK is lost, so that the increment/decrement
is resent.
Solution: Keep track of duplicate requests.
1
+1
Skeleton (maintains reference counter)
Process P
2 ACK
Object O

ACK is lost
3 +1
Proxy p
4 ACK Proxy p is now counted twice

Reference Counting (2/2)
Problem 2: Dealing with duplicated references – client
P1 tells client P2 about object O:
Client P2 creates a reference to O, but derefer-
encing (communicating with O) may take a long
time
If the last reference known to O is removed before
P2 talks to O, the object is removed prematurely
Solution 1: Ensure that P2 talks to O on time:

Let P1 tell O it will pass a reference to P2
Let O contact P2 immediately
A reference may never be removed before O has
acked that reference to the holder
P1 sends P1 deletes its P1 tells O that it will P1 deletes its
reference to P2 pass a reference to P2
Y
reference to O reference to O
P1 P1
O has been
Y O
-1 removed
Y O
-1

ACK
+1
Time
Y Time
_
P2 P2
P2 informs O it P1 sends O acks it knows
has a reference reference to P2 about P2's reference
(a) (b)

Weighted Reference Counting
Solution 2: Avoid increment and decrement messages:
Let O allow a maximum M of references
Client P1 creates reference grant it M 2 credit
Client P1 tells P2 about O, it passes half of its credit
grant to P2
Pass current credit grant back to O upon refer-
ence deletion
Partial weight at
Partial
Skeleton skeleton is halved
Process P
Total weight
Object O
weight 128
128 of proxy 64 64
Partial weight 128
Proxy
(a) (b)
Process P2
P2 gets half
of the weight 32
of proxy at P1 Total and partial
128 weight at skeleton
P1 passes 64 remain the same
reference to P2
Process P1
32
(c)
Reference Listing
Observation: We can avoid many problems if we can
tolerate message loss and duplication
Reference listing: Let an object keep a list of its

clients:
Increment operation is replaced by an (idempo-
tent) l|4T"fq
Decrement operation is replaced by an (idempo-
tent) fg!9/
There are still some problems to be solved:
Passing references: client B has to be listed at O

before last reference at O is removed (or keep a
chain of references)
Client crashes: we need to remove outdated regis-

trations (e.g., by combining reference listing with
leases)

Leases
Observation: If we cannot be exact in the presence
of communication failures, we will have to tolerate some
mistakes
Essential issue: We need to avoid that object refer-

ences are never reclaimed
Solution: Hand out a lease on each new reference:

The object promises not to decrement the refer-
ence count for a specified time
Leases need to be refreshed (by object or client)
Observations:
Refreshing may fail in the face of message loss
Refreshing can tolerate message duplication
Does not solve problems related to cyclic refer-
ences

Name Enti N Locatn

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Name Enti N Locatn

Diunggah oleh

Hak Cipta:

Format Tersedia

Distributed Systems

Principles and Paradigms

Maarten van Steen

Names, identifiers, and addresses

04 – 1 Naming/4.1 Naming Entities

Essence: Names are used to denote entities in a dis-

Note: A location-independent name for an entity

04 – 2 Naming/4.1 Naming Entities

Pure name: A name that has no meaning at all; it

Identifier: A name having the following properties:

P1 Each identifier refers to at most one entity

Observation: An identifier need not necessarily be a

Question: Can the content of an identifier ever change?

04 – 3 Naming/4.1 Naming Entities

Essence: a graph in which a leaf node represents

Note: A directory node contains a (directory) table of

04 – 4 Naming/4.1 Naming Entities

Observation: We can easily store all kinds of attributes

Observation: Directory nodes can also have attributes,

04 – 5 Naming/4.1 Naming Entities

Problem: To resolve a name we need a directory

Closure mechanism: The mechanism to select the

Question: Why are closure mechanisms always im-

Observation: A closure mechanism may also deter-

04 – 6 Naming/4.1 Naming Entities

Hard link: What we have described so far as a path

Soft link: Allow a node O to contain a name of an-

04 – 7 Naming/4.1 Naming Entities

Observation: Node :; has only one name

04 – 8 Naming/4.1 Naming Entities

Problem: We have different name spaces that we

Solution 1: Introduce a naming scheme by which

<>=@? FHGIFKJML FNPO Name space delimiter

?PLPRSBTIW=UVVXNSB Name space delimiter

04 – 9 Naming/4.1 Naming Entities

Mount point: (Directory) node in naming graph that

Mounting point: (Directory) node in other naming graph

04 – 10 Naming/4.1 Naming Entities

Solution 3: Use only full pathnames, in which the

home keys mbox

.twmrc mbox keys

Note: In principle, you always have to start in the new

04 – 11 Naming/4.1 Naming Entities

Consider a hierarchical naming graph and distinguish

Global level: Consists of the high-level directory nodes.

Administrational level: Contains mid-level directory

Managerial level: Consists of low-level directory nodes

04 – 12 Naming/4.1 Naming Entities

Item Global Administrational Managerial

04 – 13 Naming/4.1 Naming Entities

04 – 14 Naming/4.1 Naming Entities

fg T6h !ikjmlnf@oqpr389'*'oqrrsot/89'uwvyx is sent to z !fr !f{& re-

7. #<vu,cs,ftp> Name server

5. #<ftp> Name server

Naming/4.1 Naming Entities

Size scalability: We need to ensure that servers can

Solution: Assume (at least at global and administra-

Observation: An important attribute of many nodes

04 – 17 Naming/4.1 Naming Entities

Geographical scalability: We need to ensure that

Problem: By mapping nodes to servers that may, in

Solution: No general one available yet.

Observation: Node :; has only one name

<>=@? FHGIFKJML FNPO Name space delimiter

?PLPRSBTIW=UVVXNSB Name space delimiter

fg T6h!ikjmlnf@oqpr389'*'oqrrsot/89'uwvyx is sent to z !fr!f{& re-