Anda di halaman 1dari 23


l  Physical clocks
l  Logical clocks
Events, process states and clocks
A distributed system – a collection P of N single-
threaded processes without shared memory
–  Each process pi has a state si
–  Each executes a series of actions – send, receive, transform
–  Event – The execution of a single action
–  All events in a process can be place in a total ordering →i
–  e →i e ’ iff e is an event that occurs after e’ in pi
–  History of a process pi history( pi ) = hi =< ei0 , ei1, ei2 ,... >

How do we order the history of multiple processes?

–  … but first

Events, process states and clocks
Within a single process, we can order its events, but
can we timestamp them?

Computers have their own hardware-based clock, Ci ,

which can be used to assign timestamps to events
Clock is based on a counting of oscillation of a crystal
at a given frequency – stored in some register, say Hi
The OS reads this value, scales it and adds an offset
to compute a software clock (Hi is value of H at time t)
Ci(t) = α Hi(t) + β

Events, process states and clocks
Clocks tend to drift and do so a different rates
Skew – Instantaneous difference between the
readings of two clocks
Clock drift – The clocks’ underlying oscillators are
subject to physical variations that makes them drift
from each other
Drift rate – Change in the offset between a clock and a
nominal perfect reference clock per unit of time
–  For common quartz-crystal based clocks, ~10-6 sec/sec or 1
second every 11.6 days

Clock time (C)

Real time (t) 4

Physical clocks
From solar to atomic time
–  Before, time was measured astronomically (solar time)
–  Since 1967, atomic clocks based on # of transitions/sec of
cesium 133 (Cs133) atom
•  Drift rate of one part in 1013

Universal Coordinated Time (UTC)

–  Currently, real time is avg of ~50 cesium-clocks
–  Broadcast through short wave radio (WWV in the US) &
satellite (GPS)
We want to distribute this to a bunch of machines
–  Each runs its own timer, keeping a clock Cp(t) (t being UTC)
–  Given a maximum drift rate r (1− r ≤ dC/dt ≤ 1+ r)
–  To never let two clocks differ by more than d ⇒ synchronize
at least every d/(2r) seconds

Clock synchronization
To order distributed events, keep clocks synchronized
Two synchronization modes
–  Internal – Clocks must agree within a bound d
–  External – Clocks must be accurate respect to a UTC source
within a bound d
Internally synchronized ≠> externally synchronized
–  But if the system is externally synchronized with bound d, the
it is internally synchronized with bound 2d

Correctness of clocks
If drift rate falls bellow a known bound p
–  (1 – p)(t’ – t) ≤ H(t’) – H(t) ≤ (1+p)(t’ – t)
If it satisfies monotonicity – t’ > t => C(t’) > C(t)
–  We can still adjust clocks changing α and β in Ci(t) = αHi(t)+ β
–  Monotonicity
–  Drift rate is bound in between synchronization points
(but can jump ahead at those points)
A clock’s crash failure – clock stops ticking altogether
–  Any other failure is an arbitrary failure (e.g., Y2K bug, from
1999 to 1900!)

Clock synchronization – External
Cristian’s approach (1989)
A time server that gets a signal from a UTC source
Others ask server for accurate time at least once every
d/(2r) seconds (d is the bound)
While asynchronous, rtts are typically short
–  Must estimate rtt, including interrupt handling, msg processing
–  Cristian describes the algorithm as probabilistic
–  Single server, single point of failure
•  Cristian’s suggestion: a group of synchronized servers
–  Faulty server, not Cristian’s problem
•  If f is the number of faulty clocks out of N, you need N >3f clocks
for the others to achieve agreement

Clock synchronization – Internal
Berkeley’s algorithm by Gusella and Zatti (1989)
A coordinator computer, master, periodically polls
other machines

Master calculates a fault-tolerant

avg after adjusting for transfer time
2:50 3:25
–  Average is computed among clocks
that don’t differ from the others by
more than some given amount
Tells all how to adjust their clocks (+/-)

+15 -20
Clock synchronization – External
For Internet synchronization - Network Time Protocol
(NTP, Mills 1995)
Primary servers directly connected to time sources
Secondary servers synchronized with others servers
Servers synchronize with others in one of three modes
–  Multicast – For fast LANs
–  Procedure-call – ~Cristian’s, for higher synch or when there’s
no multicast
–  Symmetric – Pair of machines Primary servers
exchanged msgs with 1 Strata 1
timing info
2 2 Strata 2

3 3

Back in 5’

Logical clocks
We typically assume clock synchronization is related
to real time, not necessarily
We have seen (Berkeley algorithm) clocks can agree
on a current time without this having to be the real time
–  In many situations all that matters is that two nodes agree on
the order of events
–  If two nodes do not shared events, i.e. they don’t interact, they
don’t have to be in synch è Logical clocks

Happened-before relationship
a b
m1 b→c
c d
a→c m2
e f

The happened-before (or [potential] causal

precedence) relation on the set of events in a
distributed system:
–  HB1: If a and b are two events in the same process, and a
comes before b, then a→b
–  HB2: If a is the sending of a message, and b is the event of
receiving that message, then a→b
–  HB3: If a→b and b→c, then a→c
Happened-before relationship – notes
This introduces a partial ordering of events in a system
with concurrently operating processes
–  If x and y happen in two processes that do not exchange
messages, then neither x→y nor y→x
–  x and y are concurrent
What happen with communication through other
channels? e.g., phone
If x→y, does it mean x cause y?

a b
p2 m1
c d
p3 m2
Lamport clock
How to maintain a global view on system’s behavior
that is consistent with the happened before relation?
Attach a timestamp C(e) to each event e, satisfying the
following properties:
–  P1: If a and b are two events in the same process, and a→b,
then C(a) < C(b)
–  P2: If a corresponds to sending a message m, and b to the
receipt of that message, then also C(a) < C(b)
a→b ⇒ C(a) < C(b) : clock consistency condition

How to attach a timestamp to an event when there’s

no global clock ⇒ maintain a consistent set of logical
clocks, one per process

Lamport clock
Each process pi maintains a local counter Ci and
adjusts this counter according to the following rules:
1.  For any two successive events that take place within pi, Ci is
incremented by d (let’s say d = 1)
2.  When pi sends a message mi, it includes a timestamp
ts(m) = Ci
3.  Whenever pj receives m, pj adjusts its local counter Cj to
max(Cj, ts(m)); then executes step 1 before passing m to the
Property 1 is satisfied by (1)
Property 2 by (2) and (3)
Note: To impose total ordering (instead of partial),
attach process ID

Lamport timestamps – an example

1 2 Physical time
a b
3 4
c d
1 5
e f

From Lamport to vector clocks
With Lamport’s clocks – if x→y, C(x) < C(y) , but
if C(x) < C(y), we can’t infer x causally preceded y
–  Why? Local and global logical clock are all squashed into one,
loosing all causal dependency info among events at different

p1 6
p2 16 24 32
c d
m2 m3
p3 20
b Is the sending of m2 (C(b)=20)
causally related to receiving of m1
(C(c)=16)? 18
Vector clocks
Vector clock for a system with N processes
–  An array of N integers
–  Processes piggyback vector timestamps on each message
Rules for updating clocks
–  Just before pi sends a message m,
1.  It adds 1 to Vi[i], and
2.  Sends Vi along with m as vector timestamp vt(m)
–  When a pj receives a message m that it received from pi with
vector timestamp ts(m), it
1.  updates each Vj[k] to max{Vj[k], ts(m)[k]} for k = 1 … N
2.  increments Vj[j] by 1

Vector clocks – an example

(1,0,0) (2,0,0) Physical time

a b
p2 (2,1,0) (2,2,0)
c d
p3 (0,0,1) (2,2,2)
e f

Vector clocks
For process pi with vector Vi[1..n],
–  Vi[i] number of events that have taken place at process pi
–  Vi[j] number of events that pi knows have taken place at
process pj (i.e., that have potentially affected pi)
Comparing vector timestamps
–  V = V’ iff V[j] = V’[j] for j = 1 .. N
–  V ≤ V’ iff V[j] ≤ V’[j] for j = 1 .. N
–  If not (V < V’) and not (V > V’) (i.e., sometimes V[j] > V’[j] and
sometimes smaller) – then V || V’
If events x and y occurred at pi and pj with
vectors V and V’
–  x→y ó V[i] < V’[i]
–  Otherwise x || y

Vector clocks – an example

(1,0,0) (2,0,0) Physical time

a b
p2 (2,1,0) (2,2,0)
c d
p3 (0,0,1) (2,2,2)
e f
a →f, so Va ≤ Vf
c || e, since Vc ≤ Ve nor Ve ≤ Vc

Synchronization is about doing the right thing at the
right time …
What’s the right time?
–  An issue when you don’t share clocks