Anda di halaman 1dari 68

Logical time

[scalar time, vector time, and matrix time]

Dr. Bibhudatta Sahoo


Communication & Computing Group
Department of CSE, NIT Rourkela
Email: bdsahu@nitrkl.ac.in, 9937324437, 2462358
Moodle server : http://moodle.nitrkl.ac.in

Overview: Causality (or the causal precedence


relation)

Causality (or the causal precedence

relation) among events in a distributed


system is a powerful concept in
reasoning, analyzing, and drawing
inferences about a computation.
Causality is tracked using physical time.
In distributed systems, it is not possible to

have global physical time; it is possible to


realize only an approximation of it.
As asynchronous distributed computations
make progress in spurts, it turns out that the
logical time, which advances in jumps, is
sufficient
to
capture
the
fundamental
monotonicity property associated
with

Types of Logical Clocks


Systems of logical clocks differ in their

representation of logical time and also in


the protocol to update the logical clocks.
3 kinds of logical clocks
Scalar
Vector
Matrix

Logical clock
In a system of logical clocks, every process

has a logical clock that is advanced using a


set of rules.
Every event is assigned a timestamp and
the causality relation between events can
be generally inferred from their timestamps.
The timestamps assigned to events obey
the fundamental
monotonicity property;
that is, if an event a causally affects an
event b, then the timestamp of a is
smaller than the timestamp of b.
4

A framework for a system of


logical clocks
A system of logical clocks consists of a

time domain T and a logical clock C.


Elements of T form a partially ordered
set over a relation <.
This relation is usually called the
happened
before
or
causal
precedence. Intuitively, this relation is
analogous to the earlier than relation
provided by the physical time.

A system of logical clock


The logical clock C is a function that maps an
event e in a distributed system to an element in
the time domain T, denoted as C(e) and called
the timestamp of e, and is defined as follows:

3.2.2 Implementing logical


clocks
logical clocks

implementation requires

two issues :
1. Data
structures local to every
process to represent logical time
2. A protocol (set of rules) to update the
data
structures
to
ensure
the
consistency condition.

Data structures local to every process

A protocol
structures

to

update

the

data

The protocol ensures that a processs logical clock,

and thus its view of the global time, is managed


consistently.
The protocol [ two rules ]:
R1: This rule governs how the local logical
clock is updated by a process when it executes
an event (send, receive, or internal).
R2 This rule governs how a process updates its
global logical clock to update its view of the global
time and global progress.
It dictates what information about the logical time is
piggybacked in a message and how this information
is used by the receiving process to update its view of
the global time.
9

Scalar time
The scalar time representation was proposed

by Lamport in 1978 as an attempt to totally


order events in a distributed system.
Time domain in this representation is the set of
non-negative integers.
The logical local clock of a process

pi and its

local view of the global time are squashed into


one integer variable Ci.
Rules R1 and R2 to update the clocks are

as follows:

10

Rules R1 and R2 to update the


clocks

11

Rules R1 and R2 to update the clocks

12

Basic properties: Consistency


property
Scalar clocks satisfy the monotonicity

and hence the consistency property:


for two events ei and ej , ei ej = C(ei)
< C(ej ).

13

Basic properties: Total Ordering


The main problem in totally ordering events is

that two or more events at different processes


may have an identical timestamp. (Note that
for two events e1 and e2, C(e1) = C(e2) = e1
e2)

14

The third event of


event of process
timestamp.

process P11 and the second


P22 have identical scalar

Tie braking
The third event of process P1 and the second event of

process P2 have identical scalar timestamp. Thus, a tiebreaking mechanism is needed to order such events.
A tie is broken as follows:
Process identifiers are linearly ordered and a tie among
events with identical scalar timestamp is broken on the
basis of their process identifiers.
The lower the process identifier in the ranking, the
higher the priority. The timestamp of an event is
denoted by a tuple (t, i); where t is its time of
occurrence and i is the identity of the process where it
occurred.
The total order relation
denoted as on two
events x and y with timestamps (h, i) and (k, j),
respectively, is defined as follows:
15

Notes on Scalar time stamp


Since events that occur at the same logical

scalar time are independent (i.e., they are not


causally related), they can be ordered using any
arbitrary criterion without violating the causality
relation .
Therefore, a total order is consistent with the
causality relation .
Note that x y = x
y x y.
A total order is generally used to ensure liveness

properties in distributed algorithms.


Requests
are time stamped and served
according to the total order based on these
timestamps
16

Event counting
If the increment value d is always 1, the scalar time has

the following interesting property:


if event e has a timestamp h, then h1 represents the
minimum logical duration, counted in units of events,
required before producing the event e.
This is called as the height of the event e. In other
words, h-1 events have been produced sequentially
before the event e regardless of the processes that
produced these events.
For example, five events precede event b on the longest
causal path ending at b.

17

No strong consistency
The system of scalar clocks is not strongly

consistent; that is, for two events ei and ej,


C(ei ) < C(ej) = ei ej
The third event of process P1 has smaller scalar

timestamp than the third event of process P2.


However, the former did not happen before the
latter.

18

No strong consistency
The reason that scalar clocks are not strongly

consistent is that the logical local clock and


logical global clock of a process are quashed
into one, resulting in the loss causal
dependency information among events at
different processes.

19

No strong consistency : example


When process P2 receives the first message

from process P1, it updates its clock to 3,


forgetting that the timestamp of the latest
event at P1 on which it depends is 2.

20

Problems with Total Ordering


A linearly ordered structure of time is not always

adequate for distributed systems


captures dependence of events
loses independence of events - artificially
enforces an ordering for events that need not be
ordered.
Mapping partial ordered events onto a linearly ordered

set of integers it is losing information


Events which may happen simultaneously may get
different timestamps as if they happen in some
definite order.
A partially ordered system of vectors forming a

lattice structure is a natural representation of time


in a distributed system
21

Vector clocks
The system of vector clocks was developed
independently by Fidgety, Mattern and
Schmuck

22

Vector Times
The system of vector clocks was developed

independently by Fidge, Mattern and Schmuck.


In the system of vector clocks, the time domain

is represented by a set of n-dimensional nonnegative integer vectors.


Each process has a clock Ci consisting of a

vector of length n, where n is the total number


of processes vt[1...n], where vt[j ] is the local
logical clock of Pj and describes the logical time
23

progress at process Pj

Vector Clocks How They Work


Each processor keeps a vector of values, instead of

a single value.
vti is the clock at process i; it has a component for

each process in the system.


vti[i] corresponds to Pis local time.
vti[j] represents Pis knowledge of the time at P j

(the # of events that Pi knows have occurred at Pj


Each processor knows its own time exactly, and

updates the values of other processors clocks


based on timestamps received in messages.

24

Vector clocks
The time domain is represented by a set of n-

dimensional non-negative integer vectors.


Each process pi maintains a vector vti [1, , n],
where vti [i] is the local logical clock of p i and
describes the logical time progress at process p i.
vti [j] represents process pi s latest knowledge

of process pj local time.


If vti [j] = x, then process pi knows that local

time at process pj has progressed till x.


The entire vector vti [1, , n] constitutes pi s

view of the global logical time and is used to


timestamp events.
25

Two rules R1 and R2 to update its clock:

26

Evolution of vector time

27

Evolution of vector time


The timestamp associated with an event is the

value of the vector clock of its process when


the event is executed.
Figure in last page shows an example of
vector clocks progress with the increment
value d = 1.
Initially, a vector clock is [0 0 0 , , 0].

28

Example: Vector Clocks


Events occurring at three processes
(1,0,
0)

(2,0,
0)

(3,2,
3)

(2,1,
0)

(2,2,
0)

(0,0,
1)

(2,2,
3)
(2,2,
2)

Basic properties: Isomorphism


Recall that relation induces a partial

order on the set of events that are


produced by a distributed execution.
If events in a distributed system are
time stamped using a system of vector
clocks, we have the following property.
If two events x and y have timestamps
vh and vk, respectively, then

30

Basic properties: Isomorphism


There is an isomorphism between the set

of partially ordered events produced by a


distributed computation and their vector
timestamps.
A very powerful, useful, and interesting
property of vector clocks.
If the process at which an event occurred
is known, the test to compare two
timestamps can be simplified as follows:
if events x and y respectively occurred at
processes pi and pj and are assigned
timestamps vh and vk, respectively, then
31

Basic properties: Strong


consistency
The system of vector clocks is strongly

consistent; thus, by examining the


vector timestamp of two events, we can
determine if the events are causally
related. However, CharronBost showed
that the dimension of vector clocks
cannot be less than n, the total number
of
processes
in
the
distributed
computation, for this property to hold

32

Basic properties: Event


counting
If d is always 1 in rule R1, then

33

the ith component of vector


clock at process pi, vtii, denotes
the number of events that have
occurred at pi until that instant.
So, if an event e has timestamp vh,
vhj denotes the number of events
executed by process pj that
causally precede e. Clearly, vhj1
represents the total number of
events that causally precede e in

Basic properties: Applications


Since vector time tracks causal dependencies

exactly, it finds a wide variety of applications.


For example, they are used in distributed
debugging, implementations
of causal
ordering communication and causal distributed
shared
memory, establishment of global
breakpoints,
and
in
determining
the
consistency
of checkpoints in optimistic
recovery.

34

Singhal-Kshemkalyanis differential technique


If the number of processes in a distributed computation is

large, vector clocks will require


piggybacking of huge
amount of information in messages. ( message overhead
grows linearly with the number of processors )
Singhal-Kshemkalyanis differential technique
Enables efficient vector clocks
Based on the observation that

between successive
message sends to the same process, only a few entries of
the vector clock at the sender process are likely to
change.
When a process pi sends a message to a process pj , it
piggybacks only those entries of its vector clock that differ
since the last message sent to pj .
cuts down the message size, communication bandwidth
and buffer (to store messages) requirements.

35

SinghalKshemkalyanis differential technique


SinghalKshemkalyanis

differential
technique is based on the observation that
between successive message sends to the
same process, only a few entries of the vector
clock at the sender process are likely to change.
This
is more likely when the number of
processes is large because only a few of them
will interact frequently by passing messages.
In this technique, when a process pi sends a
message to a process pj , it piggybacks only
those entries of its vector clock that differ since
the last message sent to pj .
36

Working of SinghalKshemkalyanis
differential technique

37

Application: Causally-Ordered Multicasting

For ordered delivery of related

messages
Vi[i] is only incremented when sending
When k gets a msg from j, with

timestamp ts, the msg is buffered


until:
1:

ts[j] = Vk[j] + 1

(timestamp

indicates this is the next msg


that k is expecting from j)
2:
ts[i] Vk[i] for all i j
(k
38

has seen all msgs that were seen by j


when j sent the msg)

Causally-Ordered Multicasting
P2 [0,0,0]

P1 [0,0,0]

P3

[0,0,0]

Post a
[1,0,0] a

[1,0,0]

[1,0,0]
r: Reply a

[1,0,1]

[1,0,1]

[1,0,1]

39

Two messages:
message a from P1;
message r from P3 replying to message a;
Scenario1: at P2, message a arrives before the reply r

Causally-Ordered Multicasting (cont.)


P2 [0,0,0]

P1 [0,0,0]

P3 [0,0,0]

Post a
[1,0,0] a
[1,0,0]

[1,0,1] r: Reply a

Buffered

b
[1,0,1]

c
[1,0,0]
Deliver r

Scenario2: at P2, message a arrives after the reply


40
The
reply is not delivered right away.

Ordered Communication
Totally ordered multicast
Use Lamport timestamps
Causally ordered multicast
Use vector timestamps

41

Summary: vector clock


In vector clocks, the clock at a process is

represented by a vector of integers.


Thus, the message and the computation
overheads are likely to be high.
The vector clocks possess a powerful
property there is an isomorphism
between the set of partially ordered
events in a distributed computation and
their vector timestamps.

42

Using vector clocks


Vector

clocks and matrix clocks are


widely used in asynchronous distributed
message-passing systems.
Some example areas using vector clocks
are checkpointing, causal memory,
maintaining consistency of replicated
files, global snapshot, global time
approximation, termination detection,
bounded multiwriter construction of
shared variables, mutual exclusion and
debugging (predicate detection).
43

What is the total ordering of the


events in these two processes?
P1

P2

e11

e21

e12

e22

e13

e23

e14

e15

e16

e24

Figure 5.3 - Singhal and Shivaratri

e17

e25

Matrix time

45

Matrix Time
Vector time contains information about

latest direct dependencies


What does Pi know about Pk
Also contains info about latest direct
dependencies of those dependencies
What does Pi know about what Pk
knows about Pj
Message and computation overheads
are high
Powerful and useful for applications like
distributed garbage collection
46

Matrix Time
A system of matrix clocks was first informally

proposed by Michael and Fischer [5] and


has been used by Wuu and Bernstein [28]
and by Sarin and Lynch [22] to discard
obsolete information in replicated databases
.

47

Using matrix clocks


Some

example areas that use


matrix clocks are designing faulttolerant protocols and distributed
database
protocols,
including
protocols to discard obsolete
information
in
distributed
databases, and protocols to solve
the replicated log and replicated
dictionary problems.

48

A system of matrix clocks


In a system of matrix clocks, the time is

represented by a set of nn matrices of


non-negative integers.
A process pi maintains a matrix mti
[1..n, 1..n ]where,

49

Evolution of matrix time

50

Rules to be used by process to update clock


The entire matrix mti denotes Pis local view of

the global logical time.


The matrix timestamp of an event is the value
of the matrix clock of the process when the
event is executed.

51

Rules to be used by process to update clock

52

Matrix clocks progress in a distributed


computation.
We assume d = 1.

Let

us consider the following


events: e which is the xith event at
process pi,

e1k and e2k which are the x1kth and

x2kth events at process pk, and e1j


and e2j which are the x1jth and x2jth
events at pj .
53

Matrix clocks progress in a distributed


computation.

54

Matrix clocks progress in a distributed


computation.
The

last event of pk known by pj , to the


knowledge of pi when it executed event e, is e1k;
therefore, mte[j, k] = x1k.
Likewise, we have mte[k, j] = x1j .

55

Summary :matrix clocks


In

matrix clocks, the clock at a process is


represented by a matrix of integers.
The message and the compuatation overheads are
high; however, matrix clocks are very powerful
besides containing information about the direct
dependencies, a matrix clock contains information
about the latest direct dependencies of
those
dependencies.
This information can be very useful in applications
such as distributed garbage collection.
The power of systems of clocks increases

in the
order of scalar, vector, and matrix, but so do the
complexity and the overheads.

56

Virtual time and its


implementation
The virtual time system is a
paradigm
for
organizing
and
synchronizing distributed systems
using virtual time

57

Introduction
In virtual time, the reverse of the above

is done by assuming that every event is


labeled with a clock value from a totally
ordered virtual time
scale satisfying
Lamports clock conditions. Thus the
time warp mechanism is an inverse of
Lamports scheme.

58

Virtual time defination


Virtual time is a global, one-dimensional,

59

temporal coordinate system on a


distributed computation to measure the
computational progress and to define
synchronization.
A virtual time system is a distributed
system executing in coordination with
an imaginary virtual clock that uses
virtual time.
Virtual times are real values that are
totally ordered by the less than
relation, <.

Local virtual clocks


The

local virtual clocks move forward to higher virtual


times; however, occasionally they move backwards.
In a distributed system, processes run concurrently and
communicate with each other by exchanging messages.
Every message is characterized by four values:
o
o
o
o

(i) name of the sender;


(ii) virtual send time;
(iii) name of the receiver;
(iv) virtual receive time.

Virtual send time is the virtual time at the sender when the

message is sent, whereas virtual receive time specifies the


virtual time when the message must be received (and processed)
by the receiver.
60

Virtual time definition


The problem arises when a message arrives at

process late, that is, the virtual receive time of the


message is less than the local virtual time at the
receiver process when the message arrives.
Virtual time systems are subject to two semantic
rules similar to Lamports clock conditions:
Rule 1 : Virtual send time of each message <
virtual receive time of that message.
Rule 2 : Virtual time of each event in a process <
virtual time of next event in that process.
The above two rules imply that a process sends all
messages in increasing order of virtual send time
and a process receives (and processes) all messages
in the increasing order of virtual receive time.
61

Virtual time definition


Causality of events is an important concept

in distributed systems and is also a major


constraint in the implementation of virtual
time.
It is important an event that causes another
should be completely executed before the
caused event can be processed.
The constraint in the implementation of
virtual time can be stated as follows: If an
event A causes event B, then the execution
of A and B must be scheduled in real time so
that A is completed before B starts.
62

If event A has an earlier virtual time than event B, we

need execute A before B provided there is no causal


chain from A to B.
Better performance can be achieved by scheduling A
concurrently with B or scheduling A after B.
If A and B have exactly the same virtual time coordinate,
then there is no
restriction on the order of their
scheduling.
If A and B are distinct events, they will have different
virtual space coordinates (since they occur at different
processes) and neither will be a cause for the other.
To sum it up, events with virtual time < t complete
before the starting of events at time t and events with
virtual time > t will start only after events at time t
are complete.
63

5 Characteristics of virtual time


1. Virtual time systems are not all

isomorphic; they may be either


discrete or continuous.
2. Virtual time may be only partially
ordered (in this implementation,
total order is assumed.)
3. Virtual time may be related to
real time or may be independent
of it.
64

Characteristics of virtual time


4. Virtual

time systems may be


visible
to
programmers
and
manipulated explicitly as values,
or
hidden
and
manipulated
implicitly according to some
system-defined discipline.

5. Virtual

65

times associated with


events
may
be
explicitly
calculated by user programs or
they may be assigned by fixed

Comparison with Lamports


logical clocks
Lamport

showed that in real-time


temporal relationships happens before
and happens after, operationally
definable within a distributed system,
form only a partial order, not a total
order, and concurrent events are
incomparable under that partial order.

66

Thanks for Your Attention!

67

3.11 Exercises: Page 84

68

Anda mungkin juga menyukai