Anda di halaman 1dari 11

Distributed Systems: Introduction

Working definition of distributed systems:


 A distributed system is a collection of independent computers that appears
to its users as a single coherent system.
 Heterogeneous computers and networks are presented as a single system
view.
 Hardware aspect is that the computers are autonomous.
 Software aspect is that users think they are dealing with a single system
 Differences between the various computers, their internal organization and
the ways in which they communicate are hidden from the users.
Distributed Systems: Introduction

 Users and applications can interact with a distributed system in a consistent


and uniform ways, regardless of when and where the interaction takes place.
 A DS should be relatively easy to expand or scale.
 A distributed system will normally be continuously available, although
perhaps some parts are out of order, or being replaced or fixed.
 Distributed systems are often organized by means of a layer of software that
is logically placed between a higher-level layer consisting of users and
applications, and a layer underneath consisting of operating systems.
Accordingly, such a distributed system is sometimes called middleware.
Distributed Systems: Introduction

Examples:
 A network of workstations. Some are assigned to users (static allocation) and
some are in a pool (dynamic allocation). Single file system. When a user types
a command, it may be executed in the user’s WS, in another idle WS or in an
unassigned WS. If a system as a whole looks and acts like a single-processor
time sharing system, it qualifies to be a DS.
 Workflow system.
 WWW: A simple consistent and uniform model of distributed documents.
Current state users are aware that documents are located at different places
and are handled by different servers.
Goals

A DS should
 easily connect users to resources
 hide the facts that resources are distributed across a network
 be open
 be scalable

 Connect users and resources


 Easy access of remote resources like printers, computers, storage, data, files etc. It helps in
collaboration and exchange of information. CSCW, groupware.
 Transparency
 Hide the fact that its processes and resources are physically distributed across multiple
computers.
 A DS that is able to present itself to users and applications as if it were only a single computer
system is said to be transparent.
Goals

 Different forms of transparency


 Access transparency: deals with hiding differences in data representation and the way that resources can be
accessed by users. Example integer representation. In a DS different computers run different OSes each having
own file naming conventions and manipulations.
 Location transparency: Hide where a resource is physically located. Naming plays an important role in
achieving location transparency. Location transparency can be achieved by assigning local names to resources.
Example URL
 Migration transparency: Resources can be moved without affecting how that resource can be accessed.
 Relocation transparency: Resources can be relocated while they are being accessed without the user and the
application noticing it. Example: Mobile user can continue to use wireless laptop while moving from place to
place without ever being temporarily disconnected.
 Replication transparency: deals with hiding the fact that several copies of the resource exist to increase
availability and performance. Consequently a system that supports replication transparency should generally
support location transparency.
 Concurrency transparency: Hide that a resource may be shared by several competitive users.
 Failure transparency: Hide the failure and recovery of a resource. Masking failure is one of the hardest issues
in DS.
 Persistence transparency: masking whether a resource in volatile memory or somewhere in the disk.
Goals

 Openness
 An open DS is a system that offers services according to standard rules that describe the syntax and semantics
of those services.
 In DS services are generally specified through interfaces. For example, in DS interfaces may be described in an
Interface Description Language (IDL).
 An interface precisely specify the names of the functions, types of parameters, return values, exceptions etc.
Example IDL of CORBA
 Hard part is to specify what that service do, that is semantics of interfaces. In practice semantic specification
is given simply in natural languages. Example WSDL
 Interoperability characterizes the extent by which two implementations of systems or components from
different manufacturers can co-exist and work together by merely relying on each other’s services relying on
interfaces. Example CORBA/DCOM gateways.
 Portability characterizes to what extent an application developed for a DS A can be executed, without
modification, on a different distributed DS B that implements the same interface as A.
 An open DS should be flexible meaning that it should be easy to configure the system out of different
components possibly from different developers. It should be easy to add or replace components without
affecting the system.
Goals

 Scaling
 Scalability is one of the most important design goal for a DS. Scalabilty of a system
can be measured along at least three dimensions:
 A system can be scalable wrt its size. It means we can easily add more users and
resources to the system.
 A geographically scalable system is one in which the users and resources may lie
far apart.
 A system can be administratively scalable meaning that it can still be easy to
manage even if it spans many independent administrative organizations.
 A system scalable in one or more of these dimensions often exhibit loss of
performance as the system scales up.
Goals

 Problems related to scalability


 If more users or resources need to be supported, we often confront with the
limitations of centralized services, data and algorithms.
 Many services are centralized in the sense that these services run on specific
server for high security and manageability reason. Unfortunately such servers
become bottleneck as the no of users.
 Replicating the server at several locations may make the system more vulnerable
to security attacks.
 Similar is the case with centralized data. Imagine how Internet would work with a
centralized name table.
 Centralized algorithms are also a bad idea. In fact any algorithm that operates by
collecting information from all sites sends it to a single machine for processing and
then distributes the results must be avoided.
Goals

 Only decentralized or distributed algorithms should be used. Distributed


algorithms Distributed algorithms generally have the following characteristics-
 No machine has complete information about the system state
 Machines make decisions based only on the local information
 Failure of one machine does not ruin the algorithm
 There is no implicit assumption that a global clock exists.

 Geographical scalability has its own problems. DS designed for LAN is hard to
scale for WAN. Communication in LAN is synchronous, client blocks until a
reply is sent back and mode is broadcast.
 In WAN response time is generally high; network is unreliable and generally
point to point.
Goals

 Scaling a DS across multiple administrative domains is problematic. Major


problem is conflicting policies wrt resource usage, management and security.
 Many components of a DS residing within a single domain may be trusted by
the users belonging to the domain. System administration may have tested
and certified the applications.
 However this trust does not naturally expand across domain boundaries.
 Example, a Java applet or a foreign code downloaded in a browser from a
different domain cannot be trusted hence naturally new domain severely limit
the access rights for such code.
Goals

Scaling techniques
 Scalability problems in DS appear as performance problems.
 There are three techniques for scaling
 Hiding communication latencies: Essentially, using asynchronous communication.
No client waiting, raising interrupts, call special handler to complete the request.
 Distribution: Take a component, split into smaller parts, and distribute those parts
across the system. Example DNS – name space is organized into a tree, divided into
non-overlapping zones, each zone is handled by a name server.
 Replication: Replicate components across a DS. Replication enhances availability,
performance and helps in load balancing. But replication leads to consistency
problems. Caching is a special type of replication.

Anda mungkin juga menyukai