Bit 2303 Distributed Systems

Wachira kinyua harold BICT-001-10/04873 DIS a) With examples describe Access, Location and Migration transparency in a distributed system.
Access transparency Enables local and remote information objects to be accessed using identical operations, that is, the interface to a service request is the same for communication between components on the same host and components on different hosts. Example: File system operations in Unix Network File System (NFS). A component whose access is not transparent cannot easily be moved from one host to the other. All other components that request services would first have to be changed to use a different interface. Location Transparency Enables information objects to be accessed without knowledge of their physical location. Example: Pages in the Web. Example: When an NFS administrator moves a partition, for instance because a disk is full, application programs accessing files in that partition would have to be changed if file location is not transparent for them. Migration Transparency Allows the movement of information objects within a system without affecting the operations of users or application programs. It is useful, as it sometimes becomes necessary to move a component from one host to another (e.g., due to an overload of the host or to a replacement of the host hardware). Without migration transparency, a distributed system becomes very inflexible as components are tied to particular machines and moving them requires changes in other components. b) With aid of a diagram define a Middleware. (2 Marks) Software that mediates between an application program and a network. It manages the interaction between disparate applications across the heterogeneous computing platforms. The Object Request Broker (ORB), software that manages communication between objects, is an example of a middleware program.Middleware is computer software that connects software components or applications. It is used most often to support complex, distributed applications. It includes web servers, application servers, content management systems, and similar tools that support application development and delivery. Middleware is especially integral to modern information technology based on XML, SOAP, Web services, and service-oriented architecture. It goes on to say that it describes a piece of software that connects two or more software applications so that they can exchange data.
Middleware is an intermediary. This type of software is used in complex environments and it doesnt necessarily represent one-to-one interactions between software agents. Middleware is essential to Enterprise Application Integration or EAI.
c) What is redundancy and why is it used in a distributed system. (2 Marks) Redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe. Function of redundancy The two functions of redundancy are passive redundancy and active redundancy. Both functions prevent performance decline from exceeding specification limits without human intervention using extra capacity. Passive redundancy uses excess capacity to reduce the impact of component failures. One common form of passive redundancy is the extra strength of cabling and struts used in bridges. This extra strength allows some structural components to fail without bridge collapse. The extra strength used in the design is called the margin of safety. Performance decline is commonly associated with passive redundancy when a limited number of failures occur. Active redundancy eliminates performance decline by monitoring performance of individual device, and this monitoring is used in voting logic. The voting logic is linked to switching that automatically reconfigures components. Error detection and correction and the Global Positioning System (GPS) are two examples of active redundancy. d) What are Client and Server Stubs and how are they used in remote procedure calls. (2 Marks) A stub in distributed computing is a piece of code used for converting parameters passed during a Remote Procedure Call (RPC). The main idea of an RPC is to allow a local computer (client) to remotely call procedures on a remote computer (server). The client and server use different address spaces, so conversion of parameters used in a function call have to be performed; otherwise the values of those parameters could not be used, because of pointers to the computer's memory pointing to different data on each machine. The client and server may also use different data representations even for simple parameters (e.g., bigendian versus little-endian for integers.) Stubs are used to perform the conversion of the parameters, so a Remote Function Call looks like a local function call for the remote computer. Stub libraries must be installed on client and server side. A client stub is responsible for conversion of parameters used in a function call and deconversion of results passed from the server after execution of the function. A server skeleton, the stub on server side, is responsible for deconversion of parameters passed by the client and conversion of the results after the execution of the function. Stub can be generated in one of the two ways: 1. Manually: In this method, the RPC implementer provides a set of translation functions from which a user can construct his or her own stubs. This method is simple to implement and can handle very complex parameter types. 2. Automatically: This is more commonly used method for stub generation. It uses an interface description language (IDL), that is used for defining the interface between Client and Server. For example, an interface definition has information to indicate whether, each argument is input, output or both only input arguments need to be copied from client to server and only output elements need to be copied from server to client. Remote procedure call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. When the software in question uses object-oriented principles, RPC is called remote invocation or remote method invocation. Message passing An RPC is initiated by the client, which sends a request message to a known remote server to execute a specified procedure with supplied parameters. The remote server sends a response to the client, and the application continues its process. There are many variations and subtleties in various implementations, resulting in a variety of different (incompatible) RPC protocols. While the server is processing the call, the client is blocked (it waits until the server has finished processing before resuming execution), unless the client sends an asynchronous request to the server, such as an XHTTP call. An important difference between remote procedure calls and local calls is that remote calls can
fail because of unpredictable network problems. Also, callers generally must deal with such failures without knowing whether the remote procedure was actually invoked. Idempotent procedures (those that have no additional effects if called more than once) are easily handled, but enough difficulties remain that code to call remote procedures is often confined to carefully written low-level subsystems. [edit] Sequence of events during a RPC 1. The client calls the client stub. The call is a local procedure call, with parameters pushed on to the stack in the normal way. 2. The client stub packs the parameters into a message and makes a system call to send the message. Packing the parameters is called marshalling. 3. The kernel sends the message from the client machine to the server machine. 4. The kernel on the server machine passes the incoming packets to the server stub. 5. Finally, the server stub calls the server procedure. The reply traces the same steps in the reverse direction. e) Highlight any TWO problems associated with passing data values between different machines with different operating systems and sugested relevant solutions. (2 Marks) f) What is atomic multi-cast and why is it needed in replication (2 Marks) Atomic multicast A form of MULTICASTING in which a message sent by a source computer is either received by all its recipients or none at all. Such a form of multicasting is often used in faulttolerant systems which use DATA REPLICATION. g) Discus any THREE architectural models of distributed systems. (6 Marks) Client Server architecture The system is structured as a set of processes, called servers that offer services to the users, called clients. The client-server model is usually based on a simple request/reply protocol, implemented with send/receive primitives or using remote procedure calls (RPC) or remote method invocation (RMI): The client sends a request (invocation) message to the server asking for some service; The server does the work and returns a result (e.g. the data requested) or an error code if the work could not be performed. Peer-to-Peer architecture All processes (objects) play similar role. Processes (objects) interact without particular distinction between clients and servers. The pattern of communication depends on the particular application. A large number of data objects are shared; any individual computer holds only a small part of the application database. Processing and communication loads for access to objects are distributed across many computers and access links. This is the most general and flexible model. Three-tier architecture Three-tier is a clientserver architecture in which the user interface, functional process logic ("business rules"), computer data storage and data access are developed and maintained as independent modules, most often on separate platforms. It was developed by John J. Donovan in Open Environment Corporation (OEC), a tools company he founded in Cambridge, Massachusetts. The three-tier model is a software architecture and a software design pattern. Apart from the usual advantages of modular software with well-defined interfaces, the three-tier architecture is intended to allow any of the three tiers to be upgraded or replaced independently in response to changes in requirements or technology. For example, a change of operating system in the presentation tier would only affect the user interface code. Typically, the user interface runs on a desktop PC or workstation and uses a standard graphical user interface, functional process logic may consist of one or more separate modules running on a workstation or application server, and an RDBMS on a database server or mainframe contains the computer data storage logic. 3
h) Discuss the differences between static and dynamic invocation. (2 Marks) When using static invocation, the client application invokes operations directly on the client stubs. Static invocation is the easiest, most common type of invocation. The stubs are generated by the IDL compiler. Static invocation is recommended for applications that know at compile time the particulars of the operations they need to invoke and can process within the synchronous nature of the invocation. When using dynamic invocation, the client application can dynamically build operation requests for a object interface that has been stored in the Interface Repository. The server applications do not require any special design to be able to receive and handle dynamic invocation requests. Dynamic invocation is generally used when the client application requires deferred synchronous communication, or by dynamic client applications when the nature of the interaction is undefined. i) Distributed systems are usually considered inherently insecure. Discuss the following phrase based on networked system. (2 Marks) j) Highlight any FOUR security threats that distributed systems can be exposed to and any THREE principle methods used in attacking distributed systems? (7 Marks) Man in the middle attack A malicious peer can intercept the messages from a benevolent service provider peer to the requestor and rewrite them with bad services, making therefore the reputation of the benevolent peer to decrease. That participant could even maliciously modify the recommendations given by an honest peer, in order to benefit his/her own interests. One more time, this is a threat which has not been associated with trust and reputation systems traditionally. Most of the authors consider or assume the authenticity of the peer providing either a service or a recommendation. Nevertheless, as explained before, this attack can cause a great damage and effect in the system if its application is possible. A simple way of avoiding this risk could be by the use of cryptography schemes in order to authenticate each user in the system (maybe with a digital signature or any similar mechanism). However, and unfortunately, it is not always feasible to apply such a solution, above all in highly distributed environments like wireless sensor networks. Malicious collectives Malicious peers always provide bad services when selected as service providers. Malicious peers form a malicious collective by assigning the maximum trust value to other malicious peers in the network not many trust and reputation models treat the problem arisen from the constitution of collusion among malicious peers, having thus an important security deficiency. The first thing needed to be able to overcome this threat is to somehow manage, not only the goodness of every user when supplying services, but also their reliability when giving recommendations about other peers. Thus, a user who provides unfair ratings will be also discarded as a service provider. Malicious collectives with camouflage Malicious peers provide bad services in p% of all cases when selected as service providers. Malicious peers form a malicious collective by assigning the maximum trust value to other malicious peers in the network. This is, in many cases, a threat which is not always easy to tackle, since its resilience will mostly depend on the behavioral pattern followed by malicious peers. That is, it is not equal to battle against an oscillating pattern (being fully benevolent for a period of time and fully fraudulent for the next period, and so on. The first topic to address is to somehow distinguish the confidence deposited in a peer as a recommender and the trust deposited in the same peer as a service provider. This mechanism can be very helpful when trying to avoid unfair ratings from malicious entities. Additionally, the variable behavior of a peer, when detected, could be punished and avoided.
Man in the middle attack A malicious peer can intercept the messages from a benevolent service provider peer to the requestor and rewrite them with bad services, making therefore the reputation of the benevolent peer to decrease. That participant could even maliciously modify the recommendations given by an honest peer, in order to benefit his/her own interests. One more time, this is a threat which has not been associated with trust and reputation systems traditionally. Most of the authors consider or assume the authenticity of the peer providing either a service or a recommendation. Nevertheless, as explained before, this attack can cause a great damage and effect in the system if its application is possible. QUESTION TWO a) Discuss the following in relation to distributed systems. (6 Marks) i) Blocking vs non blocking primitives blocking primitives (sometimes called synchronous primitives). When a process calls send it specifies a destination and a buffer to send to that destination. While the message is being sent, the sending process is blocked (i.e., suspended). The instruction following the call to send is not executed until the message has been completely sent. An alternative to blocking primitives are nonblocking primitives (sometimes called asynchronous primitives). If send is nonblocking, it returns control to the caller immediately, before the message is sent. The advantage of this scheme is that the sending process can continue computing in parallel with the message transmission, instead of having the CPU go idle (assuming no other process is runnable). The choice between blocking and nonblocking primitives is normally made by the system designers (i.e., either one primitive is available or the other), although in a few systems both are available and users can choose their favorite. ii) Buffered vs non buffered primitive iii) Reliable vs unreliable primitive
Reliable: - Messages are always delivered unless the recipient does not exist. On failure, the sender is notified. In order to provide reliability, the message passing system ensures that messages have not been altered or corrupted between transmission and reception. Out-of-order message blocks are reordered and duplicate message blocks are removed. If an error occurs, the system will attempt to retransmit the message automatically. Reliable messaging, however, usually comes with noticeable overhead. Unreliable:-Messages may/may not be delivered to the recipient. If a message is delivered, its contents may be corrupted, out of order, or duplicated. Also, the sender may not receive any acknowledgment of a successful delivery. This may sound entirely useless, but unreliable messaging is typically simpler and faster than reliable messaging
b) Distributed systems differ from centralized computer systems in three essential respects discuss. (6 Marks) Three basic distributions To better illustrate this point, examine three system architectures; centralized, decentralized, and distributed. In this examination, consider three structural aspects: organization, connection, and control. Organization describes a system's physical arrangement characteristics. Connection covers the communication pathways among nodes. Control manages the operation of the earlier two considerations. Organization A centralized system has one level of structure, where all constituent elements directly depend upon a single control element. A decentralized system is hierarchical. The bottom level unites subsets of a systems entities. These entity subsets in turn combine at higher levels, ultimately culminating at a central master element. A distributed system is a collection of autonomous elements with no concept of levels. Connection Centralized systems connect constituents directly to a central master entity in a hub and spoke fashion. A decentralized system (aka network system) incorporates direct and indirect paths between 5
constituent elements and the central entity. Typically this is configured as a hierarchy with only one shortest path between any two elements. Finally, the distributed operating system requires no pattern; direct and indirect connections are possible between any two elements. Control Centralized and decentralized systems have directed flows of connection to and from the central entity, while distributed systems communicate along arbitrary paths. This is the pivotal notion of the third consideration. Control involves allocating tasks and data to system elements balancing efficiency, responsiveness and complexity. Centralized and decentralized systems offer more control, potentially easing administration by limiting options. Distributed systems are more difficult to explicitly control, but scale better horizontally and are offer fewer points of system failure. The associations conform to the needs imposed by its design but not by organizational limitations. c) Describe any FOUR goals of distributed file systems. (8 Marks)
Access transparency: - Clients are unaware that files are distributed and can access them in the same way as local files are accessed. Location transparency A consistent name space exists encompassing local as well as remote files. The name of a file does not give it location. Concurrency transparency:- All clients have the same view of the state of the file system. This means that if one process is modifying a file, any other processes on the same system or remote systems that are accessing the files will see the modifications in a coherent manner. Failure transparency:-The client and client programs should operate correctly after a server failure. Heterogeneity:-File service should be provided across different hardware and operating system platforms. Scalability:-The file system should work well in small environments (1 machine, a dozen machines) and also scale gracefully to huge ones (hundreds through tens of thousands of systems). Replication transparency:-To support scalability, we may wish to replicate files across multiple servers. Clients should be unaware of this. Migration transparency:-Files should be able to move around without the client's knowledge.
QUESTION THREE a) Threads are considered one of the fundamental processing strategies of distributed system. Sugest and briefly discuss how threading can be used in message queues to implement asynchronous requests? (5 Marks) The queued-asynchronous approach uses a queue to decouple the flow's receiver from the other steps in the flow. This means that once the receiver places a message into a queue, it can immediately return and accept a new incoming message. Furthermore, each message waiting in the queue can be assigned a different thread from the pool of threads. This is managed by a work manager, which manages multiple threads and dynamically chooses a thread to process the message taken from the queue. In this way, all assigned threads can execute simultaneously. Such parallel processing is ideal for situations where the receiver can, at peak times, accept messages significantly faster than the rest of the flow can process those messages.Under the queued-asynchronous processing strategy, the
receiver does not have to wait before accepting the next message, and the processing speed for the rest of the steps in the flow is effectively multiplied, because multiple messages are being processed at the same time. However, the increased throughput facilitated by the asynchronous approach comes at the cost of transactional reliability. Also, the queued-asynchronous approach, which uses two threads to process each message, is not suitable for request-response
exchange patterns, which need to be performed entirely on a single thread.
b) Write short notes on the following trend in distributed systems. i) Multimedia.
(3 Marks)
Multimedia is media and content that uses a combination of different content forms. The term can be used as a noun (a medium with multiple content forms) or as an adjective describing a medium as having multiple content forms. The term is used in contrast to media which use only rudimentary computer display such as text-only or traditional forms of printed or handproduced material. Multimedia includes a combination of text, audio, still images, animation, video, or interactivity content forms. Multimedia is usually recorded and played, displayed or accessed by information content processing devices, such as computerized and electronic devices, but can also be part of a live performance. Multimedia (as an adjective) also describes electronic media devices used to store and experience multimedia content. Multimedia is distinguished from mixed media in fine art; by including audio, for example, it has a broader scope. The term "rich media" is synonymous for interactive multimedia. Hypermedia can be considered one particular multimedia application.
ii) Mobile computing. (3 Marks) Mobile computing is taking a computer and all necessary files and software out into the field. Mobile computing: being able to use a computing device even when being mobile and therefore changing location. Portability is one aspect of mobile computing. Mobile computing is the ability to use computing capability without a pre-defined location and/or connection to a network to publish and/or subscribe to information. Mobile Computing is a variety of wireless devices that has the mobility to allow people to connect to the internet, providing wireless transmission to access data and information from where ever location they may be. iii) Intranet. (3 Marks) An intranet is a computer network that uses Internet Protocol technology to securely share any part of an organization's information or network operating system within that organization. It is the connection of computer networks in a local area. The term is used in contrast to internet, a network between organizations, and instead refers to a network within an organization. Sometimes, the term refers only to the organization's internal website, but may be a more extensive part of the organization's information technology infrastructure. It may host multiple private websites and constitute an important component and focal point of internal communication and collaboration. Any of the well known Internet protocols may be found in an intranet, such as HTTP (web services), SMTP (e-mail), and FTP (file transfer protocol). Internet technologies are often deployed to provide modern interfaces to legacy information systems hosting corporate data. iv) Extranet. (3 Marks) An extranet is a computer network that allows controlled access from the outside, for specific business or educational purposes. In a business-to-business context, an extranet can be viewed as an extension of an organization's intranet that is extended to users outside the organization, usually partners, vendors, and suppliers, in isolation from all other Internet users. In contrast, business-to-consumer (B2C) models involve known servers of one or more companies, communicating with previously unknown consumer users. An extranet is similar to a DMZ in that it provides access to needed services for channel partners, without granting access to an organization's entire network. v) E Commerce. (3 Marks) refers to the buying and selling of products or services over electronic systems such as the Internet and 7
other computer networks. Electronic commerce draws on such technologies as electronic funds transfer, supply chain management, Internet marketing, online transaction processing, electronic data interchange (EDI), inventory management systems, and automated data collection systems. Modern electronic commerce typically uses the World Wide Web at least at one point in the transaction's lifecycle, although it may encompass a wider range of technologies such as e-mail, mobile devices and telephones as well. QUESTION FOUR a) Group communication allows multicasting of a message to a group of processes as a single action, this supports for replication and efficient dissemination of data. To achieve the above there should be guarantees. Your are required to discuss reliability guarantee and ordering guarantee. (10 Marks) Reliability guarantee
Many distributed systems support reliable message delivery, either point-to-point (one consumer and one provider) or group based (many consumers and one provider). Typically the semantics imposed on reliability are that the message will be delivered or the sender will be able to know with certainty that it did not get to the receiver, even in the presence of failures. It is frequently the case that systems employing reliable messaging implementations distinguish between a message being delivered to the recipient and it being processed by the recipient: for instance, simply getting the message to a service does not mean much if a subsequent crash of the service occurs before it has time to work on the contents of the message. Within distributed systems the only transport you can use which gives the above mentioned failure semantics on Message delivery and processing is JMS: with transacted sessions (an optional part of the distributed systems), it is possible to guarantee that Messages are received and processed in the presence of failures. If a failure occurs during processing by the service, the Message will be placed back on to the JMS queue for later re-processing. However, this does have some important performance implications: transacted sessions can be significantly slower than non-transacted sessions so should be used with caution. Because none of the other transports supported by distributed systems come with transactional or reliable delivery guarantees, it is possible for Messages to be lost. However, in most situations the likelihood of this occurring is small. Unless there is a simultaneous failure of both sender and receiver (possible but not probable), the sender will be informed by distributed systems about any failure to deliver the Message. If a failure of the receiver occurs whilst processing and a response was expected, then the receiver will eventually time-out and can retry.
b) Discuss any THREE styles used in middleware. (6 Marks) Broker Architecture of a Middleware and Basic Remoting Patterns The pattern BROKER1 first described in POSA1 [BMR+96] is, from our perspective, a compound pattern that is typically implemented using a number of patterns from the Remoting Pattern language. The BROKER pattern addresses the problem that distributed software system developers face many challenges that do not arise in single-process software. One main challenge is the unreliable communication across networks. Other challenges are the integration of heterogeneous components into coherent applications, as well as the efficient usage of networking resources. If developers of distributed systems must master all these challenges within their application code, they might loose their primary focus, to develop distributed applications that solve the application problems well. Identification Patterns Clients of distributed applications need to find the correct remote object within the server application. Thus, means for identification, addressing, and lookup of remote objects are required. The identification of remote objects in a server application is usually done by assigning logical OBJECT IDS for remote objects. These OBJECT IDS are embedded in remote invocations so that the
INVOKER can find the correct remote object. However, OBJECT IDS only identify the remote object in the context of one particular server application. That is, in two different server applications two different objects with the same OBJECT ID might exist. For a remote invocation we additionally need to deliver the message to the correct server application. An ABSOLUTE OBJECT REFERENCE extends the concept of OBJECT IDS with location information, such as the hostname, the port, and the OBJECT ID of a remote object. Lifecycle Management Patterns Different remote objects require different lifecycles. Some remote objects need to exist from server application startup to termination. Other remote objects need to be available only for a limited period of time. In addition to difference in the lifecycles, a number of additional tasks might be coupled with the activation and deactivation of remote objects. An important aspect is that the activation and deactivation of remote objects have a strong influence on the overall resource consumption of the distributed application. Extension Patterns Often, when developing distributed applications, developers need to deal with extension concerns in the context of remote invocations at various layers of the distributed object middleware. Such extension concerns are, for instance, support for security, support for transactions, or the exchange of communication protocols. To handle such extension concerns, remote invocations need to contain more information than just the operation name and its parameters: for instance, for transaction support a transaction ID needs to be transported between client and server. For that purpose INVOCATION CONTEXTS are used: they are added to the remote invocation on client side and read out on server side. c) System failure is associated with systems not being able to achieve their goals. Descibe FOUR types of failure in distributed systems. (4 Marks) Split Brain This is a classic problem associated with distributed systems. It arises when machines providing certain kinds of fault tolerant services lose communication with each other but decide to continue operating independently. When different parts of the system diverge, the various groupings of machines may decide that the other groups have failed, even though such groups are actually still running and servicing requests. When this happens, both groups of machines may start to operate authoritatively for the service they provide causing contradictory changes to occur. Inconsistent Failure Detection A fundamental aspect of distributed applications is that failures in different components of the system must be accurately detected so that countermeasures or reconfigurations can be made to adjust to the changes in the system. Problems arise when the various components of the system detect different sets of failures or detect the failures in different orders. Site Failures A site failure occurs when all machines at a given site stop operating. A common cause of site failures is the loss of power to all machines. Note that even with uninterruptible power supplies (UPS), it is possible for a power loss to last longer than the UPS can provide power to the computers. It is not uncommon for distributed systems to be designed based on the assumption that site failures do not occur. This is often done because a site failure is classified as a "double-fault" because more than one machine has failed and many distributed systems are built with the ability to only handle single component failures. However, site failures do occur and distributed systems must be able to address them. Delayed messages This problem arises in scenarios where the delivery of messages is not synchronized with the detection of failures that occur in the system. Because messages are sent and received in particular configurations, it can cause problems when messages sent from one configuration are received by a machine that is in another configuration. Oftentimes major components of applications are implemented under the assumption that the configuration in the system will not change. Of course, serious problems will arise when it does change and the messages happen to cross configuration boundaries 9
QUESTION FIVE a) Describe any FOUR properties of distributed algorithms used in synchronization. (4 Marks) b) What is the difference between vertical distribution and horizontal distribution. (2 Marks) Vertical distribution: Distributed processing is equivalent to organizing a client-server application as multitiered architecture. Place logically different components on different machines. Horizontal distribution: Distribution of the clients and servers - more common in modern architecture. A client or server may be physically split up into logically equivalent parts, but each part is operating on its own share of the complete data set, thus balancing the load. c) Discuss the THREE main steps of fault tolerance. (6 Marks) Multi-Agent System overview A multi-agent system (MAS) is a "loosely coupled network of software agents that interact to solve problems that are beyond the individual capacities or knowledge of each problem solver.
Self-healing and self-organization study
When talking about fault-tolerance systems in a multi-agent environment we are talking about a reliable system capable of treating and recovering from faults. Such a system in the current literature is de_ned in term of a self managed system. The spread of capabilities and actions that a system can encounter and manage has led to a classi_cation of self-* systems according to the features that each class is mapped on. So we can have: Self-healing, Self-organization, Self-optimizing, Self-protection etc. For the moment we are going to focus on self-healing at agent level and selforganization features of the whole architecture of a system. Self-healing Self-healing is a modern approach to the problem of detecting failure, diagnose and recovering from it in order to make the system more reliable. Our analysis is focused on system repair and recovery. First when we think of a self-healing mechanism what we need to divergent are the states of the system. Normally as shown in the paper there are three main states in which a system can be: normal state, degraded state, broken state. Usually it is tricky to divergent each of this states the line between them is sometimes gray. For this purpose is very important to de_ne for every system in part as many characteristics as needed for the proposed mechanism of classi_cation, making it clear for divergent situation encountered in which state the system is. There are many divergent approaches on how a self-healing system should look but all of this approaches spin around the same main mechanisms: detect, diagnose, analyze, plan, knowledge, recover. Fault detection in systems which have a self-healing mechanism usually is obtained by having mechanism able to recognize degradation. For this to be realized divergent papers show two main approaches for detecting and reporting suspicious behavior: searching for inconsistencies in the sensed data or triggered by inappropriate behavior. Very important for a self-healing system are its policies. In [4] are de_ned three groups of policies: Action Policies, Goal Policies, and Utility Function Policies. And last but not the least are the recovery techniques that show what behavior should be applied: replacement, balancing, isolation, persistence, redirection, relocation, diversity. Self-Organization There are many definition of self-organization in almost every technical and social . From the point of view of computer science the trend is usually focused on trying to understand how nature do thing because it usually knows to do it best. According to [3] "Self-organization is de_ned as the mechanism or the process enabling a system to
change its organization without explicit external command during its execution time.. A typical self-organizational system needs to be capable of adaptation, evolution and emergence.
d) Explain the differences between naming and trading and suggest when you would use naming and when would you use trading for locating objects. (8 Marks) marking scheme
11

Bit 2303 Distributed Systems

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Bit 2303 Distributed Systems

Diunggah oleh

Hak Cipta:

Format Tersedia

Wachira kinyua harold BICT-001-10/04873 DIS a) With examples describe Access, Location and Migration transparency in a distributed system.

exchange patterns, which need to be performed entirely on a single thread.

b) Write short notes on the following trend in distributed systems. i) Multimedia.

Self-healing and self-organization study

Anda mungkin juga menyukai