Anda di halaman 1dari 13

FILE DOWNLOADING DISTRIBUTION AND

SCHEDULING ON GRID/CLUSTER/CLOUD
COMPUTING
Aim of the Project:

The aim of the project is to build a generic distributed system which


supplies a friendly user API allows remote execution. Using the system
will allow the developer to run his tasks producer over one computer
and the system will take care of distributing the tasks over all
computers connected to it by their priority in order to be executed in
parallel and return the tasks’ results to the same computer the
developer use.
Problem Statement:

The system checks if the System Version (increases every time the user
updates the system) and Executer Version (determined by the
Task/Result classes) are equal. If not, the system moves the executer to
the Black List and runs only after the user runs Auto-Update. The
system is distributed and will run over more than one computer on the
web, a firewall problem appears. Some of the computers can run
behind a firewall and other will run without it. If Tasks buffers become
filled with a mass of Tasks (possible when the user adds 10000 small
tasks to same client for example), the buffer will become a bottleneck
and reach low performance situation
Proposed System:

In our proposed system, a system distributes tasks with different priorities to a-


“worker” machines and gets back results, the system provides generic API
which is extended to implement concrete system. In order to create a concrete
system.
The following interfaces are implemented:

•List of operations supported by the servers.

•Operation

•Task

•Result
The system support more than one client and the operation requests are
executed in parallel, means that the client can send another task request
before he got previous tasks results. Each task should have its own priority
and the system tries to complete higher priority tasks as fast as possible.
The system supports Auto-Update option and version handling and this
option allows the user to update, add or remove any of the parts provided
(operations list, implementations, task and result implementations) on
remote machines dynamically from the client’s computer. The system can
run with Firewalled networks, when the computers in charge of Clients and
System-manager are behind the Firewall and computers in charge of
Executors are not. The system has a UI which controls the important
features i.e, Auto-update and shows the Log. The system supports output
stream managing. When the task operation uses the output stream in the
Executor, it will be shown in the Client’s output stream and the system also
supports Remote Exception Handler option, on which an exception thrown
in any server while executing an operation is handled in the operation
originator client computer.
Project Module Description:
 System Manager Module: Manages communication among system
components such as Networking, Auto-update, Failure detection, Output
stream redirection and so on. In this module the End User can see all
Executers and Clients (running ones and suspended ones, those who are in
black list), manage them (add, remove and suspend), see their logs and Auto
Update the system.

 Client Module: This module provides user’s interface, resides in user’s side
and responsible for starting/stopping the system, submitting Tasks and
processing Results. This module includes all workers, threads and constants
such as Chunk Creator, Scheduler, Collector, Ports and others. Chunk
Creator creates a Chunk of Items to send via RMI. Chunk Scheduler is
responsible for scheduling a Chunk to an Executer and Result Collector
collects the ready Results for a Client.
 Executer Module: In this module, the remote component controlled by
System-Manager and implementing Task Operation. This module includes
all worker, threads and constants such as Directories to save Jar files,
Versions, Result Organizer, Task Executor etc. Result Organizer organize the
results to Clients (in order to send the Result of each Task to the relevant
Client, means the one who added the Task to the system). Task Executor
executes the Task after polling it from the Tasks Queue, and puts the Result
in the Results Queue.
Project Architecture:

N
E
Client Stub T Skeleton Server

W
O
R

A typical implementation model of Java-RMI using Stub and Skeleton objects.


RESULT

Executer 1
Give me Executor

Client System Manager Executer 2

Executor 3
Executer 3

TASK
Tasks and Results Scheduling and Communication Diagram
Hardware Requirements

The minimum configuration is as follows


• Pentium IV machine
• 512MB RAM
• 20 GB HDD
• Monitor Color/Monochrome
• Keyboard and Mouse

Software Requirements

• The language chosen for this project is Java and software used are jdk1.6.
• Operating System used: Microsoft windows XP.
• IDE used: My Eclipse 7.1
Future Enhancements:
 The default configuration of the RMI connection keeps the connection after
it’s unbinding for 10 minutes which forces the system to kill the System
Manager each time Auto-Update is called in order to remove the RMI
connection. It’s more efficient to solve this problem and configure the RMI
connection not to keep the unbounded connection.

 Since the system is used in the Technion network and firewall is used there,
we faced a problem with Executer connection to Client and System
Manager. This problem is solved (see Problems & Solutions section). The
solution assumes that the System Manager, Client and UI resided in the
same Firewall side and they are behind the Firewall, while the Executers can
reside in any side. Two expansions are offered for this issue:

 It’s more proper to add two ways firewall connection, so he Client, System
Manager and UI can be NOT behind the Firewall and the Executers are behind it.
This could be done by testing the connection before the start of the system and
configuring it to work for the two ways.

 Still, we assume that Client, System Manager and UI are behind the Firewall,
another feature can be added to the system which is: adding the support for them
to be anywhere they want to be.
 Failure detector feature detects to restore the connection between the
System Manager and the Executer if the machine disconnected from
the network and connects again (while the Executer itself still runs),
however, this feature doesn’t detect loss of data, a case might happen
as a result of package loss when the Executer become disconnected.
A solution to this problem can be adding a new component which
assures receiving all data needed and resend Tasks lost because of
disconnection of an Executer.

Anda mungkin juga menyukai