You are on page 1of 12

www.1000projects.

com
www.fullinterview.com
www.chetanasprojects.com

A full paper report on Grid Computing

evolutionary level. The goal is to create


Abstract the illusion of a simple yet large and
powerful self managing virtual computer
“Grid” computing has emerged as an out of a large collection of connected
important new field. heterogeneous systems sharing various
In this article, defined it. we review the combinations of resources.
“Grid problem,” which we define as
flexible, secure, coordinated resource
sharing among dynamic collections of We describe requirements that we
individuals, institutions, and resources— believe any such mechanisms must
what we refer to as virtual satisfy and we discuss the importance of
organizations. defining a compact set of intergrid
protocols to enable interoperability
In such settings, we encounter unique among different Grid systems.
authentication, authorization, resource
access, resource discovery, and other We mainly emphasis on:
challenges. It is this class of problem
that is addressed by Grid technologies. Grid Computing can do
we present an extensible and open Grid Grid Construction
architecture, in which protocols, Grid Architecture
services, application programming Using a Grid:User’s Perspective
interfaces, and software development Using a Grid: An Administrators
kits are categorized according to their Perspective
roles in enabling resource sharing. Applications of Grid

Grid computing, most simply stated, is . Finally, we discuss how Grid


distributed computing taken to the next technologies relate to other
contemporary technologies, including
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
enterprise integration, application Applications
service provider, storage service
provider, and peer-to-peer computing.
There are many factors to consider in
We maintain that Grid concepts and grid –enabling an application. One must
technologies complement and have understand that not all applications can
much to contribute to these other be transformed to run in parallel on a
approaches. grid and achieve scalability. There are
some practical tools that skilled
application designers can use to write a
Introduction: parallel grid application. However,
automatic transformation of applications
Grid computing, most simply stated, is is a science in its infancy. This can be a
distributed computing taken to the next difficult job and often requires top
mathematics and programming talents, if
evolutionary level. The goal is to create
it is even possible in a given situation.
the illusion of a simple yet large and
New computation intensive applications
powerful self managing virtual computer written today are being designed for
out of a large collection of connected parallel execution and these will be
heterogeneous systems sharing various easily grid-enabled, if they do not
combinations of resources. already follow emerging grid protocols
The standardization of communications and standards.
between heterogeneous systems
created the Internet explosion. The
Virtual resources and
emerging standardization for sharing virtual organizations for
resources, along with the availability of
higher bandwidth, are driving a possibly
collaboration
equally large evolutionary step in grid
computing. Another important grid computing
The following major topics will be contribution is to enable and simplify
introduced to the readers in this chapter: collaboration among a wider audience.
The users of the grid can be organized
dynamically into a number of virtual
organizations, each with different policy
Grid requirements.These virtual organizations
can share their resources collectively
computing can do as a larger grid.
Sharing starts with data in the form of
files or databases. A “data grid” can
expand data capabilities in several ways.
When you deploy a grid, it will be to
First, files or databases can seamlessly
meet a set of customer requirements. To
span many systems and thus have larger
better match grid computing
capacities than on any single system.
capabilities to those requirements, it is
Such spanning can improve data transfer
useful to keep in mind the reasons for
rates through the use of striping
using grid computing.
techniques. Data can be duplicated
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
throughout the grid to serve as a backup to make room for the higher priority
and can be hosted on or near the work.
machines most likely to need the data, in Without a grid infrastructure, such
conjunction with advanced scheduling balancing decisions are difficult to
techniques. prioritize and execute.
Sharing is not limited to files, but also While the Resource layer is focused on
includes many other resources, such as interactions with a single resource, the
equipment, software, services, licenses, next layer in the architecture contains
and others. These resources are protocols and services (and APIs and
“virtualized” to give them a more SDKs) that are not associated with any
uniform interoperability among one specific resource but rather are
heterogeneous grid participants. global in nature and capture interactions
across collections of resources.

Figure 1

Resource balancing
Reliability
A grid federates a large number of
resources contributed by individual High-end conventional computing
machines into a greater total virtual systems use expensive hardware to
resource. This feature can prove increase Reliability. They are built using
invaluable for handling occasional peak chips with redundant circuits that vote
loads of activity in parts of an larger on results,
organization. This can happen in two And contain much logic to achieve
ways: graceful recovery from an assortment of
An unexpected peak can be routed to Hardware failures. The machines also
relatively idle machines in the grid. use duplicate processors with hot
If the grid is already fully utilized, the Pluggability so that when they fail, one
lowest priority work being performed on can be replaced without turning the other
the grid can be temporarily suspended or off. Power supplies and cooling systems
even cancelled and performed again later are duplicated. The systems are operated
on special power sources that can start
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
generators if utility power is interrupted. the grid grows, and as users become
All of this builds a reliable system, but at more dependent on it for mission-critical
a great cost, due to the duplication of work, a degree of planning is essential. It
high-reliability components. is best to understand the organization’s
In principle, most of the reliability requirements and choose grid
attributes achieved using hardware in technologies that best fit these
today’s high availability systems can be requirements. This section discussed
achieved using software in a grid setting some of the planning considerations and
in the future. grid components that address the
Management requirements.

Grid Architecture
The goal to virtualize the resources on
the grid and more uniformly handle
heterogeneous systems will create new Our goal in describing our Grid
opportunities to better manage a larger, architecture is not to provide a complete
more disperse IT infrastructure. It will enumeration of all required protocols
be easier to visualize capacity and (and services, APIs, and SDKs) but
utilization, making it easier for IT rather to identify requirements for
departments to control expenditures for general classes of component. The result
computing resources over a larger is an extensible, open architectural
organization. structure within which can be placed
The grid offers management of priorities solutions to key VO requirements.
among different projects. In the past, By definition, the number of protocols
each project may have been responsible defined at the neck must be small. In our
for its own IT resource hardware and the architecture, the neck of the hourglass
expenses associated with it. Often this consists of Resource and Connectivity
hardware might be underutilized while protocols, which facilitate the sharing of
another project finds itself in trouble, individual resources. Protocols at these
needing more resources due to layers are designed so that they can be
unexpected events. With the larger view implemented on top of a diverse range of
a grid can offer, it becomes easier to resource types, defined at the Fabric
control and manage such situations. layer, and can in turn be used to
Aggregating utilization data over a construct a wide range of global services
larger set of projects can enhance an and application-specific behaviors at the
organization’s ability to project future Collective layer—so called because they
upgrade needs. When maintenance is involve the coordinated (“collective”)
required, grid work can be rerouted to use of multiple resources.
other machines without crippling the
projects involved.

Grid construction
An ad hoc grid may be installed by a few
programmers in their spare time, but as
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
rigorously secured to deter any kind of
attack.
Organization
The technology considerations are
important in deploying a grid. However,
organizational and business issues can
be equally important. It is important to
understand how the departments in an
organization interact, operate, and
Figure 2: The layered Grid architecture contribute to the whole. Often, there are
and its relationship to the Internet barriers built between departments and
protocol architecture. Because the projects to protect their resources in an
Internet protocol architecture extends effort to increase the probability of
from network to application, there is a timely success.
mapping from Grid layers into Internet Grid software
layers.
Deployment planning components
This section presents some of the key
The use of a grid is often born from a
components that must be discussed
need for increased resources of some
before designing a grid computing
type.
architecture.
One of the first considerations is the
Management components
hardware available and how it is
Any grid system has some management
connected via a LAN or WAN, if an
components. First, there is a component
organization may want to add additional
that keeps track of the resources
hardware to augment the capabilities of
available to the grid and which users are
the grid.
members of the grid.
Security Second, there are measurement
components that determine both the
Security is a much more important factor capacities
in planning and maintaining a grid than of the nodes on the grid and their current
in conventional distributed computing, utilization rate at any given time. This
where data sharing comprises the bulk of information is used to schedule jobs in
the activity. In a grid, the member the grid.
machines are configured to execute Third, advanced grid management
programs rather than just move data. software can automatically manage
This makes an unsecured grid potentially many aspects of the grid. This is known
fertile ground for viruses and Trojan as autonomic computing, or “recovery
horse programs. For this reason, it is oriented computing.” This software
important to understand exactly which would automatically recover from
components of the grid must be various kinds of grid failures and

www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
outages, finding alternative ways to get
the workload processed.
Larger grids may have a hierarchical or
Donor software other type of organizational topology
usually matching the connectivity
topology. That is, machines locally
Each machine contributing resources
connected together with a LAN form a
typically needs to enroll as a member of
“cluster” of machines. The grid may be
the grid and install some software that
organized in a hierarchy consisting of
manages the grid’s use of its resources.
clusters of clusters. The work involved
Some grid systems provide their own
in managing the grid is distributed to
login to the grid while others depend on
increase the scalability of the grid. The
the native operating systems for user
collection and grid operation and
authentication. In the latter case, a user
resource data as well as job scheduling is
ID mapping system may be needed to
distributed to match the topology of the
match the user’s rights properly on
grid.
different machines.
Most importantly, the software installed Schedulers
on a given machine can accept an
executable job from the grid
Most grid systems include some sort of
management system and execute it.
job scheduling software. This software
More advanced implementations can
locates a machine on which to run a grid
dynamically adjust the priority of a
job that has been submitted by a user.
running job, suspend it and resume
In the simplest cases, it may just blindly
running it later, or checkpoint it with the
assign jobs in a round-robin fashion to
possibility of resuming its execution on a
the next machine matching the resource
different machine. These kinds of
requirements.
actions may be necessary to respond to
Some schedulers implement a job
load balancing problems or priority or
priority system. This is sometimes done
policy changes in the grid.
by using several job queues, each with a
Submission software different priority. As grid machines
become available to execute jobs, the
jobs are taken from the highest priority
Usually any member machine of a grid
queues first.
can be used to submit jobs to the grid
Policies of various kinds are also
and initiate grid queries. In some grid
implemented using schedulers.
systems, this function is implemented
More advanced schedulers will monitor
as a separate component installed on
the progress of scheduled jobs managing
“submission nodes” or “submission
the overall work-flow.
clients.” When a grid is built using
dedicated resources rather than Communications
scavenged resources, separate
submission software is usually installed
A grid system may include software to
on the user’s desktop or workstation.
help jobs communicate with each other.
Distributed grid The open standard Message Passing
management Interface (MPI) and any of several
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
variations is often included as part of the Command line tools are especially
grid system for just this kind of useful when the user wants to write a
communication. script that automates a sequence of
actions. First, some input data and
possibly the executable program or
Using a grid: A execution script file are sent to the
machine to execute the job. Sending the
user’s perspective input is called “staging the input data.”
Alternatively, the data and program files
may be pre-installed on the grid
This section describes the typical usage machines or accessible via a mountable
activities in using the grid from an user’s networked file system. A nice feature
perspective. provided by some grid systems is
to register these multiple versions of the
program so that the grid system can
Logging onto the grid automatically choose a correctly
matching version to the grid machine
that will run the program. Some grid
To use the grid, most grid systems
technologies require that the program
require the user to log on to a system
and input data be first processed or
using a user ID that is enrolled in the
“wrappered” in some way by the grid
grid. Other grid systems may have their
system.
own grid login ID separate from the one
Second, the job is executed on the grid
on the operating system. A grid login is
machine. The grid software running on
usually more convenient for grid users.
the donating machine executes the
It eliminates the ID matching
program in a process on the user’s
problemsamong different machines. To
behalf. It may use a common user ID on
the user, it makes the grid look more like
the machine or it may use the user’s own
one large virtual computer rather than a
user ID, depending on which grid
collection of individual machines. Some
technology is used. Some grid systems
grid implementations permit some query
implement a Introduction to Grid
functions if the user is not logged into
Computing with Globus protective
the grid or even if the user is not
“sandbox” around the program so that it
enrolled in the grid.
cannot cause any disruption to the
Queries and submitting donating machine if it encounters a
jobs problem during execution.
Third, the results of the job are sent back
to the submitter. In some
The user will usually perform some implementations, intermediate results
queries to check to see how busy the grid can be viewed by the user who
is, submitted the job.
to see how his submitted jobs are Data configuration
progressing, and to look for resources on
the grid. Grid systems usually provide
command line tools as well as graphical The data accessed by the grid jobs may
user interfaces (GUIs) for queries. simply be staged in and out by the grid
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
system. However, depending on its size
and the number of jobs, this can Using a grid:An
potentially add up to a large amount of
data traffic. For this reason, some
administrato’r
thought is usually given on how to
arrange to have the minimum of such
perspective
data movement on the grid.
There are many considerations in
This section describes the typical usage
efficiently planning the distribution and
activities in using the grid from an
sharing of data on a grid. This type of
administrator’s perspective.
analysis is necessary for large jobs to
better utilize the grid and not create
unnecessary bottlenecks.
Monitoring progress and Planning
recovery
The administrator should understand the
The user can query the grid system to organization’s requirements for the grid
see how his application and its sub jobs to better choose the grid technologies
are progressing. When the number of that satisfy those requirements. The
subjobs becomes large, it becomes too following sections briefly describe the
difficult to list them all in a graphical steps the administrator may take to
window. Instead, there may simply be a manage the grid. It is suggested that one
one large bar graph showing some should start by deploying a small grid
averaged progress metric. It becomes first, to learn about its installation and
more difficult for the user to tell if any management, before having to confront
particular subjob is not running properly. more complicated issues involved with a
A grid system, in conjunction with its large grid.
job scheduler, often provides some Installation
degree of recovery for subjobs that fail.
A job may fail due to a:
Programming error: The job stops part First, the selected grid system must be
way with some program fault. installed on an appropriately configured
Hardware or power failure: The machine set of machines. These machines should
or devices being used stop working be connected using networks with
in some way. sufficient bandwidth to other machines
Communications interruption: A on the grid. Of prime importance is
communication path to the machine has understanding the fail-over scenarios for
failed or is overloaded with other data the given grid system so that the grid can
traffic. continue operating even if any of the
Excessive slowness: The job might be in management machines fails in some
an infinite loop or normal job progress way. Machines should be configured and
may be limited by another process connected to facilitate recovery
running at a higher priority or some scenarios. Any critical databases or other
other form of contention. data essential for keeping track of the
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
jobs in the grid, members of the grid, As users join the grid, their identity must
and machines on the grid should have be positively established and entered in
suitable backups. The software to be the Certificate Authority. The user and
installed on the donor machines may his certificate credentials must be added
need to be customized so that it can find to the user list using the software
the grid management machines appropriate for the grid system deployed.
automatically and include pre-installed In some cases, the administrator must
public keys for the grid. This software propagate the user information to several
may be provided to potential donors on or all grid machines.
an FTP or equivalent server or be made The administrator must enter the
available on physical media. machine’s identification credentials,
Once, the grid is operational, there may addresses, and resource characteristics
be application software and data that using the appropriate software for
should be installed on donor machines as enrolling the donor machine into the
well. This software may have specific grid. In some cases, the administrator
licensing restrictions that should be may need to manually propagate this
understood and adhered to. Some grid information to other machines in the
systems include tools to assist with grid- grid.
wide license management. This can both Corresponding procedures for removing
help in following the rules of the users and machines must be executed by
licenses and most efficiently exploit the administrator.
those licenses.
Managing enrollment of Certificate authority
donors and users
It is critical to ensure the highest levels
of security in a grid because the grid is
An ongoing task for the grid
designed to execute code and not just
administrator is to manage the members
share data. Thus, it can be fertile ground
of the grid,
for viruses, Trojan horses, and other
both the machines donating resources
attacks if the grid system is
and the users. Users may be further
compromised in any way. The
organized as project groups. The
Certificate Authority is one of the most
administrator is responsible for
important aspects of maintaining strong
controlling the
grid security. An organization may
rights of the users in the grid. Donor
choose to use an external
machines may have access rights that
Certificate Authority or operate one
require management as well. Grid jobs
itself. You must be able to trust the
running on donor machines may be
Certificate Authority to strictly adhere to
executed under a special grid user ID on
its responsibilities.
behalf of the users submitting the jobs.
The primary responsibilities of a
The rights of these grid user IDs must be
Certificate Authority are:
properly set so that grid jobs do not
allow access to parts of the donor • Positively identify entities
machine to which the users are not requesting certificates
entitled. • Issuing, removing, and archiving
certificates
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
• Protecting the Certificate systems, and any tampering with the
Authority server communication is revealed.
• Maintaining a namespace of
unique names for certificate
owners Resource management
• Serve signed certificates to those
needing to authenticate entities Another responsibility of the
• Logging activity administrator is to manage the resources
Briefly, a Certificate Authority is based of the grid.
on the public key encryption system. In This includes setting permissions for
this system, keys are generated in pairs, grid users to use the resources as well
a public key and a private key. Either astracking resource usage and
one can be used to encrypt some data implementing a corresponding
such that the other is needed to decrypt accounting or billing system. Usage
it. statistics are useful in identifying trends
The private key is guarded by the owner in an organization that may require the
and never revealed to anyone. The acquisition of additional hardware,
public one is given to anyone needing it. reduction in excess hardware to reduce
A Certificate Authority is used to hold costs, and adjustments in priorities and
these public keys and to guarantee who policies to achieve utilization that is
they belong to. When a user uses his fairer or better achieves the overall goals
private key to encrypt something, the of an organization.
receiver uses the corresponding public Software license managers can be used
key to decrypt it. The receiver knows in a grid setting to control the proper
that only that user’s public key can utilization.
decrypt the message correctly. However, These may be configured to work with
anyone could intercept this message and job schedulers to prioritize the use of the
decrypt it because anyone can get the limited licenses.
originator’s public key. If the originator
instead doubly encrypts the message Data sharing
with his private key and the intended
recipient’s public key, a secure For small grids, the sharing of data can
communication link is formed. The be fairly easy, using existing networked
receiver uses his private key to decrypt file systems, databases, or standard data
the message and then uses the sender’s transfer protocols. As a grid grows and
public key for the second decryption. the users become dependent on any of
Now the recipient knows that if the the data storage repositories, the
message decrypts properly, then only the administrator should consider
sender could have sent it and procedures to maintain backup copies
furthermore, the sender knows that only and replicas to improve performance. All
the intended receiver can decrypt it. The of the resource management concerns
beauty of all of this is that nobody had to apply to data on the grid.
securely carry an encryption key from
the sender to the receiver, as must be
done for conventional encryption

www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
Sharing relationships are often not
Applications Of simply client-server, but peer to peer:
providers can be consumers, and sharing
Grid: relationships can exist among any subset
of participants. Sharing relationships
may be combined to coordinate use
across many resources, each owned by
Grid applications include different organizations.
These characteristics and requirements
• Distributed Supercomputing define what we term a virtual
• Distributed Supercomputing organization, a concept that we believe
applications couple multiple is becoming fundamental to much of
computational resources – modern computing. VOs enable
supercomputers and/or disparate groups of organizations and/or
workstations individuals to share resources in a
• Distributed supercomputing controlled fashion, so that members may
applications include SFExpress collaborate to achieve a shared goal.
(large-scale modeling of battle
entities with complex interactive
behavior for distributed
interactive simulation), Climate
Conclusi
Modeling (modeling of climate
behavior using complex models
on:
and long time-scales)
We have provided in this article a
• High-Throughput Applications concise statement of the “Grid problem,”
Grid used to schedule large which we define as controlled and
numbers of independent or coordinated resource sharing and
loosely coupled tasks resource use in dynamic, scalable virtual
with the goal of putting unused organizations.
cycles to work Finally, we have discussed in some
High-throughput applications detail how Grid technologies relate to
include RSA key cracking. other important technologies.
We also hope that our analysis will help
• Data-Intensive Applications establish connections among Grid
Sharing relationships can vary developers and proponents of related
dynamically over time, in terms of the technologies.
resources involved, the nature of the
access permitted, and the participants to
Refere
whom access is permitted. The dynamic
nature of sharing relationships means
nces
that we require mechanisms for
discovering and characterizing the nature 1. Barry, J., Aparicio,M., Durniak, T.,
of the relationships that exist at a Herman, P., Karuturi, J.,Woods, C.,
particular point in time. Gilman, C.,

www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com
Ramnath, R. and Lam, H., NIIIP-
SMART: An Investigation of Distributed
Object
Approaches to Support MES
Development and Deployment in a
Virtual Enterprise. In
2nd Intl Enterprise Distributed
Computing Workshop, 1998, IEEE
Press.
2. Berman, F., Wolski, R., Figueira, S.,
Schopf, J. and Shao, G. Application-
Level
Scheduling on Distributed
Heterogeneous Networks. In Proc.
Supercomputing '96, 1996.

www.1000projects.com
www.fullinterview.com
www.chetanasprojects.com

You might also like