You are on page 1of 131

Grid MasterClass

Carl Kesselman
Information Sciences Institute
University of Southern California
Univa Corporation
2

Credits
● Globus Toolkit v4 is the work of many talented
Globus Alliance members, at
◆ Argonne Natl. Lab & U.Chicago
◆ USC Information Sciences Corporation
◆ National Center for Supercomputing Applns
◆ U. Edinburgh
◆ Swedish PDC
◆ Univa Corporation
◆ Other contributors at other institutions
● Supported by DOE, NSF, UK EPSRC, and other
sources
3

Acknowledgements
● Ian Foster with whom I developed many of these
slides
● Bill Allcock, Lisa Childers,
Kate Keahey, Jennifer Schopf,
Frank Siebenlist, Mike Wilde @ ANL/UC
● Ann Chervenak, Ewa Deelman, Laura Pearlman, Mike
D’Arcy, Rob Schuler @ USC/ISI
● Karl Czajkowski, Steve Tuecke @ Univa
● Numerous other fine colleagues
● NSF, DOE, IBM for research support
4

Context:
System-Level Science

Problems too large &/or complex to tackle alone …


5

Seismic Hazard Analysis


(T. Jordan & SCEC)
Seismicity Paleoseismology Local site effects Geologic structure

Faults

Seismic
Hazard
Model
InSAR Image of the
Hector Mine Earthquake
 A satellite
generated
Interferometric
Synthetic Radar
(InSAR) image of
the 1999 Hector
Mine earthquake.

 Shows the
displacement field
in the direction of
radar imaging

 Each fringe (e.g.,


from red to red)
Stress corresponds to a
few centimeters of
displacement.
transfer Rupture
Crustal motion Crustal deformation Seismic velocity
dynamics
structure
6

SCEC Community Model


1 Standardized Seismic Hazard Analysis
2 Ground motion simulation
3 Physics-based earthquake forecasting
4 Ground-motion inverse problem Other Data
Geology

5 Structural Simulation Geodesy

Unified Structural Representation


Faults Motions Stresses Anelastic model
Invert 4 5

Ground
FSM RD AWM SRM Motions
M
3
2
Earthquake Attenuation Intensity
Forecast Model Relationship Measures
1
FSM = Fault System Model AWP = Anelastic Wave Propagation
RDM = Rupture Dynamics Model SRM = Site Response Model
7

Science Takes a Village …


● Teams organized around common goals
◆ People, resource, software, data, instruments…
● With diverse membership & capabilities
◆ Expertise in multiple areas required
● And geographic and political distribution
◆ No location/organization possesses all required skills
and resources
● Must adapt as a function of the situation
◆ Adjust membership, reallocate responsibilities,
renegotiate resources
8

Virtual Organizations
● From organizational behavior/management:
◆ "a group of people who interact through
interdependent tasks guided by common purpose
[that] works across space, time, and organizational
boundaries with links strengthened by webs of
communication technologies" (Lipnack & Stamps,
1997)
● The impact of cyberinfrastructure
◆ People  computational agents & services
◆ Communication technologies  IT
infrastructure, i.e. Grid

“The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001


9

Beyond Science Silos:


Service-Oriented Architecture

Function
Resource

● Decompose across network


Users Discovery tools
● Clients integrate dynamically
◆ Select & compose services
◆ Select “best of breed” providers Analysis tools
◆ Publish result as a new service
Data Archives
● Decouple resource & service providers

Fig: S. G. Djorgovski
10

Decomposition Enables
Separation of Concerns & Roles
S1

User S2
D
“Provide access to S3
data D at S1, S2, S3 S1
with performance P”
S2
Service D
Provider Replica catalog,
S3
“Provide storage User-level multicast, …
with performance P1,
network with P2, …” S1

Resource S2
D
Provider
S3
11

Forming & Operating


(Scientific) Communities
● Define VO membership and roles, & enforce laws and
community standards
◆ I.e., policy
● Build, buy, operate, & share community
infrastructure
◆ Data, programs, services, computing, storage,
instruments
◆ Service-oriented architecture
● Define and perform collaborative work
◆ Use shared infrastructure, roles, & policy
◆ Manage community workflow
12

Forming & Operating


(Scientific) Communities
● Define VO membership and roles, & enforce laws
and community standards
◆ I.e., policy
● Build, buy, operate, & share community infrastructure
◆ Data, programs, services, computing, storage,
instruments
◆ Service-oriented architecture
● Define and perform collaborative work
◆ Use shared infrastructure, roles, & policy
◆ Manage community workflow
13

Defining Community:
Membership and Laws
● Identify VO participants and roles
◆ For people and services
● Specify and control actions of members
◆ Empower members  delegation
◆ Enforce restrictions  federate policy

A B

1 1

10 10

1
A B

1 2 1 2
16
14

Security Services Objectives


● It’s all about “policy”
◆ Define a VO’s operating rules
◆ Security services facilitate the enforcement
● Policy facilitates “business objectives”
◆ Related to goals/purpose of the VO
● Security policy often delicate balance
◆ Legislation may mandate minimum security
◆ More security  Higher costs
◆ Less security  Higher exposure to loss
◆ Risk versus Rewards
15

Policy Challenges in VOs


● Restrict VO operations based on
characteristics of requestor
Effective
◆ VO dynamics create challenges
Access
● Intra-VO
◆ VO specific roles
Access
Policy of site
◆ Mechanisms to specify/enforce to
granted by
community
community
policy at VO level to user

Site
● Inter-VO admission-
control
◆ Entities/roles in one VO not policies

necessarily defined in another VO


16

Core Security Mechanisms


● Attribute Assertions
◆ C asserts that S has attribute A with value V
● Authentication and digital signature
◆ Allows signer to assert attributes
● Delegation
◆ C asserts that S can perform O on behalf of C
● Attribute mapping
◆ {A1, A2… An}vo1  {A’1, A’2… A’m}vo2
● Policy
◆ Entity with attributes A asserted by C may perform
operation O on resource R
17

Trust in VOs
● Do I “believe” an attribute assertion?
◆ Used to evaluate cost vs. benefit of
performing an operation
● E.g., perform untrusted operation with extra auditing
● Look at attributes of assertion signer
● Rooting trust
◆ Externally recognized source, e.g., CA
◆ Dynamically via VO structure  delegation
◆ Dynamically via alternative sources, e.g.,
reputation
18

Security Services for VO Policy


● Attribute Authority (ATA)
◆ Issue signed attribute assertions (incl. identity, delegation &
mapping)
● Authorization Authority (AZA)
◆ Decisions based on assertions & policy
● Use with message/transport level security

Resource Admin VO Delegation Assertion VO


Attribute User A User B can use Service A AZA

VO VO Me Mapping VO-A Attr 


ATA mber ATA VO-B Attr
Attrib
ute

VO Member VO VO A VO B
Attribute User B Service Service
19

Forming & Operating


Scientific Communities
● Define VO membership and roles, & enforce laws and
community standards
◆ I.e., policy
● Build, buy, operate, & share community
infrastructure
◆ Data, programs, services, computing, storage,
instruments
◆ Service-oriented architecture
● Define and perform collaborative work
◆ Use shared infrastructure, roles, & policy
◆ Manage community workflow
20

Bootstrapping a VO
by Assembling Services
1) Integrate services from other sources
Virtualize external services as VO services

Community
2) Coordinate & compose
Content
Create new services from existing ones

Services
Services Provider
Capacity
Capacity
Provider

“Service-Oriented Science”, Foster, 2005


21

Providing VO Services:
(1) Integration from Other Sources
● Negotiate service
level agreements

Community
Delegate and deploy capabilities/services
A … Community
Z
● Provision to deliver
defined capability
● Configure environment
● Host layered functions
22

Virtualizing Existing Services into a VO


● Establish service agreement with service
◆ E.g., WS-Agreement
● Delegate use to VO user
User
User VO User B
A

VO Admin

Existing
Services
23

Deploying New Services

Policy
Allocate/provision
Configure
Client Activity
Initiate activity
Monitor activity
Control activity Environment

Interface Resource provider


24

Activities Can Be Nested

Client Policy
Client

Client

Environment

Interface Resource provider


25

Providing VO Services:
(2) Coordination & Composition
● Take a set of provisioned services …
… & compose to synthesize new behaviors

● This is traditional service composition


◆ But must also be concerned with emergent
behaviors, autonomous interactions
◆ See the work of the agent & PlanetLab
communities

“Brain vs. Brawn: Why Grids and Agents Need Each Other,"
Foster, Kesselman, Jennings, 2004.
The Globus-Based 26

LIGO Data Grid


LIGO Gravitational Wave Observatory

Birmingham•
Cardiff

AEI/Golm

Replicating >1 Terabyte/day to 8 sites


>40 million replicas so far
MTBF = 1 month

www.globus.org/solutions
27

Data Replication Service


● Pull “missing” files to a storage system
Data Location
Data Movement Local Replica
GridFTP Replica Location
Reliable Catalog Index
File
Transfer
Service
Local Replica
GridFTP Replica Location
Catalog Index
Data Replication
List of
required Data
Files Replication
Service

“Design and Implementation of a Data Replication Service Based on the Lightweight


Data Replicator System,” Chervenak et al., 2005
28

Composing Resources …
Composing Services

Deploy service GridFTP LRC GridFTP


DRS

Deploy container VO Services


JVM

Deploy virtual machine


VM VM

Deploy hypervisor/OS
Hypervisor/OS

Procure hardware
Physical machine

Provisioning, management, and monitoring at all levels


29

Community Commons
● What capabilities are available to VO?
◆ Membership changes, state changes
● Require mechanisms to aggregate and update VO information
● Monitoring and discovery

MORE
The age of
information
A

A VO-specific indexes
A

S
S Information S S FRESH
30

Forming & Operating


Scientific Communities
● Define VO membership and roles, & enforce laws and
community standards
◆ I.e., policy
● Build, buy, operate, & share community infrastructure
◆ Data, programs, services, computing, storage,
instruments
◆ Service-oriented architecture
● Define and perform collaborative work
◆ Use shared infrastructure, roles, & policy
◆ Manage community workflow
31

Collaborative Work
Executed
Executing
Query Executable
Not yet executable

What I What I
Did Am Doing Edit

What I …
Want to Do

Execution environment Schedule

Time
32

Managing Collaborative Work


● Process as “workflow,” at different scales, e.g.:
◆ Run 3-stage pipeline
◆ Process data flowing from expt over a year
◆ Engage in interactive analysis
● Need to keep track of:
◆ What I want to do (will evolve with new knowledge)
◆ What I am doing now (evolve with system config.)
◆ What I did (persistent; a source of information)
Workflow with
executable
Abstract
nodes
Worfklow
Template Workflow Execution
Generation Refinement Environment
Trident: The GriPhyN 33

Virtual Data System


Workflow spec Create Execution Plan Grid Workflow Execution

VDL Statically DAGman


Program Partitioned DAG
DAG

Virtual Data DAGman &


catalog Dynamically
Condor-G
Planned
DAG
Virtual Data Job Job
Workflow Planner Cleanup
Local planner
Generator

Abstract
workflow
34

Workflow Generation
● Given: desired result and constraints
◆ desired result (high-level, metadata description)
◆ application components
◆ resources in the Grid (dynamic, distributed)
◆ constraints & preferences on solution quality
● Find: an executable job workflow
◆ A configuration that generates the desired result
◆ A specification of resources to be used
◆ Sequence of operations: create agreement, move
data, request operation
● May create workflow incrementally as information
becomes available

"Mapping Abstract Complex Workflows onto Grid Environments," Deelman, Blythe, Gil,
Kesselman, Mehta, Vahi, Arbree, Cavanaugh, Blackburn, Lazzarini, Koranda,
2003.
35

Seismic Hazard Curve


Exceeded every year
Ground motion that will be exceeded every year

Annual frequency of exceedance


Exceeded 1 time in
10 years

Ground motion that a person can expect to


Exceeded 1 time in be exceeded during their lifetime
100 years
Typical design for 10% probability of
buildings exceedance in 50 years
Exceeded 1 time in Typical design for hospitals
1000 years

Exceeded 1 time in Carl’s house during


Typical design for
10,000 years Northridge
nuclear power plant

Minor damage
Moderate damage

0.1 0.2 0.3 0.4 0.5 0.6


Ground Motion – Peak Ground Acceleration
36

SCEC Cybershake
● Calculate hazard curves by generating synthetic
seismograms from estimated rupture forecast
Hazard Map

Strain Green Rupture Forecast


Tensor

Hazard Curve
Spectral Acceleration
Synthetic Seismogram
37

Cybershake on the SCEC VO


Provenance
Catalog

Data Workflow
Catalog Scheduler/Engine

VO Service
Catalog

VO Scheduler
SCEC TeraGrid TeraGrid
Storage Storage Compute
38
Some uses of Pegasus

N u m b e r o f jo b s p e r d a y (23 d a ys ), 2 6 1 ,8 2 3 jo b s to ta l, N u m b e r
o f C PU h o u rs p e r d ay, 1 5 ,7 06 h o u rs to ta l (1 .8 ye ar s)
JO B S
1 0 00 0 0
HR S
1 0 0 00

1000

100
Runs on the
10 TeraGrid in 2005

1
/2

/4

/6

/8
/ 19

/21

/23

/25

/27

/29

/31

/10
11

11

11

11
10

10

10

10
10

10

10

11

SCEC is using Pegasus among other tools to generate hazard maps


for the LA area
39

Summary (1):
Community Services
● Community roll, city hall, permits, licensing & police force
◆ Assertions, policy, attribute & authorization services
● Directories, maps
◆ Information services
● City services: power, water, sewer
◆ Deployed services
● Shops, businesses
◆ Composed services
● Day-to-day activities
◆ Workflows, visualization
● Tax board, fees, economic considerations
◆ Barter, planned economy, eventually markets
40

Summary (2)
● Community based science will be the norm
◆ Requires collaborations across sciences— including computer
science
● Many different types of communities
◆ Differ in coupling, membership, lifetime, size
● Must think beyond science stovepipes
◆ Increasingly the community infrastructure will become the
scientific observatory
● Scaling requires a separation of concerns
◆ Providers of resources, services, content
● Small set of fundamental mechanisms required to build
communities
41

The Globus Toolkit


● Background
● Globus Toolkit
● Future directions
● Related tools
● Opportunities for collaboration
42

On April 29, 2005 the


Globus Alliance released
the finest version of the
Globus Toolkit to date!

Don’t take our word for it!


Read the UK eScience Evaluation of GT4
www.nesc.ac.uk/technical_papers/UKeS-2005-03.pdf
(Reachable from www.globus.org, under “News”)
43

The Role of the Globus Toolkit


● A collection of solutions to problems that come
up frequently when building collaborative
distributed applications
● Heterogeneity
◆ A focus, in particular, on overcoming heterogeneity
for application developers
● Standards
◆ We capitalize on and encourage use of existing
standards (IETF, W3C, OASIS, GGF)
◆ GT also includes reference implementations of
new/proposed standards in these organizations
44

The Application-Infrastructure Gap

Dynamic
and/or
Distributed
Applications

Shared Distributed Infrastructure


A B

1 1

9 9
45

Bridging the Gap:


Grid Infrastructure
Users
● Service-oriented applications Composition
◆ Wrap applications as
Workflows
services Invocation
◆ Compose applications
into workflows Appln Appln
Service Service
● Service-oriented Grid
infrastructure Provisioning
◆ Provision physical
resources to support
application workloads
46

Layers in the Grid


47
A Typical eScience Use of Globus:
Network for Earthquake Eng. Simulation

Links instruments, data,


computers, people
48

Without the Globus Toolkit


Compute
A
Simulation Server
Tool Compute
Web B
Browser Server
Web Registration
Portal Service
Camera

Application 10 Data Telepresence


Developer Viewer Monitor Camera
Off the Shelf 12 Tool
Database
Chat C
Globus 0 service
Toolkit Tool
Data
Database
Grid 0
Credential Catalog D
Community service
Repository
Database
Certificate E
service
authority
Users work Application services Collective services Resources implement
with client organize VOs & enable aggregate &/or standard access &
applications access to other services virtualize resources management interfaces
49

With the Globus Toolkit


Globus Compute
GRAM Server
Simulation
Tool Compute
Web Globus
GRAM Server
Browser
Globus Index
CHEF Service
Camera

Application 2 Data Telepresence


Developer Viewer Monitor Camera
Off the Shelf 9 Tool
Globus Database
CHEF Chat DAI
Globus 4 service
Toolkit Teamlet
Globus
Globus Database
Grid 4 MCS/RLS DAI
Community MyProxy service
Globus Database
Certificate DAI service
Authority
Users work Application services Collective services Resources implement
with client organize VOs & enable aggregate &/or standard access &
applications access to other services virtualize resources management interfaces
50

The Globus Toolkit:


“Standard Plumbing” for the Grid
● Not turnkey solutions, but building blocks & tools
for application developers & system integrators
◆ Some components (e.g., file transfer) go farther than
others (e.g., remote job submission) toward end-user
relevance
● Easier to reuse than to reinvent
◆ Compatibility with other Grid systems comes for free
● Today the majority of the GT public interfaces are
usable by application developers and system
integrators
◆ Relatively few end-user interfaces
◆ In general, not intended for direct use by end users
(scientists, engineers, marketing specialists)
51

Globus is Grid Infrastructure


● Software for Grid infrastructure
◆ Service enable new & existing resources
◆ E.g., GRAM on computer, GridFTP on storage system,
custom application service
◆ Uniform abstractions & mechanisms
● Tools to build applications that exploit Grid
infrastructure
◆ Registries, security, data management, …
● Open source & open standards
◆ Each empowers the other
● Enabler of a rich tool & service ecosystem
52

An eBusiness Use of Globus:


SAP Demonstration @ GlobusWorld
● 3 Globus-enabled applns:
◆ CRM: Internet Pricing Configurator (IPC)
◆ CRM: Workforce
Web Browsers / Batch Processes
Management (WFM) (typically several thousand requests)

◆ SCM: Advanced Planner IPC


Server
Request:
& Optimizer (APO) Price Query
2

IPC Delegation of
● Applications modified to: 1
Dispatcher Request

◆ Adjust to varying 2

IPC
demand & resources
Response: PricelistServer
◆ Use Globus to discover Depending on:
- Time
& provision resources - Discount
- Number of Items
-… 3

SAP AG R/3 Internet Pricing


& Configurator (IPC)
53

The Globus Toolkit is


a Collection of Components
● A set of loosely-coupled components, with:
◆ Services and clients
◆ Libraries
◆ Development tools
● GT components are used to build Grid-based
applications and services
◆ GT can be viewed as a Grid SDK
● GT components can be categorized across two different
dimensions
◆ By broad domain area
◆ By protocol support
54

Our Goals for GT4


● Usability, reliability, scalability, …
◆ Web service components have quality equal or
superior to pre-WS components
◆ Documentation at acceptable quality level
● Consistency with latest standards (WS-*, WSRF,
WS-N, etc.) and Apache platform
◆ WS-I Basic Profile compliant
◆ WS-I Basic Security Profile compliant
● New components, platforms, languages
◆ And links to larger Globus ecosystem
55

GT Domain Areas
● Core runtime
◆ Infrastructure for building new services
● Security
◆ Apply uniform policy across distinct systems
● Execution management
◆ Provision, deploy, & manage services
● Data management
◆ Discover, transfer, & access large data
● Monitoring
◆ Discover & monitor dynamic services
56

Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime

Data Execution Info Common


Security
Mgmt Mgmt Services Runtime
57

GT Protocols
● Web service protocols
◆ WSDL, SOAP

◆ WS Addressing, WSRF, WSN

◆ WS Security, SAML, XACML

◆ WS-Interoperability profile

● Non Web service protocols


◆ Standards-based, such as GridFTP

◆ Custom
58

“Stateless” vs. “Stateful” Services


FileTransfer
Service
move (A to B) Client
move

● Without state, how does client:


◆ Determine what happened (success/failure)?
◆ Find out how many files completed?
◆ Receive updates when interesting events arise?
◆ Terminate a request?
● Few useful services are truly “stateless”, but WS
interfaces alone do not provide built-in support for state
59

FileTransferService
(without WSRF)
FileTransfer
Service
move (A to B) : transferID Client
move
whatHappen
state tellMeWhen
cancel

● Developer reinvents wheel for each new service


◆ Custom management and identification of state: transferID
◆ Custom operations to inspect state synchronously
(whatHappen) and asynchronously (tellMeWhen)
◆ Custom lifetime operation (cancel)
60

WSRF in a Nutshell
● Service
● State representation
◆ Resource
Service ◆ Resource Property
GetRP
● State identification
EPR GetMultRPs ◆ Endpoint Reference
EPR
EPR SetRP ● State Interfaces
Resource ◆ GetRP, QueryRPs,
QueryRPs
GetMultipleRPs, SetRP
RPs Subscribe ● Lifetime Interfaces
SetTermTime
◆ SetTerminationTime
◆ ImmediateDestruction
Destroy ● Notification Interfaces
◆ Subscribe
◆ Notify
● ServiceGroups
61

FileTransferService (w/ WSRF)


FileTransferService
Client
createResource createResource (A to B) : EPR

Transfer getRP

RPs queryRPs

destroy

● Developer specifies custom method to createResource


and leaves the rest to WSRF standards:
◆ State exposed as Resource + Resource Properties and
identified by Endpoint Reference (EPR)
◆ State inspected by standard interfaces (GetRP, QueryRPs)
◆ Lifetime management by standard interfaces (Destroy)
62

Modeling State in Web Services


Resource
allocation Entity
Create Stateful

Authentication State Address


& Authorization Factory Service
are applied to service requestor
all requests (e.g., user
State inspection application)
Lifetime mgmt
Notifications
Discovery

Stateful
Entities
Register
Stateful Registry
Entity

Interactions standardized using WSDL and SOAP


63

Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime

Data Execution Info Common


Security
Mgmt Mgmt Services Runtime
64

GT4 Web Services Runtime


● Supports both GT (GRAM, RFT, Delegation, etc.) & user-
developed services
● Redesign to enhance scalability, modularity,
performance, usability
● Leverages existing WS standards
◆ WS-I Basic Profile: WSDL, SOAP, etc.
◆ WS-Security, WS-Addressing
● Adds support for emerging WS standards
◆ WS-Resource Framework, WS-Notification
● Java, Python, & C hosting environments
◆ Java is standard Apache
65

GT4 WS Core in a Nutshell


Implementation of WSRF:
Resources,
EndpointReferences,
ResourceProperties
Service
GetRP Operation Providers: pre-
build implementations of
EPR GetMultRPs WSRF operations
EPR
EPR SetRP
Resource Notification implementation:
QueryRPs Topics, TopicSet, Embedded
RPs Notification Consumer service
Subscribe

SetTermTime Implementations of Resources


(ReflectionResource,
Destroy
PersistentReflectionResource)
and ResourceProperties
(SimpleResourceProperty,
ReflectionResourceProperty)
66

GT4 WS Core in a Nutshell

Service Container
Service Container: host
Service multiple services in
Service GetRP container; one JVM
Service GetRP process
GetRP
GetMultRPs
EPR GetMultRPs
EPR GetMultRPs
EPR
EPR SetRP …more details: based
EPR
EPRResource SetRP
EPRResource SetRP on AXIS service
Resource QueryRPs
QueryRPs container, processes
RPs QueryRPs
Subscribe SOAP messages,
RPs Subscribe ResourceContext
RPs Subscribe
SetTermTime extension.
SetTermTime
ResourceHome SetTermTime
Destroy
ResourceHome Destroy
ResourceHome Destroy
67

GT4 WS Core in a Nutshell

Service Container Secure Communication:


Transport, Message,
Service PIP
Conversation (Transport
Service GetRP demonstrates best
Service GetRP
GetRP PDP performance)
EPR GetMultRPs
EPR GetMultRPs
EPR
EPR GetMultRPs
SetRP
EPR
EPRResource SetRP
EPRResource SetRP Configurable Security
Resource QueryRPs
QueryRPs Policies: Policy Information
RPs QueryRPs
Subscribe Points (PIPs), Policy Decision
RPs Subscribe
RPs Subscribe Points (PDP) -- chained
SetTermTime
SetTermTime
ResourceHome SetTermTime
Destroy
ResourceHome Destroy
ResourceHome Destroy Example authorization
PDPs: GridMap, SAML
implementations,
XACML policies
68

GT4 WS Core in a Nutshell


Apache Tomcat
Deploy Service
Service Container
Container “standalone”
PIP or within Apache
Service Tomcat
Service GetRP
Service GetRP
GetRP
GetMultRPs PDP
EPR GetMultRPs
EPR GetMultRPs
EPR
EPR SetRP
EPR
EPRResource SetRP
EPRResource SetRP
Resource QueryRPs
QueryRPs
RPs QueryRPs
Subscribe
RPs Subscribe
RPs Subscribe
SetTermTime
SetTermTime
ResourceHome SetTermTime
Destroy
ResourceHome Destroy
ResourceHome Destroy

WorkManager DB Conn Pool JNDI Directory


69

GT4 Web Services Runtime

User Applications

Custom GT4

Administration
WSRF Web WSRF Web

Registry
Custom
GT4 Container

Services Services
Web
Services
WS-Addressing, WSRF,
WS-Notification

WSDL, SOAP, WS-Security


70

GetRP Test
Distributed client and service on same LAN
(times in milliseconds)
149.67

No Security X509 Signing HTTPS


25.57 181.96

17.1 140.5
55.6

10.05 81.39
8.23
N/A
2.34 14.8 11.46 12.91
2.85
G G py W W G G py W W G G py W W
T4 T4 G S S T4 T4 G S S T4 T 4 G S S
- J - C ridW RF: RF. r R R
- J - C idW F: F. - J - C ridW RF: RF.
av
a ar :Lite NET av
a ar :Lite NET av
a ar :Lite NET
e e e
GT4 WS Core Performance
71

(1) Message-level security (times in milliseconds)


GT4 Java GT4 C GT4 Python WSRF.NET
GetRP 181.96 14.77 140.50 81.39
SetRP 182.04 14.99 142.21 82.48
CreateR 188.46 14.98 132.26 96.22
DestroyR 182.03 15.76 136.12 86.89
Notify 219.51 N/A 244.93 101.57

(2) Transport-level security (times in milliseconds)


GT4 Java GT4 C GT4 Python WSRF.NET
getRP 11.46 2.85 149.67 12.91
setRP 11.47 2.86 150.79 12.3
createR 18.00 2.82 132.60 20.84
destroyR 14.92 2.71 149.21 16.05
Notify 29.26 9.67 169.07 45.0
“WSRF/WSNs Compared,” HPDC 2005.
72

Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime

Data Execution Info Common


Security
Mgmt Mgmt Services Runtime
73

Globus Security
● Control access to shared services
◆ Address autonomous management, e.g.,
different policy in different work-groups
● Support multi-user collaborations
◆ Federate through mutually trusted services
◆ Local policy authorities rule
● Allow users and application communities to set
up dynamic trust domains
◆ Personal/VO collection of resources working
together based on trust of user/VO
74

Virtual Organization (VO) Concept


Virtual Community C

Person E
Person B File server F1
(Researcher)
Compute Server C1' (Administrator) (disk A)
Person A
Person D
(Principal Investigator)
(Researcher)

Person B
Person E
(Staff) Person D File server F1 (Faculty)
Compute Server C2 Compute Server C1 (Staff) (disks A and B)
Person A Person F
(Faculty) (Faculty)
Person C
(Student) Compute Server C3
Organization A Organization B

● VO for each application or workload


● Carve out and configure resources for a particular use
and set of users
75

GT4 Security
Authz Callout: SSL/WS-Security
SAML, XACML with Proxy
Certificates
Services (running
on user’s behalf)

Access

Compute CAS or VOMS


Rights
Center issuing SAML
or X.509 ACs

Users
Rights

Local policy MyProxy


on VO identity VO
or attribute
authority Rights’
KCA
76

GT4 Security
● Public-key-based authentication
● Extensible authorization framework based on Web
services standards
◆ SAML-based authorization callout
● As specified in GGF OGSA-Authz WG
◆ Integrated policy decision engine
● XACML policy language, per-operation policies, pluggable

● Credential management service


◆ MyProxy (One time password support)
● Community Authorization Service
● Standalone delegation service
77

GT4’s Use of Security Standards

Supported, Supported, Fastest,


but slow but insecure so default
78

GT-XACML Integration
● eXtensible Access Control Markup Language
◆ OASIS standard, open source implementations
● XACML: sophisticated policy language
● Globus Toolkit ships with XACML runtime
◆ Included in every client and server built on GT
◆ Turned-on through configuration
● … that can be called transparently from runtime
and/or explicitly from application …
● … and we use the XACML-”model” for
our Authz Processing Framework
79

GT Authorization Framework
80

Other Security Services Include …


● MyProxy
◆ Simplified credential management
◆ Web portal integration
◆ Single-sign-on support
● KCA & kx.509
◆ Bridging into/out-of Kerberos domains
● SimpleCA
◆ Online credential generation
● PERMIS
◆ Authorization service callout
81

Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime

Data Execution Info Common


Security
Mgmt Mgmt Services Runtime
82

GT4 Data Management


● Stage/move large data to/from nodes
◆ GridFTP, Reliable File Transfer (RFT)
◆ Alone, and integrated with GRAM
● Locate data of interest
◆ Replica Location Service (RLS)
● Replicate data for performance/reliability
◆ Distributed Replication Service (DRS)
● Provide access to diverse data sources
◆ File systems, parallel file systems, hierarchical storage:
GridFTP
◆ Databases: OGSA DAI
Bandwidth Vs Striping
83
20000
18000
Disk-to-disk on
16000
TeraGrid

Bandwidth (Mbps)
GridFTP in GT4
14000
12000
10000

● 100% Globus code 8000


6000
◆ No licensing issues 4000
2000
◆ Stable, extensible 0
0 10 20 30 40 50 60 70
● IPv6 Support Degree of Striping

● XIO for different transports # Stream = 1 # Stream = 2 # Stream = 4


# Stream = 8 # Stream = 16 # Stream = 32
● Striping  multi-Gb/sec wide area transport
◆ 27 Gbit/s on 30 Gbit/s link
● Pluggable
◆ Front-end: e.g., future WS control channel
◆ Back-end: e.g., HPSS, cluster file systems
◆ Transfer: e.g., UDP, NetBLT transport
84
Reliable File Transfer:
Third Party Transfer
● Fire-and-forget transfer
RFT Client
● Web services interface
● Many files & directories SOAP Notifications
Messages (Optional)
● Integrated failure recovery
RFT Service
● Has transferred 900K files

GridFTP Server GridFTP Server

Master Protocol Data Data Protocol Master


DSI Interpreter Channel Channel Interpreter DSI

IPC Link IPC Link

IPC Slave Data Data Slave IPC


Receiver DSI Channel Channel DSI Receiver
85

Replica Location Service


● Identify location of files via
logical to physical name map Index Index
● Distributed indexing of names,
fault tolerant update protocols
● GT4 version scalable & stable
● Managing ~40 million files
across ~10 sites Local Update Bloom Bloom
DB send filter filter
(secs) (secs) (bits)
10K <1 2 1M
1M 2 24 10 M
5M 7 175 50 M
86

Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime

Data Execution Info Common


Security
Mgmt Mgmt Services Runtime
87

Execution Management (GRAM)


● Common WS interface to schedulers
◆ Unix, Condor, LSF, PBS, SGE, …
● More generally: interface for process
execution management
◆ Lay down execution environment
◆ Stage data
◆ Monitor & manage lifecycle
◆ Kill it, clean up
● A basis for application-driven provisioning
88

GT4 WS GRAM
● 2nd-generation WS implementation optimized
for performance, flexibility, stability, scalability
● Streamlined critical path
◆ Use only what you need
● Flexible credential management
◆ Credential cache & delegation service
● GridFTP & RFT used for data operations
◆ Data staging & streaming output
◆ Eliminates redundant GASS code
89

GT4 WS GRAM Architecture


Service host(s) and compute element(s)

Job events
SEG
GT4 Java Container Compute element
GRAM
GRAM Local job control
services Local
services
Job tions GRAM

sudo
Dele scheduler
func gate
Client

Transfer adapter
Delegation request
Delegate
GridFTP User
RFT File
FTP job
Transfer
control
FTP data
Remote
GridFTP storage
element(s)
90

WS GRAM Performance
● Time to submit a basic GRAM job
◆ Pre-WS GRAM: < 1 second
◆ WS GRAM: 2 seconds
● Concurrent jobs
◆ Pre-WS GRAM: 300 jobs
◆ WS GRAM: 32,000 jobs
● Various studies are underway to test latest
software
Open Science Grid 91

 50 sites (15,000 CPUs) & growing


 400 to >1000 concurrent jobs
 Many applications + CS experiments;
includes long-running production operations
 Up since October 2003; few FTEs central ops

Jobs (2004)

www.opensciencegrid.org
92

Embedded
Client-side
Resource Management
VO
Deleg Deleg
Admin
GRAM GRAM
Headnode Cluster
Resource Resource
VO
Manager Manager
User
VO
Monitoring and control
User
Deleg VO Job
GRAM
Cluster
VO Scheduler ... Other Services Resource
Manager

• VO admin delegates credentials to be used by downstream VO services.


• VO admin starts the required services. VO Job
• VO jobs comes in directly from the upstream VO Users
• VO job gets forwarded to the appropriate resource using the VO credentials
• Computational job started for VO
93

Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime

Data Execution Info Common


Security
Mgmt Mgmt Services Runtime
94

Monitoring and Discovery


● “Every service should be monitorable and
discoverable using common mechanisms”
◆ WSRF/WSN provides those mechanisms
● A common aggregator framework for collecting
information from services, thus:
◆ MDS-Index: Xpath queries, with caching
◆ MDS-Trigger: perform action on condition
◆ (MDS-Archiver: Xpath on historical data)
● Deep integration with Globus containers & services:
every GT4 service is discoverable
◆ GRAM, RFT, GridFTP, CAS, …
95
GT4
Monitoring & Discovery Clients
(e.g., WebMDS)
GT4 Container
WS-ServiceGroup
MDS-
Index
Registration &
WSRF/WSN Access
adapter

GT4 Cont.
GT4 Container
MDS- Custom protocols
MDS-
Index for non-WSRF entities Index
Automated
registration GridFTP
in container
RFT
GRAM User
96

Information Providers
● GT4 information providers collect
information from some system and make it
accessible as WSRF resource properties
● Growing number of information providers
◆ Nagios, SGE, LSF, PBS
● Many opportunities to build additional ones
◆ E.g., network monitoring, storage systems,
various sensors
97

Putting it Together: Data Replication


Service
● New capabilities can be build by:
◆ Creating new services using GT4 containers
◆ Composing/combining existing service
98
Reliable Wide Area Data
Replication
LIGO Gravitational Wave Observatory

Birmingham•
Cardiff

AEI/Golm

Replicating >1 Terabyte/day to 8 sites


>30 million replicas so far
MTBF = 1 month

www.globus.org/solutions
99

The Data Replication Service


● Tech Preview in GT4.0
● Based on the publication component of the
Lightweight Data Replicator system
◆ Developed by Scott Koranda from U. Wisconsin at
Milwaukee
● Function to locally replicate a set of files
◆ User identifys a desired files
◆ DRS uses RLS to discover file locations
◆ Use RFT to create local replicas
◆ Registers new replicas in RLS
100

Motivation for DRS


● Need for higher-level data management services that
integrate lower-level Grid functionality
◆ Efficient data transfer (GridFTP, RFT)
◆ Replica registration and discovery (RLS)
◆ Eventually validation of replicas, etc.
● Goal is to generalize the custom data management
systems developed by several application communities
● Eventually to provide a suite of configurable high-level
data management services
● DRS is the first of these services
101

Relationship to
Other Globus Services
At requesting site, deploy:
Local Site
● WS-RF Services
◆ Data Replication Service
Reliable
◆ Delegation Service Data
Delegation File
Replication
◆ Reliable File Transfer Service Transfer
Service
Service Service
Replicator Delegated RFT
Resource Credential Resource

● Pre WS-RF Components Web Service Container


◆ Replica Location Service
(Local Replica Catalog
Local Replica
and Replica Location GridFTP
Replica Location
Index) Catalog Index
Server
◆ GridFTP Server
102

Service Composition Overview


Data Rep.
Client
Replica
Catalog
Replica
RFT Index

Replica
Replica Catalog Replica
WSRF Services: Catalog Catalog
•Data Replication
•Reliable File Transfer
•Information Services Delegation
Pre-WS Services:
•Delegation GridFTP
•Replica Index
•Replica Catalog Server
•GridFTP Server GridFTP
Server
MDS

Service Container
103

Data Replication Service


Data Rep.
Client Replicator Replicator Resource:
RP Represents state of
Replica
replication request.
Catalog
Data Replication: Detailed state info for
Replica
Coordinates RFT per-replica
Indexstatus
replication by Replica
discovery, transfer, Resource Properties: Catalog
Replica Replica
registration of •Status Catalog Catalog
replicas •Stage
•Result
Delegation •Error Msg
•Count GridFTP
Server
GridFTP
Server
MDS

Service Container
104

Reliable File Transfer Service


Data Rep.
Client
Replica
Transfer Resource: Catalog
Replica
Represents state of
RFT Index
transfer request.
Transfer Detailed state info for
Replica
Reliable File per-file-transfer status
Catalog
RP Replica Replica
Transfer: Catalog Catalog
Coordinates
transfer of files
using GridFTP Delegation
servers. GridFTP
Resource Properties:
Server
•RequestStatus
•OverallStatus GridFTP
•TotalBytes Server
MDS •TotalTime

Service Container
105

Delegation Service
Data Rep.
Client
Replica
Catalog
Replica
RFT Index

Replica
Catalog
Replica Represents Replica
Delegation: Credential:
Manages Catalog
the user’s delegated Catalog
delegated credential as a stateful
credentials from Delegation entity.
client/user. Credential GridFTP
RP Server
GridFTP
Resource Properties: Server
MDS •None excepts
standard lifetime RPs
(CurrentTime,
TerminationTime)

Service Container
106

Information Services (MDS)


Data Rep.
Client
Replica
Catalog
Replica
RFT Index

Replica
Replica Catalog Replica
Catalog Catalog

MDS Information
Delegation Index: an index of
Services:
Collects information monitored resources.
GridFTP
from other Trigger:Server
an event /
Resources, may stimulus trigger (not
GridFTP
trigger events based pictured)
Server
on conditions. MDS
Index Resource Properties:
RP •ResourceCount
•Entry: EPR + data
from other Resources
Service Container
107

Pre-WS Services
Replica Index:
Data Rep. Index of replica
Client information across
catalogs
Replica
Catalog
Replica
RFT Replica Catalog: Index
Catalog of replica Replica
information, per- Replica Catalog Replica
SE Catalog Catalog

Delegation
GridFTP
GridFTP Server: Server
Transfer service
GridFTP
based on GridFTP Server
protocol
MDS

Service Container
108

Create Delegated Credential


Data Rep.
•Create delegated
Client credential resource
•Set termination time
Replica
Catalog
Replica
•Credential
RFT EPR Index
EPR returned
Replica
Replica Catalog Replica
proxy
Catalog Catalog

Delegation
•Initialize user Credential GridFTP
proxy RP Server
GridFTP
Server
MDS

Service Container
109

Create Delegated Credential


EPR Data Rep.
Client Replicator
RP
Replica
Catalog
Replica
•Create Replicator Index
resource RFT
•Pass delegated Replica
credential EPR Replica Catalog Replica
•Set termination time Catalog Catalog
•Replicator EPR
returned Delegation
•Access delegated
Credential GridFTP
credential resource
RP Server
GridFTP
Server
MDS

Service Container
110

Monitor Replicator Resource


Data Rep.
Client Replicator
RP
Replica
Catalog
Replica
•Subscribe to Index
ResourceProperty RFT
changes for “Status” Replica
RP and “Stage” RP Replica Catalog Replica
Catalog Catalog

Delegation
•Add Replicator EPR •Periodically polls
Credential GridFTP
resource to MDS Replicator RP Server
via
Information service RP
GetRP or GetMultRP
Index GridFTP
Server
MDS •Conditions may
trigger alerts or
Index
other actions
RP (Trigger service not
pictured)
Service Container
111

Query Replica Information


Data Rep.
Client Replicator
RP
Replica
Catalog
•Notification of Replica
“Stage” RP value RFT Index
changed to “discover” Replica
•Replicator queries Catalog
Replica Replica
RLS Replica Index to •Replicator queries
Catalog Catalog
find catalogs that RLS Replica
contain desired Catalog(s) to retrieve
Delegation
replica information mappings from
Credential logical name to
GridFTP
RP target name (URL)
Server
GridFTP
Server
MDS
Index
RP

Service Container
112

Transfer Data
•Periodically poll
Data Rep. “ResultStatus” RP via GetRP
Client Replicator •When “Done”, get state
RP information for each file
transfer Replica
EPR Catalog
•Notification of EPR Replica
“Stage” RP value RFT Index
changed to “transfer” Transfer Replica
RP •Data transfer between Catalog
Replica Replica
•Create Transfer
GridFTP Server sites
Catalog Catalog
resource
•Pass credential EPR
•Set Termination Time Delegation
•Transfer resource Credential GridFTP
EPR returned
RP Server
GridFTP
•Access delegated Server
credential resource MDS
Index •Setup GridFTP Server
RP transfer of file(s)

Service Container
113

Register Replica Information


Data Rep.
Client Replicator
RP
Replica
Catalog
•Notification of Replica
“Stage” RP value RFT Index
changed to “register” Transfer Replica
RP Replica Catalog Replica
•Replicator registers Catalog Catalog
new file mappings in
RLS Delegation
Replica Catalog •RLS Replica
Catalog sends
Credential GridFTP
update of new
RP Server
replica mappings to
the Replica Index
GridFTP
Server
MDS
Index
RP

Service Container
114

Client Inspection of State


Data Rep.
Client Replicator
RP
Replica
•Client inspects Catalog
•Notification of Replicator state Replica
“Status” RP value information for each
RFT Index
changed to “Finished” replication in the
Transfer Replica
request Catalog
RP Replica Replica
Catalog Catalog

Delegation
Credential GridFTP
RP Server
GridFTP
Server
MDS
Index
RP

Service Container
115

Resource Termination
Data Rep. TIME
Client Replicator
RP
Replica
•Termination time (set Catalog
Replica
by client) expires
RFT Index
eventually
Transfer Replica
RP Replica Catalog Replica
•Resources Catalog Catalog
destroyed
(Credential, Transfer, Delegation
Replicator) Credential GridFTP
RP Server
GridFTP
Server
MDS
Index
RP

Service Container
116

DRS: WSDL (PortType)


<?xml version=“1.0” encoding=“utf-8”?>
<wsdl:definitions name=“Replication” …>

<wsdl:portType name=“ReplicatorPortType”
wsrp:ResourceProperties=“ReplicatorResourceProperties”>
<wsdl:operation name=“createReplicator”> …
<wsdl:operation name=“start” …
<wsdl:operation name=“stop”> …
<wsdl:operation name=“suspend”> …
<wsdl:operation name=“resume”> …
<wsdl:operation name=“findItems”> …
<wsdl:operation name=“SetTerminationTime”>
<wsdl:operation name=“Destroy”> …
<wsdl:operation name=“QueryResourceProperties”> …
<wsdl:operation name=“GetMultipleResourceProperties”> …
<wsdl:operation name=“GetResourceProperty”> …
<wsdl:operation name=“Subscribe”> …
<wsdl:operation name=“GetCurrentMessage”> …
</wsdl:portType>
</wsdl:definitions>
117

DRS: WSDL (RPs)


<?xml version=“1.0” encoding=“utf-8”?>
<wsdl:definitions name=“Replication” …>

<wsdl:portType name=“ReplicatorPortType”
wsrp:ResourceProperties=“ReplicatorResourceProperties”>
<wsdl:operation name=“createReplicator”> …
<wsdl:operation name=“start” … <xsd:element name="ReplicatorResourceProperties“>
<wsdl:operation name=“stop”> … …
<wsdl:operation name=“suspend”> … <xsd:element name=“status” …/>
<wsdl:operation name=“resume”> … <xsd:element name=“stage” …/>
<wsdl:operation name=“findItems”> … <xsd:element name=“result” …/>
<xsd:element name=“errorMessage” …/>
<wsdl:operation name=“SetTerminationTime”>
<wsdl:operation name=“Destroy”> … <xsd:element name=“count” …/>
<xsd:element
<wsdl:operation name=“QueryResourceProperties”> … name=“Topic” …/>
<xsd:element name=“TopicExprDialect”
<wsdl:operation name=“GetMultipleResourceProperties”> … …/>
<xsd:element
<wsdl:operation name=“GetResourceProperty”> … name=“TeminationTime” …/>
<wsdl:operation name=“Subscribe”> … <xsd:element name=“CurrentTime” …/>
<xsd:element
<wsdl:operation name=“GetCurrentMessage”> … name=“FixedTopicSet” …/>
</wsdl:portType> …
</xsd:element>
</wsdl:definitions>
118

DRS: Stubs
BaseRequestType
CreateReplicator ReplicationOptionsType
-credentialEPR
-initialTerminationTime
-autostart
java.rmi.Remote

ReplicatorPortType
CreateReplicatorResponse
TransferOptionsType
-endpointReference
+createReplicator () -concurrency
RequestFileReplicationRequestType
+start() -binary
+stop() -requestFileURL -blockSize
+suspend() -format -tcpBufferSize
+resume() BaseFaultType -...
+findItems()
+setTerminationTime ()
+destroy()
+queryRPs()
+getMultipleRPs ()
+getRP() ReplicatorFaultType
+subscribe ()
+getCurrentMsg()

CreateReplicatorFaultType

PortType Messages Types


119

DRS: Implementation
WS Core
ResourceHomeImpl
ReplicationServiceImpl
+find() DRS
+createReplicator()
+otherCustomOps()

ReplicatorHome
GetRPProvider
+create()
+getRP()
DiscoverWork
GetMultipleRPsProvider
+getMultipleRPs() ReflectionResource

QueryRPsProvider SimpleRPSet TransferWork


+queryRPs()

DestroyProvider ReplicatorResource
RemoveCallback
+destroy() +getTopicSet()
RegisterWork
+remove()
+create() TopicListAccessor
SetTerminationTimeProvider +otherCustomMethods()
+setTerminationTime()

SubscriptionProvider SimpleTopicSet
+subscribe()

Service Resource Backend


120

Development and Deployment Process

Define Insert Std. Develop


Generate
Interface Operations Service,
Stubs
(WSDL) (WSDLPP) Resource, etc.

Describe Build & Deploy to


Configure
Deployment Package Service
(JNDI Props)
(WSDD) (GAR) Container

1. Define Interface (WSDL, XSD)


2. WSDLPP (adds standard operations)
3. Generate Stubs
4. Develop service, resource, custom logic
5. Describe deployment (WSDD)
6. Configure JNDI properties
7. Build, package as GAR, and deploy to container
8. Deploy to Service Container
121

The Globus Ecosystem


● Globus components address core issues relating to
resource access, monitoring, discovery, security,
data movement, etc.
◆ GT4 being the latest version
● A larger Globus ecosystem of open source and
proprietary components provide complementary
components
◆ A growing list of components
● These components can be combined to produce
solutions to Grid problems
◆ We’re building a list of such solutions
122

Many Tools Build on, or Can


Contribute to, GT4-Based Grids
● Condor-G, DAGman ● Platform Globus Toolkit
● MPICH-G2 ● VOMS
● PERMIS
● GRMS
● GT4IDE
● Nimrod-G ● Sun Grid Engine
● Ninf-G ● PBS scheduler
● Open Grid Computing Env. ● LSF scheduler
● Commodity Grid Toolkit ● GridBus
● TeraGrid CTSS
● GriPhyN Virtual Data System
● NEES
● Virtual Data Toolkit ● IBM Grid Toolbox
● GridXpert Synergy ● …
123

Example Solutions
● Portal-based User Reg. System (PURSE)
● VO Management Registration Service
● Service Monitoring Service
● TeraGrid TGCP Tool
● Lightweight Data Replicator
● GriPhyN Virtual Data System
124

The Globus Commitment


to Open Source
● Globus was first established as an open
source project in 1996
● The Globus Toolkit is open source to:
◆ allow for inspection
● for consideration in standardization processes
◆ encourage adoption
● in pursuit of ubiquity and interoperability
◆ encourage contributions
● harness the expertise of the community

● The Globus Toolkit is distributed under the


(BSD-style) Apache License version 2
125

The Future:
Structure
● NSF Community Driven Improvement of Globus
Software (CDIGS) project
◆ 5 years of funding for GT enhancement
◆ Regular Globus roadmaps outlining plans
● GlobDev http://dev.globus.org
◆ Apache-like community development site
◆ Community governance of components
◆ “Globus Toolkit” & other related software
◆ Open for business early 2006
◆ “Globus Alliance” = “GlobDev committers”
126

The Future:
Content
● We now have a solid and extremely powerful Web
services base
● Next, we will build an expanded open source Grid
infrastructure
◆ Virtualization
◆ New services for provisioning, data management,
security, VO management
◆ End-user tools for application development
◆ Etc., etc.
● And of course responding to user requests for other
short-term needs
127

What to Expect from the


Globus Alliance in the Coming Months
● Support for users of GT4
◆ Working to make sure the toolkit meets
user needs
◆ Answering questions on the mailing lists
◆ Further improving documentation
● Normal evolution of performance,
scalability and feature enhancements
● Further development of tools and services
in support of VOs
● Expanding contributions to Globus
128

GlobDev
● The current set of Globus components will be
organized into several “Globus Projects”
◆ Projects release products
● Each project will have its own group of
“Committers”
◆ committers are responsible for governance on
matters relating to their products
● The “Globus Management Committee” will
◆ provide overall guidance and conflict resolution
◆ approve the creation of new Globus Projects
129

Opportunities for Collaboration


● Use of Globus software
◆ Feedback & involvement in design
● Development of new Globus components
◆ E.g., new information providers to enable use of GT to
manage an entire Grid
◆ Examples and documentation
● Globalization and localization of software
● New applications and tools
◆ E.g., Grid operations, emergency response, ecogrid,
bioinformatics, …
130

For More Information


● Globus Alliance
◆ www.globus.org
● NMI and GRIDS Center
◆ www.nsf-middleware.org
◆ www.grids-center.org
● Infrastructure
◆ www.opensciencegrid.org
◆ www.teragrid.org
2nd Edition
www.mkp.com/grid2
131

For More Information


● GT4 Programming

You might also like