Professional Documents
Culture Documents
Carl Kesselman
Information Sciences Institute
University of Southern California
Univa Corporation
2
Credits
● Globus Toolkit v4 is the work of many talented
Globus Alliance members, at
◆ Argonne Natl. Lab & U.Chicago
◆ USC Information Sciences Corporation
◆ National Center for Supercomputing Applns
◆ U. Edinburgh
◆ Swedish PDC
◆ Univa Corporation
◆ Other contributors at other institutions
● Supported by DOE, NSF, UK EPSRC, and other
sources
3
Acknowledgements
● Ian Foster with whom I developed many of these
slides
● Bill Allcock, Lisa Childers,
Kate Keahey, Jennifer Schopf,
Frank Siebenlist, Mike Wilde @ ANL/UC
● Ann Chervenak, Ewa Deelman, Laura Pearlman, Mike
D’Arcy, Rob Schuler @ USC/ISI
● Karl Czajkowski, Steve Tuecke @ Univa
● Numerous other fine colleagues
● NSF, DOE, IBM for research support
4
Context:
System-Level Science
Faults
Seismic
Hazard
Model
InSAR Image of the
Hector Mine Earthquake
A satellite
generated
Interferometric
Synthetic Radar
(InSAR) image of
the 1999 Hector
Mine earthquake.
Shows the
displacement field
in the direction of
radar imaging
Ground
FSM RD AWM SRM Motions
M
3
2
Earthquake Attenuation Intensity
Forecast Model Relationship Measures
1
FSM = Fault System Model AWP = Anelastic Wave Propagation
RDM = Rupture Dynamics Model SRM = Site Response Model
7
Virtual Organizations
● From organizational behavior/management:
◆ "a group of people who interact through
interdependent tasks guided by common purpose
[that] works across space, time, and organizational
boundaries with links strengthened by webs of
communication technologies" (Lipnack & Stamps,
1997)
● The impact of cyberinfrastructure
◆ People computational agents & services
◆ Communication technologies IT
infrastructure, i.e. Grid
Function
Resource
Fig: S. G. Djorgovski
10
Decomposition Enables
Separation of Concerns & Roles
S1
User S2
D
“Provide access to S3
data D at S1, S2, S3 S1
with performance P”
S2
Service D
Provider Replica catalog,
S3
“Provide storage User-level multicast, …
with performance P1,
network with P2, …” S1
Resource S2
D
Provider
S3
11
Defining Community:
Membership and Laws
● Identify VO participants and roles
◆ For people and services
● Specify and control actions of members
◆ Empower members delegation
◆ Enforce restrictions federate policy
A B
1 1
10 10
1
A B
1 2 1 2
16
14
Site
● Inter-VO admission-
control
◆ Entities/roles in one VO not policies
Trust in VOs
● Do I “believe” an attribute assertion?
◆ Used to evaluate cost vs. benefit of
performing an operation
● E.g., perform untrusted operation with extra auditing
● Look at attributes of assertion signer
● Rooting trust
◆ Externally recognized source, e.g., CA
◆ Dynamically via VO structure delegation
◆ Dynamically via alternative sources, e.g.,
reputation
18
VO Member VO VO A VO B
Attribute User B Service Service
19
Bootstrapping a VO
by Assembling Services
1) Integrate services from other sources
Virtualize external services as VO services
◆
Community
2) Coordinate & compose
Content
Create new services from existing ones
◆
Services
Services Provider
Capacity
Capacity
Provider
Providing VO Services:
(1) Integration from Other Sources
● Negotiate service
level agreements
●
Community
Delegate and deploy capabilities/services
A … Community
Z
● Provision to deliver
defined capability
● Configure environment
● Host layered functions
22
VO Admin
Existing
Services
23
Policy
Allocate/provision
Configure
Client Activity
Initiate activity
Monitor activity
Control activity Environment
Client Policy
Client
Client
Environment
Providing VO Services:
(2) Coordination & Composition
● Take a set of provisioned services …
… & compose to synthesize new behaviors
“Brain vs. Brawn: Why Grids and Agents Need Each Other,"
Foster, Kesselman, Jennings, 2004.
The Globus-Based 26
Birmingham•
Cardiff
AEI/Golm
www.globus.org/solutions
27
Composing Resources …
Composing Services
Deploy hypervisor/OS
Hypervisor/OS
Procure hardware
Physical machine
Community Commons
● What capabilities are available to VO?
◆ Membership changes, state changes
● Require mechanisms to aggregate and update VO information
● Monitoring and discovery
MORE
The age of
information
A
A VO-specific indexes
A
S
S Information S S FRESH
30
Collaborative Work
Executed
Executing
Query Executable
Not yet executable
What I What I
Did Am Doing Edit
What I …
Want to Do
Time
32
Abstract
workflow
34
Workflow Generation
● Given: desired result and constraints
◆ desired result (high-level, metadata description)
◆ application components
◆ resources in the Grid (dynamic, distributed)
◆ constraints & preferences on solution quality
● Find: an executable job workflow
◆ A configuration that generates the desired result
◆ A specification of resources to be used
◆ Sequence of operations: create agreement, move
data, request operation
● May create workflow incrementally as information
becomes available
"Mapping Abstract Complex Workflows onto Grid Environments," Deelman, Blythe, Gil,
Kesselman, Mehta, Vahi, Arbree, Cavanaugh, Blackburn, Lazzarini, Koranda,
2003.
35
Minor damage
Moderate damage
SCEC Cybershake
● Calculate hazard curves by generating synthetic
seismograms from estimated rupture forecast
Hazard Map
Hazard Curve
Spectral Acceleration
Synthetic Seismogram
37
Data Workflow
Catalog Scheduler/Engine
VO Service
Catalog
VO Scheduler
SCEC TeraGrid TeraGrid
Storage Storage Compute
38
Some uses of Pegasus
N u m b e r o f jo b s p e r d a y (23 d a ys ), 2 6 1 ,8 2 3 jo b s to ta l, N u m b e r
o f C PU h o u rs p e r d ay, 1 5 ,7 06 h o u rs to ta l (1 .8 ye ar s)
JO B S
1 0 00 0 0
HR S
1 0 0 00
1000
100
Runs on the
10 TeraGrid in 2005
1
/2
/4
/6
/8
/ 19
/21
/23
/25
/27
/29
/31
/10
11
11
11
11
10
10
10
10
10
10
10
11
Summary (1):
Community Services
● Community roll, city hall, permits, licensing & police force
◆ Assertions, policy, attribute & authorization services
● Directories, maps
◆ Information services
● City services: power, water, sewer
◆ Deployed services
● Shops, businesses
◆ Composed services
● Day-to-day activities
◆ Workflows, visualization
● Tax board, fees, economic considerations
◆ Barter, planned economy, eventually markets
40
Summary (2)
● Community based science will be the norm
◆ Requires collaborations across sciences— including computer
science
● Many different types of communities
◆ Differ in coupling, membership, lifetime, size
● Must think beyond science stovepipes
◆ Increasingly the community infrastructure will become the
scientific observatory
● Scaling requires a separation of concerns
◆ Providers of resources, services, content
● Small set of fundamental mechanisms required to build
communities
41
Dynamic
and/or
Distributed
Applications
1 1
9 9
45
IPC Delegation of
● Applications modified to: 1
Dispatcher Request
◆ Adjust to varying 2
IPC
demand & resources
Response: PricelistServer
◆ Use Globus to discover Depending on:
- Time
& provision resources - Discount
- Number of Items
-… 3
GT Domain Areas
● Core runtime
◆ Infrastructure for building new services
● Security
◆ Apply uniform policy across distinct systems
● Execution management
◆ Provision, deploy, & manage services
● Data management
◆ Discover, transfer, & access large data
● Monitoring
◆ Discover & monitor dynamic services
56
Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime
GT Protocols
● Web service protocols
◆ WSDL, SOAP
◆ WS-Interoperability profile
◆ Custom
58
FileTransferService
(without WSRF)
FileTransfer
Service
move (A to B) : transferID Client
move
whatHappen
state tellMeWhen
cancel
WSRF in a Nutshell
● Service
● State representation
◆ Resource
Service ◆ Resource Property
GetRP
● State identification
EPR GetMultRPs ◆ Endpoint Reference
EPR
EPR SetRP ● State Interfaces
Resource ◆ GetRP, QueryRPs,
QueryRPs
GetMultipleRPs, SetRP
RPs Subscribe ● Lifetime Interfaces
SetTermTime
◆ SetTerminationTime
◆ ImmediateDestruction
Destroy ● Notification Interfaces
◆ Subscribe
◆ Notify
● ServiceGroups
61
Transfer getRP
RPs queryRPs
destroy
Stateful
Entities
Register
Stateful Registry
Entity
Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime
Service Container
Service Container: host
Service multiple services in
Service GetRP container; one JVM
Service GetRP process
GetRP
GetMultRPs
EPR GetMultRPs
EPR GetMultRPs
EPR
EPR SetRP …more details: based
EPR
EPRResource SetRP
EPRResource SetRP on AXIS service
Resource QueryRPs
QueryRPs container, processes
RPs QueryRPs
Subscribe SOAP messages,
RPs Subscribe ResourceContext
RPs Subscribe
SetTermTime extension.
SetTermTime
ResourceHome SetTermTime
Destroy
ResourceHome Destroy
ResourceHome Destroy
67
User Applications
Custom GT4
Administration
WSRF Web WSRF Web
Registry
Custom
GT4 Container
Services Services
Web
Services
WS-Addressing, WSRF,
WS-Notification
GetRP Test
Distributed client and service on same LAN
(times in milliseconds)
149.67
17.1 140.5
55.6
10.05 81.39
8.23
N/A
2.34 14.8 11.46 12.91
2.85
G G py W W G G py W W G G py W W
T4 T4 G S S T4 T4 G S S T4 T 4 G S S
- J - C ridW RF: RF. r R R
- J - C idW F: F. - J - C ridW RF: RF.
av
a ar :Lite NET av
a ar :Lite NET av
a ar :Lite NET
e e e
GT4 WS Core Performance
71
Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime
Globus Security
● Control access to shared services
◆ Address autonomous management, e.g.,
different policy in different work-groups
● Support multi-user collaborations
◆ Federate through mutually trusted services
◆ Local policy authorities rule
● Allow users and application communities to set
up dynamic trust domains
◆ Personal/VO collection of resources working
together based on trust of user/VO
74
Person E
Person B File server F1
(Researcher)
Compute Server C1' (Administrator) (disk A)
Person A
Person D
(Principal Investigator)
(Researcher)
Person B
Person E
(Staff) Person D File server F1 (Faculty)
Compute Server C2 Compute Server C1 (Staff) (disks A and B)
Person A Person F
(Faculty) (Faculty)
Person C
(Student) Compute Server C3
Organization A Organization B
GT4 Security
Authz Callout: SSL/WS-Security
SAML, XACML with Proxy
Certificates
Services (running
on user’s behalf)
Access
Users
Rights
GT4 Security
● Public-key-based authentication
● Extensible authorization framework based on Web
services standards
◆ SAML-based authorization callout
● As specified in GGF OGSA-Authz WG
◆ Integrated policy decision engine
● XACML policy language, per-operation policies, pluggable
GT-XACML Integration
● eXtensible Access Control Markup Language
◆ OASIS standard, open source implementations
● XACML: sophisticated policy language
● Globus Toolkit ships with XACML runtime
◆ Included in every client and server built on GT
◆ Turned-on through configuration
● … that can be called transparently from runtime
and/or explicitly from application …
● … and we use the XACML-”model” for
our Authz Processing Framework
79
GT Authorization Framework
80
Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime
Bandwidth (Mbps)
GridFTP in GT4
14000
12000
10000
Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime
GT4 WS GRAM
● 2nd-generation WS implementation optimized
for performance, flexibility, stability, scalability
● Streamlined critical path
◆ Use only what you need
● Flexible credential management
◆ Credential cache & delegation service
● GridFTP & RFT used for data operations
◆ Data staging & streaming output
◆ Eliminates redundant GASS code
89
Job events
SEG
GT4 Java Container Compute element
GRAM
GRAM Local job control
services Local
services
Job tions GRAM
sudo
Dele scheduler
func gate
Client
Transfer adapter
Delegation request
Delegate
GridFTP User
RFT File
FTP job
Transfer
control
FTP data
Remote
GridFTP storage
element(s)
90
WS GRAM Performance
● Time to submit a basic GRAM job
◆ Pre-WS GRAM: < 1 second
◆ WS GRAM: 2 seconds
● Concurrent jobs
◆ Pre-WS GRAM: 300 jobs
◆ WS GRAM: 32,000 jobs
● Various studies are underway to test latest
software
Open Science Grid 91
Jobs (2004)
www.opensciencegrid.org
92
Embedded
Client-side
Resource Management
VO
Deleg Deleg
Admin
GRAM GRAM
Headnode Cluster
Resource Resource
VO
Manager Manager
User
VO
Monitoring and control
User
Deleg VO Job
GRAM
Cluster
VO Scheduler ... Other Services Resource
Manager
Globus Toolkit:
Open Source Grid Infrastructure
Data Globus Toolkit v4
Replication www.globus.org
Grid
Credential Replica
Telecontrol
Mgmt Location
Protocol
Community
Data Access Python
Delegation Scheduling WebMDS
& Integration Runtime
Framework
Reliable
Community Workspace C
File Trigger
Authorization Management Runtime
Transfer
Grid Resource
Authentication Java
Authorization
GridFTP Allocation & Index
Management Runtime
GT4 Cont.
GT4 Container
MDS- Custom protocols
MDS-
Index for non-WSRF entities Index
Automated
registration GridFTP
in container
RFT
GRAM User
96
Information Providers
● GT4 information providers collect
information from some system and make it
accessible as WSRF resource properties
● Growing number of information providers
◆ Nagios, SGE, LSF, PBS
● Many opportunities to build additional ones
◆ E.g., network monitoring, storage systems,
various sensors
97
Birmingham•
Cardiff
AEI/Golm
www.globus.org/solutions
99
Relationship to
Other Globus Services
At requesting site, deploy:
Local Site
● WS-RF Services
◆ Data Replication Service
Reliable
◆ Delegation Service Data
Delegation File
Replication
◆ Reliable File Transfer Service Transfer
Service
Service Service
Replicator Delegated RFT
Resource Credential Resource
Replica
Replica Catalog Replica
WSRF Services: Catalog Catalog
•Data Replication
•Reliable File Transfer
•Information Services Delegation
Pre-WS Services:
•Delegation GridFTP
•Replica Index
•Replica Catalog Server
•GridFTP Server GridFTP
Server
MDS
Service Container
103
Service Container
104
Service Container
105
Delegation Service
Data Rep.
Client
Replica
Catalog
Replica
RFT Index
Replica
Catalog
Replica Represents Replica
Delegation: Credential:
Manages Catalog
the user’s delegated Catalog
delegated credential as a stateful
credentials from Delegation entity.
client/user. Credential GridFTP
RP Server
GridFTP
Resource Properties: Server
MDS •None excepts
standard lifetime RPs
(CurrentTime,
TerminationTime)
Service Container
106
Replica
Replica Catalog Replica
Catalog Catalog
MDS Information
Delegation Index: an index of
Services:
Collects information monitored resources.
GridFTP
from other Trigger:Server
an event /
Resources, may stimulus trigger (not
GridFTP
trigger events based pictured)
Server
on conditions. MDS
Index Resource Properties:
RP •ResourceCount
•Entry: EPR + data
from other Resources
Service Container
107
Pre-WS Services
Replica Index:
Data Rep. Index of replica
Client information across
catalogs
Replica
Catalog
Replica
RFT Replica Catalog: Index
Catalog of replica Replica
information, per- Replica Catalog Replica
SE Catalog Catalog
Delegation
GridFTP
GridFTP Server: Server
Transfer service
GridFTP
based on GridFTP Server
protocol
MDS
Service Container
108
Delegation
•Initialize user Credential GridFTP
proxy RP Server
GridFTP
Server
MDS
Service Container
109
Service Container
110
Delegation
•Add Replicator EPR •Periodically polls
Credential GridFTP
resource to MDS Replicator RP Server
via
Information service RP
GetRP or GetMultRP
Index GridFTP
Server
MDS •Conditions may
trigger alerts or
Index
other actions
RP (Trigger service not
pictured)
Service Container
111
Service Container
112
Transfer Data
•Periodically poll
Data Rep. “ResultStatus” RP via GetRP
Client Replicator •When “Done”, get state
RP information for each file
transfer Replica
EPR Catalog
•Notification of EPR Replica
“Stage” RP value RFT Index
changed to “transfer” Transfer Replica
RP •Data transfer between Catalog
Replica Replica
•Create Transfer
GridFTP Server sites
Catalog Catalog
resource
•Pass credential EPR
•Set Termination Time Delegation
•Transfer resource Credential GridFTP
EPR returned
RP Server
GridFTP
•Access delegated Server
credential resource MDS
Index •Setup GridFTP Server
RP transfer of file(s)
Service Container
113
Service Container
114
Delegation
Credential GridFTP
RP Server
GridFTP
Server
MDS
Index
RP
Service Container
115
Resource Termination
Data Rep. TIME
Client Replicator
RP
Replica
•Termination time (set Catalog
Replica
by client) expires
RFT Index
eventually
Transfer Replica
RP Replica Catalog Replica
•Resources Catalog Catalog
destroyed
(Credential, Transfer, Delegation
Replicator) Credential GridFTP
RP Server
GridFTP
Server
MDS
Index
RP
Service Container
116
DRS: Stubs
BaseRequestType
CreateReplicator ReplicationOptionsType
-credentialEPR
-initialTerminationTime
-autostart
java.rmi.Remote
ReplicatorPortType
CreateReplicatorResponse
TransferOptionsType
-endpointReference
+createReplicator () -concurrency
RequestFileReplicationRequestType
+start() -binary
+stop() -requestFileURL -blockSize
+suspend() -format -tcpBufferSize
+resume() BaseFaultType -...
+findItems()
+setTerminationTime ()
+destroy()
+queryRPs()
+getMultipleRPs ()
+getRP() ReplicatorFaultType
+subscribe ()
+getCurrentMsg()
CreateReplicatorFaultType
DRS: Implementation
WS Core
ResourceHomeImpl
ReplicationServiceImpl
+find() DRS
+createReplicator()
+otherCustomOps()
ReplicatorHome
GetRPProvider
+create()
+getRP()
DiscoverWork
GetMultipleRPsProvider
+getMultipleRPs() ReflectionResource
DestroyProvider ReplicatorResource
RemoveCallback
+destroy() +getTopicSet()
RegisterWork
+remove()
+create() TopicListAccessor
SetTerminationTimeProvider +otherCustomMethods()
+setTerminationTime()
SubscriptionProvider SimpleTopicSet
+subscribe()
Example Solutions
● Portal-based User Reg. System (PURSE)
● VO Management Registration Service
● Service Monitoring Service
● TeraGrid TGCP Tool
● Lightweight Data Replicator
● GriPhyN Virtual Data System
124
The Future:
Structure
● NSF Community Driven Improvement of Globus
Software (CDIGS) project
◆ 5 years of funding for GT enhancement
◆ Regular Globus roadmaps outlining plans
● GlobDev http://dev.globus.org
◆ Apache-like community development site
◆ Community governance of components
◆ “Globus Toolkit” & other related software
◆ Open for business early 2006
◆ “Globus Alliance” = “GlobDev committers”
126
The Future:
Content
● We now have a solid and extremely powerful Web
services base
● Next, we will build an expanded open source Grid
infrastructure
◆ Virtualization
◆ New services for provisioning, data management,
security, VO management
◆ End-user tools for application development
◆ Etc., etc.
● And of course responding to user requests for other
short-term needs
127
GlobDev
● The current set of Globus components will be
organized into several “Globus Projects”
◆ Projects release products
● Each project will have its own group of
“Committers”
◆ committers are responsible for governance on
matters relating to their products
● The “Globus Management Committee” will
◆ provide overall guidance and conflict resolution
◆ approve the creation of new Globus Projects
129