You are on page 1of 46

Distributed Databases and Applications

John Wieczorek
Museum of Vertebrate Zoology, UC Berkeley

DiGIR

Distributed Databases

Multiple sources of data


under local control,
with concepts in common
and a desire to deliver data as part of a
community.

DiGIR

Distributed Databases

The Species Analyst (TSA)


The Integrated Taxonomic Information System (ITIS)
FishNet
The Mammal Networked Information System (MaNIS)
HerpNET
The Ornithological Information System (ORNIS)

DiGIR

Distributed Databases

European Natural History Science Information


Network (ENHSIN)
Biological Collection Access for Europe (
BioCASE)
Australia Virtual Herbarium (AVH)
Red Mundial de Informacin Sobre
Biodiversidad, Comisin Nacional para el
Conocimiento y Uso de la Biodiversidad (
REMIB, CONABIO)

DiGIR

Distributed Databases

Mountain and Plains Spatio-Temporal DatabaseInformatics (MaPSTeDI)


Ocean Biogeographic Information System (OBIS)
Pacific Basin Information Node, National Biological
Information Infrastructure (PBIN, NBII)
Species Link, Centro de Referncia em Informao
Ambiental (Species Link, CRIA)
A Virtual Herbarium of the Chicago Region (vPlants)
Spatial Analysis of Local Vegetation Inventories Across
Scales (SALVIAS)

DiGIR

Distributed Databases

Berkeley Natural History Museums (BNHM)


Association of Biological Collections, UC Davis

DiGIR

Distributed Databases

LifeMapper
Global Biodiversity Information Facility (GBIF)

DiGIR

Distributed vs. centralized

Multiple sources of data


under local control,
with concepts in common
and a desire to deliver data as part of a
community

DiGIR

Distributed vs. centralized


In other words, distribute the headache rather
than have one central migraine.

DiGIR

DiGIR
Distributed Generic Information Retrieval

John Wieczorek, Stan Blum, Dave Vieglais, P.J. Schwartz

DiGIR

10

Project Rationale

To avoid multiple incongruous


development efforts
To pool resources and create a community
of experts
To solve the problem of scalability

DiGIR

11

Project Goals

To define a protocol for retrieving


structured data from multiple,
heterogeneous databases across the
Internet
To build a reference implementation of
both provider and portal software using
said protocol
DiGIR

12

Design Goals

To use open protocols and standards, such


as HTTP and XML
To decouple the protocol, software and
semantics
To make new data provider installations as
easy as possible
To have open source development and
GNU General Public Licensing
DiGIR

13

DiGIR Architecture
User

Interface
Protocol
Portal Engine
Provider

DiGIR

14

DiGIR Architecture
Provider

DiGIR

15

DiGIR Architecture
Provider
Registry

DiGIR

16

DiGIR Architecture
Portal

Engine

DiGIR

17

DiGIR Architecture
Portal

Engine
Registry

DiGIR

18

DiGIR Architecture
User

Interface

DiGIR

19

DiGIR Architecture
User

Interface
Protocol
Portal Engine

DiGIR

20

DiGIR Architecture
User

Interface
Protocol
Portal Engine
Protocol
Provider

DiGIR

21

DiGIR Architecture
User

Interface
Protocol
Portal Engine
Protocol
Provider

DiGIR

22

DiGIR Architecture
User

Interface
Protocol
Portal Engine

DiGIR

23

DiGIR Component Summary

DiGIR

24

DiGIR Protocol

Defines request and response message


formats for communication between
provider, portal engine, and user interfaces

Metadata requests
Search requests
Inventory requests

Remains unfettered by the structure of the


data it transfers
DiGIR

25

Portal Engine

The entry point for a user


Can query a registry for
potential providers
Can determine, based on
provider metadata, whether a
provider should be queried
Can send requests to multiple
providers
Communicates via protocol
compliant messaging only
DiGIR

26

Portal Engine, continued

Assembles responses
from providers
Returns packaged results
to the user
Logs activity

DiGIR

27

Provider

Receives requests
Retrieves data from database
Sends results to requestor
Supplies metadata to describe
data classification and
availability
Logs requests

DiGIR

28

Registry

Supports provider
advertising
May be global and open
May be private
Need not be used at all
Example: Universal
Description, Discovery
and Integration (UDDI)
DiGIR

29

User Interfaces

Must be able to assemble and


send a request document to a
portal
Must be able to receive and
interpret a response document
from the portal
This is where the real fun is!

DiGIR

30

Example Network Configurations

DiGIR

31

BNHM Network Configuration


Essig
Working
Database

PHMA
Working
Database

UCBG
Working
Database

UCJEPS
Working
Database

UCMP
Working
Databases (4)

Online
Database

Online
Database

Online
Database

Online
Database

Online
Database

DiGIR
Provider
BNHM
DiGIR
Portal
BNHM
Presentation
Layer

DiGIR

32

MaNIS Network Configuration


Working
Database

Working
Database

Working
Database

Working
Database

Working
Database

Online
Database

Online
Database

Online
Database

Online
Database

Online
Database

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

MaNIS
Presentation
Layer

MaNIS
Presentation
Layer

MaNIS
Presentation
Layer

DiGIR

33

MaNIS Network Configuration


CAS
SQL Server
Database

LACM
MS Access
Database

MVZ
Sybase
Database

TTU
FoxPro
Database

UWBM
4D-Mac
Database

Online
SQL Server
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

LACM-MaNIS
Presentation
Layer

MVZ-MaNIS
Presentation
Layer

UWBM-MaNIS
Presentation
Layer

DiGIR

34

MaNIS Network Configuration


CAS
SQL Server
Database

LACM
MS Access
Database

MVZ
Sybase
Database

TTU
FoxPro
Database

UWBM
4D-Mac
Database

Online
SQL Server
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

LACM-MaNIS
Presentation
Layer

MVZ-MaNIS
Presentation
Layer

UWBM-MaNIS
Presentation
Layer

DiGIR

35

MaNIS Network Configuration


CAS
SQL Server
Database

LACM
MS Access
Database

MVZ
Sybase
Database

TTU
FoxPro
Database

UWBM
4D-Mac
Database

Online
SQL Server
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

LACM-MaNIS
Presentation
Layer

MVZ-MaNIS
Presentation
Layer

UWBM-MaNIS
Presentation
Layer

DiGIR

36

MaNIS Network Configuration


CAS
SQL Server
Database

LACM
MS Access
Database

MVZ
Sybase
Database

TTU
FoxPro
Database

UWBM
4D-Mac
Database

Online
SQL Server
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

LACM-MaNIS
Presentation
Layer

MVZ-MaNIS
Presentation
Layer

UWBM-MaNIS
Presentation
Layer

DiGIR

37

MaNIS Network Configuration


CAS
SQL Server
Database

LACM
MS Access
Database

MVZ
Sybase
Database

TTU
FoxPro
Database

UWBM
4D-Mac
Database

Online
SQL Server
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

Online
MS Access
Database

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

DiGIR
Provider

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

MaNIS
DiGIR
Portal

LACM-MaNIS
Presentation
Layer

MVZ-MaNIS
Presentation
Layer

UWBM-MaNIS
Presentation
Layer

DiGIR

38

Other Network Configurations


Working
Database

Working
Database

Online
Database

Online
Database

DiGIR
Provider

DiGIR
Provider

DiGIR
Portal

Working
Database

Working
Database

Working
Database

Online
Database

Online
Database

DiGIR
Provider

DiGIR
Provider

DiGIR
Portal

DiGIR
Portal

DiGIR

39

DiGing a little deeper

DiGIR

40

Provider Installation

Web server (Apache, IIS, etc.)


PHP: Hypertext Preprocessor
(PHP)
Provider software (DiGIR)

Configuration tool
Testing scripts
Provider scripts
Provider manual (DiGIR)
DiGIR

41

Provider Configuration Tool

Provider metadata
Resources
Database connection
Establishing table
relationships
Concept to column (i.e.,
field, attribute) mapping
DiGIR

42

Portal Configuration

Web server (Apache, IIS, etc.)


Sun Java 2 (JDK 1.4)
Tomcat (Apache)
Portal software (DiGIR)
Portal installation
documentation (DiGIR)

DiGIR

43

Portal Installation

Engine configuration file


(finding providers)
Presentation configuration
file (defining the Information
Domain)
Presentation customization
Engine start and stop scripts
Presentation start and stop
scripts
DiGIR

44

Portal Demonstrations

DiGIR

45

DiGIR Project Information

The DiGIR project is a collaborative effort


DiGIR is currently established as an open
source development project on
SourceForge (https://
sourceforge.net/projects/digir).
Further documentation is available on the
DiGIR web site (http://digir.net).
DiGIR

46

You might also like