 |
|
DAILY NEWS AND INFORMATION
FOR THE GLOBAL GRID COMMUNITY / JUNE 30, 2003: VOL. 2 NO. 26
|
Breaking News - Operating Systems
& Middleware:
SDSC Releases New Storage
Resource Broker Middleware
The San Diego Supercomputer Center (SDSC) at UCSD has released version 2.1
of
the popular SDSC Storage Resource Broker (SRB) middleware package, which
enables scientists to create, manage, and collaborate with flexible, unified
"virtual data collections" that may be stored on heterogeneous data resources
distributed across a network.
"In addition to a number of bug fixes, we've made version 2.1 more 'grid-
friendly' by including Web Services Description Language (WSDL) features, a
pure Java programming interface, and encrypted data transfers," said Arcot
Rajasekar, director of the Data Grids Technologies group in the Data and
Knowledge Systems (DAKS) program at SDSC.
SRB version 2.1 along with the user manual and release notes are available
online at www.npaci.edu/DICE/SRB/. SDSC
SRB Version 2.1 is
supported on
the following platforms: UNIX, including Linux Redhat 7.3; Solaris; AIX; SGI;
and Macintosh OS X; as well as Microsoft Windows 2000.
"There is growing interest in the research community in the SRB software
because of the need to integrate, manage, and access explosively growing data
collections," said Reagan Moore, codirector of SDSC's DAKS program.
Developed by Moore, Rajasekar, Michael Wan, and the SRB team in SDSC's Data
and Knowledge Systems (DAKS) program, the SDSC SRB is being used in projects
as diverse as helping astronomers integrate multi-terabyte image collections
in the NSF's National Virtual Observatory, enabling NIH-funded neuroscientists
to share brain data across the country in the Biomedical Informatics Research
Network, and developing persistent archives for the National Archives and
Records Administration.
Still other SRB applications include NASA, which is using the SRB to manage
massive collections of satellite data; the Science Environment for Ecological
Knowledge, a large NSF Information Technology Research project, which will use
the SRB to integrate ecological data collections; and the NSF ROADNet project,
which is employing the SRB in conjunction with object ring buffers to bring
together diverse types of sensor data in real time.
New features in SRB Version 2.1 include better support for Grid Security
Infrastructure (GSI); optional data encryption and compression; and SDSC
Matrix, a Web service-oriented interface. Matrix uses W3C standards to provide
services including data movement, replication, access control, data set
ingestion, retrieval, and container support. Other new features include
JARGON, a pure Java Application Program Interface (API) for developing
portable programs with a grid interface; the ability to bulk load data without
requiring the use of a container; the ability to list host-specific resources;
configurable parameters for determining the number of threads for parallel
transfer; and a SRB Python binding.
The SRB Data Management Middleware
The SDSC SRB is client-server middleware that solves many problems
associated
with traditional file systems. What appears as a single collection to the user
is in fact a virtual collection consisting of digital entities scattered
across distributed, heterogeneous storage resources, including file systems,
archives, and databases. The SRB makes all these differences transparent to
users, negotiating all protocols, access permissions, etc. across the multiple
sites, so that users can access data based on familiar, user-defined global
file names. Users are freed from having to keep track of such complexities as
local file names, physical locations, protocols, and security
arrangements.
As scientific disciplines become more integrated, data sharing becomes more
important. The SRB is very helpful in collaborative science because it can
finely tune data sharing and access according to the needs of individual
researchers and groups in complex collaborations. Users can also quickly and
flexibly "repurpose" or restructure collections through customized "views"
that they shape by searching with rich descriptive metadata that is expressed
in familiar, user-defined terms.
The SRB organizes metadata about the data and files in the MCAT metadata
catalog to help researchers assemble, search, access, and manage collections
of data. The MCAT provides a global name space that spans all the separate
resources. The power of the MCAT comes from its relational database
technology, so that it can be extended to include capabilities beyond those of
traditional file systems including more complex access control systems, proxy
operations for such things as delivering subsets of a collection, and
knowledge discovery based on system- and application-level metadata.
SRB collections are highly scalable, both in size and in distribution
across
remote sites. SRB collections at SDSC support nearly 9 million files and 51
terabytes of data. Once a collection is created, it can be transparently
replicated, managed, and controlled across geographically distributed
locations through any of several interactive interfaces: a command-line
interface, and new graphical user interfaces including a Windows-Explorer-like
interface called inQ -- short for inQuisitor -- and a Web interface,
MySRB.
The SRB is proven, production software, with more than 200 registered users
at
more than 50 sites. DAKS researchers on the SDSC SRB project, led by Arcot
Rajasekar, include Sheau-Yen Chen, Charles Cowart, Lucas Gilbert, Arun
Jagatheesan, George Kremenek, Roman Olschanowsky, Vicky Rowley, Wayne
Schroeder, Michael Wan, and Bing Zhu. -- Paul Tooby
|