Applications:
SDSC ENHANCES ZONE SRB TO SUPPORT LARGE-SCALE COLLABORATIONS
The San Diego Supercomputer Center (SDSC) at UC-San Diego has released version
3.2 of Zone SRB, the SDSC Storage Resource Broker (SRB) scientific data
management system. Version 3.2 offers faster command line performance, support
for the Informix database in the SRB Metadata Catalog (MCAT), faster file
transfers for users inside firewalls, and numerous improvements in
installation, administration, and the SRB server. The software, user manuals,
and release notes for version 3.2 are online for download by the research
community as a source distribution at
www.sdsc.edu/DICE/SRB/tarfiles/main.html .
"The SRB is one of the most advanced and comprehensive production tools
available for scientific data management," said Reagan Moore, SDSC
distinguished scientist and co-director of the Data and Knowledge Systems
(DAKS) program at SDSC. "The SRB system is used in scientific disciplines from
astronomy and environmental sciences to neurosciences, physics, and chemistry,
and in projects across agencies including NARA, NSF, NIH, DOE and NASA, as
well as many international efforts."
As an end-to-end data management solution, the SRB stores data created in
sensor networks or simulations, supports data management and collaboration in
data Grids, publication in digital libraries, and long-term preservation in
digital archives. The flexible Zone SRB system can manage data from simple
collections for a single researcher to complex multi-terabyte collections. By
supporting the federation of distributed data collections, Zone SRB allows
scientific users to flexibly share rapidly-changing data collections across
multiple institutions that may be spread around the globe, each running their
own SRB MCAT, or zone. This provides speedy access to "any data, anywhere,
anytime."
As they use the SRB, researchers realize that they are doing more than simply
learning a new software application -- they are mastering the basic principles
of sound scientific data management, which is essential knowledge for
scientists in today's technology-driven world.
"At first we were just planning a minor SRB release," said Arcot Rajasekar,
director of SDSC's DAKS Data Grids Technologies group. "But the team worked
very hard and so many improvements were ready that we are able to release
version 3.2 with many important new features."
What's New in SRB Version 3.2
A widely-used production tool, the evolution of the SRB is guided by input
from its large user community. The new capabilities that have been added to
version 3.2 fall into the following categories: based on the requirements of
the Compact Muon Solenoid (CMS) high-energy physics project, which sometimes
needs to move very large numbers of small files, a new Shell has been added
that executes command-line Scommands more quickly, improving performance. To
meet the need of the Biomedical Informatics Research Network (BIRN) project
for efficient parallel file transfers to and from environments within
firewalls, client-initiated connections for parallel I/O have been added. In
addition, developer Mike Smorulof Joseph Jaja's University of Maryland NASA
Earth Science Information Partners (ESIP) project completed a port of the
Informix database to the Metadata Catalog (MCAT) component of the SRB, adding
another database option for SRB users. Improvements have been made to the core
SRB Server component, which can more intelligently select an appropriate
resource based on the availability of sufficient space and other criteria. The
SRB Server can also be put into a maintenance mode for graceful shutdown. In
addition to bug fixes, new features have also been added for easier
installation and administration.
In conjunction with the release of SRB version 3.2, new versions of inQ, the
popular Windows graphical user interface to SRB, and MySRB, the web-based
access tool, are being released that support SRB 3.2. In addition, new
versions of Jargon (Java API for Real Grids On Networks), the SRB Java API,
and SDSC Matrix have also been released. Matrix can be used as either a Grid
workflow management system or for SRB Web Services, and uses the Data Grid
Language (DGL) to communicate between Matrix Web Service clients and the
server.
New features in Matrix 3.2 include the ability to create data-flows over a
very large number of files/data sets; the ability to initiate in parallel a
new dataflow based on the number of files, which increases execution speed
even for non-parallel applications; provenance tracking of Gridflows for all
data and processes during and after execution; a developer-friendly Java
client API; declaration of scoped variables within the workflow to help in
dynamic computational steering based on previous results; and a DGL developer
guide.
SDSC SRB version 3.2 is supported on a wide variety of systems. The MCAT
Metadata Catalog runs on Oracle, IBM DB2, Sybase, Informix, MySQL, and
Postgres. The SRB Server runs on Microsoft Windows NT, 2000,and XP, as well as
most UNIX platforms including Linux, IRIX, AIX, HP Tru64, and Mac OSX, and
supports data in file systems, tape stores such as the High Performance
Storage System (HPSS), and databases. In addition to UNIX clients, additional
APIs include C and C++ library calls, Shell commands, Perl and Python load
libraries, dynamic load libraries for Windows, Open Archives Interface, WSDL,
and Java classes. Interactive browser interfaces include the Windows graphical
user interface, inQ, and the Web interface, MySRB.
The SRB team, led by Reagan Moore and Arcot Rajasekar, includes chief
architect Mike Wan, senior developer Wayne Schroeder, data Grid application
specialist George Kremenek, SRB administrator Sheau-Yen Chen, and data Grid
developers Charles Cowart, Lucas Gilbert, Arun Jagatheesan, Roman
Olschanowsky, Antoine de Torcy, Tim Warnock and Bing Zhu.
|