Special Features:
TeraGrid ENTERS FULL PRODUCTION PHASE
The TeraGrid, the National Science Foundation's multi-year effort to build a
distributed national cyberinfrastructure, has now entered full production
mode, providing a coordinated set of services for the nation's science and
engineering community. TeraGrid's unified user support infrastructure and
software environment allow users to access storage and information resources
as well as over a dozen major computing systems via a single allocation,
either as stand-alone resources or as components of a distributed application
using Grid software capabilities.
"The Extensible Terascale Facility is a key milestone for the
cyberinfrastructure of tomorrow," said Sangtae Kim, director of the NSF's
Division of Shared Cyberinfrastructure. "NSF salutes the tremendous effort on
the part of the dozens of staff at the nine ETF institutions to successfully
complete construction and enter the project's operational phase."
"Through the TeraGrid partnership, we have built a distributed system of
unprecedented scale," said Charlie Catlett, TeraGrid project executive
director and a senior fellow at the Computation Institute at Argonne National
Laboratory. "This milestone is a testament to the expertise, innovation, hard
work, and dedication of all the TeraGrid partners. The partnership among these
sites is itself an extremely valuable resource, and one that will continue to
yield benefits as the TeraGrid moves into its operational phase."
Through its nine resource partner sites, the TeraGrid offers advanced
computational, visualization, instrumentation, and data resources:
- Argonne National Laboratory provides users with high-resolution rendering
and remote visualization capabilities via a 1 teraflop IBM Linux cluster with
parallel visualization hardware.
- The Center for Advanced Computing Research (CACR) at the California
Institute of Technology (Caltech) focuses on providing online access to very
large scientific data collections in astronomy and high energy physics, and
application expertise in these fields, geophysics, and neutron science.
- Indiana University and Purdue University together contribute more than 6
teraflops of computing capability, 400TB of data storage capacity,
visualization resources, access to life science data sets deriving from
Indiana University's Indiana Genomics Initiative, and a connection to the
Purdue Terrestrial Observatory.
- The National Center for Supercomputing Applications (NCSA) offers 10
teraflops of capability computing through its Mercury IBM Linux cluster, which
consists of 1,776 Itanium 2 processors. Mercury is the largest computational
resource of the TeraGrid. The system at NCSA also includes 600TB of secondary
storage and 2 petabytes of archival storage capacity. In addition, the new SGI
Altix SMP system with 1,024 Itanium 2 processors will become part of the
TeraGrid.
- With the completion of the new Atlanta TeraGrid hub and a 10-gigabit-per-
second TeraGrid connection to the Oak Ridge National Laboratory (ORNL), users
of ORNL's neutron science facilities, such as the High Flux Isotope Reactor
(HFIR)[ http://neutrons.ornl.gov/] and the Spallation Neutron Source (SNS)[
http://www.sns.gov/], will be able to utilize TeraGrid resources and services
for the storage, distribution, analysis, and simulation of their experiments
and data.
- The Pittsburgh Supercomputing Center (PSC), a lead computing site, provides
computational power to researchers via its 3,000-processor HP AlphaServer
system, TCS-1, which offers 6 teraflops of capability coupled uniquely to a
21-node visualization system. PSC also provides a 128-processor, 512GB
shared-memory HP Marvel system, a 150TB disk cache, and a mass-store system
with a capacity of 2.4 petabytes.
- The San Diego Supercomputer Center (SDSC) leads the TeraGrid data and
knowledge management effort by deploying a data-intensive IBM Linux cluster
based on Intel Itanium family processors, with a peak performance of just over
4 teraflops and 540 terabytes of network disk storage. In addition, a portion
of SDSC's DataStar IBM 10-teraflops supercomputer is assigned to the TeraGrid.
An IBM HPSS archive currently stores a petabyte of data. A next-generation Sun
Microsystems high-end server helps provide data services.
- The Texas Advanced Computing Center (TACC) offers users high-end computers
capable of 6.2 teraflops, a terascale visualization system, a 2.8-petabyte
mass storage system, and access to geoscience and biological morphology data
collections.
Through these nine sites, the TeraGrid provides 40 teraflops of computing
power with petabyte-scale data storage and operates over a 40 gigabit-per-
second network.
Scientists in a wide range of fields have already begun using the TeraGrid:
- The Center for Imaging Science (CIS) at Johns Hopkins University has
deployed its shape-based morphometric tools on the TeraGrid to support the
Biomedical Informatics Research Network, a National Institutes of Health
initiative involving 15 universities and 22 research groups whose work centers
on brain imaging of human neurological disorders and associated animal models.
Initial studies have mapped hippocampal data from Alzheimer's, semantic
dementia, and control subjects using these tools.
- Harvey Newman, a particle physicist from the California Institute of
Technology in Pasadena, was granted the single largest TeraGrid allocation to
investigate the discovery potential of CERN's CMS experiment at the Large
Hadron Collider, in particular the efficiency of detecting the decay of the
Higgs boson into two energetic photons. The work involves generating,
simulating, reconstructing, and analyzing tens of millions of proton-proton
collision, and deriving limits on the efficiency for discoveries by the CMS
collaboration in the early years of running at the LHC, which starts operating
in 2007.
- Michael Norman, an astrophysicist at the University of California-San
Diego, is conducting detailed simulations of the evolution of the universe. He
has ported his "Enzo" code to the TeraGrid and will follow the evolution of
the cosmos from shortly after the Big Bang, through the formation of gas
clouds and galaxies, all the way to the present era.
- Klaus Schulten, a biophysicist at the University of Illinois, used
terascale massive parallelism on the TeraGrid for major advances in the
understanding of membrane proteins. He is also harnessing the TeraGrid to
attack problems in the mechanisms of bioenergetic proteins, the recognition
and regulation of DNA by proteins, the molecular basis of lipid metabolism,
and the mechanical properties of cells.
The Coordinated TeraGrid Software and Services (CTSS) software suite is used
to provide a common user environment across the heterogeneous resources in
TeraGrid as well as to support Grid-based capabilities such as certificate-
based single sign-on and distributed applications management via the Globus
Toolkit. A distributed accounting infrastructure, developed at NCSA, supports
general allocations that can be redeemed at any TeraGrid resource, and a
software and services verification and validation system, developed at SDSC,
provides continuous monitoring of the software infrastructure across all
sites. With integration of the TCS-1 system, PSC spearheaded TeraGrid
expansion to interoperability, a Grid environment integrating heterogeneous
system architectures, and TeraGrid now encompasses a flexible array of
systems.
Over the next several years, the collaborative TeraGrid team will enhance and
expand the services offered to scientific users. Future features the team
plans to add include improved meta-scheduling and co-scheduling services, a
global file system to facilitate the use of data at distributed sites, and
"Science Gateways," including Web-based portals that provide a user-friendly
interface to the TeraGrid's services and meet the unique needs of specific
research communities.
For more information on the TeraGrid, go to
www.teragrid.org.
|