Applications:
PYTHON/GLOBUS TOOLS SPEED UP DEVELOPMENT OF DATA GRID FOR LIGO
Programming tools developed at the U.S. Department of Energy's Lawrence
Berkeley National Laboratroy by Keith Jackson and his colleagues in the
Computational Research Division's Secure Grid Technologies Group have been
used to set up an efficient system to distribute new data that will put the
predictions of Einstein's General Theory of Relativity to the test. To date,
more than 50TB of data from LIGO has been replicated to nine sites on two
continents, quickly and robustly.
LIGO, the Laser Interferometer Gravitational-Wave Observatory, is a facility
dedicated to detecting cosmic gravitational waves -- ripples in the fabric of
space and time -- and interpreting these waves to provide a more complete
picture of the universe. Funded by the National Science Foundation, LIGO
consists of two widely separated installations -- one in Hanford, Wash., and
the other in Livingston, La. -- operated in unison as a single observatory.
Data from LIGO will be used to test the predictions of General Relativity --
for example, whether gravitational waves propagate at the same speed as light,
and whether the graviton particle has zero rest mass.
Because gravitational waves have never been directly detected (although their
influence on distant objects has been measured), LIGO is conducting blind
searches of large sections of the sky and producing an enormous quantity of
data -- almost 1TB a day -- which requires large-scale computational resources
for analysis.
The LIGO Scientific Collaboration (LSC) scientists at 41 institutions
worldwide need fast, reliable and secure access to the data. To optimize
access, the data sets are replicated to computer and data storage hardware at
nine sites: the two observatory sites plus Caltech, MIT, Penn State, the
University of Wisconsin-Milwaukee (UWM), the Max Planck Institute for
Gravitation Physics/Albert Einstein Institute in Potsdam, Germany, and Cardiff
University and the University of Birmingham in the United Kingdom. The LSC
DataGrid uses the DOEGrids Certificate Authority operated by ESnet to issue
identity certificates and service certificates.
The data distribution tool used by the LSC DataGrid is the Lightweight Data
Replicator (LDR), which was developed at UWM as part of the Grid Physics
Network (GriPhyN) project. LDR is built on a foundation that includes the
Globus Toolkit, Python, and pyGlobus, an interface that enables Python access
to the entire Globus Toolkit. LSC DataGrid engineer Scott Koranda describes
Python as the "glue to hold it all together and make it robust."
pyGlobus is one of two Python tools developed by Jackson's group for the
Globus Toolkit, the basic software used to create computational and data
Grids. The pyGlobus interface or "wrapper" allows the use of the entire Globus
Toolkit from Python, a high-level, interpreted programming language that is
widely used in the scientific and Web communities. pyGlobus is included in the
current Globus Toolkit 3.2 release.
"What's great about using pyGlobus and Python is the speed and ease of
development for setting up a new production Grid application," Jackson said.
"The scientists spend less time programming and move on to their real work -
analyzing data -- faster."
Another Python tool just released in the Globus Toolkit 3.9.2 Development
Release (alpha test version for next year's GT 4.0) is the Python WS Core, a
Python implementation of the Web Services Resource Framework (WS-RF)
specifications. When GT 4.0 is released, the Grid community will be moving
from homegrown protocols and specifications to industry standard Web Service
protocols for client and server support and secure messaging. Moving to the
new standards will simplify the creation of Web services that can interface
efficiently with many resources.
|