Special
Features:
TUECKE HAS
SEEN THE FUTURE OF
DISTRIBUTED COMPUTING ON THE GRID By Robert McMillan, Editor-at-large, Linux Magazine
Robert McMillan is a freelance journalist and editor at large at Linux
Magazine.
Although scientists have been using Grid technologies since the Condor
project began scavenging up idle computer cycles at the University of
Wisconsin in the mid-1980s, the really exciting vision of Grid computing -- a
set of open and ubiquitous standards that real world developers will use for
distributed computing -- remains a vision of the future. One of the people
most actively involved in making that vision a reality is Steve Tuecke, lead
software architect in the Distributed Systems Laboratory at Argonne National
Laboratory and lead architect of the Globus Toolkit, the popular
implementation of the OGSA (Open Grid Services Architecture) middleware
standards that are the basis of Grid computing.
Since he first began work on the I-Way supercomputing network seven years
ago, software architect Steve Tuecke has been one of the driving forces behind
Grid computing. He is lead software architect in the Distributed Systems
Laboratory at Argonne National Laboratory and lead architect of the Globus
Toolkit, the de facto standard open source Grid middleware solution and the
inspiration for the emerging standard OGSA (Open Grid Services Architecture),
which it implements in the recent Globus Toolkit 3.0 alpha release. Steve is
well positioned to offer insight on this cutting-edge technology area. And he
did just that in this interview with developerWorks correspondent Robert
McMillan.
dW: Why is there so much interest in the Grid right
now?
Tuecke: I think it's the confluence of a few events. One is during
the mid- to late-1990s, there was a huge explosion of connectivity. This
explosion really put the infrastructure in place that let companies
realistically talk about outsourcing their capabilities, or setting up
networks with their partners.
A second thing is that during this same period -- certainly in scientific
computing -- we observed larger and larger collaborations coming together to
work. So, increasingly, scientific teams were not made up of just a single
principal investigator and a few graduate students; they would be large
collaborations, on an international scale. You'd get things like these high-
energy physics detectors, where it would take a collaboration of a thousand
physicists to build these things.
And so that push toward collaboration spurred the need to start working
across the traditional organizational boundaries. Groups of people began
coming together for a single research purpose, not because they worked for the
same laboratory.
dW: Java technology has been used in the Grid world for some time
now, but in the last year J2EE has become more important to Grid computing.
What do you see the Java platform's role on the Grid?
Tuecke: Ignoring Grid computing for a moment, let's just think about
where Java itself has caught hold. The first big place Java got a grip was
within the portal server-side community. As business started adopting these
products and good tool sets came out, this started getting uptake also within
the scientific community as well. So several years ago, you started seeing
people trying to build Web-based user interfaces to Grid environments using
Java as a translation medium.
The other thing that's happened over the last couple of years is the
emergence of J2EE as one of the two new dominant platforms (along with .Net)
for building next-generation business infrastructure. That platform, as a
basis for doing a whole lot of interesting business server-side applications,
has really exploded as well.
The third comment I'd make is that Java has seen a very big adoption for
educational purposes. I think the educational environment really picked up on
Java as an instructional tool -- even quicker than the business environment --
and as a basic language they could use for prototyping and
experimentation.
And so all of these things were happening at the same time. Gregor [von
Laszewski] was right in the mix of this with the CoG [Commodity Grid Toolkit].
The CoG originally started as tools to allow some of these portal-based
interfaces to Grids, and some Java GUIs to interface to Grid environments, but
really it's a client-side thing. And then we continued to do various
experimentation with pushing more and more services capability into Java
components as well. And so, a year ago, once we started really working closely
with IBM on the OGSA [the architecture that will be built upon the Open Grid
Services Infrastructure, or OGSI] path, we then brought Web services into this
mix. And of course Java is one of the leading platforms for programming Web
services.
So all of these things were kind of swirling and coming together to get us
to where we are today -- where Java is just a good platform for building Web
service-enabled server-side components. And that's what OGSA is about: to
define standards that talk about how I as a client interact with these Grid
services.
dW: So for you, it was the desire to use Web services standards that
got you to J2EE?
Tuecke: It was the combination of that and the overall broad
adoption of Java. To be clear, we're not a Java-only shop by any means. We
still believe very strongly in the protocols, standards, and multiple tool
kits for programming. So we have an active effort going on in C-based hosting
environments for Web services; we're working closely with a group at Lawrence
Berkeley Laboratory doing Python-based environments. We think all of those
things matter a lot. But a year ago when we were launching down the path of
doing this Grid services thing, there were really only two decent toolkits out
there for doing Web services: there was Apache Axis and Java, and then there
was .Net. On the open source side, Apache Axis was about it, so we decided to
follow that route.
dW: So are people in the scientific community interested in J2EE, or
is this just something that you're developing to make Grid computing
applicable to a broader community?
Tuecke: There's certainly a lot of work in the scientific community using
Java. I think a lot of people in the scientific community are still trying to
get their heads around J2EE. This also partly depends on which part of J2EE
you're talking about. Some people, when they think J2EE, they think servlet
engines. Others, they think entity and session beans. Certainly the scientific
community is very heavily invested in, for example, servlet-based toolkits for
interfacing to their scientific applications. I think we've seen less adoption
of the heavier-weight container-based models, such as you see in entity beans
and session beans. Entity beans really are designed for business relational
database sort of applications. You don't see that nature of application quite
as much in the scientific community.
Another factor in the scientific space -- more so than in the commercial
space -- is open source. There are a lot of groups who believe strongly in
open source software as the basis of doing their scientific work. Party
because it's free, and partly because that community has a history of needing
really special stuff, so the ability to go in and make the tweaks necessary to
really make it work for their application. All of these things are good for
them.
dW: How far away are these Grid standards from being adopted in the
commercial space? When you look at real applications of Grid computing right
now, they all seem to be scientific.
Tuecke: There are a few commercial applications starting to pop up out
there, but most of them are still in the genre of the cycle scavenging type
problem. I think there are two questions in there: one is the question of
adoption, the other is the question of standards.
On the standards side, Globus Toolkit has really, over the last year or
two, become the de facto standard software for doing a lot of the Grid-based
applications, especially in science. But they're not protocol standards, which
is where we want to be. OGSI is the first level of the infrastructure: it
takes Web services, puts a series of patterns on them, uses standard
interfaces for using server state management, lifetime management, monitoring,
and discovery.
What I think you're going to see over the next year is a huge thrust in the
community of us included in Globus to start now fleshing out what we call the
set of common services. We will use that OGSI base to start filling in
standard interfaces for talking about data access and resource management,
logging monitoring services, discovery services, and things like that. For
really widespread adoption, I think that sort of work is needed. We're working
our way up to the point where we can start just deploying Grid components and
tying them together.
You're going to see in a small number of months, delivery of the base
capability -- from a variety of vendors -- of OGSI. We've been shipping the
technology freely for almost a year already. So that gives a very strong basis
for doing either experimentation, or if the nature of the application you're
trying to build is much more customized, where having all this stock
componentry available isn't going to matter as much, great, go for it. You can
go with the tools that are there now.
dW: Now on the other side, is Grid development going to start
driving J2EE standards?
Tuecke: It's perhaps a little too early to know, but I think it's
quite likely that that will happen in some areas.
Let me back up. What does it mean to ship a J2EE-based Grid environment,
something that can deliver OGSI-compliant services? It means that you provide
a server programming environment that makes it very easy for service writers
to implement services that conform to the set of standards that are OGSI. In
other words, you're defining containers. You're starting to define standard
containers that handle a lot of the lifetime issues -- all of the things that
OGSI handles: the lifetime management, the monitoring, all that sort of stuff.
So if you look at what we're shipping with Globus as our implementation of
OGSI, what we've done is defined a set of containers in Java, in J2EE, that
make it very easy to write services. If you follow the J2EE mantra of
standardizing the Java interfaces and having multiple implementations of that
by multiple vendors, that's exactly what we think needs to happen on the Grid
side.
Just as with these interfaces for programming a Grid service; today they're
not standard. They're something that Globus, with some help from a few
friends, puts together. Over the next year or so, we're going to have to start
taking that out of the Globus universe and moving those into Java interface
standards, which really pushes us into the JCP process and starts to drive
some of the container models.
dW: It's interesting that you have the open source licensing
approach, but some people seem to feel that the whole Grid architecture is
being dictated from above, which is somewhat anathema to the open source
approach.
Tuecke: I know people have that criticism of us, and there's worry
that Globus/IBM are driving this path. Maybe the problem is that the open
source community really isn't one community. It's a whole bunch of them. So it
may be better to speak of particular open source projects as communities.
There are certainly a lot of people in the broader open source community who
will always chafe at conforming to anybody's anything, and will start again
from scratch. With Globus, we're clearly one of the bigger open source
communities in place, certainly in the Grid space, and it's critical that we
have things like OGSI there as a basis that we can just assume to keep moving
up.
One of the problems that we've seen over the last number of years, and one
of the things that really motivated us toward OGSI as a basis for all these
higher level services is having enough infrastructure in place that we don't
have to keep re-inventing from scratch: building up from the IP layer for
example, or redoing the entire monitoring architecture for your entire
project.
There are always people that chafe at having to conform in some
environments, but history shows that over time what causes a particular
technology to explode is standardization. This is true whether it came along
out of single open source projects like Perl, or Web-type things. Perl is one
where the standardization is de facto by a small set of big gurus, as opposed
to the Web where it was more of a consortium-based, W3C-based standardization.
Our feeling was, in order to get the ubiquity with Grid, that standardization
has got to happen, and somebody's got to lead it.
The last year was interesting, picking OGSI as an example: I had started
writing the first Grid service spec a year and a half ago as a completely
internal skunkworks kind of thing. I shared it with IBM when we chose to
develop it with them. We went public with it last February. And at that time
there was great excitement, but there were also a lot of people really
blasting us saying, "This is a fait accompli." These guys have done
everything, they're just saying "This is it. Accept it or go away." Our only
response at that time was, "Watch us." The whole point was to take it to the
GGF [Global Grid Forum, the standards body of the Grid], have a good
defensible position that made sense so that we could take it as a starting
point, and then have an open process to A) vet it, to see if people liked it
and thought that the direction made sense, and B) take it and make it their
own and refine it and push it forward.
And that's really what's happened in GGF and the OGSI working group within
the last ten months. It's gone from this spec that a few of us in Globus and
IBM wrote, to this thing where there's extremely active participation from
half a dozen different vendors and projects. There is a core team of probably
10 people in the working group that are from all over the place -- including
Avaki and Fujitsu Labs, Sun, Globus, and IBM. So I think that process
demonstrated that one can be a leader in a community and still have it be an
open community. That's what's going to have to keep happening.
Resources
Interested in learning more about Grid computing? Check out the Grid
Application Framework for Java (GAF4J) from alphaWorks. GAF4J is a light-
weight Java framework based on Globus Java CoG Toolkit v0.9 that aids building
Grid client applications in the Java language.
IBM researchers Liang-Jie Zhang and Jen-Yao Chung and software engineer Qun
Zhou take a look at developing Grid computing applications in a two-part
series featured on the developerWorks Web services zone: Part 1 (November
2002) introduces the basic idea of Grid computing and the Open Grid Services
Architecture (OGSA); Part 2 (December 2002) continues with an introduction to
a Grid solution architecture that includes both a logical and physical Grid-
based Grid solution sphere.
Download the IBM Grid Toolbox for AIX and Linux from alphaWorks, which is
based on the Globus Toolkit Version 2.2.3. This full-featured toolbox allows
for setting up and implementing a Grid, for Grid application development, and
for better utilization of networked resources with Grid technologies.
IBM Research is up to some very interesting work in the field of computer
sciences.
Find hundreds more Java technology resources on the developerWorks Java
technology zone.
About The Author
Robert McMillan is a freelance journalist and editor at large at Linux
Magazine. Robert can be contact at bob@linux-mag.com.
|