GRIDtoday Logo Digipede

DAILY NEWS AND INFORMATION FOR THE GLOBAL GRID COMMUNITY / MARCH 3, 2003: VOL. 2 NO. 9

( Previous Article )   ( Table of Contents )   ( Next Article )

Special Features:

TUECKE HAS SEEN THE FUTURE OF DISTRIBUTED COMPUTING ON THE GRID
By Robert McMillan, Editor-at-large, Linux Magazine

Robert McMillan is a freelance journalist and editor at large at Linux Magazine.

Although scientists have been using Grid technologies since the Condor project began scavenging up idle computer cycles at the University of Wisconsin in the mid-1980s, the really exciting vision of Grid computing -- a set of open and ubiquitous standards that real world developers will use for distributed computing -- remains a vision of the future. One of the people most actively involved in making that vision a reality is Steve Tuecke, lead software architect in the Distributed Systems Laboratory at Argonne National Laboratory and lead architect of the Globus Toolkit, the popular implementation of the OGSA (Open Grid Services Architecture) middleware standards that are the basis of Grid computing.

Since he first began work on the I-Way supercomputing network seven years ago, software architect Steve Tuecke has been one of the driving forces behind Grid computing. He is lead software architect in the Distributed Systems Laboratory at Argonne National Laboratory and lead architect of the Globus Toolkit, the de facto standard open source Grid middleware solution and the inspiration for the emerging standard OGSA (Open Grid Services Architecture), which it implements in the recent Globus Toolkit 3.0 alpha release. Steve is well positioned to offer insight on this cutting-edge technology area. And he did just that in this interview with developerWorks correspondent Robert McMillan.

dW: Why is there so much interest in the Grid right now?

Tuecke: I think it's the confluence of a few events. One is during the mid- to late-1990s, there was a huge explosion of connectivity. This explosion really put the infrastructure in place that let companies realistically talk about outsourcing their capabilities, or setting up networks with their partners.

A second thing is that during this same period -- certainly in scientific computing -- we observed larger and larger collaborations coming together to work. So, increasingly, scientific teams were not made up of just a single principal investigator and a few graduate students; they would be large collaborations, on an international scale. You'd get things like these high- energy physics detectors, where it would take a collaboration of a thousand physicists to build these things.

And so that push toward collaboration spurred the need to start working across the traditional organizational boundaries. Groups of people began coming together for a single research purpose, not because they worked for the same laboratory.

dW: Java technology has been used in the Grid world for some time now, but in the last year J2EE has become more important to Grid computing. What do you see the Java platform's role on the Grid?

Tuecke: Ignoring Grid computing for a moment, let's just think about where Java itself has caught hold. The first big place Java got a grip was within the portal server-side community. As business started adopting these products and good tool sets came out, this started getting uptake also within the scientific community as well. So several years ago, you started seeing people trying to build Web-based user interfaces to Grid environments using Java as a translation medium.

The other thing that's happened over the last couple of years is the emergence of J2EE as one of the two new dominant platforms (along with .Net) for building next-generation business infrastructure. That platform, as a basis for doing a whole lot of interesting business server-side applications, has really exploded as well.

The third comment I'd make is that Java has seen a very big adoption for educational purposes. I think the educational environment really picked up on Java as an instructional tool -- even quicker than the business environment -- and as a basic language they could use for prototyping and experimentation.

And so all of these things were happening at the same time. Gregor [von Laszewski] was right in the mix of this with the CoG [Commodity Grid Toolkit]. The CoG originally started as tools to allow some of these portal-based interfaces to Grids, and some Java GUIs to interface to Grid environments, but really it's a client-side thing. And then we continued to do various experimentation with pushing more and more services capability into Java components as well. And so, a year ago, once we started really working closely with IBM on the OGSA [the architecture that will be built upon the Open Grid Services Infrastructure, or OGSI] path, we then brought Web services into this mix. And of course Java is one of the leading platforms for programming Web services.

So all of these things were kind of swirling and coming together to get us to where we are today -- where Java is just a good platform for building Web service-enabled server-side components. And that's what OGSA is about: to define standards that talk about how I as a client interact with these Grid services.

dW: So for you, it was the desire to use Web services standards that got you to J2EE?

Tuecke: It was the combination of that and the overall broad adoption of Java. To be clear, we're not a Java-only shop by any means. We still believe very strongly in the protocols, standards, and multiple tool kits for programming. So we have an active effort going on in C-based hosting environments for Web services; we're working closely with a group at Lawrence Berkeley Laboratory doing Python-based environments. We think all of those things matter a lot. But a year ago when we were launching down the path of doing this Grid services thing, there were really only two decent toolkits out there for doing Web services: there was Apache Axis and Java, and then there was .Net. On the open source side, Apache Axis was about it, so we decided to follow that route.

dW: So are people in the scientific community interested in J2EE, or is this just something that you're developing to make Grid computing applicable to a broader community?

Tuecke: There's certainly a lot of work in the scientific community using Java. I think a lot of people in the scientific community are still trying to get their heads around J2EE. This also partly depends on which part of J2EE you're talking about. Some people, when they think J2EE, they think servlet engines. Others, they think entity and session beans. Certainly the scientific community is very heavily invested in, for example, servlet-based toolkits for interfacing to their scientific applications. I think we've seen less adoption of the heavier-weight container-based models, such as you see in entity beans and session beans. Entity beans really are designed for business relational database sort of applications. You don't see that nature of application quite as much in the scientific community.

Another factor in the scientific space -- more so than in the commercial space -- is open source. There are a lot of groups who believe strongly in open source software as the basis of doing their scientific work. Party because it's free, and partly because that community has a history of needing really special stuff, so the ability to go in and make the tweaks necessary to really make it work for their application. All of these things are good for them.

dW: How far away are these Grid standards from being adopted in the commercial space? When you look at real applications of Grid computing right now, they all seem to be scientific.

Tuecke: There are a few commercial applications starting to pop up out there, but most of them are still in the genre of the cycle scavenging type problem. I think there are two questions in there: one is the question of adoption, the other is the question of standards.

On the standards side, Globus Toolkit has really, over the last year or two, become the de facto standard software for doing a lot of the Grid-based applications, especially in science. But they're not protocol standards, which is where we want to be. OGSI is the first level of the infrastructure: it takes Web services, puts a series of patterns on them, uses standard interfaces for using server state management, lifetime management, monitoring, and discovery.

What I think you're going to see over the next year is a huge thrust in the community of us included in Globus to start now fleshing out what we call the set of common services. We will use that OGSI base to start filling in standard interfaces for talking about data access and resource management, logging monitoring services, discovery services, and things like that. For really widespread adoption, I think that sort of work is needed. We're working our way up to the point where we can start just deploying Grid components and tying them together.

You're going to see in a small number of months, delivery of the base capability -- from a variety of vendors -- of OGSI. We've been shipping the technology freely for almost a year already. So that gives a very strong basis for doing either experimentation, or if the nature of the application you're trying to build is much more customized, where having all this stock componentry available isn't going to matter as much, great, go for it. You can go with the tools that are there now.

dW: Now on the other side, is Grid development going to start driving J2EE standards?

Tuecke: It's perhaps a little too early to know, but I think it's quite likely that that will happen in some areas.

Let me back up. What does it mean to ship a J2EE-based Grid environment, something that can deliver OGSI-compliant services? It means that you provide a server programming environment that makes it very easy for service writers to implement services that conform to the set of standards that are OGSI. In other words, you're defining containers. You're starting to define standard containers that handle a lot of the lifetime issues -- all of the things that OGSI handles: the lifetime management, the monitoring, all that sort of stuff. So if you look at what we're shipping with Globus as our implementation of OGSI, what we've done is defined a set of containers in Java, in J2EE, that make it very easy to write services. If you follow the J2EE mantra of standardizing the Java interfaces and having multiple implementations of that by multiple vendors, that's exactly what we think needs to happen on the Grid side.

Just as with these interfaces for programming a Grid service; today they're not standard. They're something that Globus, with some help from a few friends, puts together. Over the next year or so, we're going to have to start taking that out of the Globus universe and moving those into Java interface standards, which really pushes us into the JCP process and starts to drive some of the container models.

dW: It's interesting that you have the open source licensing approach, but some people seem to feel that the whole Grid architecture is being dictated from above, which is somewhat anathema to the open source approach.

Tuecke: I know people have that criticism of us, and there's worry that Globus/IBM are driving this path. Maybe the problem is that the open source community really isn't one community. It's a whole bunch of them. So it may be better to speak of particular open source projects as communities. There are certainly a lot of people in the broader open source community who will always chafe at conforming to anybody's anything, and will start again from scratch. With Globus, we're clearly one of the bigger open source communities in place, certainly in the Grid space, and it's critical that we have things like OGSI there as a basis that we can just assume to keep moving up.

One of the problems that we've seen over the last number of years, and one of the things that really motivated us toward OGSI as a basis for all these higher level services is having enough infrastructure in place that we don't have to keep re-inventing from scratch: building up from the IP layer for example, or redoing the entire monitoring architecture for your entire project.

There are always people that chafe at having to conform in some environments, but history shows that over time what causes a particular technology to explode is standardization. This is true whether it came along out of single open source projects like Perl, or Web-type things. Perl is one where the standardization is de facto by a small set of big gurus, as opposed to the Web where it was more of a consortium-based, W3C-based standardization. Our feeling was, in order to get the ubiquity with Grid, that standardization has got to happen, and somebody's got to lead it.

The last year was interesting, picking OGSI as an example: I had started writing the first Grid service spec a year and a half ago as a completely internal skunkworks kind of thing. I shared it with IBM when we chose to develop it with them. We went public with it last February. And at that time there was great excitement, but there were also a lot of people really blasting us saying, "This is a fait accompli." These guys have done everything, they're just saying "This is it. Accept it or go away." Our only response at that time was, "Watch us." The whole point was to take it to the GGF [Global Grid Forum, the standards body of the Grid], have a good defensible position that made sense so that we could take it as a starting point, and then have an open process to A) vet it, to see if people liked it and thought that the direction made sense, and B) take it and make it their own and refine it and push it forward.

And that's really what's happened in GGF and the OGSI working group within the last ten months. It's gone from this spec that a few of us in Globus and IBM wrote, to this thing where there's extremely active participation from half a dozen different vendors and projects. There is a core team of probably 10 people in the working group that are from all over the place -- including Avaki and Fujitsu Labs, Sun, Globus, and IBM. So I think that process demonstrated that one can be a leader in a community and still have it be an open community. That's what's going to have to keep happening.

Resources

Interested in learning more about Grid computing? Check out the Grid Application Framework for Java (GAF4J) from alphaWorks. GAF4J is a light- weight Java framework based on Globus Java CoG Toolkit v0.9 that aids building Grid client applications in the Java language.

IBM researchers Liang-Jie Zhang and Jen-Yao Chung and software engineer Qun Zhou take a look at developing Grid computing applications in a two-part series featured on the developerWorks Web services zone: Part 1 (November 2002) introduces the basic idea of Grid computing and the Open Grid Services Architecture (OGSA); Part 2 (December 2002) continues with an introduction to a Grid solution architecture that includes both a logical and physical Grid- based Grid solution sphere.

Download the IBM Grid Toolbox for AIX and Linux from alphaWorks, which is based on the Globus Toolkit Version 2.2.3. This full-featured toolbox allows for setting up and implementing a Grid, for Grid application development, and for better utilization of networked resources with Grid technologies.

IBM Research is up to some very interesting work in the field of computer sciences.

Find hundreds more Java technology resources on the developerWorks Java technology zone.

About The Author

Robert McMillan is a freelance journalist and editor at large at Linux Magazine. Robert can be contact at bob@linux-mag.com.

( Top of Page )

( Previous Article )   ( Table of Contents )   ( Next Article )