GRIDtoday Logo IBM

DAILY NEWS AND INFORMATION FOR THE GLOBAL GRID COMMUNITY / MAY 19, 2003: VOL. 2 NO. 20

   ( Table of Contents )   

Breaking News - Operating Systems & Middleware:

Grid Computing: Conceptual Flyover For Developers

First published by IBM developerWorks at http://www.ibm.com/developerWorks.

Grid computing is the "next big thing," and this article's goal is to provide a "10,000-foot view" of key concepts. This article relates many Grid computing concepts to known quantities for developers, such as object-oriented programming, XML, and Web services. The author offers a reading list of white papers, articles, and books where you can find out more about Grid computing.

First came the mainframes: huge hulking computational devices that lived in the rarefied atmospheres of big corporate and university labs, attended to by a secluded priesthood of engineers. Later came the desktop machines, mini- and microcomputers that gave computing power to an ever-expanding group of people at work and home.

Then came the client-server and networking technologies and protocols to hook all these machines together and allow them to communicate. Fast on the heels of all that came the Internet, which expanded our ability to communicate and share files and data with any networked machine on the planet.

Now we're turning the corner on the next big thing: Grid computing, and it has as much potential for changing the way we do business as the Internet did. You're probably already familiar with technologies such as Web services, XML, and object-oriented programming. Grid computing is a lot like these, if only conceptually.

This article shows you how this emerging technology borrows from past technical concepts -- it won't take much for you to see the parallels between the development of Grid computing with that of Web services, XML, and other technical arenas. You'll also see how Grid services and the very framework it all rests on is very much like object-oriented programming.

What's Grid computing?

Sometimes it's easier to start defining Grid computing by telling you what it isn't. For instance, it's not artificial intelligence, and it's not some kind of advanced networking technology. It's also not some kind of science-fictional panacea to cure all of our technology ailments.

If you can think of the Internet as a network of communication, then Grid computing is a network of computation: tools and protocols for coordinated resource sharing and problem solving among pooled assets. These pooled assets are known as virtual organizations. They can be distributed across the globe; they're heterogeneous (some PCs, some servers, maybe mainframes and supercomputers); somewhat autonomous (a Grid can potentially access resources in different organizations); and temporary.

Although Grid computing is firmly ensconced in the realm of academic and research activities (and has been for the past decade), more and more companies are starting to turn to it for solving hard-nosed, real-world problems.

Consider this: most IT departments are being forced to do more with less. Budgets are tight, resources are thin, and skilled human resources can be scarce or expensive. To top it off, most corporate managers know that they have a super-abundance of idle computing power. It's well known in industry circles that most desktop machines only use 5% to 10% of their capacity, and most servers barely peak out at 20%. No surprise then that many of the big money people in corporate America balk at the thought of purchasing more equipment to get the job done.

What these companies need is not more horsepower, but more efficient use of existing horsepower. They need a way to tie all of these idle machines together into a pool of potential labor, manage those resources, and provide secure and reliable access to the number-crunching muscle. Imagine if a corporation or organization could use all of its idle desktop PCs at night to run memory- and processor-intensive tasks? They would get more work done faster, possibly get to market faster, and at the same time cut down their IT expenses.

Grid computing is emerging as a viable technology that businesses can use to wring more profits and productivity out of IT resources -- and it's going to be up to you developers and administrators to understand Grid computing and put it to work.

A Closer Look

From a developer's perspective, Grids are composed of virtual organizations using a common suite of protocols. These protocols allow Grid users and applications to run services in a secure, controlled manner.

Virtual organizations, as explained above, can be a handful of servers or desktop PCs in a single room, or a heterogeneous hodge-podge of systems scattered around the world connected via the Internet. All of these systems are able to work together because of certain protocols, which control connectivity, resource allocation and management, and coordination of those resources.

An effort is underway by the Global Grid forum to organize these protocols under the Open Grid Services Architecture, or OGSA, which has grown out of the open standards-based Web Services world. It's called an architecture because it is mainly about describing and building a well-defined set of interfaces from which systems can be built, all based on open standards like the Web Services Description Language (WSDL). Furthermore, OGSA builds on the experience gained from building the Globus Toolkit, a valuable reference implementation.

Open standards and protocols lead to the building of services, and services are at the heart of the Grid. To put it simply, services allow users to do things on the Grid. Services can include:

  • Information queries
  • Network bandwidth allocation
  • Data management/extraction
  • Processor requests
  • Managing data sessions
  • Balance workloads

When Grid experts talk about an individual service being run (for example, an information query), they call it a service instance (this usage is analogous to a class instance in object-oriented programming). Services and service instances can be lightweight and transient, or long-term tasks that require deep and wide support from the Grid. Services and service instances can be dynamic or interactive, or they can be batch-processed. They might run at scheduled times, or at arbitrary times.

Good services aren't just limited to what they can do for a Grid user, but what they enable: virtualization. A good set of services based on solid protocols can hide the complexity of certain requests -- solid virtualization can transform computing into a ubiquitous Grid that is more akin to our current electric and water utilities.

Think about it: When you plug an appliance into the wall, you don't know or care how the electricity flows into that appliance, nor do you know or care where the electricity comes from. It just works, and you access that electricity Grid to perform a task (such as toasting your bread or ironing your clothes). On the computing front, imagine one day being able to lease just a query tool from a Grid only when you need it, and not have to worry about databases, browsers, and operating systems.

Proponents of Grid technology say the same thing will happen with computing power -- but it's all about how we build protocols and services that allow computers in a Grid to interact.

Ian Foster, the "Father of Grid computing," uses the term semantics to describe the power of OGSA to define service instances: how it is created and controlled, how to communicate with it, and so on. The use of the term "semantics," borrowed from linguistics and psychology, is a big clue that Grid computing isn't just about data and processors and tasks -- it's about context and meaning. Semantics in a computer programming environment is about more than just applying computing power to process data and push out a result set. It's really more about bringing a problem to the computer (or Grid) and getting a solution to that problem.

Challenges

If you're thinking that anything as complex as Grid computing is bound to have challenges associated with it, then you're right. For one thing, a Grid must be able to quickly ascertain what resources are available on any computer that joins it -- and more to the point, the Grid shouldn't be bogged down by a slow or outdated system. Avoiding the "least common denominator" problem is somewhat high on the technical "to-do" list.

Another huge issue has nothing to do with Grid technology specifically, but will still impact successful Grid deployments: making applications work in a Grid environment. Right now, most applications work in server or desktop environments. One set of processors does the work. On a Grid, the work can be parceled out to as many systems as are needed to do the work, and each system contributes to the task. The results are then assembled and sent back to the requesting system.

Once those applications are ported over to a Grid environment, you have to start worrying about how the data is shared, chopped, sifted, moved around, secured, and managed. The user or application that requested the data needs to be the only entity that gets the data back, and it has to be intelligible.

Security is definitely an enormous requirement -- after all, you don't want just anyone getting access to Grid resources. And those who add their systems to a Grid will want to control who has access to use their resources, and when.

Reliability and performance also remain important -- if the Grid doesn't perform the job well and fast, then the business case for it certainly diminishes.

Where To Go From Here

If you get the impression that Grid computing is where the Web was in 1993, then you're right. A tremendous amount of territory is being mapped right now. Some solid implementation foundations are being laid down. However, much of Grid computing is undiscovered country, and many groups are turning their attention to the emerging open standards. In many ways, the discussions about Grid services parallel those around Internet and XML standards in the mid-1990s.

Resources

Here is a starter list of white papers, books, and articles you can use to learn more about Grid computing. Do you have your own list of favorites? Send me an e-mail.

Anatomy of the Grid. This white paper by Ian Foster, Carl Kesselman, and Steven Tuecke defines the field of Grid computing. As the title suggests, the authors spend some time naming all of a Grid's constituent parts and defining what they do. Their focus is on Grid architecture.

Physiology of the Grid. This white paper by Ian Foster, Carl Kesselman, Jeffrey Nick, and Steven Tuecke explains how Grid computing can be put to work in a Web services environment. This is the white paper that presents more details about OGSA and Grid semantics (i.e., services). Together with "Anatomy of the Grid," these two papers provide a fairly detailed overview (albeit a tad academic) about the world of Grid computing.

The Global Grid Forum (www.ggf.org) is a community-initiated forum of researchers and practitioners working on Grid computing, and a number of working groups are producing technical specs, documenting user experiences, and implementation guidelines. See GGF@WORK at www.ggf.org/home.php?div=atwork for a list of the working groups including the Open Grid Services Architecture (OGSA) Working Group at www.ggf.org/ogsa-wg/".

The Globus Toolkit: Globus is an open-architecture, open standards tool for building computational Grids. It is widely cited as a solid reference implementation that will get your hands dirty in the world of building, deploying, and managing Grids. Also, look at the Globus FAQ.

"The Grid: Computing without Bounds", by Ian Foster (April 2003 Scientific American). This article by Ian Foster is an excellent read for laymen, scientists, and techies alike. Ian does a terrific job of making most of the abstract parts of Grid computing tangible, making them come alive. He also gives you a peek at the future of Grid computing and how much change it can bring to the way we do business.

IBM VP Wladasky-Berger explains Grid Computing. A great overview of Grid computing -- and IBM's role in it. He talks about how emerging standards and access to greater bandwidth make the dream of commercially viable Grid computing closer to reality than most people think.

Fundamentals of Grid Computing. A brief IBM Redpaper that offers a concise technical overview of Grid computing.

Grid Computing: Making the Global Infrastructure a Reality, edited by Fran Berman, Geoffrey Fox and Tony Hey. Published March 2003 by Wiley. This 1000+ page tome is filled with articles and essays that examine Grid computing from a variety of science and technical angles, including: history of the Grid, the semantic Grid, an overview of Grid architecture, Grid deployment models, OGSA, peer-to-peer Grid databases, and a lot more. Find it at amazon.com at www.amazon.com/exec/obidos/asin/0470853190/.

10 Emerging Technologies that will Change the World. Technology Review named Grid computing as one of the ten technologies that will change the world. Article includes photos of Ian Foster and Carl Kesselman.

Grid-dy Determination: When it comes to companies that need a lot of CPU cycles to get their work done, Grid computing is definitely the way to go. Provides some real-world cases in which Grid computing is making all the difference: online gaming, financial number crunching, genome research, aerospace, and more.

About the author

Thomas Myer is the co-founder of Triple Dog Dare Media, an Austin-based Web applications development group. Triple Dog Dare Media specializes in content management solutions, e-commerce systems, and Web services deployments for small and medium business. Tom can be reached at tom@tripledogdaremedia.com.

( Top of Page )

   ( Table of Contents )