 |
|
DAILY NEWS AND INFORMATION
FOR THE GLOBAL GRID COMMUNITY / MAY 19, 2003: VOL. 2 NO. 20
|
Breaking News - Operating Systems
& Middleware:
Grid Computing: Conceptual Flyover
For Developers
First published by IBM developerWorks at http://www.ibm.com/developerWorks.
Grid computing is the "next big thing," and this article's goal is to
provide
a "10,000-foot view" of key concepts. This article relates many Grid computing
concepts to known quantities for developers, such as object-oriented
programming, XML, and Web services. The author offers a reading list of white
papers, articles, and books where you can find out more about Grid
computing.
First came the mainframes: huge hulking computational devices that lived in
the rarefied atmospheres of big corporate and university labs, attended to by
a secluded priesthood of engineers. Later came the desktop machines, mini- and
microcomputers that gave computing power to an ever-expanding group of people
at work and home.
Then came the client-server and networking technologies and protocols to
hook
all these machines together and allow them to communicate. Fast on the heels
of all that came the Internet, which expanded our ability to communicate and
share files and data with any networked machine on the planet.
Now we're turning the corner on the next big thing: Grid computing, and it
has
as much potential for changing the way we do business as the Internet did.
You're probably already familiar with technologies such as Web services, XML,
and object-oriented programming. Grid computing is a lot like these, if only
conceptually.
This article shows you how this emerging technology borrows from past
technical concepts -- it won't take much for you to see the parallels between
the development of Grid computing with that of Web services, XML, and other
technical arenas. You'll also see how Grid services and the very framework it
all rests on is very much like object-oriented programming.
What's Grid computing?
Sometimes it's easier to start defining Grid computing by telling you what
it
isn't. For instance, it's not artificial intelligence, and it's not some kind
of advanced networking technology. It's also not some kind of
science-fictional panacea to cure all of our technology ailments.
If you can think of the Internet as a network of communication, then Grid
computing is a network of computation: tools and protocols for coordinated
resource sharing and problem solving among pooled assets. These pooled assets
are known as virtual organizations. They can be distributed across the globe;
they're heterogeneous (some PCs, some servers, maybe mainframes and
supercomputers); somewhat autonomous (a Grid can potentially access resources
in different organizations); and temporary.
Although Grid computing is firmly ensconced in the realm of academic and
research activities (and has been for the past decade), more and more
companies are starting to turn to it for solving hard-nosed, real-world
problems.
Consider this: most IT departments are being forced to do more with less.
Budgets are tight, resources are thin, and skilled human resources can be
scarce or expensive. To top it off, most corporate managers know that they
have a super-abundance of idle computing power. It's well known in industry
circles that most desktop machines only use 5% to 10% of their capacity, and
most servers barely peak out at 20%. No surprise then that many of the big
money people in corporate America balk at the thought of purchasing more
equipment to get the job done.
What these companies need is not more horsepower, but more efficient use of
existing horsepower. They need a way to tie all of these idle machines
together into a pool of potential labor, manage those resources, and provide
secure and reliable access to the number-crunching muscle. Imagine if a
corporation or organization could use all of its idle desktop PCs at night to
run memory- and processor-intensive tasks? They would get more work done
faster, possibly get to market faster, and at the same time cut down their IT
expenses.
Grid computing is emerging as a viable technology that businesses can use
to
wring more profits and productivity out of IT resources -- and it's going to
be up to you developers and administrators to understand Grid computing and
put it to work.
A Closer Look
From a developer's perspective, Grids are composed of virtual organizations
using a common suite of protocols. These protocols allow Grid users and
applications to run services in a secure, controlled manner.
Virtual organizations, as explained above, can be a handful of servers or
desktop PCs in a single room, or a heterogeneous hodge-podge of systems
scattered around the world connected via the Internet. All of these systems
are able to work together because of certain protocols, which control
connectivity, resource allocation and management, and coordination of those
resources.
An effort is underway by the Global Grid forum to organize these protocols
under the Open Grid Services Architecture, or OGSA, which has grown out of the
open standards-based Web Services world. It's called an architecture because
it is mainly about describing and building a well-defined set of interfaces
from which systems can be built, all based on open standards like the Web
Services Description Language (WSDL). Furthermore, OGSA builds on the
experience gained from building the Globus Toolkit, a valuable reference
implementation.
Open standards and protocols lead to the building of services, and services
are at the heart of the Grid. To put it simply, services allow users to do
things on the Grid. Services can include:
- Information queries
- Network bandwidth allocation
- Data management/extraction
- Processor requests
- Managing data sessions
- Balance workloads
When Grid experts talk about an individual service being run (for example,
an
information query), they call it a service instance (this usage is analogous
to a class instance in object-oriented programming). Services and service
instances can be lightweight and transient, or long-term tasks that require
deep and wide support from the Grid. Services and service instances can be
dynamic or interactive, or they can be batch-processed. They might run at
scheduled times, or at arbitrary times.
Good services aren't just limited to what they can do for a Grid user, but
what they enable: virtualization. A good set of services based on solid
protocols can hide the complexity of certain requests -- solid virtualization
can transform computing into a ubiquitous Grid that is more akin to our
current electric and water utilities.
Think about it: When you plug an appliance into the wall, you don't know or
care how the electricity flows into that appliance, nor do you know or care
where the electricity comes from. It just works, and you access that
electricity Grid to perform a task (such as toasting your bread or ironing
your clothes). On the computing front, imagine one day being able to lease
just a query tool from a Grid only when you need it, and not have to worry
about databases, browsers, and operating systems.
Proponents of Grid technology say the same thing will happen with computing
power -- but it's all about how we build protocols and services that allow
computers in a Grid to interact.
Ian Foster, the "Father of Grid computing," uses the term semantics to
describe the power of OGSA to define service instances: how it is created and
controlled, how to communicate with it, and so on. The use of the term
"semantics," borrowed from linguistics and psychology, is a big clue that Grid
computing isn't just about data and processors and tasks -- it's about context
and meaning. Semantics in a computer programming environment is about more
than just applying computing power to process data and push out a result set.
It's really more about bringing a problem to the computer (or Grid) and
getting a solution to that problem.
Challenges
If you're thinking that anything as complex as Grid computing is bound to
have
challenges associated with it, then you're right. For one thing, a Grid must
be able to quickly ascertain what resources are available on any computer that
joins it -- and more to the point, the Grid shouldn't be bogged down by a slow
or outdated system. Avoiding the "least common denominator" problem is
somewhat high on the technical "to-do" list.
Another huge issue has nothing to do with Grid technology specifically, but
will still impact successful Grid deployments: making applications work in a
Grid environment. Right now, most applications work in server or desktop
environments. One set of processors does the work. On a Grid, the work can be
parceled out to as many systems as are needed to do the work, and each system
contributes to the task. The results are then assembled and sent back to the
requesting system.
Once those applications are ported over to a Grid environment, you have to
start worrying about how the data is shared, chopped, sifted, moved around,
secured, and managed. The user or application that requested the data needs to
be the only entity that gets the data back, and it has to be intelligible.
Security is definitely an enormous requirement -- after all, you don't want
just anyone getting access to Grid resources. And those who add their systems
to a Grid will want to control who has access to use their resources, and
when.
Reliability and performance also remain important -- if the Grid doesn't
perform the job well and fast, then the business case for it certainly
diminishes.
Where To Go From Here
If you get the impression that Grid computing is where the Web was in 1993,
then you're right. A tremendous amount of territory is being mapped right now.
Some solid implementation foundations are being laid down. However, much of
Grid computing is undiscovered country, and many groups are turning their
attention to the emerging open standards. In many ways, the discussions about
Grid services parallel those around Internet and XML standards in the
mid-1990s.
Resources
Here is a starter list of white papers, books, and articles you can use to
learn more about Grid computing. Do you have your own list of favorites? Send
me an e-mail.
Anatomy of the Grid. This white paper by Ian Foster, Carl Kesselman, and
Steven Tuecke defines the field of Grid computing. As the title suggests, the
authors spend some time naming all of a Grid's constituent parts and defining
what they do. Their focus is on Grid architecture.
Physiology of the Grid. This white paper by Ian Foster, Carl Kesselman,
Jeffrey Nick, and Steven Tuecke explains how Grid computing can be put to work
in a Web services environment. This is the white paper that presents more
details about OGSA and Grid semantics (i.e., services). Together with "Anatomy
of the Grid," these two papers provide a fairly detailed overview (albeit a
tad academic) about the world of Grid computing.
The Global Grid Forum (www.ggf.org) is a
community-initiated forum of researchers and practitioners working on Grid
computing, and a number of working groups are producing technical specs,
documenting user experiences, and implementation guidelines. See GGF@WORK at
www.ggf.org/home.php?div=atwork for a list of the working groups including
the Open Grid Services Architecture (OGSA) Working Group at www.ggf.org/ogsa-wg/".
The Globus Toolkit: Globus is an open-architecture, open standards tool for
building computational Grids. It is widely cited as a solid reference
implementation that will get your hands dirty in the world of building,
deploying, and managing Grids. Also, look at the Globus FAQ.
"The Grid: Computing without Bounds", by Ian Foster (April 2003 Scientific
American). This article by Ian Foster is an excellent read for laymen,
scientists, and techies alike. Ian does a terrific job of making most of the
abstract parts of Grid computing tangible, making them come alive. He also
gives you a peek at the future of Grid computing and how much change it can
bring to the way we do business.
IBM VP Wladasky-Berger explains Grid Computing. A great overview of Grid
computing -- and IBM's role in it. He talks about how emerging standards and
access to greater bandwidth make the dream of commercially viable Grid
computing closer to reality than most people think.
Fundamentals of Grid Computing. A brief IBM Redpaper that offers a concise
technical overview of Grid computing.
Grid Computing: Making the Global Infrastructure a Reality, edited by Fran
Berman, Geoffrey Fox and Tony Hey. Published March 2003 by Wiley. This 1000+
page tome is filled with articles and essays that examine Grid computing from
a variety of science and technical angles, including: history of the Grid, the
semantic Grid, an overview of Grid architecture, Grid deployment models, OGSA,
peer-to-peer Grid databases, and a lot more. Find it at amazon.com at
www.amazon.com/exec/obidos/asin/0470853190/.
10 Emerging Technologies that will Change the World. Technology Review
named
Grid computing as one of the ten technologies that will change the world.
Article includes photos of Ian Foster and Carl Kesselman.
Grid-dy Determination: When it comes to companies that need a lot of CPU
cycles to get their work done, Grid computing is definitely the way to go.
Provides some real-world cases in which Grid computing is making all the
difference: online gaming, financial number crunching, genome research,
aerospace, and more.
About the author
Thomas Myer is the co-founder of Triple Dog Dare Media, an Austin-based Web
applications development group. Triple Dog Dare Media specializes in content
management solutions, e-commerce systems, and Web services deployments for
small and medium business. Tom can be reached at tom@tripledogdaremedia.com.
|