Special Features:
PLATFORM CASE STUDY: SPACE SYSTEMS
LORAL
Background
Space Systems/Loral (SS/L)-a subsidiary of Loral Space & Communications -
is
one of the fastest growing full-service producers of commercial communications
and weather satellites. The company has an international base of customers
whose applications include broadband digital communications, wireless
telephony, direct-tohome broadcast, environmental monitoring, and air traffic
control.
Based in Palo Alto, California with worldwide operations, SS/L employs over
2,800 people. In 2000, SS/L generated $1 billion in revenue, and has one of
the world's premier facilities for making advanced satellites. Comprised of 29
buildings that encompass 1.3 million square feet, SS/L's facilities house a
variety of specialized development laboratories and modern production
environments.
SS/L's dedicated test networks-comprised of 270 servers and workstations
running Solaris, NT, Linux, and various real time operating systems-are
deployed to local and remote test sites and manage an intensive quality
assurance process. Its development networks include 42 Windows and Sun Solaris
workstations.
The Challenge
An important part of a process improvement effort (made even more
significant
in the current economic climate) is to "do more with less". For manufacturing
firms, shorter production schedules mean increased throughput and increased
profit.
SS/L identified two areas as candidates for improvement: spacecraft test
system validation and recurring administrative tasks.
For system validation, SS/L needed a solution that could monitor/test its
distributed computing environment, enable users to visualize system status and
automatically prevent or correct system failures. To reduce recurring and
unnecessary labor costs, SS/L needed to automate its system administration
procedures including host configuration and data management (test data, logs,
etc). By streamlining the manual testing process and eliminating downtimes,
SS/L engineers could ultimately increase their productivity and maximize
throughput.
To reduce recurring and unnecessary labor costs in maintaining its test and
development networks, SS/L needed a solution to automate its system
administration procedures, monitor its distributed computing environment, and
automatically prevent and handle potential system downtimes or failures.
- The SiteAssure software provided SS/L with multi-platform support for its
heterogeneous technology infrastructure.
- With SiteAssure, SS/L has a simple solution for viewing complex resource
information.
- SiteAssure has drastically reduced the time spent by system administrators
for troubleshooting.
- SS/L has minimized developer downtime, increased productivity, and boosted
its throughput capability.
The Solution
SS/L evaluated a number of different open source and commercial software
applications to address its challenges. As Platform LSF was already in
production with SS/L's design and modeling systems, it was a natural choice to
include Platform SiteAssure in the evaluation process. SS/L's goal was to
implement a resource management solution in under a month.
SS/L very quickly narrowed the field to SiteAssure. While SiteAssure
enabled
SS/L to monitor its systems and automate system administration tasks, it was
also the only solution with the ability to probe and track the internal state
of applications that SS/L was developing.
Today, 50 SiteAssure agents are deployed across SS/L's test networks and
development environment. Platform SiteAssure monitors SS/L's host systems and
provides notifications of problems such as limited disk space, system
abnormalities, excessive memory swapping, or CPU utilization. It also
automates recurring tests that were previously done manually. Policies
established in SiteAssure handle any problems, deliver automated alerts, and
provide feedback on certain operational levels.
Platform Value
While other solutions were either Windows only or UNIX based, the
SiteAssure
solution provided much needed multi-platform support for SS/L's heterogeneous
technology infrastructure. Platform also worked together with SS/L to develop
the binary port monitoring agent (now a part of the SiteAssure solution),
which allows arbitrary protocols to be used to probe services. SS/L uses the
agent to gather metrics on the internal state of SS/L developed servers. The
agent was developed by Platform within 12 weeks, beating all time
estimates.
The Agent Interface contained in SiteAssure provided SS/L with the
flexibility
to custom develop its own monitoring interface to display the status and
health of its workstations and servers. Using color-coded data flows, the
monitoring interface brings together a combination of SiteAssure agent
information and additional metrics from SS/L's existing applications. This
provided a simple solution for visualizing complex information.
As SiteAssure enables SS/L engineers to access and evaluate system
information
from their own workstations, SS/L has dramatically reduced the time that
system administrators and developers spent troubleshooting. The color-coding
system enables system users to immediately identify the source of any problems
and contact the responsible parties (system administrators for hardware
issues, and developers for software) to further isolate and resolve the
problems. Previously, it would take two hours for up to three developers to
isolate, identify and resolve a problem. SiteAssure has narrowed this down to
one person supporting a problem for a maximum of one hour. This minimizes
developer downtime, increases productivity and boosts SS/L's throughput
capability.
SiteAssure also automatically notifies system administrators when
workstations
and servers are offline for location changes, which helps streamline the
reconfiguration process and ensures minimal network interruptions.
Looking forward, SS/L has identified that at least half of its development
workstations are candidates for implementation of SiteAssure software, and
plans to implement an additional 85 SiteAssure licenses.
"Platform has increased our productivity tenfold. By monitoring and
troubleshooting problems with intelligent corrective actions, SiteAssure not
only minimizes disruptions for our primary engineering team, but also reduces
recurring system administration costs." Jim Jaquet, Test Software Section
Supervisor, Spacecraft Test and Operations World Headquarters
About Platform
Platform is the world's leading distributed computing software provider,
with
desktop to Grid solutions that allow organizations to dramatically improve
time to market and quality of results, while maximizing their I.T. investment.
Platform has strategic relationships with industry leaders including Compaq,
HP, IBM, SGI, Cadence and SAS Institute and its open, scalable software
solutions are the choice of more than 1,500 result-driven organizations around
the world. Platform is a private company with 400 employees in 14 offices in
North America, Europe and Asia.
Web site: www.platform.com
|