Special Features:
BUILDING A DYNAMIC UTILITY COMPUTING ENVIRONMENT By Kevin Hartig and Dennis Reedy, Sun Microsystems
Overview
Software-based dynamic resource allocation and provisioning mechanisms provide
an organization the opportunity to control hardware and software assets by
dynamically reorganizing software as required to optimize resource
utilization. Scaling up use of resources as needed during peak load
situations, scaling back during quiet periods and relocating resources can be
performed automatically as required. Metrical mechanisms can be attached to
application services to maximize the operational capabilities of observed
components while minimizing the intrusiveness of the observational mechanisms.
Resource capacity can be managed automatically and more economically.
Service Level Agreements (SLAs) can be defined which are based on acceptable
and agreed upon constraints. Using the SLAs definitions, a collection of
compute resources in a datacenter can be monitored for a desired Quality of
Service (QoS) and provide self-healing and self managing capabilities. The
application of Utility Computing approaches will be derived from these core
capabilities allowing organizations to utilize existing infrastructures
providing beneficial and economical solutions.
Project Rio provides a platform and the capabilities necessary to implement
the metering and monitoring of applications in a dynamic environment. This
paper describes an example of how Service Control Adapters (SCAs) built for
Rio can be used to meter, monitor and manage Tomcat container servers based on
declarative SLAs. Metrical components are used to observe system statistics
and to trigger actions that dynamically maintain the desired QoS. This example
demonstrates policy based utility computing using open source tools to create
a self-managed environment of load balanced web servers.
Policy Mechanisms
The ability to inject rules and policies into the service fabric allows
greater automation, scalability and controlled behavior. Ultimately, a system
can provide the advanced capabilities of self-healing, self-optimization and
self-configuration. The Rio platform provides mechanisms to declaratively
specify behavior (logic determining how to react to a set of stimuli) as
policies. The following policies are used in this example:
Platform policies determine it operational requirements can be met by
qualitative capabilities of a compute resource. A qualitative capability
indicates a specific type of mechanism or quality associated with a
compute resource such as network capabilities (TCP, 802.11, Bluetooth,
etc.) or platform software (drivers, database, software versions, etc.).
Supportability is determined by Platform Capability objects not by
administrative configurations. Selection policy is enabled through a
mobile code strategy, providing a decentralized collection of platform
capabilities. This approach fundamentally changes how a network of compute
resources makes their capabilities known through the network, allowing the
network of resources to grow organically instead of using a centralized
database.
Behavioral policies determine whether a service is operating to specific
service level objectives. This is accomplished by providing mechanisms to
declare and enforce SLAs. SLA policy policy handlers can be declaratively
attached to a service description, defining what actions to execute if
SLAs can not be met. SLAs vary in scope and function. Some define
quantitative behavior measuring platform depletion characteristics such as
CPU utilization, CPU capacity, memory usage and network bandwidth. Others
define application specific metrical analysis such as transaction times,
number of connections, etc. Lastly, SLAs can define dependent service
associations such as "uses" or "requires."
Metrics
The Rio platform provides metrical mechanisms for observing, monitoring and
metering system and application resources. The Watch facilities provide
integrated threshold triggers, visualization facilities and data archival for
sophisticated analysis. The following types of extensible facilities are
provided:
- Stop Watch -- Used for tracking transaction and response times.
- Counter Watch -- Count a monotonically increasing non-negative value of an
arbitrary occurrence of something over time.
- Gauge Watch -- Record values that can go up and down, and it be positive or
negative.
- Periodic Watch -- Sets a discrete value measured at specific time
intervals.
The information captured can be used to dynamically check that services are
executing within define SLAs and can also be used to track compute resource
usage for utility accounting and billing purposes.
Service Control Adapters
Service Control Adapters encapsulate the control and monitoring of services
running within the context of the Rio platform. The resources associated with
the external service are monitored and actions triggered when defined resource
thresholds are breached or cleared. For example: CPU utilization on a machine
executing the service can be monitored. If the CPU exceeds 90% (a defined
threshold) an action to start another service on a different machine can be
triggered. Actions are determined by policy objects which are declared as part
of a service's definition.
SCAs can control the startup and shutdown of the service and associated
monitors. A SCA represents a process running on a machine that is external to
the Rio platform but which is monitored and controlled by it. A SCA starts or
stops the service process and initializes and destroys the associated Rio
software components and processes. Typically, monitors start automatically
when a SCA is started, but the monitors may be managed while the SCA is
active.
The SCAs working in conjunction with the monitors and policy objects ensure
that services or containers such as web, application and database servers
perform to expectations defined by SLAs. If expectations are not being met,
the problem is automatically resolved by scaling up, scaling back, restarting,
etc. The utility environment is dynamically managed efficiently and
effectively. This also allows for more accurate cost modeling and tracking of
resources used per user.
Tomcat Service Control Adapter
The Tomcat SCA is an example that demonstrates the basic capabilities of a
Service Control Adapter and how services external to the Rio platform can be
managed through the network to ensure SLAs are being met and can be scaled
dynamically as needed. The following diagram provides an overview of the
architecture:
Components of the Tomcat SCA
The Tomcat SCA consists of the following components:
- Service Bean files that implement the SCA
In the Rio architecture, service components are called Service Beans.
Service Beans can be easily developed and deployed as services
throughout a network. The Service Bean component model provides a level
of abstraction on top of the current Jini service development model
making it straightforward to develop and deploy Jini services. The
Service Bean approach allows developers to focus on the solution space
for their domain, not focus on the infrastructure level of developing
Jini services.
The Tomcat SCA implementation extends the Rio Service Bean Adapter and
provides the core of the SCA containing the logic to monitor and manage
a Tomcat server.
The ServiceUI enables a user interface to be associated with a Jini
service. The interface provides access to SCA administrative
capabilities to control the service.
The monitoring Servlet runs in the Tomcat Servlet/JSP container. It
reads Tomcat log file information used for monitoring usage and status
of the Tomcat server. It also polls the server to determine approximate
access response times to verify that the server is accessible.
- Tomcat Operational String
The operational string represents the aggregation of software assets
that make up the Tomcat Service Control Adapter. It includes the SLAs
which define acceptable operational and performance criteria for Tomcat
servers and it defines an association with an Apache HTTP service. The
'Requires' Association1 specifies that the Tomcat SCA requires the
Apache SCA be executing. The Apache service manages the Apache web
server. This server is used to load balance HTTP requests across
multiple Tomcat servers.
Policy Definitions
The Tomcat and Apache SCAs utilizes Platform Policies to control the
instantiation of Tomcat and Apache service instances on machines which have
the requisite software already installed or on machines that can have the
software automatically provisioned, installed and configured. Rio platform
components advertise the capabilities of the compute resource they have been
instantiated on by loading a capabilities file, which provides details on the
specific mechanisms and qualities. An entry in the Tomcat SCA's service
definition lists system components that are required for instantiation. The
following snippet of the Tomcat SCA's service description illustrates the
markup:
<SystemComponent Class =
"com.sun.rio.qos.capability.enterprise.WebServerSupport">
<Attribute Name="manufacturer" Value="Tomcat"/>
<Attribute Name="version" Value="4.1.24"/>
</SystemComponent>
Support for this system component is determined by the platform policy object
loaded by Rio platform components.
Behavioral policies define an agreed upon level of performance for the system.
In this example the Tomcat SCA's service description defines an SLA requiring
the Tomcat server to handle no more than a certain maximum number of HTTP
requests/second. If this limit is breached, an event is generated and the SLA
Policy Handler for Tomcat determines what action to take. In this case, the
Rio platform tries to provision another Tomcat SCA to a machine with the
appropriate resources. If successful, the SCA starts another Tomcat server on
a platform with the necessary platform capabilities. The new server then
becomes part of a load balanced environment. The Apache service is notified of
the new Tomcat server and makes the necessary configuration changes on-the-fly
to begin load balancing.
The following XML in an operational string defines the desired Service Level
Agreements for the "Hit Counter" watch.
<SLA ID="Hit Counter" Low="10" High="100" >
<SLAPolicyHandler Class = "com.sun.rio.qos.ScalingPolicyHandler" />
<Attribute Name="MaxServices" Value="2" />
</SLA>
- Defines a Hit Counter watch.
- Uses the ScalingPolicyHandler to monitor thresholds.
- Indicates a server should be shutdown if a server is handling less than 10
hits/second.
- Indicates another server should be started if the current server is
handling more than 100 hits/second.
- Can start a maximum of 2 Tomcat servers.
The following XML in an portion of the service definition specifies a Tomcat
server needs an Apache server SCA to be available before starting up. This
defines a 'requires' relationship stating the Tomcat service requires a Apache
service in order to execute.
<Associations>
<Association Type="requires" Name="Apache HTTP Service Control Adapter" />
</Associations>
An Apache SCA must be executing otherwise Tomcat Server SCAs will not be
publicly advertised to offer its services to clients on the network. This
definition enforces the defined load balancing architecture.
Dynamic Load Balancing Across Tomcat Containers
The Tomcat SCA demonstrates managing multiple Tomcat Servlet/JSP containers in
a network. The configuration uses an Apache web server to serve static web
pages. Dynamically generated pages are served by Tomcat. Multiple Tomcat
containers are configured to be load balanced through the Apache web server.
Initially, the Apache web server SCA and only one Tomcat container are started
to handle requests. The Tomcat service specifications define the Tomcat SCA
information including an Association to a Apache Web Server SCA. Both SCAs are
dynamically instantiated to available compute resources which support the
system requirements declared by the Tomcat and Apache SCAs. If system or
process resource thresholds defined in the service definitions are breached, a
second Tomcat server is started automatically by the Rio platform on a machine
which has Tomcat container capabilities and is configured to function in a
load balancing architecture using Apache. Requests are then automatically load
balanced across the two Tomcat servers through the Apache server.
As new Tomcat servers are started, information about the new service instance
and the resources it uses can be logged and applied to the user or company
which required the additional servers to meet defined SLAs defined in
operational strings. In this way, the used resources can be charged for
appropriately.
Apache/Tomcat Metrics
The Tomcat and Apache services capture the following metrical elements:
CPU, Memory and Storage statistics are logged to keep track of how much
resources the application or server is using.
The Tomcat SCA makes use of the Periodic Watches to periodically read
the access log files retrieving statistics about the performance of the
Tomcat server. This includes hits/sec and bytes/sec. A Stop Watch is
used to test response time when accessing pages from the Tomcat server.
It is also used to determine if the service is running.
Basic usage statistics are captured to track resource usage of the
servers and to track individual user usage. A Tomcat server may be
started on behalf of an ISP customer to maintain an agreed upon level of
service. The client of an ISP using data center resources for the Tomcat
server can then be accurately charged for used resources. Individual
user usage can be tracked to further increase the granularity of
resource usage.
Service Statements
Service statements contain a list of service records associated with a service
running an specified compute resources. Each service record maintains
information about the type and amount of resources used by the service.
Service statement information can be mapped to accounting systems for tracking
usage of specific resources by specific users. Service statements and
associated records are the core components for Utility Computing usage and
charging mechanisms.
The amount or resources used by SCAs can be tracked and recorded. If a small
business is using Tomcat servers to delivery company content and the number of
servers required is scaled up, the information is tracked and the business can
be charged efficiently and cost effectively for only the resources used.
Summary
SCAs demonstrate the benefits of the Rio architecture and the capability it
provides to create a self-monitoring, self-healing network of services which
can be used in a utility computing environment. Using Apache and Tomcat
illustrates an example of policy based computing capabilities using open
source tools. It shows the flexibility of the design allowing:
- Definition of capabilities and service levels that correspond to a customer
business model.
- Custom configuration for a dynamic environment providing rapid provisioning
of services.
- Dynamic scaling corresponding to service usage.
This architecture and example also imply there is an ROI to be gained by:
- Better utilization of resources leading to lower Total Cost of Ownership.
- Assured level and QoS using declarative policies.
- Low upfront capital outlay with use of open source tools and not having to
over build or buy resources.
- Reduction of risk and easier adoption of new technologies by adding
discrete components to the Rio architecture.
Expanding on these capabilities allows organizations the capability to
potentially modify how compute resources are utilized. The utility model can
be applied where pricing is based on only resources that are used rather than
paying upfront for an entire configuration or set of services. A company can
save money by paying only for resources used. The alternative and currently
used model is to buy hardware that can handle peak loads, but ends up
typically using only a small portion of available resources. Utility computing
is the evolutionary step up from Application Service Providers.
Additional exploration of Rio usage to develop specific Utility computing
solutions is currently underway. New Service Control Adapters can be easily
written to represent, control and monitor additional applications and services
to provide a utility computing implementation.
More information about Rio and Service Control Adapters can be found at
rio.jini.org.
|