GRIDtoday IBM

DAILY NEWS AND INFORMATION FOR THE GLOBAL GRID COMMUNITY /
( Previous Article )   ( Table of Contents )  
Special Features:

BUILDING A DYNAMIC UTILITY COMPUTING ENVIRONMENT
By Kevin Hartig and Dennis Reedy, Sun Microsystems

Overview

Software-based dynamic resource allocation and provisioning mechanisms provide an organization the opportunity to control hardware and software assets by dynamically reorganizing software as required to optimize resource utilization. Scaling up use of resources as needed during peak load situations, scaling back during quiet periods and relocating resources can be performed automatically as required. Metrical mechanisms can be attached to application services to maximize the operational capabilities of observed components while minimizing the intrusiveness of the observational mechanisms. Resource capacity can be managed automatically and more economically.

Service Level Agreements (SLAs) can be defined which are based on acceptable and agreed upon constraints. Using the SLAs definitions, a collection of compute resources in a datacenter can be monitored for a desired Quality of Service (QoS) and provide self-healing and self managing capabilities. The application of Utility Computing approaches will be derived from these core capabilities allowing organizations to utilize existing infrastructures providing beneficial and economical solutions.

Project Rio provides a platform and the capabilities necessary to implement the metering and monitoring of applications in a dynamic environment. This paper describes an example of how Service Control Adapters (SCAs) built for Rio can be used to meter, monitor and manage Tomcat container servers based on declarative SLAs. Metrical components are used to observe system statistics and to trigger actions that dynamically maintain the desired QoS. This example demonstrates policy based utility computing using open source tools to create a self-managed environment of load balanced web servers.

Policy Mechanisms

The ability to inject rules and policies into the service fabric allows greater automation, scalability and controlled behavior. Ultimately, a system can provide the advanced capabilities of self-healing, self-optimization and self-configuration. The Rio platform provides mechanisms to declaratively specify behavior (logic determining how to react to a set of stimuli) as policies. The following policies are used in this example:

  • Platform Policies

Platform policies determine it operational requirements can be met by qualitative capabilities of a compute resource. A qualitative capability indicates a specific type of mechanism or quality associated with a compute resource such as network capabilities (TCP, 802.11, Bluetooth, etc.) or platform software (drivers, database, software versions, etc.). Supportability is determined by Platform Capability objects not by administrative configurations. Selection policy is enabled through a mobile code strategy, providing a decentralized collection of platform capabilities. This approach fundamentally changes how a network of compute resources makes their capabilities known through the network, allowing the network of resources to grow organically instead of using a centralized database.

  • Behavioral Policies

Behavioral policies determine whether a service is operating to specific service level objectives. This is accomplished by providing mechanisms to declare and enforce SLAs. SLA policy policy handlers can be declaratively attached to a service description, defining what actions to execute if SLAs can not be met. SLAs vary in scope and function. Some define quantitative behavior measuring platform depletion characteristics such as CPU utilization, CPU capacity, memory usage and network bandwidth. Others define application specific metrical analysis such as transaction times, number of connections, etc. Lastly, SLAs can define dependent service associations such as "uses" or "requires."

Metrics

The Rio platform provides metrical mechanisms for observing, monitoring and metering system and application resources. The Watch facilities provide integrated threshold triggers, visualization facilities and data archival for sophisticated analysis. The following types of extensible facilities are provided:

  • Stop Watch -- Used for tracking transaction and response times.
  • Counter Watch -- Count a monotonically increasing non-negative value of an arbitrary occurrence of something over time.
  • Gauge Watch -- Record values that can go up and down, and it be positive or negative.
  • Periodic Watch -- Sets a discrete value measured at specific time intervals.

The information captured can be used to dynamically check that services are executing within define SLAs and can also be used to track compute resource usage for utility accounting and billing purposes.

Service Control Adapters

Service Control Adapters encapsulate the control and monitoring of services running within the context of the Rio platform. The resources associated with the external service are monitored and actions triggered when defined resource thresholds are breached or cleared. For example: CPU utilization on a machine executing the service can be monitored. If the CPU exceeds 90% (a defined threshold) an action to start another service on a different machine can be triggered. Actions are determined by policy objects which are declared as part of a service's definition.

SCAs can control the startup and shutdown of the service and associated monitors. A SCA represents a process running on a machine that is external to the Rio platform but which is monitored and controlled by it. A SCA starts or stops the service process and initializes and destroys the associated Rio software components and processes. Typically, monitors start automatically when a SCA is started, but the monitors may be managed while the SCA is active.

The SCAs working in conjunction with the monitors and policy objects ensure that services or containers such as web, application and database servers perform to expectations defined by SLAs. If expectations are not being met, the problem is automatically resolved by scaling up, scaling back, restarting, etc. The utility environment is dynamically managed efficiently and effectively. This also allows for more accurate cost modeling and tracking of resources used per user.

Tomcat Service Control Adapter

The Tomcat SCA is an example that demonstrates the basic capabilities of a Service Control Adapter and how services external to the Rio platform can be managed through the network to ensure SLAs are being met and can be scaled dynamically as needed. The following diagram provides an overview of the architecture:

Components of the Tomcat SCA

The Tomcat SCA consists of the following components:

  • Service Bean files that implement the SCA

In the Rio architecture, service components are called Service Beans. Service Beans can be easily developed and deployed as services throughout a network. The Service Bean component model provides a level of abstraction on top of the current Jini service development model making it straightforward to develop and deploy Jini services. The Service Bean approach allows developers to focus on the solution space for their domain, not focus on the infrastructure level of developing Jini services.

The Tomcat SCA implementation extends the Rio Service Bean Adapter and provides the core of the SCA containing the logic to monitor and manage a Tomcat server.

  • Service User Interface

The ServiceUI enables a user interface to be associated with a Jini service. The interface provides access to SCA administrative capabilities to control the service.

  • Servlet code

The monitoring Servlet runs in the Tomcat Servlet/JSP container. It reads Tomcat log file information used for monitoring usage and status of the Tomcat server. It also polls the server to determine approximate access response times to verify that the server is accessible.

  • Tomcat Operational String

The operational string represents the aggregation of software assets that make up the Tomcat Service Control Adapter. It includes the SLAs which define acceptable operational and performance criteria for Tomcat servers and it defines an association with an Apache HTTP service. The 'Requires' Association1 specifies that the Tomcat SCA requires the Apache SCA be executing. The Apache service manages the Apache web server. This server is used to load balance HTTP requests across multiple Tomcat servers.

Policy Definitions

The Tomcat and Apache SCAs utilizes Platform Policies to control the instantiation of Tomcat and Apache service instances on machines which have the requisite software already installed or on machines that can have the software automatically provisioned, installed and configured. Rio platform components advertise the capabilities of the compute resource they have been instantiated on by loading a capabilities file, which provides details on the specific mechanisms and qualities. An entry in the Tomcat SCA's service definition lists system components that are required for instantiation. The following snippet of the Tomcat SCA's service description illustrates the markup:

<SystemComponent Class =
"com.sun.rio.qos.capability.enterprise.WebServerSupport">
    <Attribute Name="manufacturer" Value="Tomcat"/>
    <Attribute Name="version" Value="4.1.24"/>
</SystemComponent>

Support for this system component is determined by the platform policy object loaded by Rio platform components.

Behavioral policies define an agreed upon level of performance for the system. In this example the Tomcat SCA's service description defines an SLA requiring the Tomcat server to handle no more than a certain maximum number of HTTP requests/second. If this limit is breached, an event is generated and the SLA Policy Handler for Tomcat determines what action to take. In this case, the Rio platform tries to provision another Tomcat SCA to a machine with the appropriate resources. If successful, the SCA starts another Tomcat server on a platform with the necessary platform capabilities. The new server then becomes part of a load balanced environment. The Apache service is notified of the new Tomcat server and makes the necessary configuration changes on-the-fly to begin load balancing.

The following XML in an operational string defines the desired Service Level Agreements for the "Hit Counter" watch.

<SLA ID="Hit Counter" Low="10" High="100" >
    <SLAPolicyHandler Class = "com.sun.rio.qos.ScalingPolicyHandler" />
    <Attribute Name="MaxServices" Value="2" />
</SLA>
  • Defines a Hit Counter watch.
  • Uses the ScalingPolicyHandler to monitor thresholds.
  • Indicates a server should be shutdown if a server is handling less than 10 hits/second.
  • Indicates another server should be started if the current server is handling more than 100 hits/second.
  • Can start a maximum of 2 Tomcat servers.

The following XML in an portion of the service definition specifies a Tomcat server needs an Apache server SCA to be available before starting up. This defines a 'requires' relationship stating the Tomcat service requires a Apache service in order to execute.

<Associations>
    <Association Type="requires" Name="Apache HTTP Service Control Adapter" />
</Associations>

An Apache SCA must be executing otherwise Tomcat Server SCAs will not be publicly advertised to offer its services to clients on the network. This definition enforces the defined load balancing architecture.

Dynamic Load Balancing Across Tomcat Containers

The Tomcat SCA demonstrates managing multiple Tomcat Servlet/JSP containers in a network. The configuration uses an Apache web server to serve static web pages. Dynamically generated pages are served by Tomcat. Multiple Tomcat containers are configured to be load balanced through the Apache web server. Initially, the Apache web server SCA and only one Tomcat container are started to handle requests. The Tomcat service specifications define the Tomcat SCA information including an Association to a Apache Web Server SCA. Both SCAs are dynamically instantiated to available compute resources which support the system requirements declared by the Tomcat and Apache SCAs. If system or process resource thresholds defined in the service definitions are breached, a second Tomcat server is started automatically by the Rio platform on a machine which has Tomcat container capabilities and is configured to function in a load balancing architecture using Apache. Requests are then automatically load balanced across the two Tomcat servers through the Apache server.

As new Tomcat servers are started, information about the new service instance and the resources it uses can be logged and applied to the user or company which required the additional servers to meet defined SLAs defined in operational strings. In this way, the used resources can be charged for appropriately.

Apache/Tomcat Metrics

The Tomcat and Apache services capture the following metrical elements:

  • System

CPU, Memory and Storage statistics are logged to keep track of how much resources the application or server is using.

  • Application

The Tomcat SCA makes use of the Periodic Watches to periodically read the access log files retrieving statistics about the performance of the Tomcat server. This includes hits/sec and bytes/sec. A Stop Watch is used to test response time when accessing pages from the Tomcat server. It is also used to determine if the service is running.

  • Usage

Basic usage statistics are captured to track resource usage of the servers and to track individual user usage. A Tomcat server may be started on behalf of an ISP customer to maintain an agreed upon level of service. The client of an ISP using data center resources for the Tomcat server can then be accurately charged for used resources. Individual user usage can be tracked to further increase the granularity of resource usage.

Service Statements

Service statements contain a list of service records associated with a service running an specified compute resources. Each service record maintains information about the type and amount of resources used by the service. Service statement information can be mapped to accounting systems for tracking usage of specific resources by specific users. Service statements and associated records are the core components for Utility Computing usage and charging mechanisms.

The amount or resources used by SCAs can be tracked and recorded. If a small business is using Tomcat servers to delivery company content and the number of servers required is scaled up, the information is tracked and the business can be charged efficiently and cost effectively for only the resources used.

Summary

SCAs demonstrate the benefits of the Rio architecture and the capability it provides to create a self-monitoring, self-healing network of services which can be used in a utility computing environment. Using Apache and Tomcat illustrates an example of policy based computing capabilities using open source tools. It shows the flexibility of the design allowing:

  • Definition of capabilities and service levels that correspond to a customer business model.
  • Custom configuration for a dynamic environment providing rapid provisioning of services.
  • Dynamic scaling corresponding to service usage.

This architecture and example also imply there is an ROI to be gained by:

  • Better utilization of resources leading to lower Total Cost of Ownership.
  • Assured level and QoS using declarative policies.
  • Low upfront capital outlay with use of open source tools and not having to over build or buy resources.
  • Reduction of risk and easier adoption of new technologies by adding discrete components to the Rio architecture.

Expanding on these capabilities allows organizations the capability to potentially modify how compute resources are utilized. The utility model can be applied where pricing is based on only resources that are used rather than paying upfront for an entire configuration or set of services. A company can save money by paying only for resources used. The alternative and currently used model is to buy hardware that can handle peak loads, but ends up typically using only a small portion of available resources. Utility computing is the evolutionary step up from Application Service Providers.

Additional exploration of Rio usage to develop specific Utility computing solutions is currently underway. New Service Control Adapters can be easily written to represent, control and monitor additional applications and services to provide a utility computing implementation.

More information about Rio and Service Control Adapters can be found at rio.jini.org.

( Top of Page )
( Previous Article )   ( Table of Contents )