On the Road to Web Service-Level Management
On the Road to Web Service-Level Management
By: Mark Potts
Jan. 21, 2003 12:00 AM
Web services is now delivering on the promise of interconnecting systems, within and between organizational boundaries. But the benefits of open interoperability of such distributed resources only increase the complexity of the computing environment that has to be managed.
Earlier this year Gartner published a report defining a Web services management platform as one of four platforms required for successful Web service deployments. In this article we'll look at the role traditional systems management has played in the enterprise, requirements specific to managing a Web services environment, and how Web services in a managed environment can improve service levels while reducing the overhead and costs associated with managing complex distributed environments.
Monitoring includes collecting metrics and events relevant to the application from the application's execution environment (hardware, operating system, etc.), as well as the application itself. Control could include installation, configuration, startup, shutdown, and general health and status, as well as real-time tuning to ensure optimal performance. In order for an application to be "monitorable" and controllable, it needs to expose basic management information including identification, status, metrics, configuration, operations, and events. This set of information is referred to as the manageability model for the application. The information populating this model may be supplied explicitly by the application, implicitly through the application's environment, or through both channels.
The typical architectural model for enterprise systems management is the manager-agent. In this model, the management system communicates with an agent using a predetermined protocol that may be proprietary or based on standards. The agent is usually local to the application and responsible for communicating with the managed applications and the management system. The agent forwards events from the application to the management system, and forwards requests from the management system to the application.
SNMP, from the IETF, was developed as a stopgap solution for accessing management information on a device. SNMP, however, lacks native support for operations, secure authorization, and relationships. At the same time, it became obvious that the information model needed to be standardized independently of how the information was expressed or accessed.
The Distributed Management Task Force (DMTF) has been defining a standard manageability model, CIM for IT resources, for over five years. One of the most important benefits of CIM is the common vocabulary for simple concepts like status and description. The information in a CIM model can be accessed using Web Based Enterprise Management (WBEM), which defines a protocol using XML-encoded CIM meta-schema, classes, and instances over HTTP. While many of the systems, devices, and network models are very mature and complete, the application model is still being developed.
Nearly all the existing management technologies, protocols, and models have been focused on managing the configuration and status of specific resources rather than business views of distributed applications and systems, in essence, business services. End-to-end views of resources used by a business system are hard to develop and understand and the status of those systems even harder to infer. The traditional coarse granularity of management status (up, down, degraded, etc.) does not provide enough context to tell a busy operator or business administrator whether a system, or service, needs attention. More importantly, it doesn't tell them if the system is behaving as expected by its users, namely their business partners, suppliers, customers, or other internal personnel.
Web Service Management
Gartner defines the Web Services Management Platform (WSMP) as "a set of software services that is designed to help coordinate the activities of services while they are being used." Different service provider platforms (e.g., WebSphere, .NET, etc.) will not be able to manage services deployed on other platforms due to disparate management components. Gartner views a WSMP as the bridge to enable and provide interoperability for managing services across platforms.
In order to achieve optimal IT investment, there must be strategic alignment between business requirements and IT investment and management to support that alignment, i.e., service-level management (SLM). SLM is achieved through the proper definition of services, relationships between services, and their correlation and representation as business processes. Traditional operations management platforms have been narrowly focused on specific systems and applications as opposed to a service-based dynamic environment.
Service-based management requires the provision and consumption of services in a nonintrusive manner while maintaining the loosely coupled nature of SOA. The WSMP does precisely this by acting as a transparent intermediary, or broker, between consumers and providers of services. The broker handles requests and manages the runtime provisioning of service endpoints to the requests dynamically. It finds the most appropriate service for service requests on demand. The broker then supports the interaction between consumer and provider with management facilities for availability, versioning, provisioning, configuration management, logging, auditing, alerting, error management, transformation, and integration with security facilities for authentication and authorization.
A WSMP that transparently mediates between Consumers and Providers to resolve service requests on demand offers two important advantages:
1. Messages received can be intercepted or inspected for additional information pertaining to the consumer or the request and taken into consideration; for example, identity, geography, time, price, etc., offering opportunities for differentiated service offerings or intelligent routing based on context.
WSMP should not be seen as a replacement to systems management, but rather as a conduit and extension to external, broader management facilities such as those offered by Tivoli, CA, BMC, that can correlate and add management to the infrastructure used to implement the Web service (hardware, networks, etc.).
Web Service-Level Management
For systems management to really meet the needs of the organization, both the resources being managed (Web services) and the manager (WSMP) need to take on certain responsibilities. The resources must provide enough information and operational interfaces such that the resources can be centrally monitored and controlled. The Manager must be capable of analyzing the information provided by individual resources and correlating information from multiple resources, and provide the ability to act on the information to better manage the Quality of Service (QoS) being achieved by services and offered to consumers. The manager, where appropriate, should also be able to manage the environment proactively such that every attempt is made to meet the declared QoS, whether those be internal SLO or a more formal SLA.
The major roles involved in interactions are the Provider and the Consumer; therefore, management should be addressed from the perspective of both roles. A Manager role, when seen from the Provider's perspective, has management capabilities and visibility beyond the Web service and into the service instance or implementation. The Provider, therefore, has management capabilities (visibility and control) over elements of the architecture that a Consumer would not, e.g. the hosting environment. The separation of service from its implementation and environment means that the lower-level elements that support the service are important to the Provider so they can manage at both levels. For example, service Providers may want to replicate service instances, launch new service instances to meet SLA at peak load times, or perhaps failover to alternative service instances hosted elsewhere. In this case, being able to manage the service as it is exposed to the Consumer and the elements of the architecture supporting that service is critical to meeting business objectives.
From the Consumer's perspective, only the service as defined by the service definition they have consumed can be managed. The Consumer acting as a Manager will have visibility and control of the requesters, but in most cases will not be offered management beyond visibility (metering and monitoring) for the service. For example, the Consumer (or a third party) may want to, and be allowed to, look at the performance and availability metrics or measurements offered by a service it consumes to ensure adherence to SLA. There are three interrelated arenas of responsibility for Web services;
1. Service monitoring and reporting: Monitoring and reporting on the usage, health, and QoS being delivered by services
Service Monitoring and Reporting
Collected metrics are used to calculate measurements regarding performance availability and usage. With the Manager defining and recording measurements, services can be monitored to ensure they are meeting agreed upon service levels. Breach conditions and early warning of potential breach conditions can then be monitored and managed. The SLA agreements that define measurements should include the formula used for measurements that define the parameters of the SLA (average response time, throughput, and availability). SLAs are defined for specific customers and therefore the metric collection needs to be consumer aware and tie directly into the policies for authentication and authorization for the service. SLAs also define periods of time for which service-level objectives are applicable. Again, this definition affects configuration of the services in terms of warning events and metrics.
Service Execution Management
Web services that are provisioned may well be supported by many instances of the service so that anticipated capacity can be met within the agreed service levels defined for the service. This means that requests for service need to be intercepted or inspected so they can be routed to the most appropriate service instance available. In Grid computing, this can also mean life cycle management, where more service instances can be launched (capacity on demand) to support the Consumers and meet SLA. Routing may also be affected by policies concerning the identity of the requester. Important Consumers may be shown greater consideration when routing their requests, or Consumers with a more stringent SLA may take priority over others with looser agreements, such that all SLA are met.
Web services can also evolve over time, especially where Consumer requirements drive changes into existing services. Routing must therefore be cognitive of versions and compatibility between services, such that rolling upgrades as well as side-by-side versions of services can be managed appropriately.
Managing the interactions between Consumers and Providers includes security and is part of managing the overall QoS. A managed environment provides security services that enable applications to enforce Access Control, Identity Management, and Entitlement Management. Requirements that define Quality of Protection (QoP) may well be defined in SLA, and policies that enforce that QoP are part of the configuration of a service and need to enforced and managed.
Errors and failures are inevitable in any environment but should be managed (resolved) wherever possible such that Consumers are unaware of problems and can continue undisturbed. Errors and fault conditions need to be intercepted by the Manager before being returned to the Consumer to see if there are any ways in which the faults can be resolved. This is, again, based on configuration metadata and driven by policies that govern retry attempts with the requested service, routing to alternate services, back off times, and overall timeouts for requests.
Service Environment Management
Information specific to a service definition (functional) can also be related to other information concerning the service, for example the generic SLA that applies to this service, or other service definitions (operational) that can be utilized at runtime for monitoring and configuring services.
The Manager uses these declared relationships and associated meta information at deployment time and runtime. Once deployed, the Manager is responsible for publishing the service to the appropriate discovery mechanism, making management and operational controls available to management applications and consoles, and accommodating non-disruptive evolution of the service. Evolution includes managing side-by-side versions of services and routing appropriately between them based on Consumer requests and rolling upgrades, where service implementations can be changed and extended without disruption to the Consumers.
All three areas of management are interrelated and need to work collaboratively within a WSMP to deliver real business value to an organization. For example, performance and availability information needs to be made available to the execution management facilities such that it can effectively manage QoS; service level objectives need to be considered for reporting and monitoring and execution management, security and identity information needs to be utilized for differentiated service offerings and service relationships need to be used for correlations of management information and metrics and support of nondisruptive change management.
Moving to Web Service-Level Management
The first of the building blocks, the minimum, basic information required for managing Web services and their environment, is being developed and standardized in the W3C's Web Services Architecture Working Group's Management Task Force. The Web Services Architecture includes a requirement that implementations of the architecture must be manageable. A task force was initiated to satisfy this requirement and is working to publish a manageability model for each of the components of the Web services architecture, i.e., services, hosting environment, and discovery agency. The manageability model must include identification, configuration, metrics, and events for the components.
The second and third of the building blocks, access to and discovery of the management information, are being developed and standardized in the OASIS Management Protocol Technical Committee (MPTC). The MPTC is defining how to access manageability information for any managed resource using Web services. The same specification should also work for Web services as a specific type of managed resource.
These building blocks have been discussed in terms of how to manage Web services in particular, but you can see that the same principles can be applied to managing IT resources in general. The use of Web services to expose IT resources on Grid systems is being developed and standardized by GGF at Globus. They are also defining how to manage these IT resources using Web services and Grid services. It is logical that they should be able to leverage the same foundation building blocks being developed by the W3C and the OASIS MPTC.
Web service management platforms and Grid computing are critical components on the path to creating dynamic, service-centric networks of self-managing computing resources and as Gartner rightly points out, "enterprises that do not embrace the producer and management platforms will fail to deliver any Web services beyond trivial initiatives through 2004."
Reader Feedback: Page 1 of 1
SOA World Latest Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week