Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Computing
Conference & Expo
November 2-4, 2009 NYC
Register Today and SAVE !..

2008 West
DIAMOND SPONSOR:
Data Direct
SOA, WOA and Cloud Computing: The New Frontier for Data Services
PLATINUM SPONSORS:
Red Hat
The Opening of Virtualization
GOLD SPONSORS:
Appsense
User Environment Management – The Third Layer of the Desktop
Cordys
Cloud Computing for Business Agility
EMC
CMIS: A Multi-Vendor Proposal for a Service-Based Content Management Interoperability Standard
Freedom OSS
Practical SOA” Max Yankelevich
Intel
Architecting an Enterprise Service Router (ESR) – A Cost-Effective Way to Scale SOA Across the Enterprise
Sensedia
Return on Assests: Bringing Visibility to your SOA Strategy
Symantec
Managing Hybrid Endpoint Environments
VMWare
Game-Changing Technology for Enterprise Clouds and Applications
Click For 2008 West
Event Webcasts

2008 West
PLATINUM SPONSORS:
Appcelerator
Get ‘Rich’ Quick: Rapid Prototyping for RIA with ZERO Server Code
Keynote Systems
Designing for and Managing Performance in the New Frontier of Rich Internet Applications
GOLD SPONSORS:
ICEsoft
How Can AJAX Improve Homeland Security?
Isomorphic
Beyond Widgets: What a RIA Platform Should Offer
Oracle
REAs: Rich Enterprise Applications
Click For 2008 Event Webcasts
In many cases, the end of the year gives you time to step back and take stock of the last 12 months. This is when many of us take a hard look at what worked and what did not, complete performance reviews, and formulate plans for the coming year. For me, it is all of those things plus a time when I u...
SYS-CON.TV
SOA, Web Services and Mass Data Movement via ETL
The art of preserving legacy systems in the world of SOA solutions

On the one hand, there's extreme pressure on businesses to deliver new customer offerings and innovative business capabilities, match increased competition, and deal with new partners and providers of niche services to offer cheaper service.

On the other hand, most of us have legacy systems that prevent us from delivering to the business at the speed of business opportunities. Legacy applications house only parts of the business entity and the data is stored in proprietary structures. The data is also owned by proprietary application logic (packaged or custom-built) that represents very narrowly defined business functions where the business rules might be applicable for a single business area or line of business.

The industry is looking to SOA as a mechanism to help businesses become more agile. However, the question is how this would be achieved in an enterprise that has numerous critical path legacy systems that have the limitations mentioned above.

This article provides one possible solution to this conundrum. The architecture requires the transport of legacy data on-demand or in a scheduled manner to a common staging repository that lets the service layer apply non-legacy, non-line-of-business-specific business logic for enterprise-wide information sharing. Further, the service layer has logic that works on the base business data captured in disparate sources across the enterprise and transforms this into information.

This architecture pattern, displayed in Figure 1, offers a possible alternative to insulate the legacy systems from changes while promoting the reuse of the data collected by the legacy system. The solution also enables IT to react to the new business rules that need to be applied to one or more aspects of a business entity.

The fundamental theme encapsulated in the architecture pattern is the fact that an enterprise can leverage powerful ETL tools and processes to gather, reconcile, and finally populate enterprise repositories in an attempt to reuse information stored in multiple legacy data structures/sources. Also shown here is the fact that the consumer is unaware of how the service provider gathers the relevant data and what the sources of the data might be. The consumer's only view to the business entity and the business rules governing the business entity is via the use of an enterprise-worthy service provider interface.

Standardized ETL processes are leveraged to transport, reconcile, and transform the granular data that represents various aspects of the business entity into enterprise business information, while services are leveraged to apply enterprise-wide business rules to make business information and business functions available to the business in a consistent manner.

The pattern also demonstrates how an enterprise can extend the use of specialized or silo legacy data to service-enable all of the various legacy applications. The cleansed and reconciled data is populated into an enterprise repository that is then accessed using enterprise rules. This insulates the consumer from having to know or care about the details of invoking these legacy batch processes. The service provider offers a clean layer of indirection between the consumers and the sources of enterprise-worthy information that is locked up in legacy repositories.

Here is a step-by-step definition of the sequencing of calls as shown in Figure 1 that represents how SOA and ETL can work together in the real world. Figure 1 also shows how the architectural components are wired together to achieve the goals of the pattern:

1.  The Service Consumer calls the Enterprise Service to submit a request (scheduled) or invoke a service in real-time
2.  The Enterprise Service checks to see if this information exists in its Enterprise Staging Repository
3.  If the information isn't available, standardized ETL processes are executed (on-demand/scheduled) that gather the relevant legacy data, transform, and load the complete enterprise-worthy business entity information into the Enterprise Staging Repository. The ETL process also applies enterprise-worthy transformation and data reconciliation rules to the legacy data prior to uploading the Enterprise Staging Repository.
4.  The Enterprise Service executes enterprise-worthy business rules and processing logic to perform a commonly used business function on behalf of the consumer
5.  The Service Consumer is notified of the result of the execution of the business function

This pattern was applied by me at a large retailer to deal with "collecting" and "populating" merchandise assortment information from various vendor merchandise assortment repositories and private brand merchandise assortment repositories. The ETL process transformed the relevant information regarding various types of merchandise assortments prior to uploading the enterprise repository. This enabled the service provider layer to apply enterprise-level "merchandise assortment rules" to satisfy the "optimize merchandise assortment by region" requests.

In conclusion, the attempt made in the architectural pattern is to show how an enterprise can extend its legacy information assets while insulating the consumer layer from the details of the process. The legacy systems continue applying localized rules to capturing the data while the enterprise service only adds on the enterprise rules layer thus avoiding redundant application of business rules. In addition, front-ending the legacy data sources with a SOA-style service lets the SOA service management infrastructure be leveraged to manage and monitor consumer SLAs without affecting the fragile and customized legacy application code base. Thus, an enterprise that has legacy assets can still move ahead and apply the core principles of SOA such as loose coupling and modular design without having to sacrifice the stability of the legacy systems or decommissioning them.

The result is SOA-style services that can now expose enterprise information to satisfy any business process whether it's from within an enterprise context or from an extended enterprise business context.

Finally, the key value provided by this pattern is to demonstrate how enterprises can embark on the SOA path feeling empowered by the portfolio of legacy assets they have at their disposal instead of looking at them as a liability.

About Surekha Durvasula
Surekha Durvasula is the Manager of the corporate Enterprise Architecture Group for Kohl's Department Stores in Wisconsin.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

First of, I want to thank you all for your comments. I also agree with you John (Jones) that leveraging a real-time pub-sub interface is a great alternate option for keeping the Enterprise Repository synch'd up. However, here are some of the reasons I introduced an ETL interface into the mix for keeping the Enterprise Repository in synch.

A) For Systems of Records that are legacy systems and/or packaged vendor products it may not be feasible to extend these to create a publication process.

B) ETL has cross-business domain and enterprise business concept transformation logic and this logic is kept separated from the system of Record. Also, all of the relavant extraction and transformation logic required to create a single business concept from various Systetms of Record is included in a single ETL process instead of in the muliple pub/sub messaging processes.

C) Scheduled ETL is also effective in insuring that only the finalized stable state of long running business transactions across the various Systems of Record is applied to the Enterprise Repository.

D) ETL process includes information filter logic and it filters and transforms select System of Record transactions by correlating this information across business domains based on enterprise worthiness, relavance and priority.

E) Finally, the enterprise could choose scheduled ETL process as the primarily synchronization mechanism for the Enterprise Repository and this could be run during the SoR transaction downtimes so as to not affect the operational and transactional systetms and also to increase responsiveness of the business service.

Surekha.

Why not set up a real-time pub/sub interface between the legacy systems and the Enterprise Data Store? Then there is no data latency waiting for the ETL jobs to run when the data required by the requesting service hasn't been populated yet.

I'm glad to see an article about the use of ETL withn a SOA environment. At our company we're seeing the dichotomy between the hype of SOA, the ability for everything to be encapsulated as a service via loose coupling with an application's web services API, and the fact that a huge chunk of corporate data out there is still in legacy formats, such as VSAM, ISAM, COBOL, and other hard-to-get-to forats with no APIs.

Great article, and very informative!


Your Feedback
Surekha Durvasula wrote: First of, I want to thank you all for your comments. I also agree with you John (Jones) that leveraging a real-time pub-sub interface is a great alternate option for keeping the Enterprise Repository synch'd up. However, here are some of the reasons I introduced an ETL interface into the mix for keeping the Enterprise Repository in synch. A) For Systems of Records that are legacy systems and/or packaged vendor products it may not be feasible to extend these to create a publication process. B) ETL has cross-business domain and enterprise business concept transformation logic and this logic is kept separated from the system of Record. Also, all of the relavant extraction and transformation logic required to create a single business concept from various Systetms of Record is included in a single ETL process instead of in the muliple pub/sub messaging processes. C) Scheduled...
John Jones wrote: Why not set up a real-time pub/sub interface between the legacy systems and the Enterprise Data Store? Then there is no data latency waiting for the ETL jobs to run when the data required by the requesting service hasn't been populated yet.
Fernando Labastida wrote: I'm glad to see an article about the use of ETL withn a SOA environment. At our company we're seeing the dichotomy between the hype of SOA, the ability for everything to be encapsulated as a service via loose coupling with an application's web services API, and the fact that a huge chunk of corporate data out there is still in legacy formats, such as VSAM, ISAM, COBOL, and other hard-to-get-to forats with no APIs. Great article, and very informative!
SOA World Latest Stories
Yahoo’s critical negotiations with Alibaba to sell part of its stake in Alibaba back to the Chinese company have collapsed according to All Things Digital, a report later confirmed by CNBC. Apparently the collapse includes Yahoo’s parallel and intertwined negotiations with Softbank t...
Can you bring services from the cloud to your customers faster and have them adopt it with ease of use or bring the power of bundled services to the fingertips of your clients without creating new rigid ‘apps stove pipes'? Do you want to prevent your business running away to public and...
The Internet highway may start looking like a proverbial New York traffic jam at rush hour soon. Feel free to substitute any town you like because Cisco says there’s going to be a faster-than-expected 18x surge in worldwide mobile data traffic between 2011 and 2016. That’s when mob...
OCZ Technology Group, a provider of high-performance solid-state drives (SSDs) for computing devices and systems, on Tuesday announced the Z-Drive R4 CloudServ PCI Express (PCIe) flash storage solution, designed to accelerate cloud computing applications and reduce operating expenses i...
Many organizations have embraced, or are considering, the benefits of cloud computing – speed, flexibility, increased expertise, shared workload, reduced costs, etc. The benefits are many – but so are the risks. What are the threats to cloud security? Which parties assume responsibilit...
SoftLayer Technologies on Tuesday announced the immediate worldwide availability of SoftLayer Object Storage, a redundant and highly scalable cloud storage service that allows users to easily store, search and retrieve data across the Internet, with optional CDN connectivity, or across...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON Featured Whitepapers
ADS BY GOOGLE