Five Ways Data Virtualization Improves Data Warehousing
Data virtualization fills the EDW agility gap
By: Robert Eve
May. 26, 2011 03:00 PM
An array of business intelligence (BI), predictive analytics, data and content mining, portals and more tap a growing volume of information sourced from enterprise data warehouses (EDW). However, significant volumes of business-critical enterprise data resides outside the enterprise data warehouse. To deliver the most comprehensive information to business decision-makers, IT teams are implementing data virtualization to preserve and extend their existing enterprise data warehouse investments.
This article discusses five integration patterns that combine both enterprise data warehouses and data virtualization to solve real business and IT problems along with examples from Composite Software's data virtualization customers. The five patterns include:
Maximizing Value from Enterprise Data Warehouse Investments
This inexorable pressure has and will continue to drive the demand for enterprise data warehouses as an array of BI, predictive analytics, data and content mining, portals and other key applications rely on data sourced from enterprise data warehouses.
However, business change often outpaces enterprise data warehouse evolution. And while useful for physically consolidating and transforming a large portion of enterprise data, significant volumes of enterprise data resides outside the confines of the enterprise data warehouse. Further, enterprise data warehouses themselves require support throughout their lifecycles, driving demand for solutions that prototype, migrate, extend, federate and leverage enterprise data warehouse assets.
Data virtualization middleware, an advanced version of earlier data federation or enterprise information integration (EII) middleware, complements enterprise data warehouses by providing a range of flexible data integration techniques that preserve, extend and thereby drive greater business value from existing enterprise data warehouse investments.
1. Data Warehouse Augmentation
Data virtualization effectively federates data-warehouse information with additional sources, therefore extending existing data warehouse schemas and data. These complementary views are conducive to adding current data to historical warehouse data, detailed data to summarized warehouse data, and external data to internal warehouse data.
Energy Company Combines Up-to-the-minute and Historical Data - To optimize deployment of repair crews and equipment across more than 10,000 production oil wells, an energy company uses data virtualization to federate real-time crew, equipment and well status data from their wells and SAP's maintenance management system with historical surface, subsurface and business data from their enterprise data warehouse. The net result is faster repairs for more uptime and thus more revenue.
2. Data Warehouse Federation
Optimizing business performance requires data from across these various warehouses and marts. But physically combining multiple marts and warehouses into a singular and complete enterprise-wide data warehouse is often too costly and time consuming.
Data virtualization federates multiple physical warehouses. Two examples include combining data from the sales and financial warehouses, or combining two sales data warehouses after a corporate merger. This approach achieves logical consolidation of warehouses by creating an integrated view across them, using abstraction to rationalize the different schema designs.
Investment Bank Federate Financial Trading Data Warehouses - To enable more flexible customer self-service reporting and meet SEC compliance reporting mandates, a prime brokerage uses data virtualization to federate equity, fixed income and other investment positions and trades information from siloed trading data warehouses. The net result is higher customer satisfaction and lower reporting costs.
3. Data Warehouse Hub and Virtual Spoke
Data virtualization provides virtual data marts that eliminate, or at least significantly reduce, the need for physical data marts around the data warehouse hubs. This approach abstracts the warehouse data to meet specific consuming tool and user query requirements, while still preserving the quality and controls inherent in the data warehouse.
Mutual Fund Manager Eliminates "Rogue" Financial Data Marts - A mutual fund company uses data virtualization to enable more than 150 financial analysts to build portfolio analysis models with MATLAB® and other analysis tools leveraging a wide range of equity financial data from a 10 terabyte financial research data warehouse. Prior to introducing data virtualization, analysts frequently spawned new satellite data marts with useful data subsets for every new project. To accelerate and simplify data access and to stop the proliferation of costly, unnecessary physical marts, the firm instead used data virtualization to create virtual data marts formed from a set of robust, reusable views that directly accessed the financial warehouse on demand. This enables analysts to spend more time on analysis and less on access, thereby improving portfolio returns. The IT team has also eliminated extra, unneeded marts and all the costs that go with maintaining them.
4. Complementing the ETL Process
ETL tools can leverage data virtualization views and data services as inputs to their batch processes, appearing as another data source. This integration pattern also integrates data source types that ETL tools cannot easily access as well as reuse existing views and services, saving time and costs. Further these abstractions do not require ETL developers to understand the structure of, or interact directly with, actual data sources, significantly simplifying their work and reducing time to solution.
Energy Company Preprocesses SAP Data - To provide the SAP financial data required for their financial data warehouse, an energy company uses data virtualization to access and abstract SAP R/3 FICO data. This replaces an error-prone, SAP data-expert-intensive, flat-file-extraction process that would not scale across a complex SAP landscape. The results include more complete and timely data in the financial data warehouse enabling better performance management.
5. Data Warehouse Prototyping
Data virtualization middleware can be the platform for prototype development environment for a new data warehouse. In this prototype stage, a virtual data warehouse is built, rather than a physical one, saving the time to build the physical warehouse. This virtual warehouse includes a full schema that is easy to iterate as well as a complete functional testing environment. Performance testing is somewhat constrained at this stage, however.
Once the actual warehouse is deployed, the views and data services built during the prototype stage still have value. These are useful for prototyping and testing subsequent warehouse schema changes that arise as business needs or underlying data sources change.
Government Agency Prototypes New Data Warehouses - To reduce data warehousing time-to-solution for new data warehouse projects and changes to existing ones, a government agency uses data virtualization. The time spent in getting the data right has proven to be four times faster than directly building the ETL and warehouse, even when the subsequent translation of these working views into ETL scripts and physical warehouse schemas is factored in.
SOA World Latest Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week