SOA to the Rescue, When Drug Discovery Needs Data Fast!
Information is key to drug discovery
By: Daniel Eng
Dec. 1, 2007 06:30 PM
As the demand for new medicines grows, so does the need for better information to manage and execute the R&D processes. There is huge pressure to make informed decisions, especially during the project's early stages when the risk is high and before downstream costs are added.
Pfizer spends billions on research projects annually. At Pfizer Global R&D where the company's drug discovery takes place, research scientists and managers require vast amounts of up-to-the-minute information on lab results, submission status, and project schedules to move new research forward quickly. Management must constantly analyze the entire portfolio of new medicines in discovery to look for opportunities, trends, and areas where attention is needed. Researchers and managers strive to bring together the best in ideas, practices, policies as well as the use of information.
At Pfizer's Research Informatics Division within Global Research and Development, we seek to provide the best information possible to our R&D customers. Meeting this mission requires constant innovation. Over the past several years, we have faced a number of challenges, causing us to evolve our information delivery methods and technologies significantly. These include a new approach to real-time data integration, such as using SOA data services that lets us build new solutions more rapidly and in alignment with our SOA strategies.
Data Integration Is a Critical Requirement
Through innovative use of analytics, reporting, and portal technology, we have made great strides toward improving how this information is presented internally. However, data integration remains the biggest challenge in effectively providing information to our researchers and managers.
Why is this critical? To properly assess a portfolio of discovery projects, Pfizer managers must pull data from sources such as packaged applications, historical data from data warehouses, document repositories, and custom systems. Each source has its own access mechanisms, syntax, and security. Few are structured properly for consumption, let alone reuse. These combined factors slow down new application development projects.
Time Is of the Essence!
For new IT projects, time is of the essence. Business agility requires IT agility. Pfizer's researchers and managers, like their business user counterparts, constantly make new demands on IT for new information systems to help the business perform more efficiently, effectively, and competitively. This means we must build new systems quickly. Rapid application development (RAD) techniques are highly desired. In fact, we continuously evaluate our Enterprise Development Life Cycle (EDLC) processes with a primary objective of reducing time-to-solution with faster responses to business needs.
SOA-Compliance Is an Important Requirement
With respect to SOA and data integration, we've found that SOA helps break down silo-type data gathering and integration processes by standardizing how data is promoted and reused. The ability to virtualize and abstract via data services helps groups to easily understand and consume data confidently, reliably, and quickly without having to hunt for these sources or rely on manual processes for gathering and integrating them.
Old Extract & Mart-based Approaches Can't Meet New Requirements
Second, we've used replicated file extracts as a way to integrate data. File extracts handle data silos more efficiently than custom coding. For example, application teams that need data receive periodic file extracts from the application teams that manage the source data applications. This arms-length batch approach minimizes the impact on source systems and is useful for daily transaction summaries, shared reference data, etc. However, data integration beyond simple access - abstraction, transformation, federation, and more - requires extra work by the consuming team. This method proliferates replicated data without any controls on quality, security, and scalability.
Extract, Transform, and Load (ETL) with data marts or warehouses is our third approach to data integration. This kind of physical data replication has several advantages in terms of rationalizing and combining heterogeneous data from multiple sources. For large-scale multi-dimensional analysis, we find data warehouses are effective solutions given their ability to support the large volumes and significant schema transformations typically required. To date, this has been the data integration approach of choice for our medium and large-scale data integration projects.
Unfortunately, these three approaches may not be entirely effective with our customers. Because our customers must make decisions based on near real-time data, they often can't afford the extra development time required for building and testing custom coding, file extracts, and data marts. Forcing our business users to wait extra months for new solutions to be developed has a huge impact on how quickly we get new drugs to market.
Further, typical data mart/replication architectures don't easily fit into our new SOA strategy. New data integration projects must be SOA-enabled from the start, so they can deliver value moving forward.
Given the accelerating business demand for new systems from the R&D groups we support, my team decided to find a new approach to data integration that lets us build additional real-time solutions more rapidly, and in alignment with our SOA strategies, while avoiding the replication downside.
To do this, we launched a project with the goal of identifying and adopting a new approach to data integration that meets the following criteria:
Reader Feedback: Page 1 of 1
SOA World Latest Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week