Menu Close


How do we solve Pharma’s Data Dysfunction Conundrum?

Alexandra Moens, Kristof Geentjens, John Coritz and Richie Kahn
Anju Software

The COVID-19 pandemic was certainly an unforeseen event in 2020 that significantly impacted the pace of drug research and development, and new drug market launches and uptake. One of the key factors, however, that may have been responsible for mitigating that impact was data – both access to it and the ability to leverage it to inform both clinical development and healthcare stakeholder and patient group education strategies. Ironically, that same factor – a relative weakness of available data and particularly in data intelligence capabilities – may be equally responsible for millions of dollars in losses to pharma. These losses resulted from every drug targeted for launch in 2020 that never made it due to everything from ill-informed clinical studies and trials to not understanding the real key influencers that can make market adoption successful or not. 

In previous years, the paucity of data, and access to it, to support bringing a promising drug to market had been a real challenge for pharma. However, in recent years, the explosive volume of public-sourced medical, clinical and scientific data, and content available online has never put the pharmaceutical industry in a better position than today to expand their medicine portfolios, while also optimizing every aspect of new drug-to-market costs.  

A key barrier today to making this a reality is still incomplete, unrevised, untrustworthy or unstructured public data types and sources.  This, combined with old manual-driven processes to aggregate, analyze, and rank this data to obtain optimal insights to make the best clinical development and go-to-market decisions, is an obstacle to drug development.  Imprecision in decision-making is something pharma can no longer afford.  Actually, it should not even be an issue given the technologies available to understand and translate complex data sets that can reasonably predict clinical study outcomes as well as healthcare marketplace adoption of new therapies. 

Why is much of pharma still stuck in this data dysfunction conundrum and still at a decidedly “data disadvantage,” and what steps can it take to remove itself from this morass? 

Why Big Data Didn’t Bring Nirvana for Pharma 

With the evolution of technology and the internet in recent years, the volumes of data at pharma’s fingertips should have given the industry a huge advantage in accelerating critical new therapies to market for patients. Unfortunately, it has never been about the amount of data, which continues to grow at exponential rates every day.  It has always been about the quality of that data and how quickly it is used and interpreted for pharma to reap the full value of its impact.

To be sure, pharma has been doing its best to get its arms around the explosive growth of big data to improve processes ranging from every level of clinical drug research and development to building relationships with top medical influencers who can make or break a medicine’s success. However, the processes they need to focus on first are how to optimize the data available. While some pharma organizations are far more advanced in making “big” data “smaller,” much of the industry is still utilizing time-consuming data analysis techniques and/or underutilizing technology that can speed extraction of insights from largely unstructured data sets. This means a significant number of time-pressured clinical and market strategy outcomes seen today – good and not so good – are based on incomplete, suboptimal information vetting. 

Dated Data & Deficient Data Gathering Further Challenge Pharma

Again, the rich panoply of data accessible to pharma is not the issue. Pharma would rather have the opportunity to make decisions based on more versus less data. But the mere existence of big data and access to it does not automatically translate into knowing new drug candidates that are most likely to succeed, planning clinical studies and trial results optimally, knowing which healthcare providers will champion a certain therapy or not. 

Furthermore, while there is an endless sea of new data growing each day, many of the key data sets pharma needs related to clinical drug development, such as public trial registries, abstract databases and publication libraries, are often inaccurate and dated. This data is purposely kept incomplete or not regularly updated by pharma clinical trial sponsors to keep information secret from competitors. In that way, organizations that primarily rely on this publicly sourced “old” data, without analyzing it alongside other data sets, are making high-risk and potentially high-cost decisions if anticipated outcomes fall short. 

Additionally, today, it is frequently challenging for pharma companies to get a good picture of what competitors are doing.  This can impact everything from knowing if there is a large enough or right patient profile pool for trials in a particular geography, to knowing whether the right investigators are available to influence drug adoption rates.

Another big blind spot for pharma is knowledge about the data they already have internally, where it is located, and whether it is being repurposed efficiently. Today, it is conceivable that companies may be utilizing more than 50% of data researched only for one-time projects, and then they are erasing it from their systems or never using it again. Also, a majority of data that pharma collects is often segregated into silos by business units using separate databases or only recorded on Excel spreadsheets.  This data is never shared across the organization to determine if it may be relevant for other projects and initiatives. 

An additional challenge comes with merging data and similar data types from different sources. This is already on top of managing an array of data fields, naming conventions, terminologies, update frequencies, data accuracy, data rules, and constantly changing public domain data. It becomes unbelievably challenging for pharma to keep track of all these data elements and still provide a high-quality end product. All of these data process inefficiencies are holding back medical advances and timely medicine delivery to the patients that need them most.

Automating Data Analysis for Faster, Smarter Decisions

Medical and clinical data that can be utilized for pharma’s benefit will only continue to grow, even if from incomplete or unvalidated data sources. The mission for pharma is to identify the right technology tools that can quickly and efficiently sift through both good data and the good nuggets in the bad data to make therapy development and delivery as precise and certain as possible. There are intelligent and predictive analytics that can centralize, integrate and cross-compare –in advanced visual formats –private, public and third-party aggregated data across a pharma organization. These same analytics can also analyze, rank and score that data to optimize clinical trial decision-making and outcomes and help understand changing HCP and influencer dynamics that can impact therapy adoption.  Ultimately, this analytics technology will maximize the work of every function within a pharma company.  Unfortunately, social media and health insurance claims data, which can provide the freshest insights into a fast-changing healthcare environment, are two examples of categories where pharma is not necessarily utilizing advanced data intelligence analysis platforms to extract full value.

Today’s available technologies help pharma to achieve a new level of performance by linking internal and external data sets to build a predictive machine-learning model with a higher degree of precision in predicting drivers of clinical trial site performance (for example, site congestion, protocol parameters and excitement around the target). As a matter of fact, the use of predictive models can sometimes help achieve a three- to six-month faster enrollment of patients and a reduced default rate for clinical trials* (* New machine-learning technologies can also create new standards of data trustworthiness and accountability for pharma. With these technologies, they can use internal tools to run trial feasibility simulations and share that data with CROs to identify the best PIs for a clinical trial, saving time and money. 

The end result of pharma increasing its utilization of intelligent and predictive analytics tools is faster, smarter data-driven decisions. This is critical when, every day, a new drug or device is delayed entering the market where, depending on market share and indication, sponsors stand to lose $600,000-$8,000,000 in revenues.* (* Reducing these delays is a win for pharma and the patients they serve. 

Ultimately, the goal of combining new approaches to data-handling with advances in clinical and medical data intelligence technologies is for pharma to achieve the highest levels possible for data optimization. This, if reached, according to a McKinsey study, “can bring medicines to market 500 days faster, which would create a competitive advantage within increasingly crowded asset classes and bring much-needed therapies to patients sooner.  The study further adds that to transform drug development, this acceleration can be combined with improved quality and compliance, enhanced patient and healthcare professional experience, better insights and decision-making, and a reduction in development costs of up to 25%.” (*

Clearly, pharma has a mandate to find solutions to the data dysfunction conundrum that has existed for far too long.  This can be reached by changing mindsets about how to truly make data an asset and adopting technologies that can turn data assets into robust data insights.


Back to Articles