Skip to main content

Article

Unlock clinical insights with research-critical NLP technology

Learn about natural language processing (NLP) challenges and opportunities in the real-world data (RWD) space from Optum® NLP experts

September 2024 | 7-minute read

The nuanced, messy, exciting behind-the-scenes of NLP models

Typically, manually reviewing unstructured data from electronic health records (EHRs) isn’t feasible due to the time commitment required to sort through thousands or millions of clinical notes in a patient cohort. But these data provide valuable information about the patient journey, including details about symptom severity, patient lifestyle factors and other variables that can help support the commercialization of your product.

That’s where natural language processing technology comes in. NLP models are crucial for helping life sciences researchers extract insights from free-text EHR data. That’s why Optum® Life Sciences curates EHR data using proprietary NLP systems to help you understand the progression of diseases with details often not readily available in structured formats.

Dive deeper into the challenges and importance of using innovative NLP models when working with clinical data in this interview with two Optum experts: Vikash Verma, BPharm, MBA, Director, Data Science and Anne-Marie Currie, PhD, Senior Director, Data Science.

Why is it difficult to build NLP models?

Vikash: Building NLP models using clinical data and unstructured notes presents several challenges for us as researchers.

The contextual ambiguity of clinician notes makes things tricky because we must develop NLP algorithms that can correctly recognize and differentiate clinical elements such as medications, diseases, procedures and tests. In addition, they must consider their timing, presence and relation to the patient or their family. These models must also handle broader contexts such as idiomatic expressions, cultural references and domain-specific jargon. This requires advanced algorithms and diverse training data.

Language complexity is a factor too, given that notes may deviate from standard English since clinicians often write hastily in patient records. Notes may include irregular grammar, medical jargon, acronyms, abbreviations and languages other than English. We can’t forget that these notes exist for reimbursement purposes and clinical care documentation, not life sciences research.

Then there’s the diversity and quality of unstructured data. Notes from EHRs differ greatly in structure and format across individual clinicians, health systems and EHR vendors. There’s no set “standard” for the quality of data contained in clinical notes — there may be documentation gaps for certain clinical elements. And the data extracted from notes can significantly differ from academic and publicly available data sets, with real-world data (RWD) being more variable and noisier.

The lack of standardized guidelines for training NLP models on RWD, coupled with the variability in medical nomenclature, poses significant challenges. NLP models need extensive training to work effectively with clinical data. But there’s a scarcity of relevant data for training these models.

Scalability can also be a concern because more advanced models — particularly those employing deep learning — are resource-intensive. The computational resource requirements limit the scalability and accessibility of advanced NLP models, especially for smaller organizations. The process of annotating and curating data for NLP is labor-intensive and requires expert knowledge, making the development of robust NLP systems even more costly and complex.

Ethical concerns and the risk of bias are also relevant, especially if NLP models are trained on a small fraction of actual clinical data. These models may underperform in real-world applications and may replicate existing biases from training data. This can lead to discriminatory effects in areas such as designing health insurance plans and determining treatment outcomes. This highlights the importance of developing and using NLP technologies in a compliant, legal and responsible manner.

Broadly speaking, how do you tackle these challenges?

Anne-Marie: We can’t stress enough how important it is to take a multidisciplinary approach to building NLP models. The collective expertise at your organization is invaluable. Addressing these difficulties requires input from technical, clinical, scientific, legal and compliance partners. And then there’s ethical and cultural aspects to consider. The literature can be helpful in addition to subject matter experts at your organization with niche clinical and legal backgrounds.

To build models that can overcome these challenges, you need deep domain expertise driving the development of your NLP models. This knowledge is vital for answering research questions and training accurate models that consider the subtleties and nuances in clinical notes.

Some challenges require specific solutions. For example, to mitigate the risk of bias, we require bias testing in our evaluation practices.

As NLP continues to evolve, input from each of these domains will play a critical role in shaping the future of how we understand and use clinical data in life sciences research. 

Why do you need and use multiple models?

Vikash: We carefully select and fine-tune the most appropriate model for each specific task or use case. And while our models share some similarities, their architectural differences, training objectives and capabilities can significantly affect their performance and suitability for certain tasks.

For example, we may use a BERT-based model, such as BioBERT, for named entity recognition and relation extraction tasks involving clinical notes. A GPT-based model may be better for text generation and summarization tasks. And we may employ domain-specific models like PubMedBERT for tasks involving biomedical literature.

With a diverse portfolio of NLP models that incorporate the latest advancements in this field, our team is prepared to deliver tailored solutions that address the unique challenges and requirements of the health care industry.

How do you make decisions when designing models?

Anne-Marie: Our team makes several key decisions as they design clinical data-based NLP models. These decisions are driven by our strong understanding of the clinical domain and our commitment to developing models that directly address real-world patient care needs.

First and foremost is protecting patient privacy. We can’t advance research without protecting patients through practices that promote responsible use and adhere to legal and privacy compliance guidelines. Knowing that enterprise privacy guardrails are in place gives us peace of mind when working with real-world patient records.

Data selection and pre-processing of the data used for training our models is one of the first big steps. Our teams prioritize data sets that are diverse in terms of clinical scenarios, specialties and writing styles, so that the model can generalize to data encountered in a larger batch of notes. Data cleanliness is important too; our teams meticulously clean the data to remove errors, noise and inconsistencies that could negatively affect how the model performs.

Annotation strategy is another key decision. We employ clinical subject matter experts to meticulously annotate the data for accurate labeling of entities, relations and other relevant information. We measure inter-rater reliability and employ a double-blind method of annotation best practice to ensure the reliability of our ground-truth data — which contributes to a more robust NLP model. Using domain-specific ontologies and knowledge bases helps standardize and unify the representation of clinical concepts across different data sources.

Model architecture and training is the next critical juncture. We use a variety of approaches that combine the strengths of generative models and traditional deep learning as well as discriminative models tailored to specific medical conditions or problem statements. By fine-tuning these models on our annotated data, we can capitalize on the domain-specific knowledge embedded in these representations. All of which results in robust and accurate performance.

Finally, we emphasize rigorous evaluation and deployment strategies. Our team uses effective sampling mechanisms and comprehensive pattern analysis to ensure that models can handle the full spectrum of variations present in clinical data. 

What’s it been like creating NLP models with Optum data?

Vikash: Optum clinical data have some unique attributes that support us in building accurate, reliable NLP models. The diversity and complexity of the data — including the wide range of clinical scenarios, medical specialties, writing styles, document structures and linguistic complexities — helps models better generalize to real-world populations.

Plus, with wide domain coverage, our teams know the data asset will adequately cover the target domain for a given study. This is crucial to cover all relevant medical terminology, abbreviations, procedures, treatments and other domain-specific knowledge. We need the model to effectively understand and process clinical language relevant to our research.

These properties ensure that our resulting NLP models will be better equipped to handle the complexities of clinical language, leading to more accurate and reliable performance in real-world health care applications.

Why is it important to pair deep clinical expertise with AI?

Anne-Marie: Combining artificial intelligence (AI) capabilities with clinical expertise enables the models to reason more like clinicians. Meaning they can comprehend not just the literal text, but the underlying intent and implications as well.

Plus, health care applications demand a high degree of accuracy and trustworthiness. Incorrect interpretations or decisions can have consequences for patient safety and outcomes.

By incorporating deep clinical knowledge into the model development process, subject matter experts can validate the clinical validity of the models' outputs and resolve ambiguities. This helps ensure that the extracted information is grounded in medical evidence and best practices. 

What’s complicated about companies doing this work themselves?

Vikash: Researchers may encounter annotation challenges. Annotating large data sets for NLP model training is a time-consuming and resource-intensive process, regardless of the underlying data source.

Then there’s the computational resources required. Training NLP models, especially large language models (LLMs), requires significant computational resources, including powerful hardware (GPUs/TPUs) and infrastructure. This computational overhead can be a bottleneck for organizations with limited resources.

The maintenance and updates required to maintain the accuracy and relevance of NLP models also requires significant time and resources. Clinical practices evolve over time, so models will quickly become obsolete if maintenance isn’t prioritized.

While we’ve seen life sciences companies successfully complete this work in-house, it requires careful data curation, robust annotation practices and significant computational resources to develop and maintain accurate and generalizable NLP models for clinical applications. That’s why the Optum NLP team is dedicated to making these projects possible. We want researchers to get the enriched clinical details they need to move the needle on impactful research studies.

Infuse your research with expertly curated clinical details

NLP technology helps empower life sciences companies to unlock patient insights previously hidden in clinician notes. With expanding NLP applications, researchers can access more holistic patient stories across therapeutic areas, beyond what was previously possible.

However, the development, upkeep and use of these models is challenging. Researchers must be prepared to regularly refine NLP models, evaluate models rigorously, adhere to stringent privacy and data quality standards and commit to ongoing learning.

Want to skip the model development process and dive straight into the research? Optum Life Sciences offers off-the-shelf data, pre-enriched via NLP — powered by our foundational clinical data. If you need additional clinical details to address a specific question, custom projects are always an option too.

There's a lot to consider, but beginning to understand what makes sense for your organization is the first step on the path to unlocking a new realm of research possibilities. 

Related healthcare insights

View all

White papers

Natural language processing (NLP) methods for RWD insights

NLP systems allow us to draw insights from billions of unstructured medical notes. Learn more about our NLP methodology in this white paper.

Articles

Integrated RWD: the key to accelerating innovation

To maximize the value of clinical innovations, you need dimensional, robust data. Learn about leveraging integrated cost and clinical data.

Articles

The value of clinical notes: Beyond the structured data field

What more is there to the patient story? Get deeper data insight into patient-clinician interactions with de-identified clinical notes.