Thursday, October 23, 2025

Extracting COVID-19 Long-Haul Symptoms from Clinical Notes Using Hybrid NLP

Share

Ethics Oversight Institute Review Board Approval and Its Role in Biomedical Research

Introduction to Ethics Oversight

In conducting biomedical research, ethical oversight is critical to ensure the integrity of studies and the safety of participants. The Ethics Oversight Institute Review Board (IRB) plays a pivotal role in this process, providing a framework to evaluate research protocols under strict guidelines. This framework was utilized in a recent study under the Biomedical Research Alliance of New York (BRANY), protocol #21-08-508, which focused on analyzing clinical notes within the RECOVER network.

Protocol and Ethics Approval

The study received IRB approval from BRANY, allowing for ethical oversight while waiving the necessity of consent and Health Insurance Portability and Accountability Act (HIPAA) authorization, underlining the importance of streamlined methodologies in large-scale studies. This approach ensures that valuable patient data can be utilized efficiently while adhering to ethical standards. Specifically, all processes followed the principles outlined in the Declaration of Helsinki, ensuring that all research adhered to internationally accepted ethical norms.

Study Cohort

The data collection for this research engaged an impressive cohort of clinical notes, amassing 47,814 records from 11 leading medical institutions within the RECOVER network. These included Weill Cornell Medicine, the Medical College of Wisconsin, Cincinnati Children’s Hospital Medical Center, and others, reinforcing the collaborative effort among various reputable centers.

Data Curation and NLP Pipeline Evaluation

A core aim of the study was to curate data specifically for evaluating symptoms related to Post-Acute Sequelae of SARS-CoV-2 infection (PASC). To facilitate this, researchers sampled intake progress notes from 60 ambulatory patients at Weill Cornell Medicine and another 100 patients across 10 additional sites, ensuring a robust dataset rich in symptom-related information. The workflow for data construction is meticulously designed to enable rigorous model validation through multiple testing datasets.

Annotation Process: Ensuring Quality and Accuracy

To extract meaningful information from clinical notes, the study implemented a meticulous annotation pipeline. This involved using MedText, a sophisticated tool developed to efficiently extract symptoms from notes, coupled with Screen-Tool—open-source software enabling annotators to assess the context of symptom mentions. Two independent annotators were tasked with reviewing the symptom mentions to ensure a high level of inter-rater agreement, achieving impressive Cohen’s Kappa scores of 0.98 and 0.99 for internal and external validation sets, respectively. This meticulous attention to detail underscores the study’s commitment to data integrity.

Lexicon Construction for PASC Symptoms

A pivotal part of the study involved creating a comprehensive lexicon to identify PASC-related symptoms. Initial efforts focused on compiling terms through expert input and literature reviews, categorizing 798 symptom sub-concepts into 25 distinct categories. This lexicon served as the foundation for the Natural Language Processing (NLP) workflows, allowing researchers to leverage standardized vocabulary for consistent symptom extraction.

Development of the Hybrid NLP Pipeline

An innovative hybrid NLP pipeline was developed, combining rule-based methods with deep learning techniques to enhance symptom identification from clinical notes. MedText played a crucial role, employing modules for text preprocessing, named entity recognition (NER), and assertion detection. This multifaceted approach allowed researchers to analyze clinical notes in a structured manner while maintaining flexibility across diverse data types and settings.

Assertion Detection Module: Fine-tuning with BERT Models

To ascertain the assertion status of extracted symptoms, the study employed three popular pretrained BERT models—BioBERT, ClinicalBERT, and BiomedBERT. Training these models involved merging a publicly available i2b2 dataset with the Weill Cornell Model Training set to improve the accuracy of assertion detection. A key innovation was the transformation of the multi-status prediction task into a binary classification task, simplifying the model’s focus on detecting "present" versus "absent" symptom mentions.

Evaluation of Model Performance

The evaluation strategy for the NLP models considered both numerical performance using validated sets and prevalence studies across a larger population of clinical notes from the RECOVER network. Metrics such as precision, recall, and F1 score were meticulously analyzed, revealing impressive performance levels indicative of the pipeline’s robustness in symptom detection. Furthermore, demographic data across the patient population provided insightful context for symptom prevalence studies, facilitating a deeper understanding of PASC impacts.

Integration of Large Language Models

To further enhance the extraction process, the study explored the capabilities of advanced large language models like GPT-4. This experiment aimed to determine the model’s effectiveness in symptom extraction and assertion detection without prior exposure to explicit lexicons. By analyzing a subset of intake notes, researchers could compare and contrast the performance of traditional rule-based methods with the nuances of LLMs, revealing complementary strengths in symptom identification.

Summary of Findings

The study’s design and its innovative approach to extracting and analyzing PASC symptoms reflect a significant advancement in utilizing clinical notes for public health insights. The meticulous methodologies employed, from IRB oversight to the development of a comprehensive NLP pipeline, illustrate the potential for large-scale data analysis to enhance our understanding of post-acute COVID-19 outcomes. Researchers are paving the way for future studies, fostering collaboration and data sharing in the biomedical research field. This effort not only addresses current clinical challenges but also sets a precedent for ethical standards and methodological rigor in future endeavors.

Read more

Related updates