Thursday, October 23, 2025

Assessing the Impact of Tuberculosis Preventive Therapy on ART Adherence Using Causal Forest Double Machine Learning

Share

Data Source and Study Population: Insights into HIV Care in Ethiopia

Overview

In the fight against HIV/AIDS, understanding the dynamics of patient care and treatment adherence is crucial. A recent study conducted at the University of Gondar Comprehensive and Specialized Hospital in Ethiopia utilized the electronic medical record system known as EMR-ART to gather vital data. This research provides a comprehensive analysis of individuals diagnosed with HIV who initiated antiretroviral therapy (ART) from March 2005 to December 2024.

Source Population

The study drew its source population from all patients attending the ART clinic at the hospital during the specified period. With a runtime of nearly two decades, this extensive dataset captures the evolution of HIV treatment and patient management over time.

Inclusion Criteria

To ensure the reliability of the findings, specific inclusion criteria were established:

  • Age: Patients aged 15 years and older at the initiation of ART.
  • Health Status: Confirmed HIV-positive diagnosis.
  • Data Availability: Complete baseline clinical and laboratory information documented at the start of treatment.

Conversely, patients were excluded from the analysis if they fell into any of the following categories:

  • Missing baseline data.
  • Referred from other facilities without a complete clinical record.
  • Lost to follow-up within the first month after starting ART.

In total, after applying these criteria, a robust group of 4,152 patients remained for final analysis, allowing for a focused exploration of the factors influencing ART adherence.

Analytical Framework

Treatment and Outcome Variables

At the heart of the study were two essential variables:

  • Treatment Variable: TPT Started – This variable indicated whether a patient had been prescribed tuberculosis preventive therapy (TPT) as part of their HIV care regimen. It was coded as:

    • 1 = Yes (TPT initiated)
    • 0 = No (TPT not initiated)
  • Outcome Variable: ART Adherence – To evaluate adherence to antiretroviral therapy, this variable was defined as a binary measure, based on routine follow-up data marked by clinicians or reflected in the percentage of doses taken:
    • 1 = Good adherence (≥95% of doses taken)
    • 0 = Poor adherence (any missed medication or alternative assessments)

Covariates: Potential Confounders

A broad range of potential confounders was included to refine the analysis and understand the factors affecting both TPT initiation and ART adherence.

Demographic Characteristics

  1. Age: Continuous variable reflecting age in years at ART initiation.
  2. Sex: Categorical (Male or Female).
  3. Marital Status: Categorized as Single, Married, Divorced, or Widowed.
  4. Education Level: An ordinal measure indicating the highest level of education completed.
  5. Residence: Identified as Urban or Rural.
  6. Religion: Self-reported religious affiliation.

Clinical Characteristics

  1. WHO Clinical Stage: Categorized based on WHO guidelines at baseline (Stages I–IV).
  2. Duration on ART: Measured in months from starting ART to final follow-up.
  3. Functional Status: Working, ambulatory, or bedridden, indicating patient capacity.
  4. BMI: Calculated from documented weight and height measurements.
  5. Laboratory Data:
    • Baseline CD4 Count: Count measured at ART initiation.
    • Recent CD4 Count: Most recent follow-up CD4 result.
  1. Regimen Line: Indicating whether the patient received a first-line or second-line ART regimen.
  2. CPT Use: A binary indicator for Cotrimoxazole Preventive Therapy initiation (1 = Yes, 0 = No).

These covariates were meticulously selected to adjust for potential selection biases and provide a comprehensive understanding of how various factors influence TPT initiation and ART adherence.

Analytical Approach

To elucidate the causal impact of TPT initiation on ART adherence, three distinct causal inference models were employed:

  1. Adjusted Logistic Regression
  2. Propensity Score Matching
  3. Causal Forest Double Machine Learning (DML)

Each model was applied to the same dataset, with the Average Treatment Effect (ATE) estimated alongside corresponding confidence intervals. This multifaceted approach ensured the robustness of findings and identified the most reliable method for evaluating treatment effects.

Causal Relationships and Graphical Representation

A crucial aspect of causal inference analysis involved establishing clear hypothesized causal relationships between these variables. A causal graph was developed, visually articulating how treatment (TPT initiation), outcome (ART adherence), and confounders interact. This methodology highlighted the impact of clinical factors like WHO stage and CD4 count on both TPT and ART adherence.

Data Processing and Model Training

Prior to analysis, thorough data preprocessing was performed, including:

  • Label Encoding: Transforming categorical variables for machine learning compatibility.
  • Missing Value Processing: Addressing gaps in the dataset either by imputation or exclusion.

The DML model was trained using non-parametric techniques, specifically Random Forest for both outcome and treatment models. This unique approach leveraged machine learning to maintain robustness against potential model specification errors.

Estimation of Treatment Effects

The final step in the analytical journey involved estimating Ï„(X), representing conditional average treatment effects for individual patient profiles. This granular approach provided insights into how patient-specific characteristics modulated the influence of TPT on ART adherence.

Mathematical Modeling Framework

The study employed a sophisticated mathematical framework to encapsulate the relationship between treatment initiation and ART adherence, facilitating a nuanced understanding of treatment effects on diverse patient populations.

Robustness in Evaluation

Keys to robust evaluation included:

  • Cross-validation: Leveraging training and test datasets to minimize overfitting.
  • Feature Importance Analysis: Utilizing permutation-based importance scores and SHAP values to elucidate which covariates most significantly influence treatment effects.

This analytical rigor ensured that the findings were not only statistically significant but also clinically relevant, providing actionable insights into enhancing ART adherence through TPT in the Ethiopian context.

By illuminating the interconnections between TPT initiation, ART adherence, and various demographic and clinical factors, this research contributes to advancing our understanding of effective HIV treatment strategies and ultimately aims to improve health outcomes for people living with HIV.

Read more

Related updates