Thursday, July 17, 2025

Predicting Disease Progression Risk in Cutaneous Squamous Cell Carcinoma with Explainable Federated Deep Learning

Share

Understanding Patient Cohorts in cSCC Research

Introduction to cSCC and Cohorts

Cutaneous Squamous Cell Carcinoma (cSCC) is one of the most prevalent forms of skin cancer, often arising from prolonged sun exposure. In light of its clinical significance, understanding the disease’s progression and patient outcomes is critical. To delve deeper into cSCC’s complexities, researchers at various esteemed institutions have gathered extensive patient cohorts. This article specifically highlights three cohorts from the University Hospital Cologne, University Hospital Bonn, and Technical University Munich, providing a comprehensive look at their methodologies and findings.

The Cologne Cohort: A Retrospective Analysis

From January 2009 to May 2019, the Department of Dermatology at the University Hospital Cologne examined all patients diagnosed and treated for primary cSCC through excision. This initial training cohort consisted of 219 annotated tumors from 166 patients. Clinico-pathological parameters were meticulously collected from medical records and pathology reports. An active follow-up for disease progression—defined as local recurrence, lymph-node, or distant metastasis within two years of diagnosis—was implemented.

Among the 219 tumors analyzed, 63 demonstrated disease progression, while 156 remained non-progressive. Tumor samples underwent histological examination via Hematoxylin-Eosin (HE) staining, a critical step in ensuring accurate classification and understanding of the disease pathways.

The Bonn Cohort: A Multi-Departmental Approach

Shifting to the University Hospital Bonn, this cohort included patients treated for cSCC from March 2012 to September 2021. Here, tumors were excised in both the Department of Dermatology and the Department of Oral and Maxillo-facial Surgery. Pathological assessments were conducted following standard procedures, leading to the identification of 23 primary cSCC cases with eventual disease progression.

For comparative purposes, a randomized selection of 35 primary cSCC cases without disease progression was generated. The final Bonn cohort comprised 291 whole-slide images (WSIs) from 35 patients, with progression annotated in 21 tumors and non-progression in 14. This multi-departmental strategy aimed to enhance data robustness, further enriching the research landscape surrounding cSCC.

The Munich Cohort: Extending the Research Horizons

At the Technical University Munich, researchers focused on assembling a cohort of patients with primary cSCC while carefully controlling for disease progression outcomes. They identified patients with progression and formed a reference group of non-progressive cases. A total of 51 tumors were analyzed, producing 129 WSIs, distributed into 22 tumors with progression and 29 without.

This approach was designed to clarify the relative risk factors associated with progression, employing a similar methodology to that used in the Cologne and Bonn cohorts. By maintaining rigorous collection and follow-up procedures, the Munich cohort adds valuable information to the overall dataset concerning cSCC.

Ethical Considerations and Data Collection

A vital aspect of this research was the ethical oversight ensuring compliance with the Declaration of Helsinki. Institutional Review Boards at Bonn, Cologne, and Munich reviewed the studies comprehensively, approving multiple votes that covered the ethical guidelines and confirmed that the need for informed consent was waived due to the use of anonymized retrospective data.

The collated clinicopathological parameters across all three cohorts provide an in-depth understanding of relevant factors influencing cSCC’s progression, as outlined in their data tables.

Advanced Data Processing and Classification Techniques

To maximize the potential of the collected WSIs, sophisticated data processing techniques were employed. Using a NanoZoomer Slide Scanner, the slides were digitized at 40x resolution. A thorough filtering process removed any slides lacking sufficient tumor tissue, ensuring the quality of the data collected.

The final dataset for training the federated deep learning model consisted of 214 slides from the Cologne cohort, alongside 133 from Bonn and 113 from Munich. With an equal balance of slides indicating progression and non-progression, a stratified fashion split was performed at the patient level, facilitating training, validation, and testing.

Pre-processing and Feature Extraction

Pre-processing steps involved tiling each WSI into patches, facilitating manageable analysis at high magnification levels. This approach filtered out any patches with less than 50% tissue, thus enriching the datasets used for model training.

Each patch generated a feature vector representation using a pre-trained EfficientNet-v2-L16 model, allowing seamless computation of features from approximately 11,330 patch-based vectors on average.

Classification Using Multiple Instance Learning

The classification model utilized for this study employed a multiple instance learning approach. Each WSI was treated as a sequence of feature vectors derived from its non-empty image patches. For improved efficiency, a transformer model with pre-training weights was deployed and fine-tuned.

This strategic choice minimized compute and memory usage while enabling effective classification. Model training occurred in 50 rounds using a Federated Averaging strategy, with meticulous attention to model selection based on validation outcomes.

Explaining Classification Outcomes

Beyond simple classification, understanding what drives the classifier’s decisions can provide invaluable insights into the biological underpinnings of cSCC. Integrated Gradients (IGs), a method of deep learning explainability, were employed to identify key regions within WSIs contributing to the model’s predictions.

This multifaceted approach not only elucidates the areas of interest in the tumor microenvironment but also enables researchers to compute community-defined features of cellular composition, opening doors to potential biological discoveries in cSCC.

Innovative Measures of Spatial Analysis

To deepen the understanding of cellular arrangements within tumors, spatial autocorrelation and Average Nearest Neighbor Ratio (ANNR) were computed using Join Count analysis. By assessing cell clustering and neighboring relations, this analysis provided rich data on how different cell types interact within the tumor microenvironment.

Statistical Analysis and Prognostic Factors

Survival analysis and associations with clinico-pathological variables were explored extensively across all cohorts. Logistic regression models calculated odds ratios associated with disease progression risk, shedding light on crucial prognostic factors that may influence patient outcomes.

Kaplan-Meier methods and Cox proportional hazard models further illustrated the relationships between various variables and survival rates, thus enhancing the overall understanding of cSCC prognosis.

Conclusion: The Future of cSCC Research

As we leverage the robust data collected from these varied cohorts, the potential for groundbreaking discoveries in cSCC treatment and prognosis grows. Through meticulous research methodologies and advanced computational approaches, the understanding of this prevalent skin cancer is not only becoming clearer but also potentially guiding the development of targeted therapeutic strategies.

Read more

Related updates