Understanding Axillary Lymph Node Metastasis in Breast Cancer Patients: Insights from Recent Research
Ethical Standards and Patient Recruitment
This study adhered to the ethical standards set forth by the Declaration of Helsinki and received approval from the institutional review board at Nanjing Drum Tower Hospital. Being retrospective in nature, it allowed for a waiver of informed consent requirements, making it easier to gather necessary data.
The research involved 820 individuals diagnosed with histologically confirmed primary breast cancer from three different hospitals. A structured approach was taken to recruit these patients, as illustrated in the recruitment flowchart. The training cohort was made up of 621 breast cancer patients from Nanjing Drum Tower Hospital, recruited between April 2016 and June 2022. Validation Cohort 1 included 112 patients from Jinling Hospital, enrolled between December 2017 and November 2021, while Validation Cohort 2 comprised 87 patients from Jiangbei Hospital, recruited between December 1, 2019, and June 30, 2022.
Patient Evaluation Criteria
The study took into consideration a broad range of clinical factors, including clinical-pathological features and ultrasound (US) results related to both the breast and axilla. Key demographic and clinical characteristics examined included:
- Age
- Body Mass Index (BMI)
- Progesterone Receptor (PR) status
- Estrogen Receptor (ER) status
- Ki-67 expression
- Human Epidermal Growth Factor Receptor 2 (HER2) expression
- Tumor categorization
- Nuclear grade
- Surrogate subtype
Ultrasound findings included tumor location, breast lesion size, Breast Imaging Reporting and Data System (BI-RADS) category, and ALN status based on axillary ultrasound.
Ultrasound Examination Methodology
Every patient underwent a standard two-dimensional ultrasound examination carried out by experienced radiologists with at least five years of expertise. The ultrasound machines used included Siemens S3000, Philips IU22, GE Healthcare LOGIQ E9, among others. During the scans, patients were positioned comfortably, which facilitated a thorough examination of both the breast and axilla. All imaging data were meticulously stored using Picture Archiving and Communication Systems (PACS).
This method aimed to ensure that the anatomy was visualized accurately without artifacts for further analysis.
Data Preprocessing and Feature Extraction Techniques
The imaging repository was reviewed for relevant breast cancer ultrasound images, correlating ALNM status with the imaging data. Unnecessary image channels were eliminated through grayscale conversion, while bounding boxes were annotated by a radiologist. The original ultrasound images were normalized and resized, optimizing them for further processing.
For feature extraction, ResNet50 was employed due to its reputation for yielding superior deep learning features. The model’s architecture is equipped with identity mapping to tackle the challenges usually encountered in deeper networks, such as vanishing gradients.
Using Python and Keras for preprocessing, ResNet50 enabled extraction of 2048 deep learning features from each breast cancer ultrasound image.
Recursive Feature Elimination for Enhanced Predictive Power
Feature selection was carried out using Recursive Feature Elimination (RFE), which systematically reduced the number of features based on their relevance to the prediction goals. An important aspect of this process was normalizing the features to prevent bias and overfitting, allowing the model to focus only on the most informative elements.
Spearman’s rank correlation was employed to spot any redundancies among features, ensuring that only essential characteristics were retained for training. Using Recursive Feature Elimination with Cross-Validation (RFECV), a linear SVM was utilized to finalize the most predictive features related to ALNM.
Graph Construction to Represent Patient Data
A comprehensive feature table was created, incorporating 24 columns of data for each testing group. Each row represented a unique data record (or node) with attributes allowing the establishment of relationships based on cosine similarity.
This resulted in a complex graph structure, denoting how closely various patient data points were related to each other based on selected features. A subset of this graph was visualized to facilitate analysis.
Utilizing Graph Convolutional Networks for Model Creation
For model formulation, a Graph Convolutional Network (GCN) was leveraged, which facilitates learning from graph-structured data. This model architecture consists of multiple convolutional layers that focus on the relationships between nodes to extract pertinent features.
The GCN included dropout layers to minimize overfitting and fully connected layers for output forecasts. The one-hot encoding strategy was employed for the classification of ALNM statuses during the training phase.
Training and Evaluation Metrics
Training was meticulously designed with a structured learning environment utilizing the Adam optimizer. A focus was placed on minimizing overfitting through an early stopping strategy based on validation loss. Various performance evaluations were conducted to establish model efficacy:
- Area Under the Curve (AUC)
- Confusion Matrices
- Precision, Sensitivity, and Specificity
- Negative Predictive Value (NPV) and Positive Predictive Value (PPV)
These metrics provided insights into the model’s robustness and accuracy in predicting ALNM in breast cancer patients.
Statistical Analysis and Comparative Evaluations
Statistical comparisons were performed using various methods tailored to the data, including chi-square tests and ANOVA. Recognition of differences in model performance was critical, further validated through techniques like Decision Curve Analysis (DCA).
Heatmaps illustrated statistically significant differences across various model treatments, ensuring a comprehensive understanding of their relative strengths and weaknesses.
This study stands as a significant contribution to the field of machine learning in medical diagnostics, particularly in predicting axillary lymph node metastasis in breast cancer patients, showcasing the power of graph-based approaches and deep learning techniques combined with extensive clinical evaluation metrics.