Patient samples, clinical data, and study cohorts
To ensure robust development, external validation, and clinical applicability of the model, we systematically integrated multiple distinct cohorts representing a range of treatment scenarios and clinical settings, ultimately including a total of 1,249 patients with HER2-positive breast cancer for analysis (Fig. 1a and Table 1).
Fig. 1The alternative text for this image may have been generated using AI.
Cohort description and HER2-LADDER construction workflow. a Overview of the study cohorts, including patient distribution, treatment regimens, and stratification criteria of the model development cohort and validation sets. b Workflow for HER2-LADDER model development. Baseline core-needle biopsies (CNBs) were subjected to digital scanning of paired hematoxylin and eosin (H&E) and HER2 immunohistochemistry (IHC) slides. The HoVer-Net and D-PathAI algorithms segmented tumor–microenvironment cells (cancer cells, stromal cells, lymphocytes, neutrophils, and macrophages) and tumor cell HER2 expression (strong/weak/null with integrity of membrane staining), respectively. Spatial features were extracted via single-cell morphological and topological profiling (sc-MTOP). Selected features and clinicopathological factors (age, clinical stage, and hormone receptor status) were integrated through multimodal ensemble voting to establish the HER2-LADDER predictive score. Created with BioRender.com
Table 1 Baseline clinicopathological characteristics of the cohorts included in the study
As the core datasets for model development and validation, the cohort included a total of 372 HER2-positive breast cancer patients who completed standard-of-care taxane, carboplatin, trastuzumab, and pertuzumab (TCbHP/PCbHP) regimens at Fudan University Shanghai Cancer Center (FUSCC) between 2020 and 2024. After 14 patients without paired baseline H&E and HER2 IHC slides were excluded, 358 patients were eligible for computational pathological analysis. Among them, 276 cases treated from 2020 to 2022 constituted the model construction set, while 82 cases treated after 2023 formed a temporal validation set to assess the reproducibility of HER2-LADDER in more recent real-world patients. The FASCINATE-N trial (NCT05582499) HER2-positive cohort comprised 88 patients treated with PCbHP between December 2022 and February 2024.19 After three cases without available slides of the primary tumor were excluded, 85 patients were included as a trial-based validation set for independent evaluation of HER2-LADDER in a prospective clinical trial setting. In addition, an independent multicenter external validation cohort from Chongqing University Cancer Hospital (CQUCH) included 40 HER2-positive patients treated with neoadjuvant TCbHP/PCbHP, with paired baseline H&E and HER2 IHC slides available for analysis. Across these cohorts, patients exhibited broadly similar baseline characteristics, with median ages between 50.2 and 52.5 years.
To further explore the applicability of the model in nonstandard or adaptive therapeutic scenarios, 490 cases were retrospectively collected from patients treated at FUSCC. A total of 484 patients in these cohorts had paired baseline H&E and HER2 IHC slides available for spatial feature extraction. These included 18 patients who received dose-dense doxorubicin–cyclophosphamide followed by THP (ddAC-THP) and 46 who received THP directly under dual-targeted regimens; 301 patients who received single-targeted regimens with TCbH/PCbH; 80 patients who received SHR-A1811 (trastuzumab rezetecan), a next-generation anti-HER2 ADC composed of a humanized HER2-directed monoclonal antibody, a cleavable tetrapeptide linker, and a DNA topoisomerase I inhibitor payload (FASCINATE-N trial, NCT05582499)19,33; and 39 patients who received docetaxel, trastuzumab plus the pan-ErbB TKI pyrotinib (PHEDRA trial, NCT03588091).20
Finally, to assess the prognostic value of the model in the adjuvant setting, we analyzed a separate cohort of patients from FUSCC who received dual-targeted chemotherapy after surgery between 2018 and 2020. Among them, 157 received TCbHP, and 125 received ddAC-THP, with follow-up exceeding 5 years, to evaluate long-term outcomes, including OS and disease-free survival (DFS).
Establishment of the HER2-LADDER model
The HER2-LADDER model was developed using real-world HER2-positive breast cancer cohorts treated with standard neoadjuvant TCbHP/PCbHP. Paired H&E and HER2 IHC slides from these biopsies underwent digital scanning. Two specialized deep-learning algorithms were subsequently employed for detailed single-cell segmentation (Fig. 1b): HoVer-Net distinguished components of the tumor–microenvironment—including tumor cells, stromal cells, lymphocytes, neutrophils, and macrophages—on H&E slides, while D-PathAI characterized tumor cells on HER2 IHC slides according to HER2 membrane staining intensity and completeness.
Following cellular segmentation, a single-cell morphological and topological profiling (sc-MTOP) framework was applied to systematically extract spatially resolved single-cell features. This procedure generated a total of 69 features from H&E images and 70 from HER2 IHC images, encompassing cell proportions, spatial interactions, and cellular distributions. As summarized in Supplementary Table 1, these spatial features were broadly classified into three functional categories: (i) proportion, which represents the compositional ratios of specific cell types within the tissue; (ii) Nsubgraph, which indicates the number of cells within defined local cellular clusters, reflecting aggregation tendencies; (iii) edge-length metrics (MinEdgeLength and MeanEdgeLength), which quantify the spatial distance between specific cell types; and (iv) degree, which captures the connectivity and centrality of individual cells within their immediate cellular neighborhoods.
These digital pathology-derived spatial features were subsequently integrated with critical clinical-pathological variables, including patient age, clinical stage, and hormone receptor (HR) status. Through systematic data splitting, feature engineering, multimodel hyperparameter tuning, and ensemble voting, we finalized the HER2-LADDER predictive scoring model (Supplementary Fig. 1). Model performance, generalizability, and clinical utility were subsequently validated across multiple real-world and trial-based patient cohorts.
Predictive performance of the HER2-LADDER model in neoadjuvant settings
Following the modeling workflow detailed in Fig. 1b, the HER2-LADDER model demonstrated robust and consistent predictive performance across both the construction and validation sets. Specifically, the area under the curve (AUC) of the model was 0.944 (95% CI, 0.921–0.962) in the model construction set (Fig. 2a), 0.903 (95% CI, 0.829–0.955) in the independent temporal validation set (Fig. 2b), and 0.869 (95% CI, 0.814–0.929) in the FASCINATE-N PCbHP set (Fig. 2c). Similarly, the average precision (AP) scores, which reflect the model’s discriminative power in imbalanced classification settings, were 0.917 (95% CI, 0.877–0.946) for model construction (Fig. 2d), 0.861 (95% CI, 0.745–0.929) for independent temporal validation (Fig. 2e), and 0.815 (95% CI, 0.684–0.905) for FASCINATE-N PCbHP validation. Decision curve analysis (DCA) further demonstrated that the voting ensemble model consistently outperformed its individual submodel components in both cohorts, highlighting the superior net clinical benefit and the advantage of integrative modeling approaches (Fig. 2g–i). Confusion matrix analyses demonstrated consistent classification accuracy and reproducibility across cohorts, with Cohen’s kappa coefficients ≥0.76 (Supplementary Fig. 2a–c).
Fig. 2The alternative text for this image may have been generated using AI.
Comprehensive evaluation of the efficacy prediction performance of the HER2-LADDER model. a–c Receiver operating characteristic (ROC) curve illustrating HER2-LADDER model performance in the model construction (a) (AUC = 0.944), temporal validation (b) (AUC = 0.903), and FASCINATE-N PCbHP (trial-based validation) (c) (AUC = 0.869) cohorts. d–f Average precision (AP) curve of the HER2-LADDER model in the model construction (a) (AP = 0.917), temporal validation (b) (AP = 0.861), and FASCINATE-N PCbHP validation (c) (AP = 0.815) cohorts. g–i Decision curve analysis (DCA) highlighting the clinical net benefit of the ensemble voting model compared with that of individual submodels across the model construction (g), temporal validation (h), and FASCINATE-N PCbHP (i) cohorts. j Forest plot of logistic regression analysis showing odds ratios (ORs) for predicting treatment resistance for HER2-LADDER score (per 0.1-unit increment) and traditional clinical-pathological factors; the multivariate model was performed on the combined model construction set (N = 276) and temporal validation set (N = 82) and was adjusted for age and clinical stage
In addition, HER2-LADDER achieved a sensitivity of 0.8716 and specificity of 0.8933 in the model construction set, 0.8966 and 0.8600 in the temporal validation set, and 0.8173 and 0.8947 in the trial-based validation set, indicating balanced discrimination across cohorts. Calibration curve analysis further confirmed the agreement between the predicted and observed probabilities across all cohorts (Supplementary Fig. 2d). The Brier scores were 0.117 for the model construction set, 0.150 for the temporal validation set, and 0.155 for the FASCINATE-N PCbHP cohort, all of which are indicative of reliable probabilistic calibration. In an independent real-world cohort treated with the neoadjuvant ddAC-THP regimen, the discriminative performance of the model was stable, with an AUC of 0.852 and a Cohen’s kappa of 0.66 (Supplementary Fig. 2e, f). Furthermore, in the independent CQUCH cohort, HER2-LADDER achieved an AUC of 0.798, demonstrating preserved discriminative performance when extrapolated to an external medical center (Supplementary Fig. 3). These results indicate stable model behavior and support the generalizability of the predictive framework.
The predictive strength of the HER2-LADDER score was further supported by logistic regression analyses conducted on the combined set across 358 real-world neoadjuvant TCbHP/PCbHP patients (Fig. 2j). Each 0.1-unit increase in the HER2-LADDER score increased the odds of a nonpathological complete response (non-pCR) after TCbHP/PCbHP treatment nearly sevenfold (OR 6.628, 95% CI: 4.445–9.883, P = 1.706 × 10−20), outperforming conventional clinicopathologic predictors such as HER2 IHC 3+ status (OR 0.112, 95% CI: 0.05–0.249, P = 7.867 × 10−8), HR negativity (OR 0.217, 95% CI: 0.126–0.347, P = 1.540 × 10−10), and TILs score (OR 0.839, 95% CI: 0.686–1.027, P = 0.089). This association remained robust in the multivariable logistic regression after adjusting for age and clinical stage (adjusted OR 6.923, 95% CI: 4.537–10.563; P = 2.837 × 10−19). We further compared the discrimination ability of HER2-LADDER with that of conventional clinicopathologic variables using AUC and DeLong tests (Supplementary Fig. 4). Compared with individual clinicopathologic predictors, HER2-LADDER consistently achieved higher AUCs, with DeLong tests confirming its statistically significant superiority across all cohorts and demonstrating the superior and consistent predictive value of HER2-LADDER over traditional clinicopathologic factors. Together, these results highlight the strong predictive performance, clinical robustness, and potential of the model to inform neoadjuvant treatment decisions involving trastuzumab- and pertuzumab-based dual HER2 blockade combined with chemotherapy.
Layered treatment recommendation and optimization informed by the HER2-LADDER model
To examine the potential clinical applicability of the HER2-LADDER model across distinct anti-HER2 therapeutic scenarios, we evaluated its predictive associations in neoadjuvant cohorts receiving different treatment regimens.
With respect to monoclonal antibody-based regimens, logistic regression analyses revealed that HER2-LADDER scores were significantly correlated with treatment responses in both dual- and single-targeted anti-HER2 regimens (Fig. 3a). Notably, the TCbHP/PCbHP treatment cohort, comprising patients from the model construction set (N = 276), temporal validation set (N = 82), and trial-based validation set (N = 85), demonstrated a strong association between HER2-LADDER scores and treatment response (OR 5.349, 95% CI 3.899–7.338; P = 2.620 × 10−25). Specifically, significant associations were observed for dual HER2 blockade with taxane monochemotherapy (THP) (OR 5.224, 95% CI 1.597 − 17.084; P = 0.006) and for single-targeted TCbH/PCbH regimens (OR 2.623, 95% CI 2.033 − 3.385; P = 1.177 × 10−13). Consistent discrimination was observed across these groups, with an AUC = 0.842 for THP and an AUC = 0.785 for TCbH/PCbH (Fig. 3b). These results indicate that HER2-LADDER consistently predicts treatment response across monoclonal antibody-based anti-HER2 regimens, with notably stable performance in the de-escalated THP regimen.
Fig. 3The alternative text for this image may have been generated using AI.
Layered stratification by the HER2–LADDER model informs treatment de-escalation or alternative strategies. a Logistic regression analysis demonstrating significant correlations between the HER2-LADDER score and therapeutic outcomes across diverse dual-targeted (TCbHP/PCbHP, ddAC-THP, and THP), single-targeted (TCbH/PCbH), and alternative anti-HER2 regimens (SHR-A1811, TH plus pyrotinib). b ROC curves comparing the HER2-LADDER predictive performance of single-agent chemotherapy with that of dual-targeted therapy (THP) and that of dual-agent chemotherapy with that of single-targeted therapy (TCbH/PCbH). c Tertile-based stratification of HER2-LADDER scores defining Low, Medium, and High groups aligned with therapeutic strategies. d Comparative analysis of pathological complete response (pCR) rates across HER2-LADDER-defined groups and different anti-HER2 treatment regimens (TCbHP/PCbHP, THP, TCbH/PCbH, SHR-A1811, TH plus pyrotinib), with statistical analysis by the chi-square test and Bonferroni adjustment. e Summary schematic illustrating treatment planning informed by HER2-LADDER scores: HER2-LADDER-Low patients are appropriate for de-escalation (THP or TCbH/PCbH), HER2-LADDER-Medium patients benefit from standard dual-targeted therapy (TCbHP/PCbHP), and HER2-LADDER-High patients may require alternative options such as ADCs or TKIs
To explore whether HER2-LADDER retains predictive relevance beyond monoclonal antibody-based regimens, we examined its association with treatment outcomes in cohorts receiving alternative anti-HER2 therapies. In the SHR-A1811 (novel anti-HER2 ADC) cohort, HER2-LADDER expression was moderately but significantly correlated with treatment outcomes (OR = 1.741, 95% CI 1.175–2.579; P = 0.006). In contrast, no significant association was detected in the TH plus pyrotinib cohort (OR = 1.107, 95% CI 0.549–2.234; P = 0.776). These exploratory findings suggest that while the model retains partial predictive relevance in ADC-treated tumors, its applicability does not extend to TKI-based therapies. This discrepancy likely reflects intrinsic biological and mechanistic differences between ADCs and small-molecule TKIs relative to standard monoclonal antibody-based regimens.
To explore whether HER2-LADDER scores could facilitate response interpretation across different anti-HER2 regimens, we stratified patients into three groups on the basis of score distribution within the model construction cohort. Using a tertile-based division, the entire score range was objectively partitioned into Low, Medium, and High groups. This segmentation ensured reproducibility and generalizability, providing a consistent framework for exploratory comparisons of treatment outcomes across different regimens (Fig. 3c). Comparisons of the pCR rates across these groups revealed distinct treatment response patterns (Fig. 3d). In the HER2-LADDER-Low group, the pCR rates were similarly high for TCbHP/PCbHP (96.2%), THP (85.7%, Bonferroni-adjusted P = 1 vs. TCbHP/PCbHP), and TCbH/PCbH (84.9%, P = 0.143), indicating potential appropriateness for regimen de-escalation. In the HER2-LADDER-Medium group, TCbHP/PCbHP significantly outperformed THP (pCR 80.0% vs. 7.7%, P = 6.382 × 10−7) and TCbH/PCbH (47.1%, P = 6.856 × 10−5), supporting maintenance of the standard regimen. In the HER2-LADDER-High group, patients who received SHR-A1811 (54.2%, P = 6.914 × 10−3) and TH plus pyrotinib (50.0%, P = 0.085) had poor responses to TCbHP/PCbHP (19.6%) but relatively better outcomes.
Collectively, these observations suggest that HER2-LADDER scores may provide a framework to assist in stratified therapeutic planning, supporting potential de-escalation regimens (THP, TCbH/PCbH) for HER2-LADDER-Low patients, standard-of-care TCbHP/PCbHP in HER2-LADDER-Medium patients, and alternative regimens (ADCs or TKIs) for HER2-LADDER-High patients (Fig. 3e). Together, HER2-LADDER represents a conceptually innovative and clinically translatable framework for precision-guided optimization of anti-HER2 therapies.
Prognostic value of HER2-LADDER in the adjuvant treatment setting
Given the central role of dual HER2-targeted therapy combined with chemotherapy in the adjuvant management of early-stage HER2-positive breast cancer, we further evaluated the prognostic utility of the HER2-LADDER model beyond the neoadjuvant context. Stratification on the basis of baseline HER2-LADDER scores revealed marked differences in long-term clinical outcomes. Patients in the High group had significantly worse overall survival (OS; hazard ratio (HR) 7.17; 95% CI: 1.52–33.83; log-rank P = 0.013; Fig. 4a) and worse DFS (HR 2.94; 95% CI: 1.14–7.61; P = 0.026; Fig. 4b) than those in the HER2-LADDER-Low group did. Notably, subgroup analysis of patients receiving adjuvant TCbHP further confirmed the prognostic relevance of HER2-LADDER, with the High group showing significantly inferior DFS (HR 4.71, 95% CI: 1.21–24.28; P = 0.024; Supplementary Fig. 5). These results indicate that HER2-LADDER scores, derived solely from pretreatment core-needle biopsy slides, can serve as powerful prognostic indicators to predict long-term outcomes, even in the adjuvant setting.
Fig. 4The alternative text for this image may have been generated using AI.
Prediction performance of the HER2-LADDER model in adjuvant-treated HER2-positive breast cancer. a Kaplan–Meier analysis comparing overall survival (OS) between the HER2-LADDER-High and HER2-LADDER-Low groups; hazard ratios and P value derived from the Cox proportional hazards model and log-rank test. b Kaplan‒Meier curve for disease-free survival (DFS) comparing the HER2-LADDER-High and HER2-LADDER-Low groups; hazard ratios and P value determined by Cox regression and log-rank tests. The extended subgroup analysis is provided in Supplementary Fig. 5
Biological interpretability of the predictive features
Building on the strong predictive performance of HER2-LADDER in both neoadjuvant and adjuvant settings, we next explored the biological basis of the key features of the model to better understand the mechanisms underlying the response to TCbHP/PCbHP treatment. The final feature set, derived through systematic engineering, integrated clinicopathological variables (e.g., HR status) with spatially resolved features extracted from digital pathology images. These spatial features were categorized according to staining modality and biological relevance (Fig. 5a).
Fig. 5The alternative text for this image may have been generated using AI.
Interpretability deconvolution of the HER2-LADDER model. a Categorization of the key predictive spatial variables integrated into HER2-LADDER, derived from H&E (spatial interactions of tumor microenvironmental cells) and HER2 IHC (spatial interactions of HER2-expressing tumor cells). Variables include cell proportions (Pi), aggregation scale (Nsubgraph), edge-length metrics (minEdgeLength), and spatial centrality (Degrees). b Shapley additive explanations (SHAP) analysis quantifying the relative contributions of each feature and clinicopathological variable within the submodels and the final ensemble HER2-LADDER model. Features ranked by predictive importance (mean [|SHAP value|]), reflecting their overall importance
From H&E-stained images, selected variables primarily reflected tumor microenvironment interactions, including the spatial distribution and colocalization patterns of lymphocytes, neutrophils, macrophages, and stromal cells relative to tumor cells. Representative features included Neutro_Lymph_Nsubgraph (cellular aggregation), Lymph_Lymph_Nsubgraph (cellular aggregation), Pi_Lymph_in_all_cells (proportion), and Neutro_Lymph_minEdgeLength (intercellular distance). Moreover, features derived from HER2 IHC-stained images described the spatial organization and expression heterogeneity of HER2-positive tumor cells, encompassing the proportions of HER2-expressing cell subtypes and their spatial arrangements. Representative variables included HER2strong_complete_HER2strong_incomplete_Nsubgraph (cellular aggregation), Pi_HER2strong_complete_in_HER2strong (proportion), Pi_HER2strong_incomplete_in_tumor_cells (proportion), and HER2strong_complete_HER2strong_complete_minEdgeLength (intercellular distance).
To assess the relative contribution of these features, we applied Shapley additive explanations (SHAP) analysis across individual submodels and the ensemble voting classifier (Fig. 5b). SHAP confirmed that both H&E-derived microenvironmental cell interaction features and HER2 IHC-derived tumor cell spatial characteristics substantially influenced the model predictions. These findings underscore that the most influential features identified by HER2-LADDER are not only predictive but also biologically meaningful, offering insights into the spatial determinants of therapeutic response.
Spatial determinants of treatment responsiveness suggested by the HER2-LADDER model
To further elucidate the biological mechanisms underlying the predictive ability of HER2-LADDER, we systematically investigated spatial determinants associated with treatment responsiveness, categorizing them into HER2 expression characteristics and tumor immune microenvironment features, as outlined in Fig. 6a. By integrating digital pathology profiling with spatial transcriptomics validation via Xenium in situ analyses, we closely examined key variables identified through SHAP analyses.
Fig. 6The alternative text for this image may have been generated using AI.
Spatial multiomics characterization of HER2 expression heterogeneity determinants in HER2-LADDER groups. a Analytical framework illustrating the integration of digital pathology profiling and Xenium in situ validation of HER2-expressing tumor cell spatial characteristics. b c Comparative analysis of the aggregation scale (b, HER2strong_complete_HER2strong_incomplete_Nsubgraph) and minimum intercellular distance (c, HER2strong_complete_HER2strong_incomplete_minEdgeLength) between HER2-strong complete and HER2-strong incomplete cells across HER2-LADDER groups; statistical significance was assessed by Wilcoxon tests. d, e Comparison of the proportions of HER2-strong complete (d) and HER2-weak complete (e) cells between the HER2-LADDER groups; statistical significance was assessed by Wilcoxon tests. f Representative HER2 IHC whole-slide images and tumor cell-type mapping illustrating spatial heterogeneity in HER2 membrane expression patterns between the HER2-LADDER groups. g Differential gene expression analysis from Xenium spatial transcriptomics (21 samples; 11 HER2-LADDER-Low and 10 HER2-LADDER-High), highlighting increased ERBB2 pathway gene expression in HER2-LADDER-Low tumors. h Single-cell PAM50 subtyping demonstrating significantly higher proportions of the HER2-enriched subtype in HER2-LADDER-Low tumors than in luminal subtypes in HER2-LADDER-High tumors; statistical significance was assessed by the Wilcoxon test. i–j Spatially resolved single-cell mappings of PAM50 subtypes showing distinct spatial distribution patterns for unified HER2-enriched cells (i, HER2-LADDER-Low tumors) versus mixed luminal and HER2-enriched patterns (j, HER2-LADDER-High tumors)
In terms of HER2 expression characteristics, marked spatial differences emerged between tumors classified into the HER2-LADDER-High group and those classified into the HER2-LADDER-Low group. Specifically, HER2-LADDER-High tumors exhibited significantly greater heterogeneity in HER2 membrane staining, characterized by mosaic-like distribution patterns. The average aggregation scale (HER2strong_complete_HER2strong_incomplete_Nsubgraph) was significantly greater in HER2-LADDER-High tumors (Wilcoxon P = 4.2 × 10−7; Fig. 6b). Additionally, these tumors displayed notably shorter average minimal intercellular distances between HER2-strong complete and incomplete membrane-expressing tumor cells (HER2strong_complete_HER2strong_incomplete_minEdgeLength; P = 2.9 × 10−5; Fig. 6c). Quantitative analysis further revealed that HER2-LADDER-High tumors contained significantly lower proportions of tumor cells with complete membrane HER2-strong positivity (Pi_HER2strong_complete_in_HER2strong; P = 7 × 10−8; Fig. 6d) and higher proportions of tumor cells with HER2-weak complete membrane staining (Pi_HER2weak_complete_in_tumor_cells; P = 2.6 × 10−5; Fig. 6e). These findings were visually confirmed via HER2 IHC WSIs, underscoring the pronounced mosaic pattern of HER2 expression as a potential determinant of reduced sensitivity to dual HER2-targeted treatments (Fig. 6f).
Complementary Xenium in situ analysis validated these observations at the transcriptomic level, demonstrating significantly elevated expression of ERBB2 signaling-related genes and downstream effectors in tumors from the HER2-LADDER-Low group, indicating robust HER2 pathway activation (Fig. 6g). Single-cell PAM50 subtyping supported these findings, as HER2-LADDER-Low tumors exhibited a significantly greater proportion of HER2-enriched subtype cells (P = 2.6 × 10−3), whereas HER2-LADDER-High tumors had greater proportions of luminal A (P = 6.5 × 10−3) and luminal B subtype cells (P = 0.05; Fig. 6h). Spatial mapping further confirmed a relatively unified composition of HER2-enriched cells within HER2-LADDER-Low tumors (Fig. 6i), which contrasted sharply with the mosaic-like distribution observed in HER2-LADDER-High tumors dominated by cells of luminal subtypes (Fig. 6j).
When the tumor immune microenvironment was examined, significant differences in spatial immune cell organization were observed. Tumors with better therapeutic responses demonstrated characteristics consistent with an “immune-hot” phenotype, including increased aggregation scales (Neutro_Lymph_Nsubgraph; P = 2.4 × 10−4; Fig. 7a) and reduced minimal intercellular distances (Neutro_Lymph_minEdgeLength; P = 5.8 × 10−6; Fig. 7b) between neutrophils and lymphocytes. Similar immune-hot patterns were observed for lymphocyte‒lymphocyte interactions, including larger aggregation scales (Lymph_Lymph_Nsubgraph; P = 7.3 × 10−7; Fig. 7c) and shorter intercellular distances (Lymph_Lymph_minEdgeLength; P = 7.9 × 10−5; Fig. 7d). Spatial mappings from H&E-stained WSIs visually validated these findings, highlighting dense immune cell clustering within tumors exhibiting favorable responses (Fig. 7e).
Fig. 7The alternative text for this image may have been generated using AI.
Spatial multiomics characterization of immune microenvironment determinants associated with HER2–LADDER groups. a, b Comparative analyses of neutrophil–lymphocyte aggregation scales (a, Neutro_Lymph_Nsubgraph) and minimum intercellular distances (b, Neutro_Lymph_minEdgeLength) between HER2-LADDER groups; statistical significance was assessed by Wilcoxon tests. c, d Aggregation scales (c, Lymph_Lymph_Nsubgraph) and minimum intercellular distances (d, Lymph_Lymph_minEdgeLength) within lymphocyte populations compared with HER2-LADDER groups; statistical significance was assessed by Wilcoxon tests. e Representative spatial mappings from H&E WSIs depicting differential immune cell clustering patterns (neutrophil‒lymphocyte and lymphocyte‒lymphocyte interactions) between the HER2-LADDER groups. f Comparative Xenium-derived cell composition analysis, highlighting significantly increased helper T-cell proportions in HER2-LADDER-Low tumors, particularly within 30 μm of neutrophils; statistical significance was assessed by Wilcoxon tests. g Pathway enrichment analysis of differentially expressed genes in neutrophils derived from Xenium spatial transcriptomic profiling of 21 tumor samples (11 HER2-LADDER-Low and 10 HER2-LADDER-High cases). h Cell–cell ligand–receptor interaction analysis identifying significant differences in interactions between antigen-presenting cells (neutrophils, dendritic cells, and B cells) and helper T cells, as well as subsequent interactions between helper T cells and effector immune cells (M1-like macrophages, natural killer T cells, and plasma cells), between the HER2-LADDER groups; statistical significance is indicated. i, l Spatial connectivity analyses using sc-MTOP quantitatively confirm stronger interactions in the HER2-LADDER-Low groups between neutrophils and helper T cells (i) and subsequently between helper T cells and effector immune populations (j), suggesting that coordinated immune responses drive enhanced treatment sensitivity. Statistical significance was assessed by the Wilcoxon test. Representative spatial interaction schematics are illustrated in (k, l)
Building upon these observations, the results of the Xenium analysis further revealed that the abundance of helper T cells differed most markedly between the HER2-LADDER-Low and HER2-LADDER-High groups (P = 3.8 × 10−3; Fig. 7f). Further spatial analysis revealed an enrichment of helper T cells within 30 μm of neutrophils in HER2-LADDER-Low tumors, suggesting critical spatially mediated immunological interactions (Fig. 7f). In HER2-LADDER-Low tumors, differential expression analysis revealed significant enrichment of antigen receptor-mediated signaling pathways in neutrophils (Fig. 7g). These findings suggest that neutrophils may engage in antigen presentation, prompting further examination of their interactions with antigen-presenting cells (APCs).
Ligand‒receptor analyses subsequently revealed significant differences in cellular communication between APC populations (neutrophils, dendritic cells, and B cells) and helper T cells, as well as downstream interactions between helper T cells and effector immune populations, including M1-like macrophages, natural killer T cells, and plasma cells (Fig. 7h). Quantitative spatial interaction assessments indicated notably stronger connections between neutrophils and helper T cells (Wilcoxon P = 0.01; Fig. 7i), with subsequent robust interactions extending from helper T cells to effector immune cells (M1-like macrophages: P = 2.8 × 10−3; natural killer T cells: P = 0.024; plasma cells: P = 6.2 × 10−3; Fig. 7j). Collectively, these findings suggest that a spatial immunological cascade initiated by neutrophil-helper T-cell interactions enhances effector immune responses, thereby potentiating antitumor immunity and improving therapeutic responses (Fig. 7k, l).
In summary, the integration of digital pathology profiling with Xenium spatial in situ analyses provided mechanistic insights into the distinct spatial characteristics of HER2 expression and tumor microenvironmental interactions that underpin HER2-LADDER predictions. These findings reinforce the biological interpretability of HER2-LADDER, highlighting the pivotal role of spatially resolved tumor phenotypes in predicting therapeutic responsiveness.

