Outlining the tumour-immune microenvironments of the Oncotype Dx RS
Considering the consensus view of estrogen receptor-positive, human epidermal growth factor receptor 2-negative (ER+HER2−) breast cancer being an immune-cold subtype, we firstly investigated the immune composition of the tumour microenvironment (TME) and epithelia across the Oncotype Dx recurrence score (RS) (Fig. 2a). Using 5-marker multiplex immunofluorescence-based spatial proteomic data (n = 440), we discovered that within TME/stroma (i.e. stromal tumour-infiltrating lymphocytes: sTIL) no sTIL phenotype density was significantly different across the RS except for sTIL T-regulatory cells, that significantly increased with risk (Fig. 2b. Kruskal–Wallis p < 0.001. Dunn’s test High RS vs. Low RS p < 0.0001, High RS vs. Int RS p = 0.016, Int RS vs. Low RS p = 0.042. FDR < 0.05). Within epithelia (i.e. intraepithelial tumour-infiltrating lymphocytes: iTIL) (Fig. 2c), likewise only iTIL T-regulatory cells significantly increased (Kruskal–Wallis q = 0.018), with High RS disease contributing the majority of this significant difference (Dunn’s test High RS vs. Low RS p < 0.001, FDR < 0.05; High RS vs. Int RS p = 0.049, FDR > 0.05). We next explored with hierarchical clustering whether any immune pattern would emerge in both the stroma (TME, Fig. 2d) and epithelia (EPI, Fig. 2e). We found four TME and three EPI states. One state each was truly immune cold (TME-4: n = 90, 20%. EPI-1: n = 121, 28%. Fig. 2f) and one pan-immune hot (TME-3: n = 130, 30%. EPI-3: n = 123, 28%. Fig. 2f). Two TME and one EPI state were variably immune-infiltrated, with one more enriched in T-helper/cytotoxic T-cells (TME-2: n = 92, 21%. Fig. 2f) and the other more T-helper scarce (TME-1: n = 128, 29%, Fig. 2f), whereas EPI-2 appeared as an intermediary between hot and cold states (n = 196, 45%. Fig. 2f).
Fig. 2: Outlining the immune milieu.
a Representative ROIs (regions of interest) showing proteomic 7plex data across the tumour microenvironment (TME) and epithelia (EPI). Boxplots of log-transformed immune phenotype density across b. The TME (Low RS [recurrence score] n = 167, Int RS n = 183, High RS n = 90), c. Epithelia (Low RS n = 167, Int RS n = 183, High RS n = 90). Box plots show the median (centre line), with the box spanning the interquartile range (25th to 75th percentiles). Whiskers extend to the most extreme values within 1.5 × the interquartile range from the box. Outliers beyond the whiskers were not displayed. Opacity encodes significance on the Kruskal–Wallis test, and q shows the Benjamini–Hochberg-adjusted p-value from the Kruskal–Wallis test. Asterisks denote significant pairwise comparisons of two-sided Dunn’s post hoc test, adjusted for multiple comparisons, with values as follows: p < 0.05 (*), 0.01 ≥ p > 0.001 (**), p ≤ 0.001 (***). Exact adjusted p-values: b CD4+FOXP3+ Low RS vs. High RS p = 0.0000873; Low RS vs. Int RS p = 0.0424; Int RS vs. High RS p = 0.0155. c CD4+FOXP3+ Low RS vs. High RS p = 0.00253. d–f Heatmaps of unsupervised clustering on immune phenotypes from spatial proteomic data, within the d TME, and e epithelia. f Pie charts of the frequency of patients with each TME and EPI state. Bar charts of gene set enrichment analysis (GSEA) for each unsupervised cluster versus the baseline, immune-cold state, for g TME States versus TME-4, h EPI states versus EPI-1.
We next performed differential gene expression for each cluster relative to the immune-cold state (TME-4 or EPI-1), and gene-set enrichment analysis (GSEA) was subsequently performed to identify functionally enriched terms and pathways (Fig. 2g, h). We performed this in the matched spatial proteomic-transcriptomic dataset (n = 359). All TME states were dominated by immune activation and cytokine response programmes (e.g. enriched interferon-α and interferon-γ hallmarks), though differing in the type and intensity of inflammation. TME-1 (T-cell low) was enriched in epithelial-to-mesenchymal transition and P53 pathway, TGF-β signalling, apoptosis and oxidative phosphorylation. TME-2 (T-cell infiltrating) was enriched for allograft rejection, TNF signalling via NFκB, P53 pathway and complement hallmarks, whereas TME-3 (immune-hot) was uniquely enriched in PI3K-AKT-mTOR, mTORC1 signalling, and DNA damage checkpoint G2M, among enriched inflammatory pathways. Altogether, compared with immune cold TME-4, GSEA suggests TME-1 is a more activated/stressed stromal state with interferon-driven immune signalling; TME-2 more classically inflammatory; TME-3 is immune-hot but exhibits strong growth, metabolic and DNA damage responses indicative of a milieu with marked cell-cycle pressure and survival pathways (Fig. 2g). Comparatively, EPI-2 (immune-intermediary) was also independently associated with downregulation of TNF signalling, hypoxia, TGF-β signalling, Notch signalling and androgen response hallmarks, indicative of a less inflamed, less proliferative, and transcriptionally quiet epithelium. On the contrary, EPI-3 (immune-hot) was associated with downregulation of glycolysis, fatty acid metabolism, MYC targets and DNA repair, with enrichment of interferon-α/ interferon-γ hallmarks, inflammatory response pathways, IL6-JAK-STAT3 signalling and complement pathways, suggestive of an inflamed, immune-engaged state versus an immunologically cold epithelium (Fig. 2h).
Immune-hot epithelia co-occur with immunoregulatory stroma
We next investigated the prevalence of TME and EPI states across the cohort (proteomic data, n = 440). We found no significant enrichment of any state across categories of the Oncotype Dx RS (Fig. 3a, TME χ2 = 6.5, p = 0.37. EPI χ2 = 8.22, p = 0.084), implying that immune composition is not captured wholly by the genomic context. Subsequently, we explored the relationship between TME and EPI states. Immune-hot states in both TME (TME-3) and epithelia (EPI-3) tended to co-occur (Fig. 3b, n = 84, 64.6%). However, immune-cold epithelia (EPI-1, Fig. 3b) were observed in milieu exhibiting both a cold TME (Fig. 3b, TME-4: n = 52, 57.8%), and in milieu defined by T-cell infiltration/T-reg sparsity (Fig. 3b, TME-2: n = 49, 53.3%). These data suggest the presence of distinct immune-epithelial programmes that are partly, though not reliably, coupled. To investigate this further, using the matched spatial transcriptomic-proteomic dataset (n = 359), we subsequently employed cluster-wise differential expression (DEG) analysis of an OmniPath-curated25 ligand-receptor gene set, treating each spatial compartment independently. Comparing DEGs across each immune state (Fig .3c, d) revealed a significantly divergent transcriptomic landscape between states. While TME-2 (T-cell infiltrating) and TME-3 (immune-hot) both contain relatively higher densities of T-cells, TME-3 surprisingly exhibited no significant DEGs of ligands versus TME-1 (T-cell low), suggesting that the differences may be driven more by receptor programmes (Fig. 3c). TME-2, however, diverged from TME-1 in that ligand genes relating to angiogenesis (VEGFA), M2-like macrophage (SPP1), and extracellular matrix (ECM) remodelling (ADAM15) were significantly under-expressed, whereas ligand and receptor genes related to immune recruitment (CCL5, CXCL9), costimulation (CD27, CD48), activation (CD247, IL7R and IL2RB/G) and inhibition (TIGIT) were significantly overexpressed. These data are indicative of TME-2 being an immune-recruiting and activation-competent niche with concurrent inhibitory signalling. Comparatively to TME-1 (T-cell low), only receptor genes were overexpressed in TME-3 (immune-hot). These were related to immune trafficking (CCR5), effector presence (CD8A, CD247), cytokine response (IL7R, IL2RB/G) and checkpoints (CTLA4, TIGIT and CD96). Versus TME-2 (T-cell infiltrating, Supplementary Fig. 5), receptors relating to cell polarity/junction formation (PARD3), and ligands to ECM remodelling (ADAM15, SPINT1) and Wnt-antagonism (DKK1), were significantly overexpressed in TME-3 (immune-hot). Taken together, the TME of immune-warm states appear transcriptionally wired toward more adaptive lymphoid responses (TME-2) or immunoregulatory, stromal-remodelling niches (TME-3).
Fig. 3: Outlining TME and EPI immune states.
a Stacked barcharts of the proportion of each TME (tumour microenvironment) and EPI (epithelial) state across the Oncotype Dx RS. Bottommost text outlines the result of an omnibus Chi-squared test for the association of Oncotype Dx RS (recurrence score) categories across TME and EPI states. b Confusion matrix encoding the frequency and percentage of patient-level overlap between TME and EPI states. c, d Volcano plots of differentially expressed ligand-receptor genes derived from the OmniPath database. Δ-z is the difference in mean z-scored expression between clusters. Dashed lines denote Δ-z thresholds and p-thresholds derived from two-sided empirical Bayes moderated t-tests from Limma, encoded by blue point colour. Those passing adjustments for multiple comparisons with the Benjamini–Hochberg method are in red (FDR [false discovery rate] <0.05), for c TME, and d EPI states. Bar charts showing significantly up- and down-regulated hallmark pathways across e TME and f EPI states, adjusted for multiple comparisons with the Benjamini–Hochberg method FDR < 0.05. NES normalised enrichment score, FC fold change, NS non-significant.
We also recognised patterns in the transcriptomic landscape of epithelia (Fig. 3d). EPI-3 (immune-hot) states significantly upregulated ligand genes related to chemokines and recruitment (CXCL9, CCL5 and CX3CL1), inflammatory regulation (TNF, IL1RN and C1QA) and cytotoxicity effectors (GZMB), indicating a more chemokine-active, inflamed interface versus both EPI-1 and EPI-2 states. In EPI-3, overexpressed receptor genes were likewise distinguished by immune trafficking (ITGAL, ITGB2 and CXCR4), interferon response (IFNAR2) and T-cell signalling (CD3G, CD247, IL2RA, IL2RG and IL15RA). Notably, in EPI-3 we also saw a consistent and significant underexpression of ESR1 (encoding ERα), suggesting that relatively immune-inflamed epithelia may be marked by both immune signalling, chemotaxis and de-differentiation from luminal-like biology. These data were further confirmed in GSEA analysis, wherein TME-3 (Fig. 3e) and EPI-3 (Fig. 3f) exhibited significant upregulation of inflammation-related, and downregulation of estrogen response early hallmarks. Immune-hot epithelia therefore appear as chemokine-active and IFN-responsive, with evidence of immune trafficking and signalling transcripts. Simultaneously, the inflamed stromal compartment is enriched for trafficking yet also checkpoints (CTLA4, TIGIT and CD96) and overexpresses additional remodelling-associated programmes. As TME-3 and EPI-3 are moderately coupled, this reveals a greater immune milieu where pathways related to immune activity and engagement within epithelia occur predominantly within a constrained, suppressive stromal environment.
We hypothesised that the presence of constrained stroma and chemokine-upregulated epithelia may provide useful prognostic information across the cohort. We subsequently explored the prognostic potential of immune states across genomic risk categories, using the proteomic data set (n = 440). We found no TME type or EPI type to be independently prognostic across any Oncotype Dx RS category overall (Supplementary Fig. 6). However, across randomised treatment arms of the Intermediate RS we found EPI-3 (immune-hot) to be an indicator of significantly poorer outcome on chemoendocrine therapy (Supplementary Fig. 6 HR: 4.09, 95% CI: 1.45–11.53, p = 0.008, FDR < 0.10). Interestingly, EPI-3 patients with an Intermediate RS trended to improved outcome when receiving randomised endocrine therapy alone (Supplementary Fig. 7 HR:0.092, 95% CI: 0.01–0.83, p = 0.032, FDR > 0.10).
Cytotoxic T-cell density is a predictive marker in the Intermediate RS
Immune-hot states appear to circumscribe a potentially constrained, T-cell-dysfunctional stromal-immune hub (TME-3). This hub is moderately coupled to epithelia primed for immune chemotaxis and T-cell signalling (EPI-3), though we saw Intermediate RS patients with this EPI state had significantly poorer outcomes on chemotherapy. We next investigated whether any individual immune variable could recapitulate the finding of poorer outcome on chemotherapy seen in patients with an EPI-3 state. We therefore modelled continuous immune counts and densities across the Oncotype Dx RS, using the proteomic data set (n = 440). We found that an accumulation of CD8+ cells in epithelia trended toward poorer iDFS in the High RS (Supplementary Fig. 8. iTIL CD8% HR:1.53, 95% CI: 1.02–2.3, p = 0.041, FDR > 0.10). sTIL density of CD4+ and CD8+ cells, however, were both significantly prognostic (Supplementary Fig. 8. sTIL CD4+ density HR:1.04, 95% CI: 1.01–1.08, p = 0.010, FDR < 0.10. sTIL CD8+ density HR: 1.15, 95% CI: 1.02–1.29, p = 0.016, FDR < 0.10), suggesting that a greater T-cell infiltrate may parse tumours at greater risk of recurrence on chemotherapy. We also recognised trends to poorer prognosis in Low RS patients with high macrophage (CD68+) infiltrates (Supplementary Fig. 8. TME CD68% HR:1.08, 95% CI: 1.06–1.16, p = 0.032, FDR > 0.10), though we saw no prognostic trends in the Intermediate RS overall.
However, for patients with an Intermediate RS, the underlying association of immune density with outcome could be modified and disguised by the type of treatment regimen administered. We therefore investigated by randomised treatment arm (Fig. 4). This analysis (Fig. 4a) revealed that patients receiving chemotherapy had significantly poorer 15-year iDFS if their tumours contained e.g. greater iTIL cytotoxic T-cell % (HR: 2.10, 95% CI: 1.22–1.3.64, p = 0.008, FDR < 0.10) or higher sTIL cytotoxic T-cell density (HR: 1.18, 95% CI: 1.03–1.37, p = 0.020, FDR < 0.10). This was likewise observed for sTIL T-helper cell density (HR: 1.08, 95% CI: 1.00–1.15, p = 0.034, FDR < 0.10) and iTIL T-helper cell density (HR: 1.19, 95% CI: 1.04–1.37, p = 0.012, FDR < 0.10). Using a median cutoff as an illustration for survival analysis, high sTIL CD8+ density (>median, ~30 cells/mm2) had significantly poorer survival than low density in the chemoendocrine arm (Fig. 4b. HR:0.26, p = 0.0057, 15-year iDFS high: 68% vs. 91.3% low), likewise for high iTIL CD8+ % (>median, 0.2%) (Fig. 4b HR:0.20 p = 0.0017, 15-year iDFS high: 68.2% vs. 91.2% low). Moreover, both high stromal density and epithelial count of cytotoxic T-cells provided significant additional prognostic information to nested models of clinical covariates (Fig. 4b). Investigating the prognostic power versus EPI states revealed that neither sTIL CD8+ density nor iTIL CD8% provided additional prognostic information to EPI states and vice versa (Supplementary Table 9). This importantly highlights substantial overlap in the prognostic signal found in EPI-3 and CD8+ density/percentage within the Intermediate RS. Together, these data underscore the prognostic utility of immune phenotyping above canonical risk variables in the intermediate RS and show that these data cannot be fully captured by spatially agnostic bulk phenotype scoring.
Fig. 4: Immune phenotyping in the Intermediate RS.
a Forest plot of multivariable Cox models across randomised treatment arms (HT n = 123, HT + CT n = 118), for immune densities/proportions (per step in the label) with clinical covariates (age, menopausal status, histological grade, tumour size and luminal status). Each row centre point encodes the hazard ratio (HR), and bars represent the 95% CI of the immune variable taken from multivariable models, independently of the other variables shown. White points denote p < 0.05, gold points denote variables passing adjustment for multiple comparison with a Bejamini–Hochberg FDR [false discovery rate] <0.10. b Kaplan–Meier curves for two exemplar biomarkers, dichotomised at the cohort median (left). Right panels show paired bar plots of model performance (ΔLR χ² [likelihood ratio-chi squared] and ΔC-index). Light green denotes the baseline clinical variable, dark green denotes bivariable models. Shaded regions show 95% CI (confidence interval). c, d Inverse-analysis Kaplan–Meier curves for the same biomarkers, stratified by received adjuvant treatment (HT = endocrine therapy, HT + CT = chemoendocrine). Annotations report the unadjusted Cox model HR with shaded regions representing the 95% CI and estimated 15-year invasive disease-free survival (iDFS) for each strata. e, f Predicted difference in 15-year iDFS (HT + CT − HT) from a spline-interaction Cox model as a function of the continuous biomarker (bottom), with a histogram and density curve of the biomarker distribution (percentage: bins 0.5%. Density: bins 50 cells/mm2) (top). Shaded regions show 95% CI. Curves below the horizontal line indicate poorer outcomes on chemoendocrine therapy. Adjusted interaction HR and 95% CI come from the rescaled Wald test values. P-values are derived from the likelihood-ratio test of interaction, q-values show FDR adjustment using the Benjamini–Hochberg method. g Stacked bar plots showing distributions of menopausal status and received treatment under trial criteria, current guidelines and the proposed new biomarker. Labels indicate patient counts. h Sankey diagram illustrating potential treatment reclassification from trial criteria to proposed biomarker, with flows coloured by treatment and biomarker strata.
To further explore the possible differences in outcome by received treatment when scoring cytotoxic T-cells, we subsequently performed an inverse analysis (Fig. 4c, d). We divided the Intermediate RS into high vs low by median iTIL CD8% (Fig. 4c) and sTIL CD8+ density (Fig. 4d), and investigated the 15-year iDFS difference by received treatment. Low density CD8+ patient survival was not significantly different when receiving either endocrine therapy alone or additional chemotherapy (Fig. 4c, iTIL% p = 0.36. Figure 4d, sTIL density p = 0.96). Those with high iTIL CD8% (Fig. 4c) and high sTIL CD8+ density (Fig. 4d), however, had significantly poorer outcomes when receiving additional chemotherapy (Fig. 4c, iTIL CD8% p = 0.0074, 15-year iDFS HTCT: 69.8% vs. 90.1% HT-only. Figure 4D, sTIL CD8+ density p = 0.025, 15-year iDFS HTCT: 68% vs. 86.7% HT-only). Interaction plots (Fig. 4e, f), adjusted for clinical covariates, demonstrated that additional chemotherapy did not confer a significant benefit over endocrine therapy alone in our cohort when iTIL CD8% increased continuously (Fig. 4e, ΔLR χ²: 4.59, p = 0.032, q = 0.032. HR:4.35, 95% CI: 0.79–0.037). We saw this likewise with sTIL CD8+ density (Fig. 4f, ΔLR χ²: 7.03, p = 0.008, q = 0.016. HR:2.06, 95% CI: 0.95–4.48). Interaction plot curves shifted toward a poorer outcome on chemoendocrine therapy above median sTIL CD8+ density, though 95%CI excluding 1 was only seen beyond the ~70th percentile (~200 cells/mm2). The confidence intervals being wide across much of both biomarker ranges is likely due to the relatively small size of the Intermediate RS in our cohort. Thus, these results should be interpreted with caution and require external validation in a larger cohort of chemotherapy-randomised patients with an Intermediate RS before optimum thresholds can be generalised to practice.
As patients dropped out of analyses due to insufficient tissue, presence of artefactual cores, or withdrawal from the study, we next examined the proportion of clinical variables across dropout subsets, finding no significant difference in any strata (Supplementary Fig. 10). Secondly, as menopausal status is the current treatment stratifier for patients with an Intermediate RS (RS 16–25), we next wanted to examine the potential treatment change when using sTIL CD8+ density over menopausal status. Current clinical guidelines suggest that pre-/peri-menopausal patients may benefit from chemoendocrine therapy (RS 16–25), whereas postmenopausal women would not significantly benefit (RS 0–25) (Fig. 4g: current guidelines). Since our cohort was derived from patients previously enrolled in TAILORx, before these guidelines were updated, a large number of pre- and post-menopausal patients received either therapy (Fig. 4g: trial). Using high sTIL CD8+ density (>median) as a predictor of response, we see a potential treatment change of 50% (n = 56/111) of postmenopausal women (RS 16–25) from endocrine therapy alone to chemoendocrine therapy. Of pre-/peri-menopausal women (RS 16–25), 49% (n = 36/73) would have a change in treatment from chemoendocrine therapy to endocrine therapy alone. While the distribution is similar to the working trial cohort (Fig. 4g: trial, new), the Sankey diagram (Fig. 4h) demonstrates that a significant proportion of those patients would experience a change in treatment despite a similar split in menopausal status within treatment strata to the trial distribution. Moreover, testing this effect across subsets of menopausal status and randomised treatment groups, we found high sTIL CD8+ density was only statistically significantly prognostic in postmenopausal patients (Supplementary Fig. 11, premenopausal: low sTIL CD8 density HR:0.27, 95% CI: 0.05–1.41, p = 0.097, postmenopausal: low sTIL CD8 density HR:0.27, 95% CI: 0.07–1.0, p = 0.035), likely due to few reduced events after stratification in the premenopausal group, as the directionality of the effect was consistent across menopausal statuses.
Internal orthogonal validation on whole tissue resections
To mitigate potential TMA sampling bias and in lieu of an accessible validation cohort with similar 1:1 randomisation of the Intermediate RS, we next performed an internal orthogonal validation of cytotoxic T-cell density on whole resection specimens (WSIs) only within the randomised Intermediate RS group (n = 151) (Fig. 5a, b). Using an illustrative cut-off of median sTIL CD8+ density (median ~125 cells/mm2) showed significantly poorer iDFS of patients receiving additional chemoendocrine therapy (p = 0.0062, 15-year iDFS HTCT: 61.8% vs. 89% HT-only, Fig. 5c). We crucially discovered a treatment-biomarker interaction, indicating poorer outcomes of Intermediate RS patients prescribed additional chemotherapy as sTIL CD8+ density increased (adjusted for clinical covariates. Up to −18% survival difference at 15-years. ΔLR-χ2: 8.90, p = 0.0029. HR:1.45, 95% CI: 1.01–2.05) (Fig. 5d). Absolute survival differences (ASDs) across prespecified percentiles (10th–90th, Fig. 5e) showed a monotonic trend toward no benefit from chemotherapy with high (>50th percentile) WSI sTIL CD8+ density. At the 90th percentile of density, a 3.9% (95 CI: 0.6–15.8%, p = 0.008, FDR < 0.10), 9.0% (95% CI: 2.23–29.9%, p = 0.006, FDR < 0.10) and 15.3% (95% CI: 3.66–48.5%, p = 0.006, FDR < 0.10) higher iDFS was observed at 5 years, 10 years and 15 years, respectively, for patients receiving endocrine therapy only (Fig. 5e).
Fig. 5: Orthogonal validation of the proposed biomarker.
a Example CD8-IHC WSI (whole-slide image), from the CD8 WSI dataset (n = 437). Scale bar shows 2 mm. Yellow inset shows a region of interest (ROI). b ROI panels: raw image (top left), positive cell detection mask (top right), epithelia segmentation mask (bottom left), and CD8 density by nearest-neighbour (NN) of 50 µm (bottom right). Scale bars show 100 µm. c Kaplan–Meier curves of stromal CD8 density stratified by received adjuvant therapy (HT = endocrine therapy, HT + CT = chemoendocrine therapy) and dichotomised at the cohort median. Shaded regions show 95% CI. Survival curves are unadjusted. d Predicted difference in 15-year invasive disease-free survival (iDFS) (HT + CT − HT) from the spline-interaction Cox model as a function of continuous stromal CD8 density (bottom), adjusted for all clinical covariates (age, menopausal status, histological grade, tumour size, luminal status), with a histogram and density curve of the biomarker distribution (top). Shaded areas correspond to 95% confidence interval (CI). P-values are taken from the likelihood ratio test of interaction (ΔLR χ²). Adjusted interaction hazard ratio (HR) and 95% CI come from the rescaled Wald test values. e Absolute survival difference (HT − HT + CT) for biomarker deciles (d) across 5, 10 and 15 years (point colour encodes the time horizon). Points encode the survival difference, error bars show 95% CI (bootstrap, 1000 resamples). Colour opacity shows significance. HT n = 80, HT + CT n = 72. f Difference in restricted mean survival time at 15 years (Δ-RMST, HT − HT + CT) at the same biomarker percentiles. Points encode mean survival difference, error bars show 95% CI (bootstrap, 1000 resamples). Colour opacity shows significance. HT n = 80, HT + CT n = 72. g Decision curve analysis comparing treatment policies across minimum chemotherapy harm thresholds. Curves show standardised benefit (higher is better). Teal lines show the proposed biomarker policy, purple lines show the menopausal status policy. Red and green lines show treat-all (red) versus treat-none (green) policies. Shaded areas show 95% CI (bootstrap, 1000 iterations). For interpretation, at a chosen threshold (e.g. 5%), the vertical distance between curves is the gain in net benefit, approximating the additional correct treatment decisions per 100 patients compared with the alternative policies.
Over a 15-year horizon, the delta in restricted mean survival time was up to 12 months in favour of endocrine therapy-only (90th percentile, 95% CI: 2.16–37.4, p = 0.008, FDR < 0.10. Fig. 5F). Decision curve analysis demonstrated that sTIL CD8+ density provided higher net benefit than menopausal status across most thresholds. Particularly 0–3% and 8–30% (bootstrap p = 0.004, FDR < 0.10, p = 0.006, FDR < 0.10, respectively). To contextualise, a harm threshold of 15% indicates an intention to treat when the probability of recurrence is at least 15%, otherwise spare chemotherapy. Using sTIL CD8+ density at this threshold yielded a net benefit corresponding to 6 more patients per 100 correctly managed over the 15-year period compared with menopausal status (95% CI: 2.65–8.62, p = 0.002, FDR < 0.10). Compared with a chemo-for-all policy, sTIL CD8+ density yielded 11 more per 100 correctly managed (95% CI: 7.25–14.19, p = 0.002, FDR < 0.10). While these results require external validation to increase discriminatory power at lower percentiles, our data suggest that sTIL CD8+ density is a clinically translatable predictive biomarker that may help identify Intermediate RS patients (RS 16–25) more likely to be harmed by, or less likely to benefit from, additive chemotherapy regimens. Crucially, this data is independent of menopausal status. Though these data require extensive external validation, our data suggest this could result in up to 50% of all Intermediate RS patients (RS 16–25) experiencing a change in adjuvant treatment from the existing paradigm (49% pre-/peri-menopausal de-escalated from, and 50% postmenopausal escalated to, chemotherapy).
Spatial and transcriptomic signatures link immune states to exclusion, checkpoint activity and T-cell exhaustion
Having previously defined discrete immune ecologies in the TME and epithelium, we next wanted to better understand the transcriptomic and spatial landscape accompanying EPI and TME states, and whether these are associated with features that could plausibly influence adjuvant chemotherapy benefit. Using the matched proteomic-transcriptomic dataset (n = 359), we firstly modelled transcriptional immune programmes as a function of immune densities, spatial proximities and immune states, by fitting regression models with module scores as outcomes. This was performed for cytotoxicity (GZMA, GZMB, IFNG, PRF1 and TCF7), checkpoints (TIGIT, PDCD1, LAG3, PDCD1LG2, CTLA4, CD274 and VSIR) and exhaustion (TOX, CXCL13 and HAVCR2) modules, to find which variable most significantly contributed to each gene programme (Fig. 6a).
Fig. 6: Spatial-immune context and gene readouts.
a Forest plot of regression coefficients for predicting gene module scores within TME ([tumour microenvironment] left, patient n = 359), with bar charts of partial R2 variance (i.e. proportion of variance explained, right. Error bars show 95% confidence interval [CI] from 5000 bootstrapped resamples. Gene modules are defined as the mean expression of constituent genes in the TME/stroma: cytotoxic = GZMA, GZMB, PRF1, IFNG. Checkpoints = TIGIT, LAG3, CTLA4, CD274, PDCD1, PDCD1LG2, VSIR. Exhaustion = TOX, CXCL13, HAVCR2. All predictors were z-scored. Models include stromal immune cell densities and proximity metrics, with TME/EPI (epithelial) state and Oncotype Dx RS as covariates, and were adjusted for variables except the predictor of interest. Points show the estimated regression coefficient with error bars denoting the 95% CI. White points indicate p < 0.05, gold points and bold labels denote FDR (false discovery rate) <0.10. Raw p-values correspond to type II ANOVA tests for each term in the model. b Boxplots showing phenotype distances to a tumour mask (µm), stratified by TME states 1–4. Patient TME-1 n = 161, TME-2 n = 100, TME-3 n = 125, TME-4 n = 51. Box plots show the median (centre line), with the box spanning the interquartile range (25th–75th percentiles). Whiskers extend to the most extreme values within 1.5 × the interquartile range from the box. Outliers beyond the whiskers were not displayed. Asterisks denote p-values of two-sided Dunn’s test for pairwise comparisons. Opacity encodes significant differences. Significance is as follows: p < 0.05 (*), 0.01 ≥ p > 0.001 (**), p ≤ 0.001 (***). Exact p-values are: T-helper cell TME-1 vs. TME-2 p = 0.00747, TME-2 vs. TME-4 p = 0.00747. Cytotoxic T-cell TME-1 vs. TME-4 p = 0.031, TME-2 vs. TME-4 p = 0.00162, TME-3 vs. TME-4 p = 0.00162. Macrophage TME-1 vs. TME-2 p = 0.00747, TME-2 vs. TME-4 p = 0.0747. B-cells TME-1 vs. TME-2 p = 0.0398, TME-1 vs. TME-3 p = 0.00131, TME-3 vs. TME-4 p = 0.0398. c Dot and whisker plots showing stromal-to-epithelial coupling of immune infiltration across TME states. Dotted line indicates proportional coupling. TME-4 was dropped, as the relative lack of immune cellularity resulted in unstable slope estimates. d Illustrative examples of immune infiltrate in TME states, from spatial proteomics analyses, showing PanCK+ tumour epithelium (magenta), CD68+ macrophages (red), CD4 + FOXP3+ T-regulatory cells (green-orange), CD8+ cytotoxic T-cells (yellow), CD4+ T-helper cells (green), CD20+ B-cells (white). e–h Heatmaps of regression slopes for immune module genes by cell phenotype density (Δ-gene z-score, i.e. for each gene, how much does its expression change as the predictor variable density increases by 50 cells/mm2, adjusting for other stromal/epithelial-immune densities). Black outlines show genes passing a discovery FDR < 0.10. e Increasing epithelial (iTIL) cytotoxic T-cell density. f, g Increasing TME (stromal) macrophage density. h Increasing TME (sTIL) cytotoxic T-cell density.
We found TME and EPI states to not be significantly associated with any immune programme, though trends of increased cytotoxicity and exhaustion in TME, and checkpoints in EPI states, were seen in all but the immune-cold (i.e. TME-4, EPI-1) state (Fig. 6a). On the contrary, we found that increasing stromal macrophage density (+1 SD, ~220 cells/mm2) showed a significant negative trend for exhaustion module (β = −0.65, 95% CI: −1.08 to −0.21, p = 0.0045, FDR < 0.10), and cytotoxicity module genes (β = −0.74, 95% CI: −1.16 to −0.33, p < 0.0001, FDR < 0.10), with trends observed in the checkpoint module (β = −0.58, 95% CI: −1.02 to −0.13, p = 0.011, FDR > 0.10). This negative association is consistent with an immunomodulatory macrophage context. Moreover, spatial proximity of cell phenotypes revealed that T-helper and macrophage distances, particularly from B-cells (CD20+) and cytotoxic T-cells (CD8+), provided additional information on gene module activity to densities and TME/EPI states. As example, increasing macrophage separation from cytotoxic T-cells was associated with increased cytotoxicity (Fig. 6a. β = 5.34, 95% CI: 2.46–8.22, p < 0.001, FDR < 0.10) and exhaustion module activity (Fig. 6a, β = 4.90, 95% CI: 1.85–7.93, p < 0.001, FDR < 0.10), again consistent with an immunomodulatory role of macrophages across our cohort. Similarly, increasing T-helper cell separation from cytotoxic T-cells was associated with a marked decrease in cytotoxicity module activity (Fig.6A, β = −5.87, 95% CI: −8.78 to −2.96, p < 0.001, FDR < 0.10), but also in the exhaustion module (Fig. 6A, β = −5.37, 95% CI: −8.44 to −2.30, p < 0.001, FDR < 0.10), a finding that is compatible with an immune-engaged niche wherein activation and chronic stimulation co-occur. These data suggest that immune density and spatial organisation provide granularity beyond TME/EPI states in explaining the immune transcriptional programmes active within stromal regions.
To explore spatial organisation further, we computed at the patient level the signed distance of each cell from an epithelial (PanCK+) tumour mask in samples stratified by their assigned TME state (negative distances indicate cells within a tumour mask). We observed systematic differences across TME states for all phenotypes except T-regs (Fig. 6b, T-reg Kruskal–Wallis p = 0.989, other p < 0.01). For macrophages, cytotoxic T-cells and T-helper cells, TME-2 (T-cell infiltrating) tended to exhibit the largest median distance from a tumour boundary than other TME states (Fig. 6b), whereas in TME-3 (immune-hot) and TME-1 (T-cell low), the distances of T-helper cells and macrophages were not significantly different from immune-cold (TME-4) tumours (Fig. 6b). We next investigated the relationship between iTILs and sTILs across all phenotypes and TME states, by modelling iTIL densities as a function of sTIL densities. Across cores, increases in sTIL densities were expectedly strongly associated with increases in iTIL densities (logStroma χ2 = 3063.97, p < 0.0001). However, this relationship differed by TME state (2-way interaction TME state:logStroma: χ2 = 14.59, p = 0.0007), and was further dependent on cell phenotype (3-way interaction TME state:phenotype:logStroma: χ2 = 69.45, p < 0.0001). For each phenotype and TME state, we subsequently plotted the log-log slope (Fig. 6c). This plot shows the relationship in density between iTILs and sTILs as sTIL densities increase: values < 1 indicate that an accrual of sTILs does not translate proportionally to an accrual of iTILs, therefore suggestive of restricted epithelial entry. Patients with TME-2 (T-cell infiltrating) exhibited the highest scaling across all phenotypes, suggesting a greater ability of accruing sTILs to penetrate the epithelia (Fig. 6c). This was most pronounced in T-helper cells (β = 1.08, 95% CI: 0.99–1.17) and macrophages (β:1.01, 95% CI: 0.90–1.12). In immune-hot TME-3, however, stromal-epithelial scaling was consistently sub-proportional (β < 1), the most strongly coupled phenotype being T-regulatory cells (β = 0.91, 95% CI: 0.82–0.99). We also observed the lowest scaling relationship in TME-3 of T-helper cells (β = 0.95, 95% CI: 0.86–1.03), and cytotoxic T-cells (β = 0.87, 95% CI: 0.78–0.95) were similarly coupled to TME-1 (T-cell low), suggesting greater similarities in the spatial organisation of TILs between T-cell low (TME-1) and immune-hot (TME-3) states than between immune-hot and T-cell infiltrating milieu (TME-2). Taken together, our data suggest an enrichment of peri-tumoural TILs across TME states, wherein immune-hot and T-cell low environments exhibit more pronounced restriction in epithelial entry of immune infiltrates than those with intermediary infiltration of TILs (Fig. 6d).
To understand whether an increasing density of immune infiltrates could be associated with increasing or decreasing immune responses and exclusionary mechanisms, we next performed a regression analysis for macrophage and cytotoxic T-cell density (steps of 50 cells/mm2), against individual genes used in previous gene modules (Fig. 6e–h), as these density variables were most frequently associated with gene module activity outlined previously (Fig. 6a). In EPI-1 (immune-cold) states, increasing iTIL CD8+ density was associated with significantly upregulated antigen presentation (B2M, CD74, HLA.DRB1 and HLA.DRA), and cytotoxic effector genes (GZMA, GZMB), indicating that, if CD8+ cells infiltrate, they are more likely to reflect active effector engagement in this state. EPI-2 (intermediate infiltrate) was associated with the most significant upregulation of antigen presentation genes, yet cytotoxicity genes were not significantly coupled, implying that immune pressure is not consistently translated to effector killing in this state. EPI-3 (immune-hot) states were likewise associated with strong antigen presentation gene upregulation (CIITA, B2M, CD74, HLA.DRB1 and HLA.DRA), though weaker cytotoxicity (GZMA, PRF1), and significant upregulation of CTLA4 and CXCL13. EPI-3, therefore, likely reflects an environment with adaptive inhibition and chronic stimulation26, further implicating the potential presence of a dysfunctional and constrained anti-tumour immune response.
Within the stroma, increasing macrophage density (Fig. 6f, g) was not coupled to antigen presentation across any TME state (Fig. 6f). In TME-1 (T-cell low), macrophage density was markedly associated with extracellular matrix and fibrosis-associated gene upregulation (COL1A1, COL3A1 and TIMP2), reduced M1-like features (CXCL9, NOS2), and global trends to downregulation in checkpoints (significantly TIGIT and PDCD1 – encoding PD-1), cytotoxicity (significantly GZMA, GZMB), and exhaustion genes (significantly TOX, CXCL13). Increasing macrophage density in state TME-2 (T-cell infiltrating) indicated a more permissive stroma. ECM/fibrosis genes were frequently significantly downregulated (COL1A1, COL3A1, TIMP2 and MMP2), coupled to significant upregulation of cytotoxicity module genes (GZMA, GZMB and PRF1), indicative of more canonical adaptive inflammation. Increasing macrophage density in TME-3 (immune-hot) was, however, associated with M2-like (SPP1) and ECM regulation features (TIMP2; trends in MMP9), and a trend of downregulated cytotoxicity genes. TIMP2 is an inhibitor of matrix metalloproteinases in the TME27, implying that TME-3 stroma exhibits strict control over ECM degradation, and may be more functionally constrained in its immune response by SPP1+ macrophages28. These data mirror our spatial findings that state TME-1 and TME-3 have significantly reduced sTIL-iTIL coupling compared with TME-2.
We subsequently investigated changes in cytotoxicity, checkpoints and exhaustion module genes as sTIL CD8+ density increased across TME states. TME-3 (immune-hot) exhibited significant upregulation of cytotoxicity genes (GZMA, GZMB and PRF1), indicating enhanced killing potential; however, it also significantly upregulated TCF7 (encoding TCF-1, a CD8+ cell transcriptional regulator that maintains a progenitor-exhaustion state29). This suggests the presence of T-exhausted cells with progenitor-like capacity, which are known to be crucial in an optimal response to ICB30. In this vein, TME-3 milieu were also coupled to trending increases in most checkpoints (LAG3, TIGIT, PDCD1, PDCD1LG2 and CTLA4). We also recognised significant upregulation of VSIR (encoding VISTA), an emerging immune checkpoint shown to suppress T-cell immune responses31,32, the expression of which strongly associates with SPP1+ macrophages, and ICB resistance, in colorectal cancer33.
Altogether, our analyses support a model in early-stage ER+HER2− disease in which inflamed microenvironments exhibit peri-tumoural immune enrichment with relatively excluded epithelia. This arrangement coincides with transcriptional features consistent with adaptive inhibitory signalling, chronic stimulation and ECM remodelling. While these observational data cannot establish causality, we hypothesise that in immune-dense tumours with engaged stromal remodelling and immunoregulatory programmes, the addition of chemotherapy could amplify existing wound-healing and suppressive pathways rather than enhance effective tumour cell killing. This may help to explain the observed pattern in which patients with higher densities of sTIL cytotoxic T-cells have poorer outcomes on chemoendocrine therapy: the existing infiltrate may not be fully capable of responding to and clearing the residual cancer, as the microenvironment may be promoting a dysfunctional/exhausted milieu.

