Data source
This population-based, retrospective observational study extracted all patient data from the Nationwide Inpatient Sample (NIS) database, which is the largest all-payer, continuous inpatient care database in the United States (US), having data of about 8 million hospital stays each year [16]. The database is administered by the Healthcare Cost and Utilisation Project (HCUP) of the Agency for Healthcare Research and Quality (AHRQ) (https://www.hcup-us.ahrq.gov/db/nation/nis/NIS_Introduction_2020.jsp). This continuous, annually updated database derives patient data from about 1,050 hospitals in 44 states in the US, representing a 20% stratified sample of US community hospitals as defined by the American Hospital Association. The data contain primary and secondary diagnoses, primary and secondary procedures, admission and discharge status, patient demographics, expected payment source, duration of hospital stay, and hospital-related characteristics (i.e., bed size/location/teaching status/hospital region).
Study population
Data of hospitalised older adults aged 60 years or older, who were admitted with CRC diagnosis between 2005 and 2018, were extracted from the NIS database. CRC diagnoses were confirmed based on the International Classification of Diseases, Ninth and Tenth Revisions, Clinical Modification (ICD-9-CM and ICD-10-CM) codes: 153, 154.0, 154.1, V10.05, V10.06, C18-20, Z85.03, and Z85.04. Hospitalisations were eligible if the CRC codes appeared in the principal or any secondary diagnosis field. Patients with missing information on key outcomes and variables (i.e., mortality status, length of stay [LOS], discharge destination, total hospital costs, sex, household income, primary payer, admission type, hospital bed size, and race/ethnicity), as well as those missing sample weight values, were excluded. The patients were further grouped into non-metastatic and metastatic CRC (ICD-9-CM: 197-198 or ICD-10-CM: C78-79), and then divided by whether or not they had frailty.
Study variables
Study outcomes
Primary study outcomes were: (1) in-hospital mortality; (2) prolonged LOS, defined as a LOS >75th percentile; (3) discharge to long-term care facilities, defined as discharged to a nursing home or long-term facility; and (4) total hospital costs.
Assessment of frailty
Frailty was assessed using the HFRS, a validated claims-based frailty measure derived from a predefined set of ICD diagnostic codes that serve as surrogates for frailty-related conditions, such as volume depletion, chronic pulmonary disease, and heart failure [15]. The HFRS has been widely validated and applied across diverse clinical settings and health systems internationally [17]. Because the NIS is a hospitalisation-level database and does not allow longitudinal linkage across admissions, frailty-related ICD diagnostic codes were identified exclusively from diagnoses recorded during the index hospitalisation. Hospitalisations occurring before October 1, 2015 were coded using ICD-9-CM, whereas those on or after this date used ICD-10-CM, in accordance with the nationwide transition in coding systems.
The HFRS was calculated by applying the original weighting scheme, in which each frailty-related ICD diagnosis code is assigned a predefined weight reflecting its relative contribution to frailty, and the total HFRS is obtained by summing the weights of all eligible diagnosis codes recorded during the index admission. Consistent with prior studies, patients with an HFRS ≥ 5 were classified as frail, whereas those with an HFRS < 5 were classified as non-frail [18]. The HFRS was calculated by applying the original weighting scheme to diagnosis codes recorded during the index admission only. To ensure methodological transparency and reproducibility, the complete lists of ICD-9-CM and ICD-10-CM diagnostic codes and their corresponding weights used to derive the HFRS are provided in Supplementary Tables 5 and 6.
Covariates
Data from patients’ demographic characteristics, including age, race/ethnicity, household income, insurance status (primary payer), admission types (elective or emergent), and severity of illness were extracted from the NIS database. Patient’s clinical characteristics, including obesity, active tobacco use, major comorbidities (ischaemic heart disease, congestive heart failure, diabetes, cerebrovascular disease, chronic pulmonary disease, severe liver disease, moderate-to-severe renal disease, and systemic connective tissue disorders), and severity of comorbidities assessed by Charlson Comorbidity Index (CCI), was also identified through ICD-9 and ICD-10 codes [19]. Detailed ICD codes used are provided in Supplementary Table 1. Hospital-related characteristics (bed size, location/teaching status, and hospital region) were extracted from the database as part of the comprehensive data available for all participants in accordance with other studies in the literature that have used the NIS data.
To account for treatment-related differences, CRC-directed surgical procedures were identified using ICD-9-PCS and ICD-10-PCS codes and classified as: (1) open surgery, (2) laparoscopic/minimally invasive surgery, (3) liver resection for metastatic disease, and (4) no CRC-related surgery (i.e., no colorectal or liver metastasectomy performed during the admission). Additionally, the five most common principal diagnosis codes of the hospitalisations have been reviewed, as presented in Supplementary Table 7.
Statistical analysis
The NIS database includes a 20% sample of US annual inpatient admissions, weighted samples (before/after 2012 using TRENDWT and DISCWT), stratum (NIS_STRATUM), and cluster (HOSPID) were used to produce national estimates for all analyses. The SURVEY procedure in SAS performs analysis for sample survey data. Descriptive statistics are presented as numbers (n) and weighted percentages (%), or as means and standard errors (SE). Group comparisons use standardised mean difference (SMD). The propensity-score matching (PSM) method uses the SAS OneToManyMTCH macro to balance the baseline characteristics of patients with and without frailty. The macro prioritises “best” matches first and then proceeds with “next-best” matches until no more can be made. The patient cohort was matched at a ratio of case (frail): control (non-frail) = 1:1 based on variables with SMD ≥ 0.1 in Supplementary Table 2 (except for HFRS).
Univariate and multivariable logistic regression analyses were used to determine associations between the study variables and binary outcomes, while linear regression was used to assess associations with the continuous outcome (i.e., total hospital costs). In addition, stratified analyses on the impact of frailty on in-hospital mortality were performed according to age, sex, and race/ethnicity. An SMD < 0.1 indicates a balanced distribution of variables between groups. Variables with an SMD ≥ 0.1 after PSM were additionally adjusted for in the multivariable model. All p-values were two-sided, and p < 0.05 was considered to represent statistical significance. All statistical analyses were performed using the statistical software package SAS software version 9.4 (SAS Institute Inc., Cary, NC, USA).

