Background
Surgical removal remains the primary curative treatment in the management of early-stage breast cancer care, with over 95% of all women diagnosed with primary breast cancer receiving some form of surgery. The unique tissue microenvironment and cosmetic challenges in breast surgery call for advanced visualization and quantitative techniques to assist surgeons in visualizing each patient’s anatomy prior to surgery. We therefore developed an artificial intelligence algorithm (TumorSight Viz, version 1.3), which reconstructs a 3D digital twin of a patient from their DCE-MRI to allow for visualization and quantitation of patient anatomy prior to surgery.
Methods
In this study, we performed a retrospective, multi-institute clinical trial of the AI algorithm on 267 patients with early-stage breast cancer to assess the accuracy of its AI-based landmark identification and measurement compared with radiologist assessment of disease size and location. Specifically, we compared the deviation from ground truth by the AI algorithm to the deviation between landmark features measured by 3 US board-certified, fellowship-trained breast radiologists (13-15 years of experience). Briefly, ground truth was established via a multi-case, multi-reader study where radiologists: (1) measured quantities of interest including the tumor dimension along each of the 3 anatomical axes, and the closest approach of disease to the chest, nipple, and skin; and (2) approved tumor regions hand-segmented by trained annotators. Algorithm-generated segmentations were compared directly to ground truth.
Results
We found that the variation between the AI algorithm and ground-truth measurements was on the same order of magnitude as variations between independent radiologists for all landmark measurements (Table). Of particular interest, we found that the AI algorithm had a mean deviation in tumor longest dimension measurement of only 1.27 cm (compared with a mean deviation of 1.02 cm among radiologists), a mean absolute error of 4.13 cc in tumor volume, and mean surface Dice coefficient of 0.92 (indicating strong spatial agreement between the AI algorithm and radiologists). Algorithm performance was found to be stable across clinical and imaging substrata, including T stage (T1-T4), histology (hormone receptor [HR]-positive/HER2-, HR-positive/HER2+, HR-negative/HER2+, triple negative breast cancer), MRI manufacturer (GE, Siemens, Philips), or magnetic field strength (1.5T, 3T). Additionally, the algorithm was able to generate these features in an average of 8 minutes per patient following MRI upload into the cloud-based software.
Conclusions
Taken together, these findings show that the AI algorithm is a clinically viable software to robustly identify landmark features and measurements within the range of inter-radiologist variability. These detailed depictions of 3D tumors offer both qualitative and quantitative assessment of cancer topology and may aid in management of patient disease.

