But the results are ‘not enough on their own’ to hand over screening to AI.
Breast cancer screenings supported by AI are safe and generally accurate, according to a Swedish study of 80,000 women.
AI screenings of 39,996 participants revealed that AI screenings surpassed the lowest acceptable limit for safety (three diagnoses per 1000 participants), without affecting factors like the rate of recalls, false positives, and consensus meetings. AI also detected 20% more cancers than the control group.
There were 46,365 screen readings conducted across the experimental group, which resulted in 244 detected cancers and 861 recalls.
By comparison, the 40,024 participants in the control group – which underwent standard double screenings – ended up with 203 screen-detected cancers, 817 recalls, and a total of 83,231 screen readings. The results indicated that AI had a similar detection rate to standard screenings.
AI provided an “examination-based malignancy risk” score out of 10. Scores from 1-7 were categorised as “low risk”, 8-9 were classed as “intermediate risk”, and a score of 10 was “high risk”. Participants who had a score of 10 were given another screening.
AI was used in the study to triage patients into single or double reading groups; risk scores were then provided to radiologists.
One of the strengths of AI was the technology’s ability to identify and prioritise those at risk of cancer, which has strong implications for the reduction of workload on clinicians, as only 30% of women had a score over 8. The report found a reduction of 44.3% in workload.
“Cancer prevalence increases sharply in the group with a risk score of 10, and retrospective studies using the same AI version as in this trial have reported 87-90% of screen detected cancers and 45% of interval cancers to be in this group,” the researchers wrote.
Further, “contrary to expectations”, the number of screenings that led to resource intensive consensus meetings – where inconclusive scans are reassessed by two radiologists – was not affected by AI.
The researchers suggested that in countries where double screenings are standard practice, AI could replace one of the readers.
Additionally, AI could be used to “force examinations with high AI risk scores to a consensus meeting or to arbitration, or to automatically recall cases above a specific threshold”.
However, the increased number of in-situ cancers – abnormal cells that may develop into cancer – detected may give rise to overdiagnosis. That said, the research stated that the performance of AI is “improved” compared to conventional computer-aided diagnosis, citing a study which found CAD showed a 34% increase in the diagnosis of in situ cancers but “without a parallel increase in the detection of invasive cancer”.
But the researchers cautioned against over-reliance on AI, saying it could “cause an increased risk of detrimental automation bias over time”.
They underlined the importance of the radiologist having the final say, which “reduces false positives” and ensured that “established medicolegal requirements” are met.
Additionally, they emphasised the potential ability of AI to improve detection rates of interval cancers at routine screenings, which will be properly realised in a two-year follow-up (December 2024). Interval cancers are cancers diagnosed in between scheduled screenings, and typically have a worse prognosis than screen-detected cancers.
“We … need to investigate whether the higher detection of small invasive cancers will lead to a subsequent reduction of prognostically significant cancers and whether the frequency of in situ cancers detected will be reduced at subsequent screenings.
“These promising interim safety results should be used to inform new trials and programme-based evaluations to address the pronounced radiologist shortage in many countries. But they are not enough on their own to confirm that AI is ready to be implemented in mammography screening,” cautioned lead author Dr Kristina Lang from Lund University, Sweden.
“We still need to understand the implications on patients’ outcomes, especially whether combining radiologists’ expertise with AI can help detect interval cancers that are often missed by traditional screening, as well as the cost-effectiveness of the technology.”
The authors also acknowledged the limitations of the study – that it only measured patients at one centre, and that it was “limited to the combination of one type of mammography device and one AI system”. Further, the results may not be generalisable to less skilled radiologists, as the radiologists in the trial were “moderately to highly experienced” in breast imaging and were given high levels of discretion.