Imaging Informatics: New Generation
By Keith Loria
Radiology Today
Vol. 26 No. 1 P. 10

Generative AI’s Role in Radiology

Generative AI (GenAI) is a growing subset of AI that focuses on creating new content or data that resembles existing data, whereby it examines algorithms, particularly deep learning techniques such as generative adversarial networks (GANs) and transformers, to generate new images, text, or other forms of data based on training from real-world datasets. The technology is starting to make some noise in the radiology world, as it can be utilized to enhance image quality and facilitate accurate diagnoses by generating synthetic medical images for training purposes, improve segmentation and localization of anomalies, and support clinical decision-making. Josh Miller, CEO of Gradient Health, notes that while there’s some great early work in the space, it’s still far from what it will eventually become.

“We’re starting to see the seeds of true generative AI be planted, but we’ve got a ways to go before generative AI is trusted to diagnose patients,” he says. “We’re starting to see it be useful in summarization today, but we’re yet to jump into where it’ll really start to shine.”

Novel Capabilities
Bradley J. Erickson, MD, PhD, with the Mayo Clinic’s radiology informatics laboratory, says one form of GenAI that is widely used is for creating improved images from noisier input images, either because of reduced X-ray dose or reduced acquisition time.

“While this is not what most people think of when they hear ‘GenAI,’ it is actually one important use of GenAI for images,” he says. “Gen AI is starting to be used for generating draft reports for radiology exams. The performance is similar to a good resident. I think it is still important to have a radiologist review the images and report, but there does appear to be an efficiency gain in creating this draft report.”

In the last six months, Miller has seen models that look at a chest X-ray and write the radiology report from scratch. There is also a growing interest in generating synthetic images to help address under-represented groups in a training set, with the hope that it will improve AI performance for those groups.

More companies are getting involved in the GenAI space. One of the main values Miller believes these systems will deliver is making free or inexpensive care available to patients who may not otherwise have access.

“We often talk about human vs AI quality in the case of reads, but a well-trained radiologist is not universally available to everyone in the world,” he says. “GenAI systems offer an alternative to those who may not have an alternative for receiving care.”

Miller also sees GenAI playing a role in the triaging of urgent exams. “Prioritizing exams in terms of urgency can bring a lot of value to patients,” he says. “For example, if radiology has a stack of reads to do for the day, moving the patients that are perhaps at risk for a stroke to the top of their list.”

It will also help in the sorting of normal vs abnormal exams. “Quite a bit of time is spent looking at scans without issues, which is arguably a waste of an expert radiologist’s time,” Miller says. “Even if we don’t rely on true AI-based diagnostics, this represents a huge potential for efficiency.”

Addressing Setbacks
The main concern about generative methods is hallucinations, which occur when a large language model (LLM) perceives patterns or objects that are nonexistent, creating nonsensical or inaccurate outputs. “This is reduced when there are large data sets,” Erickson says. “But one can also apply uncertainty quantification methods to identify predictions/creations that should not be fully trusted. Multimodal input may also be a way to reduce hallucinations, since there is less chance of correlation between disparate data types.”

Another challenge, Erickson notes, is that methods such as GANs are challenging to train—sometimes they work but sometimes they suffer from mode collapse and fail. “DDPMs [denoising diffusion probabilistic models] seem to be largely immune to this problem, but they require much more computation for both training and, more importantly, for inference,” he says. “Many are working on this challenge but, today at least, it is a problem.”

GenAI for medical text also has challenges associated with veracity. Most language models were trained on large text sets that are not medical in nature. Furthermore, medical text often has numbers which are critical but not understood by current LLM technology.

“AI methods in general tend to promote confirmation bias where the radiologist finds that the AI is usually reliable and, if they are not given some sort of signal that an instance of AI output is less trustworthy, they will tend to trust all outputs,” Erickson says. “For that reason, I think uncertainty quantification is going to be a critical element of successful AI adoption.”

Miller says the reliance on GenAI in radiology can lead to diagnostic complacency among radiologists, potentially undermining their expertise and critical reasoning skills. “Training data for foundation models often has a bias towards the most common disease,” he says. “If you train a GenAI on a random large pool of data, most of that pool of data will be common issues by default. This means that less common diseases may be underdiagnosed given the impossibility of giving an AI enough data on rare diseases to recognize it to the same quality as common diseases.”

Setting Standards
GenAI, Miller says, must be held to the same, if not a higher, standard as typical medical devices. “If a GenAI device harms a patient, the liability must clearly fall to the AI company that developed it,” he says. “This helps enforce safety-focused thinking, not just in deployment, but in development. Furthermore, there are a number of reports of physicians using ChatGPT to aid in patient treatment. ChatGPT is definitely not a regulated, approved medical device, so the liability is with the physician for any patient harm that comes from using it.”

Today’s medical records are often difficult to consume efficiently and, if GenAI can help to efficiently summarize medical notes, Erickson believes radiologists and others may be more motivated to actually see what other clinicians are saying about a patient. “In particular, if the GenAI output can be tuned into what a specific radiologist wants— for example, a neurointerventional radiologist almost certainly wants a different summary than a breast radiologist—the value of these tools will go up and interaction between specialists will also go up,” he says.

In Miller’s opinion, GenAI is not ready to independently diagnose patients without some manner of radiologist in the loop, and it’s probably at least a decade away from reliably and legally diagnosing patients without a physician or radiologist oversight. “That being said, the risks are plenty, but I view them as more social/political than technical,” Miller says. “Anyone who’s been treated for anything can tell you that a physician’s job goes beyond treatment and into the realm of care, which AI won’t ever be capable of. So call me excited for the long term but somewhat skeptical in the short term.”

While AI is never going to be perfect, neither are humans, and many people in the field feel that if GenAI can reduce error rates, that should be an acceptable condition for clinical use. “I think uncertainty quantification can also help,” Erickson says. “I think that if we can develop a framework for incorporating UQ values into a practice and accept better-than human system performance—even if not perfect—that should be acceptable from a regulatory perspective.”

Keith Loria is a freelance writer based in Oakton, Virginia. He is a frequent contributor to Radiology Today.