February 2012
In CAD We Trust? — Researcher Seeks Ways to Build Radiologists’ Comfort With Computer-Aided Detection Software
By David Yeager
Radiology Today
Vol. 13 No. 2 P. 6
We humans may be wired to develop tools that help us do our work more efficiently, but sometimes it takes a while for our understanding to catch up with a tool’s capability. As the Bud Light commercial with the cavemen and the wheel illustrates, a tool is only as good as the people who use it. A 21st century example of this maxim can be found in the use of computer-aided detection (CAD) with screening mammography.
In July 2011, a study published online in the Journal of the National Cancer Institute found that CAD use during film-screening mammography in the United States decreased screening specificity without improving detection rates or prognostic characteristics such as stage, size, or lymph node status in patients with invasive breast cancer. Because the study looked at 1.6 million film-screen mammograms from 684,956 women, it drew attention of people in the field. But whether CAD’s shortcomings are a technological or a personnel matter is open to debate.
“I think evaluating CAD clinically is difficult, one, because the prevalence [of missed cancers] is low but, two, because to interpret the results, you need to think carefully about how CAD is supposed to work clinically,” says Robert M. Nishikawa, PhD, FAAPM, an associate professor of radiology and the director of the Carl J. Vyborny Translational Laboratory for Breast Imaging Research at the University of Chicago.
Seeing Is Not Necessarily Believing
Nishikawa says several complex factors influence how radiologists interact with CAD. One of the most significant is trust: Because it requires a tremendous amount of time to become proficient at interpreting screening mammograms, let alone interpreting CAD findings, experienced radiologists often view CAD with suspicion. He says the sheer volume of screening mammograms that a radiologist reads combined with the actual number of cancers that are found, with or without CAD, works against any perceived benefit that CAD provides.
In a typical screening regime, there will be approximately four cancers per 1,000 women. Of those four, a radiologist may miss one. That’s one cancer in 1,000 that CAD may be able to help a radiologist find. To complicate matters, CAD has a false-positive rate of 0.5 per image, which means that every four-image screening mammogram that’s processed with CAD will have two false-positive marks on it. And that’s an average, so any mammogram could have more or fewer false-positive marks. Considering that for every 1,000 cases, there will be about 2,000 false-positive CAD marks and one that finds a cancer, it makes it difficult for radiologists to trust the technology.
Radiologists often believe CAD makes too many marks on an image and, more importantly, misses suspicious areas. Nishikawa says CAD probably isn’t missing many cancers, but it does sometimes miss areas that get radiologists’ attention. Part of the problem is that CAD isn’t designed to find every questionable area on an image.
“When you’re designing CAD, you’re designing it to find cancer, so you don’t really want it to mark all those other things. That’s sort of the dilemma in developing it. But if you don’t mark those things, the radiologist will think it’s missing cancers and won’t have confidence that it can ever find cancer when it needs to,” says Nishikawa. “If they don’t really pay serious attention to the marks, then I can almost guarantee you they’re not going to derive much benefit because they’re not really using it optimally.”
Improving the Odds
With so many marks to consider, reducing the number of false-positives would go a long way toward boosting radiologists’ confidence in CAD. No one wants to spend extra time reviewing images if they can’t be reasonably sure that it will benefit patients. Nishikawa says CAD vendors are aware of the issue, and improvements are a work in progress.
Another difficulty is that proficiency with CAD, as with screening mammogram interpretation, is achieved only through long hours of practice. Nishikawa thinks CAD training programs, similar to ACR and private courses for reading screening mammograms, may be beneficial to some clinicians. However, he notes that there is no consensus on how best to train radiologists to read screening mammograms and even less agreement about how to do it with CAD.
One solution that could potentially provide immediate, concrete benefits is for facilities to audit all their false-negative screening mammograms. By reviewing these cases, doctors will almost certainly find instances where CAD marked a cancer but was ignored. Annual reviews presumably would provide clinicians with an evidence-based measure of CAD’s performance.
“I’m almost certain they’ll find cases that were marked by the computer [in which] they basically ignored the prompt,” says Nishikawa. “So now, if you do that, they’ll get feedback—real clinical feedback—that the computers can actually find things that could help them, if they pay attention to the marks.”
Another development that could allow more efficient use of CAD would be an improved computer-human interface. Nishikawa says there are several possible versions, but there are a few that he finds particularly interesting. One is called interactive CAD; it’s currently in development at the Radboud University Nijmegen in the Netherlands. Rather than employing CAD as a second reader, the radiologist can click on any area of interest on an image and receive feedback from the software. If there’s a CAD mark on the image, it may contain some details about the lesion, but the radiologist would see the mark only if he or she wants to see it, the idea being that it would be used only for borderline cases.
Similarly, another type of interface can be incorporated into a second reader. If a clinician wants to know more about a mark, he or she can click on it to launch a pop-up window that will provide images similar to the area of interest. It will also include some data about whether the area is likely to be cancerous as well as the probability of the lesion being benign or malignant. Another way that CAD as a second reader may be improved is by color-coding CAD marks to coincide with the probability of a suspicious lesion being cancerous.
Nishikawa says the key to making the best use of CAD lies in examining how clinicians interact with it. It will probably take some time before that is well understood, but, regardless of any significant changes to CAD interfaces or training, he believes screening mammography interpretation with CAD is better than without.
— David Yeager is a freelance writer and editor based in Royersford, Pennsylvania.