Radiology’s Future in Big Data

February 2014

Radiology’s Future in Big Data — Radiology Today Interview With Eliot Siegel, MD
Radiology Today
Vol. 15 No. 2 P. 22

Eliot Siegel, MD, is a professor and the vice chair of information systems at the University of Maryland and the chief of imaging for the VA Maryland Health Care System in Baltimore. He was a driving force in helping the Baltimore VA become the first hospital in the world to go filmless and digital throughout the medical enterprise. He has worked with medical institutions, government agencies, and vendors on numerous projects to advance the field of imaging informatics, including applying IBM’s Watson technology to mining radiology data.

Radiology Today recently spoke with him about the present and future of imaging informatics, and this is what he had to say.

Radiology Today (RT): What are some ways that radiology is connecting to the rest of the medical enterprise?

Siegel: The main change in radiology’s interface with the enterprise since the era of film-based imaging has been a transition from one-on-one direct personal communication to remote digital communication made possible by ubiquitous access to images and reports on the PACS and EMR, e-mail, texting, and other forms of electronic messaging.

Ironically, despite greater theoretical access to patient history in this age of Meaningful Use and the emerging electronic medical record, the lack of interpersonal contact with our clinical colleagues has probably resulted in interpretations being rendered with less clinical information.

As things exist currently, the electronic medical record is not easy to access or consume by most radiologists because it represents a separate application from the PACS and is not integrated in a way that allows us to automatically look up relevant patient history for a given patient that we are reviewing on PACS with a single mouse click or automatically. This has resulted in a paradoxical decrease in interchange of clinical information and radiology interpretation among the respective specialties despite the fact that the data are online and digital.

In addition to access to the electronic medical record, radiologists are increasingly working in health care enterprises in which there is access to other types of images, ranging from digitized medical documents to other types of medical images from cardiology, pathology, ophthalmology, dermatology, GI medicine, and others. This provides us the opportunity for radiology to become a nexus for medical images in general by sharing our image management systems and expertise in image acquisition, quality, safety, image distribution and analysis, and other areas.

We’re also connecting to the enterprise by making our images and reports available to patients and their health care providers via hospital and imaging facility portals. This brings a greater level of connectivity and relevance of diagnostic imaging studies to patients and their providers. Pioneering efforts such as the NIBIB [National Institute of Biomedical Imaging and Bioengineering]-funded RSNA Image Sharing project provide direct connectivity between the imaging department and a patient’s personal health record.

However, as we become increasingly digitally connected to the health care enterprise in radiology, we need to try to regain or preserve our direct interpersonal connections with our referring clinicians and patients. This may require special efforts to connect with patients when they are in the radiology department and to encourage clinicians to physically come to the imaging department more frequently for direct consultations, teaching sessions, and other relevant discussions.

RT: What do you see as the biggest challenges in imaging informatics today?

Siegel: Two major challenges come to mind. The first involves communication of information, which includes the important concept of true closure of the loop with regard to radiology findings. We radiologists make recommendations all the time that don’t get followed up either by us or by clinicians. Our current information systems in radiology do not allow us to monitor recommendations that we make and follow up on them. I want to not only get acknowledgment that the clinician has received that information for medico-legal reasons—which is where most people close the loop—but I believe that we in radiology also have a continuing responsibility for our patients to ascertain whether that information ended up being acted upon and whether the imaging findings impacted patient care. I believe that providing this continuum of care and follow-up in a responsible yet practical way that does not constrain efficiency is the most important “big” challenge in informatics.

The second major challenge for imaging informatics is figuring out how to tag and index the incredible amount of data that we routinely acquire in diagnostic imaging. Our counterparts in astronomy, chemistry, and other disciplines have created means of structuring, tagging, and indexing their information in a logical and machine-readable fashion. Our colleagues in cardiology and other disciplines in medicine have found ways to make their results available in a highly structured fashion, which allows them to be discovered and utilized by the algorithms that represent clinical pathways and decision-support systems. In the near future, computers will assist much more than today in ensuring patient safety, minimizing disparity of treatment by different physicians in different areas, and in day-to-day clinical decision support.

In order for us to continue being relevant, we must make our data in diagnostic imaging similarly discoverable by those intelligent algorithms. Mining the incredible wealth of data from radiology reports—both structured and unstructured—and the actual image pixel data is a fascinating challenge that is just beginning to be undertaken. Much of the work that I have done with the NCI [National Cancer Institute] imaging informatics initiative and the IBM Watson team in diagnostic imaging has been in determining ways in which we can tag and index imaging reports and the images themselves.

RT: Decision support is something that people discuss as a way that radiology could provide added value. Is that something that is being done more frequently or is it being done in very limited cases?

Siegel: I think that most of us in imaging informatics had expected that by 2014, radiology would be utilizing data-driven decision-support tools to a substantial greater extent. There has been some great pioneering work in clinical decision support by Keith Dreyer at Massachusetts General and Ramin Khorasani at Brigham and Women’s and others in the area of feedback on the selection of an appropriate radiology examination. This is designed for decision support for our clinical colleagues much more than for radiologists; decision-support tools for radiologists are in their relative infancy.

However, more and more clinicians are using guidelines and decision-support algorithms to help them make decisions about treatment of medical conditions such as deep venous thrombosis, such as which patients should be admitted from the ER, which patients should be on specific types of chemotherapy, and things of that nature. These are increasingly becoming personalized to account for the specific attributes of a particular patient.

As far as decision support in radiology, the technology exists for decision support, but until we have the capability to mine our own data—whether that’s local data or on a regional or national level or even repurposing data from large clinical trials—then I don’t think we’ll have decision-support tools that are as sophisticated and robust as we really need them to be. It really relates to the question, to what extent is radiology metadata mineable today?

The model that I’d like to see would be that when every report is completed, the reporting system would provide feedback as to its interpretation of what the radiologist is trying to say. This would typically be in the form of two, three, or four bulleted items. It may be that the clinician sees the radiologist’s report as the free text dictated by the radiologist, but the reporting system also generates and stores metadata about the report in a structured format using natural language processing. This might include whether the study is essentially positive or negative for the clinical indication, such as pulmonary embolism. It would include whether there was a recommendation and the time frame for that. It would also include the major diagnostic findings and conclusions in a structured format.

Consequently, for each imaging study, there would be the traditional radiology report that would be sent to the electronic medical record. In addition to that, there would be metadata, which would include a structured version of the major findings, recommendations, and information about the exam, such as contrast injection, and specific CT or MRI sequences utilized. The metadata would be available to clinicians who wanted to review it but would otherwise be stored in a searchable format so that clinical decision-support systems could utilize that information. It would make it relatively unambiguous whether an imaging report was positive or negative for specific findings such as pulmonary embolism or intracranial hemorrhage, for example, to a computer system that needed to know the high-level results of a study.

In the vast majority of cases, our clinical colleagues are not interested in all of the different acquisition parameters for an MRI scan, for example, and those would be included in the metadata but not in the typical imaging report in the EMR. This would also be true of radiation dose information and, in some cases, quantitative data.

You can also imagine, in the future, being able to take advantage of some of the innovative work that’s being done with computer-aided detection, where the computer’s able to look through a CT scan of the thorax, for example, and determine not only whether there’s a lung nodule but also hundreds of other parameters, such as whether there’s loss of height of a vertebral body or create a quantitative score of the degree of interstitial lung disease in a patient with COPD. There are a lot of quantitative measurements that could be made in an automated or semiautomated way, and they could be stored in the electronic medical record and radiology information systems.

RT: In your estimation, how much radiology metadata are mineable today?

Siegel: I think there’s a tremendous opportunity for improvement in making our metadata mineable for those decision-support and quality and safety software programs. At this point, the information that we have in radiology reports is being accessed mainly by looking up and reading those reports, per se, not by parsing out the information from the radiology reports in a structured, searchable format. So, as far as the metadata that are mineable, those decision-support systems need to be able to obtain quantitative measurement information from automated analysis or from manual measurements made by radiologists.

It is critical for us to be able to tag different findings on the images in a manner analogous to the ways in which people tag their friends on Facebook and, for the most part, we’re not able to do that because of the lack of standard means of image annotation from the vendor workstations. A standard called AIM (Annotation and Image Mark-Up) was developed using either DICOM structured reporting or XML for tagging images and regions of an image by the National Cancer Institute’s caBIG imaging workspace but has not been widely adopted by workstation and PACS vendors. This would have facilitated the ability to tag imaging findings across different vendors and make human and computer measurements and interpretations more easily “consumable” by computer systems.

Another major challenge with radiology decision-support systems is the lack of quality radiology databases. Most institutions do not collect and organize their imaging data. Those few academic ones that do typically utilize them do so internally and don’t share their deidentified data with other practices. However, there is a gold mine of clinical trial data sponsored by the ACR Imaging Network and the NCI and many other funding sources that could be a source of data for decision-support systems. However, in most cases, these data are not available except for approved research purposes.

One of the projects that I’ve been working on with the principal investigators of the National Lung Screening Trial and the ACR and the NIH [National Institutes of Health] has been to make the data from their over 50,000 patients available for decision-support tools. This has the potential to create a supplement to the Fleischner criteria that would personalize the assessment of lung nodules. Using personalized data available from that screening trial, one can create a decision-support tool that takes into account specific data about patient smoking history, nodule shape, patient age, geographic location, and data in addition to nodule size to create a personalized index of likelihood of whether the nodule is benign or malignant. This repurposing of data from a clinical trial designed to evaluate the impact of screening on patient survival shows exciting promise as a model for reusing other large well-documented imaging datasets.

RT: What about data that have been stored in an unstructured format?

Siegel: Radiology reports utilize a relatively limited vocabulary and even free-text reports are relatively structured in comparison with other unstructured medical data. This has resulted in radiology reports being a favorite source of data for researchers working on natural language processing over the past 25 years.

What I’m really encouraged to see is that speech recognition vendors are beginning to automatically parse out meaning from reports. For example, is this a positive or negative study? Were there recommendations made, yes or no? Are there critical findings, such as the mention of a subarachnoid hemorrhage on a head CT or the mention of a pneumothorax on a chest radiograph or a chest CT? Or is there mention of lung nodules? I’m really pleased and excited to increasingly see tools being developed that allow the computer to do that.

Greater adoption of the AIM standard and the RSNA’s Radiology Lexicon (RadLex) would allow diagnostic imaging data to become more easily interoperable with parallel efforts to utilize natural language processing and structured reporting in clinical practice. Each imaging study would then have associated with it a fair amount of discoverable structured data in addition to the free-text report that the radiologist generates. At that point, radiology becomes a specialty that can participate in Big Data.

So as we talk about looking at genomic data and correlating that with lab data and clinical data, having discoverable radiology data using tags that can be generated and indexed would make radiology much more relevant than it would be otherwise in the era of Big Data.

RT: How do you think medical institutions might use information gained from mining radiology data?

Siegel: There’s an incredible wealth of data in every single study that we do, and we only report on a small subset of it. We might do a CT of the chest to evaluate for pulmonary embolism, and we may report that there’s no evidence of pulmonary embolism and, otherwise, we’re not seeing any active cardiopulmonary disease, for example. But in the image, there’s information about coronary artery calcification, bone mineralization, the texture of the lungs, whether the patient has gallstones or vascular calcification, and a tremendous amount of additional information. There’s an incredibly large amount of data in the pixel data from an MR scan, a CT scan, or a conventional radiograph. That’s why we keep the images for reference.

So, as time goes on, for Big Data, there are really two things that we want to mine. We want to mine the information in the radiology report as it exists today along with the metadata—things like radiation dose, the amount of contrast that was administered, or data about how the image was acquired. Then there’s also the pixel data, the information in the image that either may be commented on or may not be, that can be tagged using this now-available mechanism called AIM that allows us to tag findings in the images.

Radiology is currently perceived by many in the biomedical informatics field as “just a set of pictures” that you really can’t quantify and you really can’t put into the paradigm of Big Data the way you can with genomic and proteomic data. I actually believe that images have as rich or a richer amount of content and material than the genomic data that we have and, over the next several years, we’re going to figure out ways to index and tag both the pixel data and the radiology reports and make radiology discoverable.

Many different algorithms are being written by EMR vendors and third parties that are beginning to look at intelligent ways to allow clinicians to practice more safely and more efficaciously. Once radiology is able to solve some of its challenges, it will be part of that discoverable Big Data, and I think we’ll maintain our role that we have currently, where radiology is a critical component to making patient diagnosis and treatment decisions.

With regard to the future of decision support in radiology, there is much work to be done, but I believe it will help us to deal with the tremendous information overload that we experience in radiology today.

As an academic radiologist at the University of Maryland, I have the advantage of having radiology residents and fellows. When I read my PET/CT scans or other types of imaging studies, I have a resident or fellow in radiology who has talked with the patient, read through the chart, talked with the clinicians, and looked at the old examinations. By the time I go over the study, they may have spent quite a while on that case. I can have the important elements and a tentative diagnosis that’s made by the resident or fellow looking at the images and correlate them with the data presented to me, resulting in an intelligent synthesis of the data. Then I can provide my added value by determining whether I agree with or disagree with the tentative diagnosis and my added analysis, insights, and recommendations.

Having the residents and fellows makes me much more efficient. It makes me, I think, safer and more accurate for having that collaboration, even though, as the attending radiologist, I’m the final arbiter of the diagnosis. Being able to create the computer equivalent of some of the functions of the residents and fellows with regard to synthesis of available information and a preliminary list of observations and thoughts is, in my opinion, the most exciting challenge for medical imaging informatics for the future. ■

Radiology Today Magazine