A doctor studies a lung x-rayShare on Pinterest
A chest X-ray can help identify a number of lung diseases/Getty Images
  • Artificial intelligence (AI) and machine learning tools are becoming increasingly common in healthcare and beyond.
  • A new study compared AI tools to human radiologists and found that radiologists were superior to machines at identifying conditions from X-rays.
  • The researchers reported that the more complicated the diagnosis, the more strongly human experts performed compared to AI tools.

Artificial intelligence (AI) is already transforming the way we interact with the world, from helping forecast hurricanes better than ever to providing financial tips.

But when it comes to reading your X-rays at the doctor’s office, AI’s may not be ready to replace the radiologists of the world.

That’s according to a new study published in the journal Radiology.

In the study, Danish researchers took a pool of 72 radiologists and four commercial AI tools and stacked them against each other to interpret 2,040 older adult (average age 72) chest X-rays.

About a third of the X-rays displayed at least one of three diagnosable conditions: airspace disease, pneumothorax (collapsed lung), or pleural effusion (also known as “water on the lung”).

Researchers report that AI tools were reasonably sensitive, diagnosing airspace disease 72% to 91% of the time among positive cases, 63% to 90% of the time for pneumothorax, and 62% to 95% of the time for pleural effusion.

However, researchers said these AI tools also produced a high number of false positives, with their accuracy lessening the more complicated the diagnosis became. This was especially true in case cases of multiple concurrent conditions or when X-ray evidence was smaller.

For pneumothorax, for instance, when these false positives were added up, the positive predictive values for AI systems were between 56 percent and 86 percent. Radiologists, on the other hand, got it right 96 percent of the time.

Positive predictive values for pleural effusion were similar to those for pneumothorax, ranging from 56 percent to 84 percent accuracy.

AI was even worse at diagnosing airspace disease, only positively predicting the condition in 40% to 50% of cases.

“In this difficult and elderly patient sample, the AI predicted airspace disease where none was present five to six out of 10 times. You cannot have an AI system working on its own at that rate,” said lead study author Dr. Louis Plesner, a lead study author and a resident radiologist in the Department of Radiology at Herlev and Gentofte Hospital in Copenhagen, Denmark, in a press release. “AI systems seem very good at finding disease, but they aren’t as good as radiologists at identifying the absence of disease, especially when the chest X-rays are complex.”

Another issue, Plesner said, is that a high rate of false positives would be costly both in time, unnecessary testing, and increased radiation exposure among patients.

“This study doesn’t surprise me and is exactly what would be expected of an AI system,” said Zee Rizvi, the co-founder and president of Odesso Health, an AI-assisted service for automating electronic medical records.

“At best, AI augments human skills in a complementary fashion,” he told Medical News Today. “To view AI and human capability as mutually exclusive will always lead to disappointing results. We are not far along enough in the AI and deep learning space to entirely remove humans from the equation of productivity and patient outcomes. It’s just that simple.”

Dr. Fara Kamanger, a dermatologist and chair of the San Francisco Dermatological Society as well as the founder of AI skin health tool DermGPT, responded positively to the study while noting its limitations.

“The design of this study is robust, as it incorporates multiple AI tools and involves two radiologists to confirm the diagnosis. In cases of disagreement, a third radiologist is consulted,” Kamanger told Medical News Today. “The potential of AI in healthcare is vast and encompasses various applications, including drug development, research, patient care, practice management, prescription and insurance management, and more. It is encouraging to see physicians taking a proactive role in leading the development of AI in healthcare.”

Kamanger did agree with Rizvi that it was unlikely AI would be replacing human experts in healthcare any time soon.

“Human physicians have the advantage of conducting a 360-degree clinical evaluation, which includes assessing the patient’s physical appearance, vital signs, and clinical correlation. This holistic approach enables physicians to consider various factors and make accurate diagnoses,” she said. “To further enhance AI systems, it is important to incorporate this comprehensive approach into their development. By integrating data from various sources and considering multiple aspects of patient evaluation, AI systems can strive to mimic the clinical practice of human physicians more effectively.”

“However, it is crucial to recognize that human clinical judgment and experience will continue to be invaluable in providing comprehensive patient care,” Kamanger added.

One thing Rizvi said he would like to see is a follow-up study combining the human and machine camps.

“This study is predicated on the binary assumption that outcomes rely on either AI or radiologists,” he said. “If the study was conducted to examine cooperation between AI and radiologists, the outcome would most definitely be stronger than its parts.”