Some doctors don’t trust a second opinion when it comes from an AI system.
What’s new: A team at MIT and Regensburg University investigated how physicians responded to diagnostic advice they received from a machine learning model versus a human expert.
How it works: The authors recruited doctors to diagnose chest X-rays.
- The physicians fell into two groups: 138 radiologists highly experienced in reading X-rays and 127 internal or emergency medicine specialists with less experience in that task.
- For each case, the doctors were given either accurate or inaccurate advice and told that it was generated by either a model or human expert.
- The physicians rated the advice and offered their own diagnosis.
Results: The radiologists generally rated as lower-quality advice they believed was generated by AI. The others rated AI and human advice to be roughly of equal quality. Both groups made more accurate diagnoses when given accurate advice, regardless of its source. However, 27 percent of radiologists and 41 percent of the less experienced offered an incorrect diagnosis when given inaccurate advice.
Behind the news: AI-powered diagnostic tools are proliferating and becoming more widely accepted in the U.S. and elsewhere. These tools may work about as well as traditional methods at predicting clinical outcomes. Those that work well may only do so on certain populations due to biased training data.
Why it matters: It’s not enough to develop AI systems in isolation. It’s important also to understand how humans use them. The best diagnostic algorithm in the world won’t help if people don’t heed its recommendations.
We’re thinking: While some doctors are skeptical of AI, others may trust it too much, which also can lead to errors. Practitioners in a wide variety of fields will need to cultivate a balance between skepticism and trust in machine learning systems. We welcome help from the computer-human interface community in wrestling with these challenges.