U.S. government approval of medical AI products is on the upswing — but information about how such systems were built is largely unavailable.

What’s new: The U.S. Food and Drug Administration (FDA) has approved a plethora of AI-driven medical systems. But, unlike drugs, there’s a dearth of publicly available information about how well they work, according to an investigation by the health-news website Stat News.

What they found: The FDA doesn’t require makers of AI systems to provide systematic documentation of their development and validation processes, such as the composition of training and test datasets and the populations involved. The data actually provided by manufacturers varies widely.

  • Stat News compiled a list of 161 products that were approved between 2012 and 2020. Most are imaging systems trained to recognize signs of stroke, cancer, or other conditions. Others monitor heartbeats, predict fertility status, or analyze blood loss.
  • The makers of only 73 of those products disclosed the number of patients in the test dataset. In those cases, the number of patients ranged from less than 100 to more than 15,000.
  • The manufacturers of fewer than 40 products revealed whether the data they used for training and testing had come from more than one facility — an important factor in proving the product’s general utility. Makers of 13 products broke down their study population by gender. Seven did so by race.
  • A few companies said they had tested and validated their product on a large, diverse population, but that information was not publicly available.

Behind the news: The rate at which the FDA approves medical AI products is rising and could reach 600 products annually by 2025, according to Stat News.

  • Most such products currently are approved under a standard that requires demonstrating “substantial equivalence” in safety and efficacy to similar, already-approved systems. This standard, known as 510(k), was established in 1976 without medical AI in mind.
  • A recent FDA action plan for regulating AI aims to compel manufacturers to evaluate their products more rigorously.

Why it matters: Without consistent requirements for testing and reporting, the FDA can’t ensure that AI systems will render accurate diagnoses, recommend appropriate treatments, or treat minority populations fairly. This leaves health care providers to figure out for themselves whether a product works as advertised with their particular equipment and patients.

We’re thinking: If you don’t know how an AI system was trained and tested, you can’t evaluate the risk of concept or data drift as real-world conditions and data distributions change. This is a problem even in drug testing: A vaccine validated against the dominant Covid-19 variant may become less effective as the virus mutates. Researchers are developing tools to combat such drifts in AI systems. Let’s make sure they’re deployed in medical AI.

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox