Skip to main content

Algorithms used in medicine are trained on data from only a few states

Algorithms used in medicine are trained on data from only a few states


34 states aren’t represented in any medical AI training sets

Share this story

Illustration by Ana Kova

Most medical algorithms were developed using information from people treated in Massachusetts, California, or New York, according to a new study. Those three states dominate patient data — and 34 other states were simply not represented at all, according to the research published this week in the Journal of the American Medical Association. The narrow geographic distribution of the data used for these algorithms may be an unrecognized bias, the study authors argue.

The algorithms that the researchers were looking at are designed to make medical decisions based on patient data. When researchers build an algorithm that they want to guide patient diagnosis — like to examine a chest X-ray and decide if it has signs of pneumonia — they feed it real-world examples of patients with and without the condition they want it to look for. It’s well-recognized that gender and racial diversity is important in those training sets: if an algorithm only gets men’s X-rays during training, it may not work as well when it’s given an X-ray from a woman who is hospitalized with difficulty breathing. But while researchers have learned to watch for some forms of bias, geography hasn’t been highlighted.

“There are all these things that end up getting baked into the dataset and become implicit assumptions in the data, which may not be valid assumptions nationwide,” study author and Stanford University researcher Amit Kaushal told Stat News.

Kaushal and his team examined the data used to train 56 published algorithms, which were designed to be used in fields like dermatology, radiology, and cardiology. It’s not clear how many are actually in use at clinics and hospitals. Of the 56 algorithms, 40 used patient data from either Massachusetts, California, or New York. No other state contributed data to more than five algorithms.

It’s not clear if or exactly how geography might skew an algorithm’s performance. Coastal hubs like New York, though, have different demographics and underlying health issues than states in the South or Midwest. Still, researchers do know, in general, that algorithms that work under one set of circumstances sometimes don’t work as well with others. Some studies show that algorithms can work better at the institutions where they’re created than they do at other hospitals.

Many academic research centers that do artificial intelligence and machine learning research are in health care hubs like Massachusetts, California, and New York. Data from California, home to Silicon Valley, was included in about 40 percent of the algorithms. It’s difficult for researchers to get access to data from institutions other than the ones where they work. That may be why the data clusters in this way. Broadening the datasets may be challenging, but identifying the disparity shows that geography is another factor worth tracking in medical algorithms.