In 2011, IBM's Watson supercomputer got an unusually public proof-of-concept, competing on Jeopardy! and beating its human competitors hands-down. It was a powerful public win for IBM, and for artificial intelligence at large, but the computer at the center of all that publicity was still basically a prototype. If Watson can do this, IBM wanted to say, imagine what it can do in the real world.

Watson will point doctors to crucial data and likely diagnoses

Now, Watson is getting its chance. For the past year, the Watson team has been building up the supercomputer's medical skills, scanning through exam books to learn the basic principles of diagnosis and learning to parse the often-confusing mess of data in electronic health records. Watson has already served on the business side of Sloan-Kettering hospital, where there are fewer malpractice concerns, but a new three-year program will usher the supercomputer into the examination rooms of the Cleveland Clinic. The goal is to create a digital assistant that can point doctors to crucial data and likely diagnoses based on a patient's medical history. If IBM can get the system working, it could be a lifeline to overworked doctors and overcrowded hospitals — but first, the company will have to navigate an unusually tangled web of data, and an industry that's proven particularly resistant to digitization.

The US will be short as many as 45,000 primary care doctors by 2020

At a glance, American hospitals seem ripe for a tool like Watson. The country is facing a major shortage of primary care physicians, the all-purpose doctors working on the front lines of medicine, and the shortage will only get worse in the coming decade. The Association of American Medical Colleges estimates the US will be short as many as 45,000 primary care doctors by 2020. The result will be longer wait times, more crowded hospitals and fewer doctors to handle the same number of patients. At the same time, more digitized hospitals have meant more comprehensive medical records, offering more reports to sift through for each patient even as there's less and less time to work through each one.

"It doesn't work every time, but it's getting better."

To IBM, this looks like an information processing problem: too much data, and not enough doctors to manage it. To solve the issue, it’s putting Watson to work summarizing medical records, giving doctors a quick summary of a patient's medical history. In theory, that should help them to treat more patients, more effectively. If doctors are curious about a particular diagnosis or piece of data, they can drill down, tracking the information back to its source.

An on-screen visualization of Watson's concept map

Since the program is still in a trial phase, it's only used after doctors have made their initial diagnosis, but the doctors involved say it's already showing promise. "I've had a couple of patients where Watson found things that I had missed," says Dr. Neil Mehta, the staff physician who's leading the Cleveland Clinic's end of the Watson project. "It doesn't work every time, but it's getting better." One example is a patient suffering from sleep apnea-like symptoms. Years earlier, this patient had a blood gas test that would have confirmed the diagnosis, but the test results were hidden in a hard-to-find section of the medical record. Without Watson, Mehta says he never would have seen the result.

The system looks for competing theories that might explain the patient's symptoms

The process for finding that crucial test turns out to be remarkably similar to finding the right answer to a Jeopardy question. Having built a basic concept map from studying medical exams, Watson parses the medical records for facts and test results, then knits them together into competing theories that might explain the patient's symptoms. Mike Barborak, a natural language engineer at IBM Research, describes it as a step towards a more intelligent kind of machine. "It's language processing but it's also using this notion of combining those different salient factors to see if one supports or contradicts the other and coming up with some conclusion of what's actually being stated," Barborak says. That means looking beyond the words of the diagnosis for a basic sense of how the words fit together, and what each possible diagnosis means.

So far, the biggest roadblock isn't machine intelligence, but human organization. Watson works best with clean, unambiguous data sources, and at the moment, electronic health records don't quite fit the bill. "It does not come in a clean format at all," Barborak says. "It comes in a very noisy format." Doctors will frequently leave out the end date of a prescription, or the related causes of a particular ailment — omissions that rarely confuse human doctors, but throw Watson for a loop. The structure of medical language can also be confusing. A simple condition like high cholesterol can be diagnosed by multiple different names, each with a subtly different meaning.

"It's a tough problem, and I don't know if anyone has a good answer right now."

Selling doctors on the process may be an even bigger problem. The same overworked conditions that make doctors ideal candidates for Watson also leads them to cut corners in record-keeping. For many doctors, electronic health records are just one more kind of paperwork, a distraction from the primary goals of medicine — and so far, research hasn't given them much reason to change their mind. "I'll tell you, I love electronic health records, I can't live without them," Mehta says, "But in all the data you look at, we have not shown improvement in key outcomes in patient care or cost in spite of using them. And it's because we don't use them right and we don't have the time."

For Watson and other systems like it to work, doctors at large will have to buy in, giving up precious time with no immediate payoff, and it will have to happen across the entire health system. "It's a tough problem and I don't know if anyone has a good answer right now," Mehta says. Unfortunately for IBM, that’s one problem machine intelligence can't solve.