Online news has made finding current events easier than ever — but what if it could be used to predict future events as well? Eric Horvitz of Microsoft Research and Kira Radinsky of the Technion-Israel Institute of Technology have built a system that mines over 20 years of New York Times articles for events that could point to other, later developments. In their test model, Horvitz and Radinsky created a system to draw connections between events, then set it loose on the Times database.
The system was able to "learn" correlations between events by looking at sequences of stories in particular places: if an article was published about a drought in one place, for example, there was an 18 percent chance of a drought being reported there later. And both droughts and storms can lead to cholera outbreaks. Not every event (like, say, "cholera outbreak in Rwanda") had enough data to be useful, so it also needed to be able to find and connect patterns from different kinds of events that shared characteristics.
With enough data and a good model, the system could be used to give early warning signs for disasters by mining individual reports and finding larger patterns in them. The general idea of finding ways to predict diseases isn't new, nor is the concept of data mining for prediction, but the wide scope of this project makes it potentially very useful — as long as the system is able to successfully draw correlations between events, and to generalize enough to make them useful, it could be applied to any number of situations.