Data from a private insurance company has given scientists a new way to study whether nature or nurture matters more when it comes to staying healthy in the face of disease. Though the answer isn’t definitive or exact — it varies according to each of the 560 diseases that were studied — the technique holds promise for bringing more insights in the future.
The traditional way of studying nature versus nurture relies on twins. Because identical twins share the same genetic code, comparing the health of twins can help determine whether genetic or environmental factors play more of a role in their health. Problem is, it can be hard to find many pairs of twins, so most twin studies use small datasets and look at one disease at a time. The new study, published this week in Nature Genetics, uses a database of 45 million people, including over 56,000 pairs of twins.
The hope is that the results will help guide further research into the causes of various conditions
Some conditions, like Huntington’s, are 100 percent influenced by genetics, meaning that if you inherit a genetic mutation, there’s a 100 percent chance you’ll get the disease, no matter how wealthy you are or where you live or what you eat. The chances of getting other diseases, like asthma, are far more influenced by factors in a person’s environment, like climate and wealth, than by their genetic code. The Nature Genetics study found that genes influenced at least 40 percent of the 560 diseases, with cognitive disorders being the most influenced by genetics. About a quarter of the diseases were at least partly caused by environment, with eye diseases having the largest environmental influence.
The data in question comes from the private health insurer Aetna, which shared this data (stripped of identifying information) with Harvard University’s Department of Biomedical Informatics. The researchers had the idea for this study when they noticed that the data included the dependents of the primary subscriber. Basically, they could see the insurance information of children who were on their parents’ Aetna insurance, explains study co-author Chirag Patel, a Harvard University professor of biomedical informatics. Next, the researchers looked at the birth dates to figure out if they were twins. Then, they used a statistical technique to figure out the likelihood that the twins were identical or fraternal. (Fraternal twins only share half of their DNA.)
The database included zip codes (which the scientists used to extrapolate factors like socioeconomic status and air pollution), a record of doctor visits, and diagnosis codes from the International Classification of Diseases, explains first author Chirag Lakhani, a research fellow in biomedical informatics at Harvard. By combining and analyzing all of this data, the scientists were able to tease out the relative contribution of genetic versus environmental factors for those 560 conditions, which include everything from heart disease to connective tissue disease to blood disease.
The hope, Lakhani says, is that the results will help guide further research into the causes of various conditions. “For example, if you’re interested in lead poisoning, genetics plays a very small role and we need to think about the environment,” he says. “But for other cases like ADHD that are more likely to have a hard genetic component, we can think about other ways to interrogate the disease,” he says.
Patel and Lakhani point out that their study has limitations. For one, they didn’t look at ultra rare diseases, and because they were looking at twins young enough to still be on their parents’ health insurance, the analysis excludes diseases like Parkinson’s or Alzheimer’s that develop in old age.
Dan Belsky, a professor of epidemiology at Columbia University’s Mailman School of Public Health who was not involved in the study, said that the study method helps solve a big problem in medical research: that the people who sign up to participate in studies could be fundamentally different from the people who don’t. That makes the results incomplete and not representative. “Nobody is going to go out and collect data from this many people, but you can partner with the people who hold this data to leverage the extraordinary data capture that’s occurring in our lives to advance science,” he says. “This is a very carefully done study and I think it’s exciting to see this scale of data put to this question.”
That said, today’s study doesn’t perfectly circumvent that problem either. As Patel points out, the people who have private insurance like Aetna have different circumstances from those who have, for example, Medicare. The next step is to try to use the method on many different databases. “I think there’s tremendous possibility of leveraging these same methods in those populations to better get a grasp on other factors that matter that weren’t showing up as strong,” he says.
Correction Jan. 16th, 2019 6PM EST: An earlier version of this article incorrectly stated that Dan Belsky was a professor at Duke University. He is now a professor at Columbia University.