Scott Pruitt, the head of the Environmental Protection Agency, claims he’s on a mission to make science at the EPA more transparent — but scientists have their own ideas about what kind of transparency leads to trustworthy science.
Last week, Pruitt proposed a rule that would require the EPA to release the data behind any studies used for crafting policies to protect the environment and public health, like keeping air and water clean. While that sounds reasonable, in reality, the rule would have the exact opposite effect of transparency: It would force the government to spend time and money to redact confidential information like medical records, or — more likely — disqualify research that includes sensitive data from being used in policy-making.
We’ve seen politicians claim they want to “fix” transparency in science before. For years, Rep. Lamar Smith (R-TX), the chairman of the House Science, Space, and Technology Committee, has tried to pass legislation that would call for full disclosure of datasets. (These bills, including one called the HONEST Act, have never become law.) According to the scientific community, these proposals are just a ploy to prevent scientific evidence from informing policy decisions. If the EPA really wants open science, transparency should focus on research funding, not data; the costs of making data public should not fall on the scientists; and third parties who want to validate the findings should be held to same ethical standards that the study participants agreed to.
“I think many in the scientific community would be perfectly pleased to engage in a good faith discussion about how to make things as transparent as possible and improve the quality of the science used in regulatory decisions,” says Jeremy Berg, editor-in-chief of Science magazine. “But I don’t see this as that sort of good faith effort.” The Verge emailed the EPA to ask whether scientists helped the agency craft the proposed rule. In response, EPA spokesperson Molly Block sent a link to Pruitt’s testimony at the House Energy and Commerce Committee hearing on April 26th.
Many scientific journals, like Science, already require study authors to publish their data in online databases, but at Science, for example, the policy allows exceptions for studies that contain sensitive details like medical information, Berg tells The Verge. The National Institute of Health, the biggest founder of biomedical research, also calls for data to be made publicly available: Policies vary based on the type of research and the confidential information researchers gather, but the NIH maintains several online databases, like the National Cancer Institute Data Catalog, that qualified researchers can access for free.
Pruitt’s proposed rule would change nothing about how the scientific community does science — it would just hamstring the EPA’s ability to rely on science to craft policy. “The issues around transparency are best tackled within the scientific community,” says Gretchen Goldman, the research director for the Center for Science and Democracy at the Union of Concerned Scientists. “The proposal only dictates what kind of science can be used in EPA decision-making.”
So, if scientists had to write a “science transparency” rule, what would it look like? Here are a few recommendations.
Data can be made available — but there are caveats
Sharing data is key for catching mistakes and ensuring that scientific findings aren’t just flukes. But when it comes to confidential health data, research subjects need to be protected at all costs, says David Savitz, a professor of epidemiology at the Brown University School of Public Health. Researchers have legal and ethical obligations to keep personal information private. Plus, if people’s privacy is breached, a lot fewer people would agree to participate in scientific studies.
The rule makes vague allusions to protecting privacy and preventing re-identification of study subjects. And at two congressional hearings on April 26th, Pruitt repeatedly said that people’s personal and confidential information would be “redacted and protected” under the proposed rule.
But that’s not how data-sharing works, says Peter Thorne, director of the Environmental Health Sciences Research Center at the University of Iowa College of Public Health. When researchers disclose their datasets, they don’t black out personal health information; instead, they group the results, Thorne says. So if you enrolled 100 people in a specific zip code, and 70 developed asthma and 30 didn’t, you disclose data on those groups rather than the individuals. That way, you protect people’s privacy.
“If I had to take one of my data files — which we hold in a locked cabinet in a locked room in a locked segment of the building — and try to redact it, it would have far less value than it has when we aggregate it,” Thorne says. “We would redact so much of it, there would be nothing left.”
The problem is, sometimes even aggregating isn’t possible. If one person in a specific zip code got a particularly rare type of disease, and that person is of a particular ethnicity, that person’s identity is impossible to keep private. At the House Appropriations Committee hearing last week, Rep. Betty McCollum (D-MN) brought this concern up with Pruitt. “If you’re from a small rural town, you can be identified,” McCollum told the EPA administrator. “All of us who have worked on public health records know that.”
The burden should not be on the scientists
Sharing raw data is not as easy as it sounds — it takes time and money. Researchers can’t just dump massive troves of data in an online database and call it a day; they have to erase any information that could identify the study participants and they have to make sure the data isn’t confusing. Publicly available datasets from the National Center for Health Statistics, for instance, can come with 100-page manuals, says Savitz. “It’s a lot of work,” he says. And it can require hiring an extra programmer or research assistant to do the job.
A 2015 report from the Congressional Budget Office estimated that it would cost the EPA $250 million a year in the first few years to disclose all the data of studies used in rule-making. The cost per study would be between $10,000 and $30,000, the report says. The extra costs could result in the EPA relying on fewer studies when drafting regulations, which means “the quality of the agency’s work could be compromised,” according to the report.
The proposed rule doesn’t specify who would be responsible for all the extra work, but it should fall on the oil and gas companies — or any other third party — who has doubts about the research, Savitz says. “It shouldn’t be out of my funds,” Savitz says. “I’ve done the work. It should be something that, if it’s important enough to do, should be supported by others.”
Transparency on funding
More than transparency around data, a serious “science transparency” rule would require scientists to say who’s funding their research, says Goldman at the Union of Concerned Scientists. Industry-financed research isn’t necessarily bad — but people should know who’s paying for a study. Sometimes the results can seem suspiciously convenient. Just last month, for example, The New York Times revealed that the alcohol industry was behind a large study meant to show that a drink a day is just fine.
“That’s the bigger challenge around transparency to me,” Goldman says. “We don’t have consistent procedures for what researchers have to disclose when.” Plus, it’s not at all clear how the policies are enforced or whether researchers who break the rules are punished, she says. If funding sources are completely open, then it’d be much easier for the public to interpret a study’s findings. “That’s an issue that needs the most attention around the issue of transparency,” Goldman says. And the current EPA rule doesn’t address that at all.
If scientific data is made available, there should still be some restriction on who can access that data to validate the findings. Ethically, scientists have to tell research subjects how their data will be used, and how their privacy will be protected. Those agreements are important and have to be approved by major oversight groups — ethics committees and institutional review boards. So if outside sources want to put their hands on that data, they have to be held to the same standards that the study authors agreed to when they recruited those subjects.
“We cannot share medical records and medical information with just about anybody who comes along,” says Beate Ritz, an environmental epidemiologist at the University of California Los Angeles. “We have to vet these people.” One way to do that is to share the data with another independent research group that signs on to the same confidentiality agreements as the study authors. That way, researchers can vet the people who get their hands on the data, instead of just putting it out into the world for anyone who wants it, for whatever purpose. And the EPA should provide the money needed for re-analyzing the data. “It’s better to have carrots rather than sticks,” she says.
Data privacy expert Yaniv Erlich suggests putting shared data in some sort of database that researchers would only gain access to after identifying themselves and signing legal forms. “Not too horrendous, but something that will create a contract,” says Erlich, a computer science professor at Columbia University who is taking a leave of absence to serve as chief science officer at the genealogy website MyHeritage. Researchers would have to agree to not, say, hand the data over to the police, or try to figure out the identities of anonymized study participants. And if the study subjects wanted to, they could get deleted from the database. “In this way, you create a legal binding relationship about what the data can be used [for],” he says. (Of course, you’d still need a way to enforce these agreements — and consequences that are frightening enough to keep people from breaking them. Ideally, an EPA rule would spell out what those might be.)
Ultimately, most scientists object to the very premise of a “science transparency” rule, whether enacted by the EPA or by Congress through legislation. The peer-review process is already very strict, and “the data is scrutinized dramatically,” says Judith Zelikoff, a toxicologist at New York University. Ritz agrees: “There’s nothing secret about the science,” she says.
This is why the proposed rule is widely seen by the scientific community as yet another attempt by the Trump administration to limit the use of sound science in rule-making. Last year, Pruitt prohibited scientists getting EPA money from advising the agency on policy, while allowing people from the fossil fuel industry. The rule opened for public comment today, and it will probably be challenged in the courts. But if it’s approved and finalized, the rule could change how clean air and water regulations are written — at the expense of the American public, Rep. Raul Ruiz (D-CA) pointed out at the hearing last week.
“You are making lives more difficult for everyday American families,” Ruiz told Pruitt. “This is disgraceful. The American people deserve better.”