For months, Facebook has been shaken by a steady leak of documents from whistleblower Frances Haugen, beginning in The Wall Street Journal but spreading to government officials and nearly any outlet with an interest in the company. Now, those documents are going much more public, giving us the most sweeping look at the operations of Facebook anyone not directly involved with the company has ever had.
The documents stem from disclosures made to the SEC and provided to Congress by Haugen in redacted form. The Verge and a consortium of other news organizations have obtained redacted versions of the documents, which we’re calling The Facebook Papers. In order to receive the documents, we agreed to begin publishing our reporting on them this week, but we are not beholden to report on them in a certain way or coordinate what we cover with other outlets.
We’ve already done some reporting on the files, and we’ll keep reporting on them over the coming weeks. (There are a lot of pages.) This isn’t a comprehensive list of everything in the documents, and you should fully expect more to come out. Instead, think of it as a summary of the files that, so far, have stood out to us the most. Hopefully, it will help you make sense of the sheer volume of Facebook news coming out this morning.
Facebook was caught off guard by vaccine misinformation in comments
Facebook has taken a lot of criticism for its handling of COVID misinformation — including from President Biden, who accused the platform of “killing people” by letting anti-vaxx sentiment run amok — but the leaks show just how chaotic the effort was inside the company. One document dated March 2021 shows an employee raising the alarm about how unprepared the platform was. “Vaccine hesitancy in comments is rampant,” the memo reads. “Our ability to detect vaccine-hesitant comments is bad in English, and basically non-existent elsewhere. We need Policy guidelines specifically aimed at vaccine hesitancy in comments.”
(Editor’s note: We are not publishing the document itself since it contains the full name of a Facebook employee.)
“Comments are a significant portion of misinfo on FB,” says another employee in an internal comment, “and are almost a complete blind spot for us in terms of enforcement and transparency right now.”
“For English posts on Vaccines, vaccine hesitancy prevalence among comments is 50 percent.”
The document makes clear that Facebook already had a “COVID-19 lockdown defense” project dedicated to the platform dynamics created by the pandemic, including a workstream dedicated entirely to vaccine hesitancy. That team had also created significant automated flagging systems for misinformation — but according to the files, those simply weren’t being used to downrank anti-vaccine comments. As of the March memo, there were no plans to develop moderation infrastructure like labeling, guidelines, and classifier systems to identify anti-vaccine statements in comments.
The failure to meaningfully moderate comments was noticed outside the company. A First Draft News study in May looked at the comments for a handful of major news outlets with more than a million followers and found one in five comments on vaccine-related stories contained some kind of anti-vaccine misinformation.
The documents show Facebook was aware of the problem, and First Draft may have even underestimated the issue. One comment on the post reads, “for English posts on Vaccines, vaccine hesitancy prevalence among comments is 50 percent.”
Other aspects of the issue remained completely unstudied since the company had yet to dig in on the comment problem. “We currently don’t understand whether Group comments are a serious problem,” the post argues. “It’s clear to us that the ‘good post, bad comment’ problem is a big deal, but it’s not necessarily as clear that [vaccine hesitant] comments on [vaccine hesitant] posts are additive to the harm.”
Another document published shortly after April 2021 shows the company still coming to terms with vaccine misinformation. The good news was that there was no evidence of a foreign influence campaign driving anti-vaxx sentiment, as had stung the company in 2016. But the vast majority of content was being sent by a relatively small portion of accounts, suggesting that Facebook had not yet taken the easiest measures to address the issue.
Facebook has taken some steps recently to address misinformation in comments — including new downranking rules from just last week. But these documents show those changes came more than six months after the alarm had been raised internally and after Facebook had publicly expressed anxiety about the impending leak. Given the huge surge of COVID deaths among vaccine-skeptical populations over the past three months, it’s easy to see the recent changes as too little, too late.
Apple threatened to ban Facebook over online “slave markets”
Facebook scrambled to address human trafficking content after Apple threatened to kick its apps off the iOS App Store, a leaked SEV (or Site Event) report shows. The report, referenced briefly by The Wall Street Journal’s Facebook Files reporting, indicates that Apple threatened to pull Facebook and Instagram from iOS on October 23rd of 2019.
Apple had been tipped off by a BBC News Arabic report that found domestic workers being sold via Instagram and other apps, where sellers encouraged buyers to mistreat the workers by doing things such as confiscating their passports.
the policy was “highly likely leading to exploitation of domestic servants”
Facebook had been aware of the problem but hadn’t understood the scope of it, apparently because very little content (less than 2 percent of the material it found) had been reported by users. “Removing our applications from Apple platforms would have had potentially severe consequences to the business,” the Facebook report notes.
After Apple escalated the issue, Facebook moderators swept the platforms for keywords and hashtags mentioned in the BBC reporting, ultimately disabling 1,021 accounts and removing 129,121 pieces of content. Facebook also removed a policy exception letting established brick and mortar businesses like recruitment agencies post ads about domestic workers. Facebook determined that even if the businesses were legal, the policy was “highly likely leading to exploitation of domestic servants.”
Apple was apparently satisfied with the mitigation measures, and the incident was closed within a week.
The Civic Integrity team was often directly blocked by Zuckerberg
One of the most alarming incidents uncovered by the papers was that Mark Zuckerberg personally intervened to ensure Facebook would comply with a repressive law instituted in Vietnam, agreeing to moderate more aggressively against “anti-state” content on the platform. The anecdote leads The Washington Post’s report on the papers and plays into a much more troubling dynamic described by Haugen before Congress. Facebook’s Integrity team had lots of ideas for how to make Facebook less harmful, but they were usually overruled, sometimes by Zuckerberg himself.
Bloomberg News explores the issue in more detail, showing how the company often found its own efforts to downrank harmful content overwhelmed by the content’s inherent virality. As one employee put it, “I worry that Feed is becoming an arms race.”
Politico highlights another employee quote showing just how demoralized the team had become. “If we really want to change the state of our information ecosystem, we need a revolution, not an evolution of our systems,” an employee wrote in October 2019. “If you don’t have enough good content to boost, it doesn’t matter how much you downrank the bad.”
Facebook used a German anti-vaccine movement as a test case for more aggressive moderation
Another document details Facebook’s so-called “Querdenken experiment,” in which the company’s moderators tested out a more aggressive moderation approach on a German conspiracy movement. Facebook’s Dangerous Content team was already developing a new classification — a “harmful topic community” — and the growing Querdenken was chosen as an experiment on how the classification would work in practice.
As a Facebook employee writes in the document, “this could be a good case study to inform how we tackle these problems in the future”
“this could be a good case study to inform how we tackle these problems in the future”
Querdenken has become one of the leading anti-lockdown and anti-vaccination groups in Germany, with similarities to more extreme groups like QAnon. As the Facebook proposal framed it, the Querdenken movement had potential for violence but wasn’t yet linked to extreme enough activity that would justify banning followers from the platform entirely.
The documents give few details about how the experiment proceeded, although clearly, some version of the Querdenken plan was implemented. (A later report says “results from some initial samples look promising.”) And judging by the company’s public statements, it did result in a meaningful change to moderation policy: in September, Facebook announced a new policy on “coordinated social harm,” specifically citing Querdenken as an example of the new approach in action.
Unlike many of the other documents, the Querdenken experiment shows Facebook’s moderation system as relatively effective. The company identified the group before it caused significant harm, took action with an eye towards long-term consequences and was transparent about the policy shift after it took place. But the incident shows how complex the interplay of policy and enforcement can be, with broader policies often rewritten with an eye towards specific groups. And for supporters of Querdenken, it may be alarming to learn that the rules of the world’s largest social platform were rewritten specifically to keep their movement from gaining public support.
Facebook’s January 6th response was shaped by glitches and delays
Facebook discussed developing extreme “break-glass measures” to limit misinformation, calls to violence, and other material that could disrupt the 2020 presidential election. But when former President Donald Trump and his supporters tried to stop successor Joe Biden from being declared president on January 6th, 2021, Facebook employees complained these measures were implemented too late or stymied by technical and bureaucratic hangups.
Reports at Politico and The New York Times outline Facebook’s struggle to handle users delegitimizing the elections. Internally, critics said Facebook didn’t have a sufficient game plan for “harmful non-violating narratives” that toed the line between misinformation and content Facebook wants to preserve as free speech. And some plans, like a change that would have prevented Groups from changing their names to terms like “Stop the Steal,” apparently got held up by technical problems.
Facebook was trying to rebalance its News Feed for “civic health”
The Wall Street Journal first revealed that news outlets and political parties had complained about users favoring negative and hyperbolic content. Facebook was considering ways to fix the problem, and one method involved re-weighting the News Feed to optimize for “civic health” instead of primarily focusing on meaningful social interactions or session time.
In a product briefing called “Ranking For Civic Health,” Facebook acknowledged that “people think that political content on Facebook is low quality, untrustworthy, and divisive,” and the current ranking system was “not creating a wholly valuable civic experience for users.” (Based on document comment dates, the document was produced around January and February of 2020.)
“people think that political content on Facebook is low quality, untrustworthy, and divisive”
The document says Facebook’s ranking algorithm recommended civic content that users themselves didn’t report finding valuable — something earlier leaks have indicated was a problem with meaningful social interactions (MSI) and other engagement-based metrics. “Our current ranking objectives do not optimize for integrity outcomes, which can have dangerous consequences,” it says — for instance, MSI optimization was “contributing hugely to Civic misinfo,” and Facebook estimated that removing it from Civic posts would decrease it by 30 to 50 percent.
Facebook tried to improve civic health by asking users explicitly what they thought constituted good civic content. This sometimes revealed even bigger problems — since Facebook apparently found that 20 to 30 percent of respondents “may say that known Civic hate is ‘good for the community’” in surveys. Facebook settled on a strategy to “prioritize the reduction of policy-violating content — such as Civic Hate or Civic Misinfo — even if the individual user finds it valuable” and aimed to reduce the prevalence of civic hate speech, misinformation, and inflammatory content by at least 10 percent.
The company ran rebalancing experiments in February 2020 by slightly increasing the amount of “civic” News Feed content that a random set of users saw, then optimizing that content through different metrics (including MSI and whether people thought content was “worth your time”). It also ran a survey to collect how users felt about civic News Feed content. The company aimed to have a new optimization system chosen by March of 2020 — although it’s not clear how the coronavirus pandemic may have changed those plans.
Why likes were never hidden on Facebook and Instagram
A highly publicized plan from early last year to hide like counts on Instagram never happened because testing the change hurt ad revenue and led to people using the app less. A quiet test of the experience on the Facebook app was also killed after leadership told Zuckerberg that it wasn’t a “top barrier to sharing nor a top concern for people.”
A lengthy internal presentation to Zuckerberg about the plan, dubbed Project Daisy, shows that there were concerns among leadership about how the Facebook app would have been perceived if Instagram hid like counts and Facebook did not, which is something employees who were involved in the project have told The Verge. Employees working on Instagram wanted to bill it as a way to depressurize the app for young people, but the team working on the Facebook app wasn’t into the idea. If Instagram went through with hiding likes, the presentation details how leadership at Facebook wanted to “minimize blowback to the Facebook app” for not hiding them and still “ensure credit ladders up to the Facebook company.”
Instead of hiding likes for all users by default as was originally planned, Instagram earlier adopted a half measure by letting people opt into hiding their likes.
Facebook’s “civic groups” policy stumbled over a simple design flaw
In October 2020, Facebook announced that it would stop recommending civic and political groups to users in the US as part of a broader effort to avoid the mistakes of the 2016 election. (The policy was made permanent shortly after the January 6th riot.) But actually keeping those groups out of Facebook recommendation feeds has been a huge challenge for the company — and an internal document gives us new insight into why.
The document shows Facebook employees grappling with a public article, flagged by the PR team, which found 30 separate groups still appearing in recommendation feeds in apparent violation of the policy. Escalated on January 19th, the document says many of the groups named in the report had been labeled as civic groups at one point — but were somehow still being recommended as part of facebook’s Groups You Should Join feature.
Reducing churn could have solved almost all of the problems
“Leakage has been there since Nov 2020 at the least,” the Facebook document reads.
It’s not clear which article the document is referring to, but there were a number of reports spotting enforcement failures at the time. The Markup found a bunch of groups slipping through that January and again in June. Even now, it’s not clear the civic groups policy is being enforced as intended. At the time, most observers focused on the conceptual difficulty: it’s a hard philosophical problem to draw a clear line between which groups count as “civic” and even harder to scale it across a platform of Facebook’s size.
But the internal report shows the real problem was much simpler. Facebook’s monitoring system (referred to in the report as “Laser”) had been trained to only look at the past seven days of content when determining whether a page fell into the “civic groups” category, which meant pages were constantly filtering in and out as the period viewed by the algorithm changed. In practice, that meant a pro-Trump or pro-Biden group could easily dodge the label by posting a few days worth of less obviously political content. The report estimates that a full 12 percent of labeled groups would churn out of the category from day to day.
According to the report, 90 percent of the groups highlighted by the article had been caught by Facebook’s existing “civic groups” classifier — but they’d filtered out as part of the churn. So reducing churn alone could have solved almost all of the problems spotted by the report.
There have been lots of stories like this over the past five years: Facebook sets a policy that seems measured and responsible, but a cursory test (usually from a journalist) shows banned content is still easily slipping through. From the outside, it’s often unclear how much of the problem is incompetence at Facebook and how much is just the inherent difficulty of managing a platform of that size.
But in this case, the documents put the blame squarely on Facebook as a company. This was a high-profile policy with huge stakes for the country at large, with obvious delicacy in how it was implemented. A churn rate that high made it inevitable that targeted groups would slip through the cracks, and the company simply didn’t notice until reporters called them out.