Skip to main content

Facebook explains why it’s bad at catching hate speech

Facebook explains why it’s bad at catching hate speech

Share this story

Facebook stock image

As part of Facebook’s promise to answer “hard questions,” the company has published a long explanation of how it finds and removes hate speech — or at least, why it’s often not very good at it. The post runs through the difficulties of defining hate speech across different countries, teaching AI to handle its nuances, and separating intentionally hateful posts from ones that describe hate speech to critique it.

Facebook lays out ambiguous scenarios that could flummox automated tools, including insulting terms that communities have reclaimed. It also describes some cases where it clearly got things wrong: it removed a piece of hate mail that activist Shaun King had posted in order to condemn, for example, a mistake Facebook acknowledges can be “deeply upsetting.” (It later restored the post.) It also lists occasions where it thinks it made the right call on a difficult issue. But it doesn’t delve into some of the thorniest hate speech questions, like semantic tweaks that turn ugly sentiments into acceptable opinions — “migrants are dirt” versus “migrants are dirty,” to cite one example from last year.

This is a problem that goes beyond hate speech; as leaked moderation guidelines showed, there’s a frustratingly fine line versus serious and non-serious threats. And it requires responding to several different sets of legal requirements, because a hateful post could be acceptable in one country and banned in another.

There’s one major, unstated background question: can Facebook ever come up with a system that can handle its nearly 2 billion users? The company says it removes around 66,000 hate mail posts per week, and it relies heavily on user flagging to catch them. Facebook is committing to adding 3,000 more members to its 4,500-strong moderation team, but that’s still minuscule for a platform so big. And if Facebook ever wants to really solve its moderation problems, it will have to find its purpose first.