Skip to main content

Machine learning is helping computers spot arguments online before they happen

Machine learning is helping computers spot arguments online before they happen


‘Hey there. It looks like you’re trying to rile someone up for no good reason?’

Share this story


It’s probably happened to you. You’re having a chat with someone online (on social media, via email, in Slack) when things take a nasty turn. The conversation starts out civil, but before you know it, you’re trading personal insults with a stranger / co-worker / family friend. Well, we have some good news: scientists are looking into it, and with a little help from machine learning, they could help us stop arguments online before they even happen.

The work comes from researchers at Cornell University, Google Jigsaw, and Wikimedia, who teamed up to create software that scans a conversation for verbal ticks and predicts whether it will end acrimoniously or amiably. Notably, the software was trained and tested on a hotbed of high-stakes discussion: the “talk page” on Wikipedia articles, where editors discuss changes to phrasing, the need for better sources, and so on.

Friendly conversations tend to create a buffer zone of pleasantries

The software was preprogrammed to look for certain features that past research has shown correlates with a conversational mood. For example, signs that a discussion will go well include gratitude (“Thanks for your help”), greetings (“How’s your day going?”), hedges (“I think that”), and, of course, the liberal use of the word “please.” All this combines to create not only a friendly atmosphere, but an emotional buffer between the two participants. It’s essentially a no-man’s-land of disagreement, where someone can admit they’re wrong without losing face.

On the other hand, warning signs include repeated, direct questioning (“Why is there no mention of this? Why didn’t you look at that?”) and the use of sentences that start with second person pronouns (“Your sources don’t matter”), especially when they appear in the first reply, which suggests someone is trying to make the matter personal. To add to all these signals, the researchers also gauged the general “toxicity” of conversations using Google’s Perspective API, an AI tool that tries to measure how friendly, neutral, or aggressive any given text is.

Using a statistical method known as logistic regression, the researchers worked out how to best balance these factors when their software made its judgments. At the end of the training period, when given a pair of conversations that started friendly but where one ended in personal insults, the software was able to predict which was which just under 65 percent of the time. That’s pretty good, although some major caveats apply: first, the test was done on a limited data set (Wikipedia talk pages, where, unusually for online discussions, participants have a shared goal: improving the quality of an article). Second, humans still performed better at the same task, making the right call 72 percent of the time.

But for the scientists, the work shows that we’re on the right path to creating machines that can intervene in online arguments. “Humans have nagging suspicions when conversations will eventually go bad, and this [research] shows that it’s feasible for us to make computers aware of those suspicions, too,” Justine Zhang, a PhD student at Cornell University who worked on the project, tells The Verge.

Research like this is particularly interesting, as it’s part of an emerging body of work that uses machine learning to analyze online discussions. Tech giants like Facebook and Google, which operate huge, influential platforms full of angry commenters, are in dire need of tech like this. Recent outcry over Russian political ads on Facebook and horrific children’s content on YouTube suggest what the stakes are. These companies hope that AI will be able to do a better job (and cost less) than human moderators.

AI is going to be vital for moderating online spaces

However, while AI can certainly handle the scale of the internet, they’ve proven time and time again that they can’t cope with the nuances of human language. Google’s Perspective API, for example, which was used in this research, is pitched as a tool that can help remove toxic comments from websites. But its judgment is often flawed; for example, it ranks the sentence “I am a man” as much less toxic than “I am a gay, black woman.” (Google, for its part, stresses that Perspective is a work-in-progress, and it’s only one tool in a developer’s arsenal for automating comment moderation.)

In the case of this specific research, you can imagine it being used to intervene in online discussions, giving users a nudge when things look like they’re about to get heated. Such a bot would be welcome in many arenas (who wasn’t dived into an argument without thinking at least once in their life?), but it’s worth considering potential downsides, too. Beyond the wider problems of AI moderation, what if a bot like this was adapted to weed out political dissent? Or what if stopping humans from having arguments online is actually a bad thing? No one is claiming that the internet is a model of productive debate, but when a conversation goes wrong, it’s at least an opportunity to find out how not to do things. If we’re too insulated, we might never learn.

Points like these were “in the back of the team’s heads” while doing this work, says Zhang. She said they found that “some disagreements are inherently useful,” and if the software was too quick to judge, it could “dissuade potentially constructive discussions.” “There are cases of people managing to recover from bad conversations, so deciding when a machine should step in and mediate [is] an interesting question.”

Above all, says Zhang, the work showed how unpredictable and dynamic human conversation can be. When looking over the data, she said, she’d sometimes find a discussion with all the hallmarks of a row in the making, and would read on only to find that the participants had somehow pulled back from the brink and ended things amicably. “It was interesting to see people managing to do this, using some weird combination of being firm and polite,” she says.

Zhang hopes future work will focus on similar examples; exchanges that show humanity redeeming itself at the last minute. “Our focus was looking at conversations that derail, but it was nice to see that that’s not all that is out there.”