Skip to main content

Google just made its troll-detecting software available to developers

Google just made its troll-detecting software available to developers


But the software still has a lot to learn

Share this story

Google’s Jigsaw unit, as part of a larger effort to battle online trolling, said earlier today that it was releasing a new tool called Perspective, based on software that uses machine learning to detect harassment and abuse online.

While Jigsaw and its software are not new — the subsidiary has been around in some form since 2010, though with a different name, and Jigsaw software has even been used to spot potential ISIS recruits — this new Perspective API now makes it available to developers who want to use and build on top of the software.

The software works by determining the “toxicity” of online comments, a scale that has been established by mining millions of comments from the web and then presenting them to panels of 10 people (humans!) at a clip to get their feedback. There’s a demo version of the tool available on the Perspective API website, which anyone can use to type in a draft of their comments and get a sense of how toxic or abusive they might be (assuming one is thoughtful enough to go to the Perspective API website before they dash off their comment on an internet forum and leave a digital footprint for all of eternity).

Jigsaw’s software isn’t new, but now it’s accessible to developers

But its real value may come from being plugged directly into popular comment sections on the web. Publishers like The New York Times, The Guardian, and the Economist are experimenting with the tool, according to this report in Wired, and plan to use it as a way to keep their comment sections a space where “everyone can have intelligent debates.”

Certain phrases or quotes that have worked their way into the mainstream in recent months register fairly high on the toxicity scale, according to the Perspective tool. “Fake news!” registered as 47 percent similar to comments people said were "toxic." “Bad hombre” is 55 percent similar to comments people labeled as toxic. “Grab her by the pussy” was 92 percent similar.

But there also seem to be nuances that the tool isn’t picking up, which may just be a part of the “learning” process of the machine learning. In the spirit of testing Perspective, I entered some of the remarks that were left on a recent YouTube video I did. “How about Verge allow a man to review the watch?” someone wrote, which is blatantly sexist but only 3 percent similar to other toxic comments, according to Perspective. A negative comment about my appearance was only 6 percent similar to other negative remarks; it is somewhat reassuring that another YouTube comment, suggesting I have an eating disorder, was said to be 26 percent similar to other terrible remarks.

Wired’s Andy Greenberg had similar results when he ran Jigsaw’s software through its paces last fall, and pointed out that the “algorithm still has lessons to learn.” Which, apparently, is not too dissimilar from us humans.

Update February 23rd, 1:45PM ET: Twitter user Ramsey Nasser points out that the algorithm has consistently high toxicity for Arabic, no matter the content.