Facebook automatically scans posts and chat logs for criminal activity, using big data processing techniques similar to those used in targeting advertising to determine the most vulnerable users, according to a new Reuters report that explores combating pedophilia in social media. The social network's scanning tools use factors such as mutual friends, past interaction, distance and age difference — alongside simple phrase searches — to flag potentially nefarious conversations for human moderators. They also rely on archives of previous conversations that are known to have led to sexual assaults, identifying patterns and searching for similar ones.
"We use technology that has a very low false-positive rate."
Apparently keen to pre-empt the sorts of privacy concerns that have dogged Facebook in recent months, Chief Security Officer Joe Sullivan tells Reuters that "it's really important that we use technology that has a very low false-positive rate." He explains that "[w]e've never wanted to set up an environment where we have employees looking at private communications," stressing that the company's systems attempt to avoid flagging up long-standing personal relationships. A lot of the activity that Facebook refers to law enforcement is identified through its user reporting system, detailed in a recent infographic.
Privacy issues aside, it would be practically impossible for human moderators to effectively trawl through the vast amount of data generated by more than 900 million users each day. Even much smaller sites such as Habbo Hotel are unable to provide effective human monitoring — lacking Facebook's automatic flagging technology, the site became embroiled in an embarrassing pedophile scandal last month, when a journalist posing as a 13-year-old girl was bombarded with sexually explicit messages.
Facebook limits interactions with under-18s, removing their profiles from public searches while restricting messaging to friends-of-friends and chat to friends only. Unfortunately, this doesn't solve the issue of users providing false ages, a problem which cuts both ways: while predatory adults are known to impersonate teens, children younger than 13 also frequently lie about their age to gain access to the site.