Spam is a problem, and any Twitter user knows well that the social network has its fair share of it. A team of researchers is working to fight back, however. In a paper released this week, they detailed how they've been working with Twitter to take a close look at how fraudulent accounts are made and how they can be stopped — and they've already had promising results.
The team, made up of researchers from George Mason University, the International Computer Science Institute, and the University of California, Berkeley, worked with Twitter to purchase over 127,000 fraudulent, automatically-generated accounts from 27 different merchants over a ten month period starting in June 2012. Their goal? To try and develop a way to stop spam accounts before they're made or before they're used to spread malware, phishing attempts, scams, and more across the web.
"All of the stock got suspended... Not just mine... [I] don’t know what Twitter has done."
According to their results, it appears the team's been successful. With Twitter's assistance, they were able to work up a set of identifiers that could be used to sniff out when accounts were generated automatically. Specifically, the researchers were able to flag usernames made using specific patterns, and they used information on the signup procedure — like how long it took to fill out forms — to hone their results. Twitter ultimately used the data to wipe out "several million" of the fraudulent accounts that came from the 27 merchants the researchers studied, and it worked at catching new accounts, too. When the researchers attempted to buy 14,000 additional accounts after Twitter implemented the changes, 90 percent were dead on arrival. One merchant told the researchers (who were posing as scammers) that "All of the stock got suspended... Not just mine... Don’t know what Twitter has done."
Unfortunately, that doesn't mean Twitter spam is gone for good. According to the paper (PDF), it's estimated that the accounts stemming from those merchants represented just about 10 to 20 percent of Twitter spam. That means it'd take a lot of time and energy to keep on top of it all. As one of the researchers told Brian Krebs for his security blog, "We would love to keep doing this, but the hard part is you kind of have to keep doing the buys, and that’s a lot of work." And that's compounded by the fact that merchants can counteract the algorithms, meaning they have to constantly be updated. Indeed, only two weeks after Twitter starting using the team's work, only 54 percent of new accounts purchased by the researchers were immediately suspended.
There are other ways to counteract spam, however, and the team's made a few recommendations from their research. It currently costs just about four cents per Twitter account (typically purchased by the thousand), but Twitter could deter spammers with higher costs by making it more difficult to generate fraudulent accounts.
Since Gmail requires phone verification, 60 percent of accounts used Hotmail addresses
Twitter currently requires an email address to sign up, but the researchers found that merchants often didn't pass along the email credentials used to make spammy accounts, meaning periodic requests for email confirmation after signup could catch some spammers. Additionally, stronger requirements from free webmail providers would go a long way. Gmail accounts, according to the research, cost up to 150 times more than their Hotmail and Yahoo equivalents because Google requires phone verification to sign up. It's perhaps unsurprising, then, that over 60 percent of the fraudulent accounts studied were connected to Hotmail and 11 percent with Yahoo, while only 1.89 percent used Gmail addresses.
Ever-despised CAPTCHAs could also make a difference. Merchants use digital "sweatshops" in places like China to have humans solve the irritating codes, raising costs. The researchers say that the codes stopped 92 percent of accounts from being made, but only 35 percent of accounts the researchers purchased required a CAPTCHA. Merchants were able to get around the codes by using infected PCs to avoid the IP blacklists that Twitter uses to guess if someone is just a computer. With more adaptive IP blacklists — or stricter CAPTCHA requirements — Twitter could stem the tide against spammers.
Some significant challenges certainly remain then, but the researchers' work at breaking down the patterns used to automatically generate Twitter accounts has had some promising results. The team notes that they "are now working with Twitter to integrate our findings and existing classifier into their abuse detection infrastructure," and any attempt to whittle down the number of spammy @replies is very welcome — even if that means entering more CAPTCHAs.