Facebook has a link problem. Earlier this week, a security researcher named Inti De Ceukelaire detailed a curious fact about how Facebook Messenger treats privately shared links. Through the right API call, De Ceukelaire was able to summon links shared by specific users in private messages. The links were collected by the Facebook crawler, where De Ceukelaire discovered they were easily accessible to anyone running a Facebook app. Those links could be anything from a popular news story to directions to an abortion clinic. As long as they’re shared in private messages, they’re logged in Facebook’s database, and accessible to API calls.
It would be hard to exploit that bug at scale for a few different reasons. De Ceukelaire was only able to make the API call because he's registered as a Facebook developer, and if he started pulling those links en masse, Facebook would quickly catch on and pull his credentials. Still, the bug points to a number of lingering problems with the conflicting way web services treat URLs, and how those conflicts can put private information into public view.
How can you share a link without allowing it to be scraped?
The biggest problem is a simple legal one. Facebook is already facing allegations that it scanned private messages, leading to a lawsuit that was certified for non-monetary damages last month. After examining the company’s source code, one expert described a database called Titan that displayed the date, time, and recipients of any private messages sent through the system. Facebook insists the link practices were entirely lawful and that many of the practices have since been discontinued.
The practice of scanning links is larger than just Facebook. URLs are a common place for sites to collect data, either by routing the link through an intermediary or dropping some query tags at the end of the URL. That's a great way to keep track of where people are coming from, but it can cause real privacy concerns, as Facebook is now discovering. Twitter was hit with a similar lawsuit last month, alleging that link-shortening measures in direct-messaged links constituted a violation of privacy. If bit.ly knows which links to shorten, they know which links are being sent to you.
But while some systems are using URLs as public data points, other systems are using them as passwords. If you’re sharing a Dropbox folder or login-free Google document, that URL is as much of a password as an address, a system that also plays a central role in Google Photos. Scooping up those URLs in transit is a genuine security risk, exposing potentially sensitive documents to third-party intermediaries. Of course, it’s not likely that Bit.ly will start harvesting Google Docs en masse, and more than a few people would notice if it did. But the fact that it’s possible at all should give us pause.
That leaves consumers in a tricky place. When Google gives you a private 40-character URL, how are you meant to share it without allowing it be scraped? Even if you know enough to seek out a privacy-minded messaging service, how are users supposed to tell who’s storing URL logs and who isn’t? Both systems are invisible and they combine in unpredictable ways. But the end result for users is a private system that isn’t nearly as private as it seems.