Google, Facebook, Microsoft, and Twitter partner for ambitious new data project

Facebook Prineville Data Center
Vjeran Pavic

Today, Google, Facebook, Microsoft, and Twitter joined to announce a new standards initiative called the Data Transfer Project, designed as a new way to move data between platforms. In a blog post, Google described the project as letting users “transfer data directly from one service to another, without needing to download and re-upload it.”

The current version of the system supports data transfer for photos, mail, contacts, calendars, and tasks, drawing from publicly available APIs from Google, Microsoft, Twitter, Flickr, Instagram, Remember the Milk, and SmugMug. Many of those transfers could already be accomplished through other means, but participants hope the project will grow into a more robust and flexible alternative to conventional APIs. In its own blog post, Microsoft called for more companies to sign onto the effort, adding that “portability and interoperability are central to cloud innovation and competition.”

The existing code for the project is available open-source on GitHub, along with a white paper describing its scope. Much of the codebase consists of “adapters” that can translate proprietary APIs into an interoperable transfer, making Instagram data workable for Flickr and vice versa. Between those adapters, engineers have also built a system to encrypt the data in transit, issuing forward-secret keys for each transaction. Notably, that system is focused on one-time transfers rather than the continuous interoperability enabled by many APIs.

“The future of portability will need to be more inclusive, flexible, and open,” reads the white paper. “Our hope for this project is that it will enable a connection between any two public-facing product interfaces for importing and exporting data directly.”

The bulk of the coding so far has been done by Google and Microsoft engineers who have long been tinkering with the idea of a more robust data transfer system. According to Greg Fair, product manager for Google Takeout, the idea arose from a frustration with the available options for managing data after it’s downloaded. Without a clear way to import that same data to a different service, tools like Takeout were only solving half the problem.

“When people have data, they want to be able to move it from one product to another, and they can’t,” says Fair. “It’s a problem that we can’t really solve alone.”

Most platforms already offer some kind of data-download tool, but those tools rarely connect with other services. Europe’s new GDPR legislation requires tools to provide all available data on a given user, which means it’s far more comprehensive than what you’d get from an API. Along with emails or photos, you’ll find thornier data like location history and facial recognition profiles that many users don’t even realize are being collected. There are a few projects trying to make use of that data — most notably Digi.me, which is building an entire app ecosystem around it — but for the most part, it ends up sitting on users’ hard drives. Download tools are presented as proof that users really do own their data, but owning your data and using it have turned into completely different things.

The project was envisioned as an open-source standard, and many of the engineers involved say a broader shift in governance will be necessary if the standard is successful. “In the long term, we want there to be a consortium of industry leaders, consumer groups, government groups,” says Fair. “But until we have a reasonable critical mass, it’s not an interesting conversation.”

This is a delicate time for a data-sharing project. Facebook’s API was at the center of the Cambridge Analytica scandal, and the industry is still feeling out exactly how much users should be trusted with their own data. Google has struggled with its own API scandal, facing outcry over third-party email apps mishandling Gmail users’ data. In some ways, the proposed consortium would be a way to manage that risk, spreading the responsibility out among more groups.

Still, the specter of Cambridge Analytica puts a real limit on how much data companies are willing to share. When I asked about the data privacy implications of the new project, Facebook emphasized the importance of maintaining API-level controls.

“We always want to think about user data protection first,” says David Baser, who works on Facebook’s data download product. “One of the things that’s nice about an API is that, as the data provider, we have the ability to turn off the pipeline or impose conditions on how they can use it. With a data download tool, the data leaves our hands, and it’s truly out there in the wild. If someone wants to use that data for bad purposes, Facebook truly cannot do anything about it.”

At the same time, tech companies are facing more aggressive antitrust concerns than ever before, many of them centering on data access. The biggest tech companies have few competitors. And as they face new questions about federal regulation and monopoly power, sharing data could be one of the least painful ways to rein themselves in.

It’s an unlikely remedy for companies that are reeling from data privacy scandals, but it’s one that outsiders like Open Technology Institute director Kevin Bankston have been pushing as more important than ever, particularly for Facebook. “My primary goal has been to make sure that the value of openness doesn’t get forgotten,” Bankston says. “If you’re concerned about the power of these platforms, portability is a way to balance that out.”

Update 7/20/2018 12:00PM EST: This piece was updated to include reference to Microsoft’s announcement of the Data Transfer Project.

Recommended by Outbrain

Comments

As always, IF this works and IF this is only used as it seems to be intended, it’s a great idea and a big step forward.

Also as always, there’s a major name missing from that list that will limit its success.

The missing major name doesn’t want you to move beyond it’s walls, so it’s understandable they’re missing.

I hope this will be encrypted by default, backed by open sourced technology and will able to circumvent any government sanctioned data relay so that in transfers between data centers to data centers, government has no means or right to able to get a copy of it too. Also I hope this will make switching ecosystems more easier and less user hostile than ever before. Also where the hell is Apple’s participation?!?

Hope. It is the quintessential human delusion, simultaneously the source of your greatest strength and your greatest weakness.

Already I can see the chain reaction: the chemical precursors that signal the onset of an emotion, designed specifically to overwhelm logic and reason.

Also where the hell is Apple’s participation?!?

Apple can make hardware, but their services…

I’d expect plugins from Yahoo and AOL before Apple could work it out.

does apple ever play along? it always feels like Apple is more focused in locking you into their ecosystem/wall garden.

To be fair though most companies would love that – it happens that apple is exceedingly good and successful at doing so on account of its hardware prowess.
Whereas a company like Google also tends to make money in a different way so they wouldn’t care as much as long as you use their search and maps. MS is heading towards the same model (or is already there in a sense).

The idea itself is great, but the risks associated with it make it a hard no for me.

Especially with Facebook and Google involved. To me this seems more like those companies finally being able to fulfill their dream of knowing absolutely everything about you.

As it is I refuse to let those apps raid my phone for information when I can avoid it (thank you iOS), no way in hell would I let Facebook see my Google data and vice versa. Yes I know they can still get some of that information other ways, but its not a complete picture (and thats another issue).

As far as Apple is concerned. With how much they have been pushing privacy as a major selling point of their products for the last few years, no way they would open up their services to something like this. I could see Apple working with Microsoft or Amazon… but not google or facebook in this area.

What are you up to that you’re so worried about this?

The simple fact that I don’t enjoy Google and Facebook treating me as the product?
What is so hard to understand about not wanting a private company (or anyone for that matter) to know all of that information about me for the sole purpose of selling me advertising?

That is why I stopped using "free" services as much as possible and switching to paid services (email, RSS feeds, etc).

I don’t have to be doing anything that I am worried about to not want my information mined in this manner (I mean I do have some photos that I would rather not end up on facebook or google’s servers).

Nobody’s selling you anything unless you decide to make a purchase. I guess I look at it the other way: What damage is done if I use their products and they know more about me? In my case, zero damage.

I never said they were selling me anything, I am the one being sold.
Google and Facebook treat you as the product and their advertisers as the customers.

What happened with Facebook is enough reason to not trust that these companies are actually going to do what is necessary.
When Google or Facebook put profits of the data they know about you above actually making things that benefit you there is a problem. That is exactly what they do though, their business is making money off of your data in some form.
That is the only way their services are able to remain "free". Instead of paying them directly, you are paying them with your data.

The damage comes from what happened with Facebook and Cambridge Analytics, a new API (like this article) that leaks more than it should, a company having access to data they never should have had.
Can you imagine what would have happened if this was in place during the Facebook scandal and it also somehow Cambridge analytics to get your emails or other private data that you thought were safe on Google?

Honestly the inclusion of Facebook has me skeptical of the entire project.
I can trust Google and Microsoft as they’ve never given me a reason to doubt them, but Twitter and Facebook are dangerous and terrifying.

I’m glad I never joined Facebook as I saw the dangers of it right out of high school. That was 15 years ago and I’ve only become increasingly displeased with them.

As far as Apple is concerned. With how much they have been pushing privacy as a major selling point of their products for the last few years, no way they would open up their services to something like this.

Yeah, Apple care so much about your privacy that they won’t even allow you to get access to your own information.

If people care about protecting their privacy they need to realise they are better off just letting Apple control their data.

that they won’t even allow you to get access to your own information.

Not entirely sure what you mean for that, since they are opening up the same things all the other companies are for downloading your data.

I don’t agree with letting Apple control all of your data if you really care about protection the most. For me they are just the lower of all the evils since I don’t need to worry about them mining my data for ads or anything (but I still have to worry about a leak or something with the data I have stored with them)

Facebook and Twitter? Yeah, no…yeah…..no….

View All Comments
Back to top ↑