The shady data-gathering tactics used by Cambridge Analytica were an open secret to online marketers. I know, because I was one

Illustration by William Joel / The Verge

The recently revealed Facebook data “breach” that allowed Cambridge Analytica to get access to millions of users’ worth of Facebook data has been greeted as a shocking scandal. Reporters and readers have been surprised to learn about the ability to gather personal data on the friends of people who install a Facebook app, the conversion of a personality quiz into a source of political data, the idea that you can target marketing messages based on individual psychographic profiles, and the surreptitious collection of data under the guise of academic research, later used for political purposes. But there is one group of people who are mostly unsurprised by these revelations: the market researchers and digital marketers who have known about (and in many cases, used) these tactics for years. I’m one of them.

Back when the Cambridge Analytica data was getting collected by an enterprising academic, I was the vice president of social media for Vision Critical, a customer intelligence software company that powers customer feedback for more than a third of the Fortune 100 companies. Our enterprise clients wanted to know how social media data could complement the insights they were getting from their customer surveys, and it was my job to come up with a way of integrating social media data with survey data.

Vision Critical’s blue chip customer roster made us an appealing target for the many social media vendors hawking data-gathering solutions. In 2012, the data analytics firm Microstrategy pitched me on their “Wisdom” tool for Facebook, which the company had touted as a data source based on ”12 million anonymous, opted-in Facebook users.” But when I spoke with an analyst at Microstrategy that December, he told me that the company’s data set — by then, nearly 17.5 million strong— was based on just 52,600 actual installs, each of which provided access to an average of 332 friends.

Nor was Microstrategy doing something unusual. The tactic of collecting friend data, which has been featured prominently in the Cambridge Analytica coverage, was a well-known way of turning a handful of app users into a goldmine.

“We were all conscious that friend data was accessible,” says Sam Weston, a communications consultant who has been working in digital marketing and market research for nearly two decades. “I don’t think that anybody had perspective on the potential consequences until it was slotted into this news story, where the consequence may have been the election of Donald Trump.”

Mary Hodder, a longtime privacy consultant who is now the product developer for the Identity Ecosystem Steering Group, was equally unsurprised. “I knew 10 years ago that Facebook’s API allowed an entity to gather friend data,” Hodder told me. “But I wasn’t surprised that the 95 percent of the population that didn’t understand this were shocked. They thought if Facebook was going to sell you out, it would just be you. They didn’t know you would take all your friends with you.”

If Facebook’s generous access to friend data was known to many marketers and software developers, so was the tactic of disguising data grabs as fun apps, pages, or quizzes. Another company I spoke with back in 2012 was LoudDoor, a Facebook advertising company that offered enhanced ad targeting based on the data they were gathering from millions of Facebook users. The company ran a network of Facebook pages that were essentially content farms, like the Diving Wrecks and Reefs page that consisted of pretty underwater photos. Interspersed with all the photos were occasional come-ons for fan surveys that would enter you in a contest; taking the survey meant installing a “fan satisfaction” app that gave LoudDoor access to all your data. Pages and apps like this might have seemed innocent to the average Facebook user, but people in the marketing community were hardly deceived.

“It was pretty common knowledge among people who understood the internet that if you were taking a quiz to find out what kind of cheese you are, somebody on the other end is very interested in getting that data,” says Susanne Yada, a Facebook ad strategist. “I wish I could say I was more surprised and more alarmed. I just assumed that if you take a quiz, someone would know who you are because you are signed into Facebook.”

Among the ubiquitous data-gathering apps of that period, the personality quiz that Cambridge Analytica created was nothing special. “It is actually stunning to think, with the clarity that perspective brings, that you could stand up the kind of ridiculous quiz or survey that they did and then walk away with psychographic profiles on 50 million Americans,” Weston muses now. “Even for someone who worked in the field, [the Cambridge Analytica story] was a moment that gave you real pause to reflect on the business that we walked away from, but that was a massive part of the industry for a long time.”

And yes, these “fun” apps gathered your friends’ data too: the LoudDoor salesperson I spoke with at the time told me that they had close to 12 million users, which gave them access to data on 85 million Americans. But that friend data grab was far from clear to fans of the page, even if the company’s disclosure notice explained that the purpose of its app was to ensure that “brands can make better decisions about which content they should promote to you.”

As for the idea that the purpose of gathering data is to target ads — well, Cambridge Analytica is scarcely an outlier there, either. Mary Hodder recalls working with a company called Apisphere that used location data for ad targeting back in 2008 and 2009.

“We did this project for the Hard Rock Cafe Casino in Las Vegas,” Hodder told me. “They wanted to put wands in the ceiling to collect the IMEI [identification] numbers of every phone that went by, map everywhere they went in the casino or on the property, and map them in the hallways up to their rooms. And then they could do a reverse lookup on IMEI numbers because there are companies that aggregate IMEI numbers, and as soon as they figured out who the person was, they could send them offers, text them offers, and the people had not opted in. So they were basically just intercepting your phone, and figuring out how to send messages to you in one form or another.”

Hodder remembers objecting to this as at a meeting where the rest of her colleagues saw nothing amiss with the practice: that’s how normal it was to harvest data and use it to target individual ads, long before Cambridge Analytica got in on the action. For those of us who were witness to the “look what we can do!” explosion of data-driven marketing tactics, it takes some reflection to understand why the practices of Cambridge Analytica have surprised so many people.

“When you say ‘We’re creating psychological profiles to sway people,’ marketers have been doing that since marketing existed,” Yada observes. “But I think there’s a difference between actually representing what your services are and how they can help people, versus being really clandestine and trying to sway people with fake news.”

“The fundamental problem is the gap in understanding about what Facebook’s business model actually is,” Weston says. ”You know how Target’s business model works, or how Apple’s business model works, but nobody understands how these folks [digital marketers] actually make money. That’s not just true for Facebook but for every ad-supported business and every data-supported business, which is just about every tech company… [Facebook] did a good job talking to Wall Street about how their business works, but at no point did they actually talk to their users.“

Given the widespread normalization of deceptive data gathering and marketing tactics, I count myself lucky that the company I worked with didn’t buy into the frenzy of the social media data gold rush. Because Vision Critical had its roots in the market research industry, where there are norms and codes of practice around how you handle respondent data, the idea of grabbing up friend data was utterly anathema: the company’s founder dismissed it as a non-starter the very first time it came up, and at every stage in developing our own Facebook app, we disclosed that we were using it to gather data.

But the whole time, it felt like we were swimming against the tide by following old-school standards for transparency and accountability in how we handled data. I hate to admit how many times I pitched my colleagues on some clever way of incentivizing people to connect to Facebook, based on some scheme or app I’d just stumbled across, only to be reminded that it would violate our data or privacy policies. If I’d been working in a digital marketing agency where gamifying data requests was the norm, I can easily see how I might have yielded to the temptation of disguising a data grab with a recreational app, or scooping up friend data just because it was there.

That experience points to how difficult it will be to reform not just Facebook, but the larger industry of data collectors and marketing shops that have evolved to maximize the amount of data collected and the precision of ad targeting. Social networks and other advertising platforms may set up various processes that notionally screen out data aggregators or manipulative advertisers, but as long as these companies run on advertising revenue, they have little incentive to promote transparency among data brokers and advertisers. And those industries, in turn, have little motivation to place ethics ahead of profit.

The outrage now directed at Cambridge Analytica and Facebook suggests there might be an appetite for an online ecosystem based on a different compact between consumers, platforms and advertisers. But we won’t build that ecosystem by pretending that this is a matter of a few bad actors. It’s time for us to face up to what online marketers and researchers have known for more than a decade: the contemporary Internet runs on the exploitation of user data, and that fact won’t change until consumers, regulators and businesses commit to a radically different model.

Alexandra Samuel is the former VP of Social Media for Vision Critical. She is now an independent technology writer and a regular contributor to The Wall Street Journal, JSTOR Daily, and The Harvard Business Review.

The Cambridge Analytica scandal

What happened 7
Facebook reacts 8
The world reacts 13
The apology tour 6
Zuckerberg testifies 14
What you can do 5

Comments

Oh wow. This is gonna get much bigger now. I hope more media picks up this expose article by The Verge. Good job Verge for this

I’m done. You know when Cambridge Analytica scandal popped up, I’m not sure if I can really leave Facebook, bc no matter how I hate bc of what I known on this controversy, there’s where my family is, my access to local business that doesn’t have their own websites, and the biggest address book in the world especially I’m on the point of my life right that I need networks of people for professional purposes. They’re just too big to leave. It’s such a huge mess and courage to really do it. I don’t even know hoe many apps will I lose access to logging in if I really do it bc I do really love their one click sign in too, bc I hate passwords, or even if I’ll have access to Facebook Messenger bc of all that’s what I really need on them only, bc there’s no greater chat app than it here (I assume they still function separately). But no matter how hard it is, I’m gonna strive my way to do it, bc if no one will these company’s will stay business as usual. This company this menace finally need to feel the heat to those that known, esp to those that can’t bc they’re too comfortable to care (esp in most of here in Asia)

I’m gonna download my data now, in all trust also that’s all of it, either its really that it, or there’s still some data they won’t give to you, and see it. Even delete your data when you really delete your Facebook account. I expecting to find out something that I will be surprised that they know about but I will not be surprised either.

Going back to this article context, I’m no surprised either, but the more you put a thought into it, it’s just so scary that this is what we live in now. I’m the kind of perdon that back then a care free on privacy bc you out so much trust to them that they’ll not really do anything bad about it but only to learn more about you, serve ads bc that’s their money, and in the end work better for you, but now I don’t know now, bc it’s so much complicated than that. Even if I really don’t sign up for any of those shody survey apps bc I’m too smart to know what’s spam/scam is, I’m pretty sure I’m already compromised, as Im seeing those things on my newsfeed too, and if you’re seeing it on your newsfeed, it’s more likely its one in your social circle, and you are already compromised bc that’s how their system works. I hope this movement against Facebook spark a new important debate just like Time Well Spent and push for new changes for other platforms we do use too. I know Facebook is borderline evil on this data that they enable these things too happen, I’m pretty sure this happend on other platforms as well, Google, Twitter, and YouTube, not exactly but one way or another. Maybe not that massive as this bc business model, intent and leadership really plays a part here but In pretty sure it happens. Even if they closed the loopholes these smart people made, people are still smart over the AI to use it in the wrong way and find a way.

I hope not only Facebook but whole Silicon Valley will look into its reflection on this crisis and try to draw the line now in privacy. And I hope the US government really cares in good intent that they’re really will investigate this and try to work on something in good intent to finally have the US some great and proper digital privacy laws, only to protect its people not just bc of some other agenda as well

I was able to use Messenger without an active Facebook account for a while, but then it would keep me logged in but would stop people from receiving my messages until I reactivated Facebook. Broken by design, I suspect.

Immediate articles on TheVerge and elsewhere have focused on limiting permissions or disengaging from Facebook, but what are the alternatives? What apps provide a good user experience and respect user privacy, while being popular enough to be adopted by friends, family and professional contacts? Is there any practical way that I can move away from Facebook, Messenger, WhatsApp and Instagram, without becoming anti-social?

Fantastic article, one of the best ever written on this site. Everyone who is up in arms about Cambridge Analytica needs to read this.

Anyone working in the field of big data or consumer research has known this is going on for over a decade, as stated. This Cambridge Analytica story really isn’t news. Particularly in light of the fact that for Obama’s 2012 campaign, everyone was bragging about how Facebook and other social networks invited the Democrats right in and handed them the data. Not some limited amount, either – I assume the DNC has at minimum a snapshot of the social media graph from that era for the US and beyond, it hasn’t been further updated since (of course it has). Oddly, there wasn’t the same sort of outrage/coverage about that. I leave reasons why as an exercise for the reader.

As an aside, the fact that your friends being susceptible to the clickfarm games/quizzes/etc. completely compromises your own privacy is a significant part of the reason why I deleted my Facebook account 8 years back. No regrets. I invite you to do the same.

Yea invite others to do the same also, just like their social bubble and how this breach happend, there’s a social effect on this, if you deleted your Facebook and you successfully taught these issues to others too, this will hopefully create a ripple effect that others will delete theirs too. Compromise their mentality through Facebook. Not their data.

Hopefully so that they will be finally be hurt or this is the beginning of their end

How is it marketing companies can gather all this data and then Facebook turns up such irrelevant ads?

Quantity /= quality.

But, but, algorithms! And…AI!

By and large because the people paying for Facebook ads and sponsored posts in your timeline are junk. The A list advertisers aren’t wasting their time putting a post in front of you, they get much more out of learning everything Facebook knows about you.

This is a good article. But it’s not just marketers who have know this all along—anyone with common sense and a modicum of skepticism should have known not to volunteer this information. I never played Farmville or any number of the other apps that have popped up over the past decade. I also NEVER log into another site through my Facebook account, and I never put up information that I wouldn’t be comfortable with perfect strangers knowing. All of this just seemed obvious to me, so I can’t understand why everyone is so shocked—shocked!—to learn that this is how your information was used.

Also, I am not sure I understand the real harm that was caused to any particular Facebook user. After all, the information that was bought/shared was info that the user himself or herself decided to make available in some capacity, and was then used to tailor political and commercial ads to that user. So…that user was ultimately the recipient of more effective advertising? Do we really believe that users are such sheep that they lack the willpower to resist such advertising? If so, then I think they have no one to blame but themselves.

Do we really believe that users are such sheep that they lack the willpower to resist such advertising? If so, then I think they have no one to blame but themselves.

Its America looking for any excuse for having elected Trump. This and the Russian ‘Troll farms’ before it.

I wonder how much of an effect this is actually having worldwide. Nobody I know (in the UK) is actually considering deleting their account or anything at the moment. Anecdotal data I know, but still.

I’m also in the camp of feeling like I knew this was happening and assumed most other people did too, but then we are in the section of society that reads tech blogs and so are way more likely to be aware of this kinda stuff and not log in via facebook, or by default be wary whenever the box pops up asking for "app X" to have permission to access your profile data.

Not sure about Facebook, but having friends is still ok right ?

As someone who developed some non professional Facebook applications I knew I could access that data and always figured that it probably was for sale somewhere on the black and grey market. Only thing that I don’t and didn’t know is how commonly it is used.

Anyway, thank you eu for giving us GDPR in two months which makes all of these types of things illegal.

Luckily I’ve never used an app inside Facebook nor have I ever logged into any service by using Facebook credentials.

I’m confused as to why would Facebook hand these stupid apps all that data on a platter? What’s in it for Facebook?

Another question, they’re saying people gave these apps their Facebook data, but also their friends’ data. Is it only their friends’ names, or also their friends’ details such as names, sex, age, likes, friend list, etc?

What’s in it for Facebook?

Isn’t it fairly obvious that it received a lot of money for the information?

No, it’s not obvious. I was asking for reliable first hand information.

Ok. Per a 30-second Google search:

http://www.latimes.com/business/technology/la-fi-tn-facebook-third-parties-20180320-story.html

https://www.nytimes.com/2017/11/19/opinion/facebook-regulation-incentive.html

The NYT article is actually from before the current crisis, but is still relevant and raises the question in my mind of whether any of the Cambridge Analytica stuff presents anything we didn’t already know/suspect about Facebook.

In other words, Facebook didn’t make any money from these apps but these apps possibly helped make their site a bit more interesting to users, which possibly ended up marginally beneficial to Facebook.

Personaity quizes, FarmVille and other useless apps don’t sound like something that’s indispensable to Facebook. They’ve made a horribly stupid decision to hand over their crown jewels to others for questionable gain. Idiots.

My other question still stands, if they really shared people’s friends’ data with outside apps even when people agreed only to share just their their own data, they’re even more irresponsible than I thought.

Facebook didn’t make any money from these apps but these apps possibly helped make their site a bit more interesting to users, which possibly ended up marginally beneficial to Facebook.

Facebook primarily makes money off of advertising. The more people that are on Facebook, and the more data that Facebook has on its users, the more money that advertisers will spend in advertising on Facebook. So Facebook’s model is all about growing its membership.

In May 2007, at the time when Facebook opened its platform, it had about 20 million members. Games, dating apps, quizzes, video sharing services, etc. all became plugged in, and members could use Facebook credentials to log in to many other sites. All of these things spurred massive growth in Facebook’s membership, reaching about 1 billion users by 2012 when it held it’s IPO that raised $16 billion for the company (apparently it was the third-largest in history). It now has 2 billion users, and the continued growth increases its attractiveness for advertisers. The stock price has grown to 4x what it was in 2012. I think it is quite an understatement to dismiss Facebook’s profit from the third-party developers as "marginally beneficial." None of the apps/services were indispensable to Facebook, but they encouraged users to share even more data, which in turn made the platform even more attractive to advertisers. It was likely a violation of their privacy policy, but they knew exactly what they were doing and were gambling on not getting caught. Again, users volunteered this information in the first place and probably should have been more circumspect, so at least part of the blame lies with them as well.

Just because two things happened at the same time doesn’t mean one caused another. No to mention that games, quizzes, video sharing services did not have to have access to personal information AT ALL, yet Facebook stupidly gave it to them. Why would a quiz need my gender, age or my friend list? They could just have access to my user ID and absolutely nothing else.

Excellent job Verge!

More accurately:

Alexandra Samuel is the former VP of Social Media for Vision Critical. She is now an independent technology writer and a regular contributor to The Wall Street Journal, JSTOR Daily, and The Harvard Business Review.

I didn’t really find out anything new, except for the IMEI grabbing scheme. That was pretty much bordering on illegal gathering of data that only intelligence agencies are usually allowed to get legally.
Some people should go to prison if they did IMEI capture, in my book.

View All Comments
Back to top ↑