clock menu more-arrow no yes
Illustration by William Joel / The Verge

Filed under:

Cambridge Analytica’s Facebook data abuse shouldn’t get credit for Trump

‘I think Cambridge Analytica is a better marketing company than a targeting company’

If you buy something from a Verge link, Vox Media may earn a commission. See our ethics statement.

Over the weekend, reports from The New York Times and The Observer confirmed that voter-profiling company Cambridge Analytica had amassed data on over 50 million Facebook users. This information had been collected legitimately by an academic researcher, Aleksandr Kogan, who passed it on to the profiling firm. (This is why Facebook insists what happened wasn’t a “data breach.” Instead, it was the company’s fault. Facebook didn’t closely supervise how its data was being used.)

Cambridge Analytica gathered this information to develop “psychographic” profiling tools, which it claimed could tailor political ads to users’ personality traits. “We exploited Facebook to harvest millions of people’s profiles,” whistleblower Christopher Wylie told The Observer. “And built models to exploit what we knew about them and target their inner demons. That was the basis the entire company was built on.”

It’s a great quote. But this weekend’s reports suggest these methods might not have actually been used in the 2016 US election. (In March 2017, a New York Times article said psychographics weren’t used; recent articles offer a somewhat more muddled picture.) Still, is it even possible to target a person’s inner demons using Facebook data? How afraid should we be of sophisticated psy-ops being deployed at scale, thanks to both the data we willingly give to Facebook and Facebook’s apparent inability to protect the people who use it? We asked Facebook how many other researchers had access to this data and if Facebook was reviewing those projects to see if misuse occurred elsewhere. Facebook hasn’t responded.

Taken altogether, it seems like Facebook was taken in by a shady firm that misused data and lied about it. When Facebook found out, it did nothing. And making matters worse, we can’t even point at Cambridge Analytica’s deception as the reason Trump was elected: a closer look at its methods suggests they might not even work.

This isn’t the first time Facebook’s protections for its users have been called into question. In January 2012, experimenters were allowed to manipulate what about 700,000 Facebook users saw when they logged in. The study was meant to assess “emotional contagion” — the idea that if you were shown sad things, you’d become sadder, and if you were shown happy things, you’d become happier. The study, published in 2014, almost immediately kicked up a fuss — though it had been legal, it might not have been ethical. Lost in the noise was something very interesting to the discussion around psychographics: the psychological effect was very small.

Small is not the same as “no effect.” About 340,000 extra people voted in the 2010 US elections, thanks to a Facebook message, according to a study in Nature in 2012. But that’s because the message used the power of real-life social networks to get them to go: an “I voted” message showing the names of up to six friends who voted got more people to the polls. People who saw this message were 0.3 percent more likely to seek information about their local polling place than those who just saw an informational message about voting. They were about 0.4 percent more likely to get to the polls, too. This only worked, the study found, if close friends had clicked “I voted.” This wasn’t an effect of advertising or psychological profiling, though. It was just peer pressure.

Misuse of data is bad, but some coverage of Cambridge Analytica suggests that knowing what someone liked on Facebook is enough leverage to transform elections. Even Wylie seems to think so, calling his work “Steve Bannon’s psychological warfare mindfuck tool.” That is almost certainly overstating Cambridge Analytica’s power. Gathering Facebook data to predict personality and then using that to craft a message that would sway an election is a very tricky process, says Eitan Hersh, a professor of political science at Tufts University and author of Hacking the Electorate. To understand why, it helps to know a little about the backstory of two ideas: microtargeting and psychographics.

Microtargeting means analyzing data to predict the behavior, interests, and opinions held by specific groups of people and then serving them the messages they’re most likely to respond to. The Obama campaign used this technique in different ways in both the 2008 and 2012 elections, mining data from publicly available voters’ files as well as social media like Facebook, according to Mother Jones. Though many people say that microtargeting played a major role in Obama’s re-election in 2012 — and it is possible — it’s not proven, says Frederik Zuiderveen Borgesius, a legal researcher at the Free University of Brussels. The campaign did get a lot of attention for the way it used the new social media outlets to target voters, though.

One way to target voters, in particular, is relevant to Cambridge Analytica: collecting information to predict people’s personality and psychology — known as psychographics — and then using that information to try to influence behavior. Most commonly, psychographics focuses on predicting attributes measured by the Big Five personality scale: openness, conscientiousness, extraversion (or introversion), agreeableness, and neuroticism. And it’s mostly used to sell products. Traditional demographic-based targeting will show a cleaning products ad to, say, white middle-aged women who stay at home. That’s the population most likely to buy the company’s sponge. Psychographic-based targeting, on the other hand, will show a home alarm ad to people who are neurotic because these people are more likely to be worried about safety.

This works — to an extent. In a paper published last year in the journal Proceedings of the National Academy of Sciences, researchers used psychographic targeting on 3.5 million Facebook users. Facebook didn’t explicitly give companies personality information, but it did let advertisers target users based on their Likes, says study co-author Sandra Matz, a business professor at Columbia Business School who studies big data and marketing. (Likes are anonymized by default, so a company can target, for instance, 34-year-old white women in California who like “Yosemite National Park,” but it won’t know who they are.) Previous research shows that there’s a decent correlation between Facebook Likes and Big Five personality traits, adds Matz, and so her team chose to target Likes that correlated with high extraversion (“making people laugh”) and high introversion (“computers”).

Then, they created ads that either aligned with or contradicted someone’s personality profile. For example, the beauty ad for extroverts told them to “dance like no one is watching (but they totally are)” and showed a woman at a crowded party. The ad for introverts showed a woman with a makeup brush and read “beauty doesn’t have to shout.” Ads that matched a personality profile got 40 percent more clicks and 50 percent more purchases than ads that didn’t match.

This is the type of research that inspired Cambridge Analytica. (One of the co-authors of that study is Michal Kosinski, who pioneered a lot of the research that the firm draws upon.) The founders were also influenced by a 2013 paper, also by Kosinski, that showed that Facebook Likes could predict sexual orientation, ethnicity, personality, IQ, and more. The research, based on over 58,000 participants, found that Facebook Likes could correctly predict whether a man was gay or straight 88 percent of the time and whether someone was a Democrat or a Republican 85 percent of the time. Some results are striking: Liking “Hello Kitty” on Facebook suggests that the user is more likely to be a Democrat, of African-American origin, and predominantly Christian, the study says.

The paper concluded that predicting personality traits based on their Facebook Likes could be used “to improve numerous products and services,” like insurance advertising. It also warned against using this kind of online data without people’s consent because it could end up deterring people from using digital technology altogether. The authors write: “It is our hope, however, that the trust and goodwill among parties interacting in the digital environment can be maintained by providing users with transparency and control over their information, leading to an individually controlled balance between the promises and perils of the Digital Age.”

Now, companies like Cambridge Analytica want to use psychographics and microtargeting to influence political decisions instead of consumer ones. Take the example of gun rights, says Tom Dobber, a doctoral candidate studying political microtargeting at the University of Amsterdam. Extroverts might respond well to a pro-gun ad that talks about hunting as a family tradition and an adventure. But neurotic people might prefer a message emphasizing that the Second Amendment will protect us. “You say the same thing but with two very different messages,” he says.

Even if Cambridge Analytica did affect Donald Trump’s election in 2016, everything we know about political microtargeting suggests that its role was insignificant. “We don’t really know much about the effects of microtargeting, let alone targeting on the basis of someone’s psyche,” says Dobber. “I think Cambridge Analytica is a better marketing company than a targeting company.”

There’s good reason to believe Dobber, and that reason comes from Cambridge Analytica’s previous client. Before the company worked for Donald Trump, it worked with Ted Cruz during the Republican Primary in 2016. Former Cruz aide Rick Tyler told The New York Times that the psychographic models proved unreliable. According to the Times’ reporting, “more than half the Oklahoma voters whom Cambridge had identified as Cruz supporters actually favored other candidates.” Cruz’s campaign quit using Cambridge Analytica after a primary in South Carolina, the Times reported.

So why are the effects of microtargeting so limited?

First, using digital data to decide who to target can easily go wrong. Often, this extra information doesn’t tell you anything you couldn’t get from a public voter database, and it becomes less useful over time as people’s preferences change. Plus, messages that work for one campaign might not work for another. Microtargeting might be more effective in settings with less information, like a state legislative race, says Hersh from Tufts, but there’s a lot of information being shared in an American presidential election. “The idea that some additional piece of information in this overwhelming wave of data going into people’s head is going to trick them. It doesn’t give people enough credit,” he says.

The 2013 paper that inspired Cambridge Analytica isn’t wrong, says Dobber. “It is not rocket science to infer someone’s political preferences on the basis of someone’s Likes if those people have actually Liked a lot of pages,” Dobber tells The Verge in an email. But it’s a huge leap to say that being able to infer some things from Facebook means swaying voter behavior.

Personality traits are correlated with political values, but the correlation is generally weak, says Hersh. Being conservative is weakly correlated with preferring authoritarianism, but plenty of liberals like authoritarianism, too. That means it’s easy to mistarget messages, and that can be very alienating, he says. For example, one model predicting whether someone is Hispanic based on factors like last name and location was only correct about two-thirds of the time. And Hersh’s research suggests that the people who wrongly received the ads intended for Latinos really don’t like them.

Second, a lot of this data doesn’t give us anything that we don’t already know. Hersh uses a simple case: who owns a boat? Someone who accesses that data will learn that boat owners are likely to be Republican. “That’s a totally useless data point,” he says. “If I have the demographic data — if I know that there is a white man in a Republican town near Virginia Beach who’s rich — I already know they’re Republican regardless of the boat. Boat ownership doesn’t provide any more information.”

In fact, Hersh spent a week trying to create a microtargeting model to find people who were interested in climate change and, he says, “you can’t do better than party affiliation.” If you don’t have access to that information, it’s very hard to figure out who’s interested. If you do, nothing else matters.

Another problem is that the predictive power of these Facebook Likes weakens over time. Many people forget about their Likes and may not feel so enthusiastic about them five years later, even if they didn’t bother to click “Unlike,” Dobber says. Plus, the signal that “liking” something sends may change. For example, says Matz from Columbia, a few years ago, “liking” Game of Thrones might have meant that you were an introvert who watched TV instead of going out. Now that the show is so popular, liking it might mean that you’re an extrovert. Liking Bernie Sanders five years ago is different from liking him right before the 2016 election, Dobber says.

Plus, as Matz points out, even if people take the same personality questionnaire, they rarely answer the same way twice. Self-reported information is often unreliable, and we generally have very little insight into our own personalities. That might further account for why microtargeting runs into so many obstacles.

Finally, the insights might not hold because what will persuade people depends so much on context. What might work for the Obama campaign might not have worked when Clinton was the messenger, and what works in the summer might not work in the fall. Information from campaigns shows that if you have data about individuals, it’s a waste of effort to try persuading people, says Hersh. It’s so much easier to mobilize likely voters than it is to change people’s minds.

Psychographics and microtargeting might sway consumer behavior — do you buy Crest or Colgate? — but politics are a core part of many people’s identity. It’s so hard to persuade people anyway that a cleverly tweaked message is unlikely to have a big effect.

There’s a long history of personality tests being used for political purposes, says Merve Emre, a McGill University literature professor and author of The Personality Brokers, a forthcoming book on the history of personality testing. “People are agog at the politicization of personality or the use of it in these really political ways, but a lot of personality testing began for explicitly political purposes,” she says. The Big Five personality test originally began as a series of experiments on Air Force officers to figure out the ideal personality factors that made someone competent. (Of course, there was no follow-up, so we have no idea if it worked even a little.)

The Myers-Briggs test was developed by two women at a consulting firm and the first institution to use it was the Office of Strategic Services, the precursor to the Central Intelligence Agency. The OSS used it during World War II, in addition to other personality tests, to show how to match secret agents to covert operations. And immediately after WWII, psychologists developed the F-scale to figure out who might have fascist leanings. “Cambridge Analytica doing this with Facebook is more sophisticated than what we’ve seen before, but the impulse behind it — to try to figure out people’s political leanings through personality — is by no means new,” says Emre.

And personality testing, at least originally, was less than rigorous, Emre adds. Early versions of the Myers-Briggs, for example, had separate scoresheets for men and women based on the idea that women were supposed to be more naturally emotional.

But while the problems with personality testing go way back, psychographics and microtargeting changing the election is not the concerning part of the Cambridge Analytica story. The abuse of data is. “People are right that Google and Facebook do have a lot of data and there are nuances where they can use this data to be manipulative,” says Hersh. “The thing I’m calling BS on is this story that requires all these connections to work between the personality and the data and the political values and the messaging.”

It’s possible that in the future, microtargeting will become more accurate, realizing people’s current fears. As we give more and more of our data to social media companies and as the tools for sifting through those huge troves of data get better, measuring personality and targeting specific political messages might become easier. The positive and negative implications amount to the same thing: manipulation, encouraging people to vote or to stay home. Unless governments in the US and Europe start regulating this type of activity and imposing stricter privacy rules, political microtargeting is likely to increase, says Dobber, the doctoral candidate from Amsterdam.

But for now, there are plenty of substantive critiques of big tech companies and how they handle our data. Leaks of data can make us vulnerable — and not just to political campaigns. The thing to fear is not a few shadowy data brokers targeting your “inner demons.” It’s how little Facebook appears to be doing to protect our privacy.

We have emailed Facebook for comment and will update when it responds.

James Vincent contributed to this report.