Last week, Facebook made public a huge trove of data about its users’ interests as part of a new tool called Audience Optimization. For the first time it revealed not only the hundreds of thousands of categories into which Facebook divides its users, but also the number of people who belong to each one.
Here’s how it works: the tool allows any official page manager to identify the "Preferred Audience" for a post by searching for and selecting interests relevant to the story. To help make sure these interests are neither too niche nor too broad, Facebook auto-completes interests and displays the total audience size for each one — not as a subset of your page’s followers, but as a subset of all Facebook users. These categories run from the obvious — Beyoncé — to the more perplexing: emotions, illegal activities, and other identifiers people likely wouldn’t publicly post.
Because Facebook’s interest categories are publicly available, accessing this data was relatively simple. By programmatically emulating searches of all possible letters and numbers until no more results were returned, we were able to pull a list of 282,002 interests — which Facebook says may not even constitute the entire dataset. Most interests are sorted into broad categories like Lifestyle and Culture, People, or News and Entertainment. Each has a very precise number for audience size, ranging from zero all the way up to 1,466,365,990, the number of people interested in Facebook itself. You may have already glimpsed a few of these tags in your advertising preferences, but this is the closest we’ve come to a complete, ranked list of every interest on Facebook.
Many of these interests look a lot like Pages you would ordinarily follow — celebrities, hobbies, brands, etc. — although their relative audience sizes can be surprising. Japanese pop duo Puffy AmiYumi (139,218,340) beats Beyoncé (80,634,320). The Minions (75,372,780) beat Kanye West (74,589,850). Disney on Ice (36,144,060) beats Game of Thrones (34,527,750). And the hobby "cat communication" (4,663,340), whatever that is, beats Sarah Palin (4,645,190). On the other extreme, the Power Macintosh 7100, Amazon MP3s, and the Applebee’s in Amman, Jordan all have audiences of fewer than 30.
But there are about 282,000 categories in our data set, and many of them are more nuanced or abstract. For instance, Facebook says there are 88 million interested in sin, 81 million in boredom, 41 million in crying, and 28 million in envy. These are tiny, though, when compared to the 839 million interested in love or the 571 million in happiness. The interests get weirder the further down the list you go, like "narcissistic parent" with an audience of 41,660.
So where did these categories come from? A Facebook spokesperson explained that interests are formulated algorithmically from popular Facebook open graph pages (the articles, music, and videos being shared), Facebook Ads tags, and other Facebook data sets. The sheer randomness of the list suggests that the algorithms are scraping keywords from your posts (Facebook says Messenger was excluded). For example, the "comitative case," a fancy grammar term for words like "with," supposedly has an audience of 58 million — who may just be heavy users of the word "with." MS-DOS, which is made up of two common letter strings, shows a bigger audience than PlayStation. Dog fighting somehow snuck in as a "sport" with an audience of 7.2 million, and arson as an "event" with 1.9 million (though both of these are likely to disappear soon, as Facebook allows you to report interests as inappropriate).
We’ve broken out the top 10 biggest audience sizes for just a few categories: celebrities, 2016 presidential candidates, positive and negative emotions, gadgets, and a sampling of the most bespoke hipster interests with fewer than 30 followers. Keep in mind that audience size does not take into account sentiment — so Donald Trump’s 20 million lead may not be flattering. We have extracted the top 2,001 interests here, or you can download the exhaustive 18.2 MB list if you want to take a closer look. We even made an interactive Facebook popularity quiz. See something interesting? Discuss it in the comments.
These lists are curious and funny, but they also show us what Facebook is learning about us. Facebook was clear that Preferred Audiences are not necessarily the same as its advertising tags, but they both rely on similar algorithms to sort users and target us with content. And while these categories obviously aren’t perfect now, they’re already a lot more extensive and sophisticated than we imagined.