After yesterday’s mass shooting in Las Vegas, Google briefly gave its “Top Stories” stamp of approval to two 4chan threads identifying (and triumphantly smearing) the wrong man as the shooter. Google apologized for including “inaccurate” web pages in its top results, saying that its algorithm had spotted a burst of activity around a little-used search term (the name of 4chan’s so-called suspect), created a Top Stories carousel, and favored “fresh” content there above more authoritative sources.
This is far from the first time Google’s search results have purveyed misinformation. In March, it finally instructed human quality raters — who manually evaluate web pages to train the Search algorithm — to flag offensive and factually incorrect material, which Search could then downgrade for users seeking general information about a topic. As the 4chan incident shows, though, it still has blind spots. And that’s not really because of a problem with Google’s algorithm. It’s happening because Google’s core business has never been about defining truth — yet that’s what Top Stories is implicitly promising.
Google publishes detailed guidelines for website quality ratings, where it outlines many ranking factors that include originality of content and “expertise, authoritativeness, and trustworthiness.” But it won’t go into detail about how various factors intersect, a crucial question that it says is too complicated to answer. Among other things, Google won’t explain just what makes a “Top Stories” carousel appear for a particular search term, except that it uses a special set of signals to detect whether users might be interested in seeing fresh or “breaking” links. Once the carousel appears, we don’t know how its stories are chosen compared to Google’s normal search results — except, again, that there’s an added focus on freshness.
I’m sure there are complicated answers to these questions, but there are also basic principles that Google could publicly commit to following, if it wanted. Are Top Stories supposed to be held to higher than usual trustworthiness standards than average search results? Does the carousel only appear if there’s a baseline general-interest newsworthiness, or is any internet micro-controversy supposed to trigger one? What is a “Top Story” even supposed to be?
“Top Stories” has only been part of desktop Google Search since the end of 2016, when it replaced similar “In the News” boxes. While the new name justifies a much broader and more flexible range of content, it leaves the overarching purpose unclear. If good Top Stories are defined by the same standards as good generic search results, they should just be the top-ranked links for a query. If the point is to showcase fresh content, they should be called something like “Recent Stories.” If they’re the most high-quality and definitive results, Google needs to explain its standards — and why they’re different from the larger ranking system.
A Google spokesperson told us that Top Stories could be valuable for immediately presenting a range of different types of useful information on a search query, especially when it’s newsworthy. But Google already has a News box, which sets search algorithms loose on a smaller list of approved sites. It seems easy to offer an expanded version of this with a larger list of general-purpose websites, exclude sites with low “authoritative” rankings, or otherwise provide special guidance for these sections. Conversely, if Google can’t define why “Top Stories” are special, then it might as well abolish them — there’s no reason to give a few arcanely selected web pages special treatment.
Google’s original PageRank algorithm was built to deliver the most popular and influential results for any search query, whether or not the content was true or good in a more philosophical sense. It’s gotten far more complex since then, but being the web’s best directory is still very different from being its ultimate arbiter of quality. As it puts more and more focus on its AI assistant, Google wants to be both — but it refuses to acknowledge that they’re different things, and that the tools that work for one might be bad for the other. As long as that’s true, Top Stories will stay like it is now: a vaguely named, poorly defined system that conflates popularity with value.
Comments
Facebook spreads false news faster than any platform. I think we can cut Google a little slack.
By nighthawk1986 on 10.04.17 10:23am
And people have to be responsible themselves, I won’t click a source unless I know is from a well known news source, 4Chan is famous but is not a well known or a respected news source.
By Miguel Martinez on 10.04.17 10:25am
The problem is that Google is choosing to promote 4Chan in a way that suggests it’s a reliable source of information. They’ve chosen an editorial role, they have editorial responsibilities.
By Marc Love on 10.05.17 2:15am
Yes, but that’s a lot of users that are spreading it. This is an algorithm grabbing news stories from a crappy source.
By Disdain on 10.04.17 10:33am
Algorithms are written by people. Google doesn’t get to hide behind the algorithm as if they didn’t explicitly create the rules of the algorithm. Anybody who knows anything about 4Chan would know that it doesn’t belong in the promoted news section alongside the Washington Post and the New York Times.
By Marc Love on 10.05.17 2:20am
I don’t buy this algorithm story. I’ve used Google News Top Stories for years and there is little doubt that somebody is pulling strings in the background. There is a clear political bent and often a political statement disguised as a headline will rank in the top section for days, while everything around it constantly changes.
By P Roppo on 10.04.17 10:26am
Of course when you search for "geary danley" those pages would show, and I would expect them to – they were the top hits for that search term. Exactly what that feature claims to do.
The real question is if you searched for "Vegas Shooter" or similar, what would Google Top Stories show?
By prittjr on 10.04.17 10:39am
Nah.. I like top stories.. Blacklisting 4chan would be a better idea IMO
By Svem26 on 10.04.17 11:47am
Something’s gone awry with the formerly useful Google News, which was a handy, one-stop-shopping aggregate that often saved me the time of digging elsewhere if no further digging was warranted. Now it’s often date-stale or just plain bad (as in, inaccurate) articles that are worse than a waste of my time. The only improvement seems to be better elimination of sources I don’t want to hear from, which used to be about as ignored as the "Door Close" button in an elevator.
More and more I’m just making my first news stop Reuters after barely giving Google News a glance.
By TiredOldCynic on 10.04.17 5:54pm
"It’s happening because Google’s core business has never been about defining truth — yet that’s what Top Stories is implicitly promising."
Ah — no, that’s not it at all. It clearly states the stories are picked by a computer. I’ve always assumed since this was Google, I was looking at stories found by a search engine — good, bad, the usual mix of Internet content.
When did they promise you truth or even vetting?
By Doc Memory on 10.05.17 2:23am
Or you know they could just stop letting anybody that declares themselves to be a news site from being included in a Google-promoted news section. Establish some standards. If you’re going to do editorial like that, then you need editors and editorial standards.
By Marc Love on 10.05.17 2:24am
Why not just have a small team of people curate the news to keep the robots in check? It would prevent these things from ever happening. Why we must solely rely on tech to do everything. Toss a little humanity back in the mix.
By MrSparkl3z on 10.05.17 12:13pm
Personally, I’d like a REAL Kill File for Google News instead of the slider which only has the ‘Rarely’ setting. I can curate well enough myself.
By mathue taxion on 10.05.17 11:13pm