Skip to main content

Google is using machine learning to sort good apps from bad on the Play Store

Google is using machine learning to sort good apps from bad on the Play Store

/

The company’s latest technique compares app with app to sniff out the wrong ‘uns

Share this story

Security on Android has always been a challenge for Google due to the operating system’s open nature. But in recent years, the company has been gaining ground in its fight against malware and exploits, thanks in part to the use of machine learning and AI to spot problem apps before users install them. Today, the company has described in detail how it’s using one technique — known as peer grouping — to help keep the Play Store purely playful.

Peer grouping is a pretty simple idea. By comparing data about apps that perform similar tasks, say Google’s engineers, they can identify the ones with something to hide. If you’re looking at a group of 20 calculator apps, for example, the app that is asking for permission to access your microphone, location, and phone book is probably up to no good. Google’s new system flags it automatically, and security engineers then swoop in for a closer look.

Google is using machine learning to group apps by function and spot the bad apples.
Google is using machine learning to group apps by function and spot the bad apples.
Image: Google

With machine learning, Google can use peer grouping to scan apps that are being loaded on to the Play Store en masse. A range of metrics are used to group apps into clusters, including their description, their metadata (how big the file size is for example), and statistics like how many times they’ve been installed. A new peer group is created for each app, as Google says using set categories — like “productivity” and “games” — are “too coarse and inflexible” to follow the changing distinctions of the app world. Similarly, grouping them by hand would take too long. Once grouped, the bad apples can be picked out of the barrel.

“We focus on signals that can negatively affect user privacy, such as permission requests that are not related to core app functionality, and the actual, observed behaviors,” explains Martin Pelikan of Google’s security and privacy team over email. “For example, a flashlight app might not need access to address book of the user or the precise hardware identifier of a user’s phone. The same might hold for many other apps, such as ‘mirror’ apps that turns on a device’s front-facing camera.”

Techniques like this seem to be making a difference for Google. In its most recent annual Android security review, the percentage of users who had installed harmful apps from the official Play Store fell from 0.15 percent in 2015 to 0.05 percent in 2016.

However, data from that same review highlights the fact that Google has to watch more vectors of attack than just the official channels. Many users — particularly those in China — install Android apps from alternative app stores, which the company doesn’t have control over. And when taking these into account, the number of individuals installing bad apps actually rose slightly, from 0.5 percent in 2015 to 0.7 percent in 2016. Machine learning, it seems, can only do so much.