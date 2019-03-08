“I hear a lot that algorithms are powerful technologies, and the companies are monopolies, and we are helpless against them, and Google is the only option,” says Kartik Hosanagar, a professor of technology and digital business at the University of Pennsylvania. “I’m trying to say that we aren’t.”

Hosanagar is the author of A Human’s Guide to Machine Intelligence: How Algorithms Are Shaping Our Lives and How We Can Stay in Control, which comes out March 12th from Penguin Random House. The book takes the reader through all of the ways that daily lives are touched by algorithms. The Verge spoke to Hosanagar about the last part of the book: four suggestions, or “pillars,” to implement to prevent harmful, unanticipated consequences.

This interview has been lightly edited for clarity.

I was interested in the question of how we can “control” algorithms. But before we get there, can you tell me about why they need to be controlled?

I think most people use technology in a very passive way, not necessarily taking note of how technology is working behind the scenes. That would be okay if all that’s changing is the efficiency with which decisions get made But they also come with certain risks and unanticipated consequences, like biases that are not explicitly programmed. And because they make decisions at scale — meaning for thousands or even millions of people — the risk is that these biases, like race bias in sentencing algorithms, might get almost institutionalized.

That makes sense. Now let’s talk about your four suggestions — or, as you call them, “pillars” — for reining in algorithms.

The first one is related to transparency with regard to the data being used. But before I go there, transparency starts with letting the user know that an algorithm is making a decision. Often, we don’t know that. You apply for a loan and get rejected, and you don’t even know who made the decision. So you need to start with letting the user know that an algorithm is working behind the scenes.

More than that, you need transparency with regard to the data being used. What were the different variables used in, for example, a mortgage application? Then, the second thing is that you need transparency with regard to the models, meaning explanations of not just data used, but how things were weighted. Maybe something weighs more heavily than employment history. Or maybe a top variable is address, and we might wonder why that is most important. It allows us to first learn what’s being used and if there’s a problem with any of that we can question it. I think that helps a lot in addressing trust and unanticipated consequences.

What might the costs be of too much transparency? I know that with algorithms, a lot of this information is trade secrets, so they’re not exactly going to give up that data.

This is an area where I feel like the efforts of regulators are misdirected. When we talk about transparency, I see efforts to reveal the source code, and that violates the intellectual property of the company. On the other hand, revealing the source code is not going to tell us a lot about the algorithm anyway.

So I don’t think transparency necessarily means revealing the source code. It should be high-level, just about data and variables. Plus, the user is going to be overwhelmed if you have a whole lot of detail about how the algorithm works. In fact, research shows that very high levels of transparency actually hurt user trust in algorithms.

Can you tell me more about that research?

There’s some research done at Stanford University on this topic. One problem with grading in college courses is that different TAs are more or less lenient. So they used an algorithm to normalize or modify the grade so that the level of leniency was consistent. Then, one group received minimal information about how the algorithm worked, a second group got some high-level data, and the third got all of the information on how the algorithm worked and the raw data and all the changes made. The result was that the level of trust in the third group was back down to the same level as the group that didn’t receive any information. So it goes to show that if you reveal that much information, it’s as if you reveal nothing. The good news for the user is that you don’t need the inner workings of the algorithms, both from a trust standpoint and from the standpoint of understanding what the algorithm is doing.

Then there is the other question of what kind of transparency you need in an audit process, and that needs more transparency and needs to be done by a team that’s independent of the team that made the models.

The third idea is that there needs to always be some feedback loop between the user and algorithm. How might this work?

Again, oftentimes, the user can do nothing but use the algorithm and has no control and that forces us to be passive. I advocate that there should be a feedback loop, like being able to say, “I don’t like this post.” Or in cases where it’s more important, like with mortgages, that’s where transparency comes in. If we know what variables are being used, we can protest, say, using address or ZIP code as a variable since address correlates to race.

What’s the last pillar?

The last pillar is just that people need to take responsibility for how users are impacted by algorithmic decisions. We have our dollars and votes and we should be looking for elected representatives who are aware of these issues, who take them seriously and are advocating for consumer protections. We should be voting with our wallets and be aware of how data is being used. That line might look different for different people, but we have to be more deliberate with our choices.

We should have more regulation, perhaps a federal algorithmic safety board that is providing best practices to the industry and providing some sort of oversight. There’s a need for an independent body to look into complaints and issues.