Last week, Apple, without very much warning at all, announced a new set of tools built into the iPhone designed to protect children from abuse. Siri will now offer resources to people who ask for child abuse material or who ask how to report it. iMessage will now flag nudes sent or received by kids under 13 and alert their parents. Images backed up to iCloud Photos will now be matched against a database of known child sexual abuse material (CSAM) and reported to the National Center for Missing and Exploited Children (NCMEC) if more than a certain number of images match. And that matching process doesn’t just happen in the cloud — part of it happens locally on your phone. That’s a big change from how things normally work.
Apple claims it designed what it says is a much more private process that involves scanning images on your phone. And that is a very big line to cross — basically, the iPhone’s operating system now has the capability to look at your photos and match them up against a database of illegal content, and you cannot remove that capability. And while we might all agree that adding this capability is justifiable in the face of child abuse, there are huge questions about what happens when governments around the world, from the UK to China, ask Apple to match up other kinds of images — terrorist content, images of protests, pictures of dictators looking silly. These kinds of demands are routinely made around the world. And until now, no part of that happened on your phone in your pocket.
To unpack all of this, I asked Riana Pfefferkorn and Jennifer King to join me on the show. They’re both researchers at Stanford: Riana specializes in encryption policies, while Jen specializes in privacy and data policy. She’s also worked on child abuse issues at big tech companies in the past.
I think for a company with as much power and influence as Apple, rolling out a system that changes an important part of our relationship with our personal devices deserves thorough and frequent explanation. I hope the company does more to explain what it’s doing, and soon.
The following transcript has been lightly edited for clarity.
Jen King and Riana Pfefferkorn, you are both researchers at Stanford. Welcome to Decoder.
Jen King: Thanks for having us.
Riana Pfefferkorn: Thank you.
Let’s start with some introductions. Riana, what’s your title and what do you work on at Stanford?
RP: My name is Riana Pfefferkorn. I’m a research scholar at the Stanford Internet Observatory. I’ve been at Stanford in various capacities since late 2015, and I primarily focus on encryption policies. So this is really a moment in the sun for me, for better or for worse.
Welcome to the light. Jen, what about you? What’s your title, what do you work on?
JK: I am a fellow on privacy and data policy at the Stanford Institute for Human-Centered Artificial Intelligence. I’ve been at Stanford since 2018, and I focus primarily on consumer privacy issues. And so, that runs the gamut across social networks, AI, you name it. If it involves data and people and privacy, it’s kind of in my wheelhouse.
I asked both of you to come on the show because of a very complicated new set of tools from Apple, designed to protect children from harm. The announcement of those tools, the tools themselves, how they’ve been announced, how they’ve been communicated about, have generated a tremendous amount of confusion and controversy, so I’m hoping you can help me understand the tools, and then understand the controversy.
There’s three of them. Let’s go through them from simplest to most complicated. The simplest one actually seems totally fine to me. Correct me if I’m wrong. If you ask Siri on the iPhone for information on how to report child abuse, or much more oddly, if you ask it for child abuse material, it will give you resources to help you report it, or tell you to get support for yourself. This does not seem very controversial at all. It also frankly seems very strange that Apple realized that it was getting this many inquiries to Siri. But, there it is.
That seems fine to me.
JK: It doesn’t really raise any red flags for me, I don’t know about you, Riana.
RP: This seems like something that I’m not sure if this was part of their initial announcement, or if they’d hurriedly added this after the fact, once people started critiquing them or saying, oh my God, this is going to have such a terrible impact on trans and queer and closeted youth.
As it stands, I don’t think it’s controversial, I just am not convinced that it’s going to be all that helpful. Because what they are saying is, if you ask Siri, “Siri, I’m being abused at home, what can I do?” Siri will basically tell you, according to their documentation, go report it somewhere else. Apple still doesn’t want to know about this.
Note that they are not making any changes to the abuse reporting functionality of iMessage, which, as I understand it, is limited basically to like, spam. They could’ve added that directly in iMessage, given that iMessage is the tool where all of this is happening. Instead, they’re saying, if you just happen to go and talk to Siri about this, we will point you to some other resources that are not Apple.
I think that question about overall effectiveness pervades this entire conversation. But in terms of, here’s the thing, the controversy is pretty small. This one to me feels simple and seemingly the least important to focus on.
The next one does have some meaningful controversy associated with it, which is, if you are a child who is [12 years old] or younger, and you’re on your family’s iCloud plan, and you send or receive nudes in iMessage, the Messages app on your phone will detect it, and then tell your parents if you view it. And if you’re sending it, it will detect it, say, “do you really want send it?” and then tell your parents if you choose to send it. This has a wide variety of privacy implications for children; a wide variety of implications particularly for queer youth, and transgender youth.
At the same time, it feels to me like the controversy around this one is just: how is this deployed? Who will get to use it? Will they always be operating with their children’s best interests at heart? But there’s no technical controversy here. This is a policy controversy, as near as I understand. Is that right, Jen?
JK: I think so. I say that with a small hesitation, because I am not sure, and Riana may know the answer to this. where they’re doing that real-time scanning to determine whether the image itself, how much, I guess — the proportion of skin it probably contains. I assume that’s happening on the client side, on the phone itself. And I don’t know if Riana has any particular concerns about how that’s being done.
Most of the criticisms I’ve heard raised about this are some really good normative questions around what type of family and what type of parenting structure does this really seek to help? I’m a parent, I have my kid’s best interests at heart. But not every family operates in that way. And so I think there’s just been a lot of concerns that just assuming that reporting to parents is the right thing to do won’t always yield the best consequences for a wide variety of reasons.
Riana, do you have any concerns on the technical side that are not policy concerns? That’s how I keep thinking about it. There’s a bunch of technical stuff: we’re creating capabilities. And there’s a bunch of policy stuff: how we’re using those capabilities. And obviously the third one, which is the scanning iCloud photos, contains both of those controversies. This one, it really feels like, as Jen called it, a normative controversy.
RP: So, yeah — their documentation is clear that they are analyzing images on the device, and I know that there has been some concern that because it’s not transparent from their documentation exactly how this is happening, how accurate is this image analysis going to be. What else is going to get ensnared in this, that might not actually be as accurate as Apple is saying it’s going to be? That’s definitely a concern that I’ve seen from some of the people who work on the issue of trying to help people who have been abused, in their family life or by intimate partners.
And it’s something that honestly, I don’t understand the technology well enough, and I also don’t think that Apple has provided enough documentation to enable reasoned analysis, and thoughtful analysis. That seems to be one of the [things] they’ve tripped over, is not providing sufficient documentation to enable people to really inspect and test out their claims.
That is absolutely a theme that runs right into the third announcement, which is this very complicated cryptographic system to check images that are uploaded to iCloud photos for known child sexual abuse material. I’m not even going try to explain this one. Riana, I’m just going defer to you. Explain how Apple says this system works.
RP: This will be done on the client baked into the operating system and deployed for every iPhone running iOS 15, once that comes out around the world. But this will only be turned on within the United States at least, so far. There is going to be an on-device attempt to try and make a hash of the photos you have uploaded to iCloud Photos, and check the hash against the hash database that is maintained by the National Center for Missing and Exploited Children, or NCMEC, that contains known child sex abuse material, or CSAM for short.
There is not going to be a hash of actual CSAM on your phone. There’s not going to be a search of everything so far on your camera roll, only if [the photos] are going into iCloud photos. If you have one image that is in the NCMEC database, that will not trigger review by Apple, where they will have a human in the loop to take a look. It will be some unspecified threshold number of images that have to be triggered by their system, which is more complex than I want to try and explain.
So, if there is a collection of CSAM material sufficient to cross the threshold, then there will be the ability for a human reviewer at Apple to review and confirm that these are images that are part of the NCMEC database. They’re not going be looking at unfiltered, horrific imagery. There is going to be some degraded version of the image, so that they aren’t going to be exposed to this. Really, it’s very traumatic for people who have to review this stuff.
And then if they confirm that it is in fact, known as CSAM, then that report goes to NCMEC, pursuant to Apple’s duties under federal law, and then NCMEC will involve law enforcement.
One of the things that’s very challenging to understand here is that Apple has built it this way so they’re not scanning iCloud data in the cloud, from what I understand. What they don’t want to do is have people upload their photo libraries to iCloud, and then scan a bunch of information in the cloud.
That other way of doing it, which is in the cloud, is what the other major tech companies do, and that is kind of our expectation of what they do.
JK: Right, although I think the use case is potentially quite different. It’s one of the interesting questions why Apple is doing this in such an aggressive and public way, given that they were not a major source of child sexual violence imagery reporting to begin with. But when you think about these different products, in the online ecosystem, a lot of what you’re seeing are pedophiles who are sharing these things on these very public platforms, even if they carve out little small spaces of them.
And so they’re usually doing it on a platform, right? Whether it’s something like Facebook, WhatsApp, Dropbox, whatever it might be. And so, yes, in that case, you’re usually uploading imagery to the platform provider, it’s up to them whether they want to scan it in real time to see what you are uploading. Does it match one of these known images, or known videos that NCMEC maintains a database of?
That they’re doing it this way is just a really interesting, different use case than what we often see. And I’m not sure if Riana has any kind of theory behind why they’ve decided to take this particular tactic. I mean, when I first heard about it, the idea that I was going to have the entire NCMEC hash database sitting on my phone — I mean, obviously, hashes are extremely small text files, so we’re talking about just strings of characters that to the human eye, it just looks like garbage, and they don’t take up a lot of memory, but at the same time, the idea that we’re pushing that to everybody’s individual devices was kind of shocking to me. I’m still kind of in shock about it. Because it’s just such a different use case than what we’ve seen before.
RP: One of the concerns that has been raised with having this kind of client-side technology being deployed is that once you’re pushing it to people’s devices, it is possible — this is a concern of researchers in this space — for people to try and reverse-engineer that, basically, and figure out what is in the database. There’s a lot of research that’s done there. There are fears on one side about, well what if something that is not CSAM gets slipped into this database?
The fear on the other side is, what if people who have really strong motivations to continue trading CSAM try to defeat the database by figuring out what’s in it, figuring out how they can perturb an image, so that it slips past the hash matching feature.
And that’s something that I think is a worry, that once this is put onto people’s devices — rather than happening server-side as currently happens with other technologies such as PhotoDNA — that you are opening up an avenue for malicious reverse engineering to try and figure out how to continue operating, unimpeded and uncaught.
I read some strident statements from the EFF (Electronic Frontier Foundation) and Edward Snowden, and others, calling this a backdoor into the iPhone. Do you think that is a fair characterization, Riana?
RP: I don’t like using the word backdoor because it’s a very loaded term and it means different things to different people. And I don’t know that I agree with that because this is all still happening on the client. Right? Apple is very careful to not mention that there are end-to-end encryption for iMessage. And I agree gives an insight into what people are doing on their phone that was not there before. But I don’t know whether that means that you could characterize it as a backdoor.
I’ve heard a lot of people talking about, like, “Does this mean it’s not end-to-end encryption anymore? Does this mean it’s a backdoor?” I don’t care. I don’t care what we’re calling it. That’s a way of distracting from the main things that we’re actually trying to talk about here, which I think are: what are the policy and privacy and free expression data security impacts that will result from Apple’s decision here? And how will that go out beyond the particular CSAM context? And will what they’re doing work to actually protect children better than what they’ve been doing to date? So quibbling over labels is just not very interesting to me, frankly.
This comes back to that efficacy question that we’re talking about with Siri. Right now, in order to detect CSAM material, you have to A, be somebody who has it, B, be putting it into your camera roll, and then C, uploading that to iCloud photos. I feel like if criminals are dumb, maybe they’re going to get caught. But it seems very easy for anybody with even a moderate amount of interest to avoid this system, thus reducing the need for this controversy at all.
JK: There’s a couple things here. One is that you could take the position that Apple’s being extremely defensive here and saying, essentially, “Hey, pedophile community, we don’t want you here, so we’re going to, in a very public way, work to defeat your use of our products for that purpose.” Right? And that might be quite effective.
I want to actually add a little context here for why I’m in this conversation. Before I worked in academia, I used to work in [the tech] industry. I worked for about two years building a tool to review CSAM material and detect it. And when I worked on this project, it was very clear from the beginning that the goal was to get it off the servers of the company I was working for. Like — there was no higher goal. We were not going to somehow solve the child pornography problem.
That’s where I have a particular insight. One of the reasons Apple could be taking this stand could be a moral issue — it could be that they’ve decided that they just simply do not want their products associated with this type of material, and in a very public way they’re going to take a stand against it. I think you’re right. I think that there are people for whom, if you’re going to get caught using an Apple product, it’s probably because you weren’t necessarily well-versed in all the ways to try to defeat this type of thing.
[But] I think it’s really important to remember [that] when you talk about these issues and you think about this group of people, that they are a community. And there are a lot of different ways that you can detect this content. I would feel a lot better about this decision if I felt like what we were hearing is that all other methods have been exhausted, and this is where we are at.
And I am in no way of the belief that all other methods have been exhausted, by Apple or by kind of the larger tech community et al, who I think has really failed on this issue, given I worked on it from 2002 to 2004 and it’s gotten tremendously worse since that time. A lot more people have joined the internet since then, so it is kind of a question of scale. But I would say industry across the board has really been bad at really trying to defeat this as an issue.
What are the other methods?
JK: It’s important to understand that this is a community of users, and different communities use different products in different ways. When you’re in product design, you’re designing a product with particular users in mind. You kind of have your optimal user groups that you want to privilege the product for, who you want to attract, how you want to design the features for.
The kind of work I did to try to understand this community, it became very clear that this group of users know what they’re doing is illegal. They don’t want to get caught, and they use things very materially different than other users. And so if you’re willing to put in the time to understand how they operate and put in the resources to detect them, and to really see how they differ from other users — because they don’t use these products the same way that you and I probably do. Right? They’re not loading up photos to share with friends and family. They’re operating under subterfuge. They know what they’re doing is highly illegal.
There’s often a great deal of pressure in terms of timing, for example. One of the things I witnessed in the work I did was that people often would create accounts and basically have an upload party. They would use the service at an extremely high rate for an extremely short amount of time and then ditch it, ditch whatever product they were working in. Because they knew that they only had a limited amount of time before they would get caught.
To just assume that you can’t potentially put in more work to understand how these people use your product, and that they may be detectable in ways that don’t require the types of work that we’re seeing Apple do — if I had more reassurance they’d actually kind of done that level of research and really exhausted their options I would probably feel more confident about what they’re doing.
I don’t want to just point the finger at Apple. I think this is an industry-wide problem, with a real lack of devotion to resources behind it.
RP: The trouble with this particular context is how extremely unique CSAM is compared to any other kind of abusive content that a provider might encounter. It is uniquely opaque in terms of how much outside auditability or oversight or information anybody can have.
I mentioned earlier that there’s a risk that people might be able to try and reverse-engineer what’s in the database of hashed values to try and figure out how they could subvert and sneak CSAM around the database.
The other thing is that it’s hard for us to know exactly what it is that providers are doing. As Jen was saying, there’s a bunch of different techniques that they could take and different approaches that they can employ. But when it comes to what they are doing on the backend about CSAM, they are not very forthcoming because everything that they tell people to explain what it is they’re doing is basically a roadmap to the people who want to abuse that process, who want to evade it.
So it is uniquely difficult to get information about this on the outside, as a researcher, as a user, as a policymaker, as a concerned parent, because of this veil of secrecy that hangs over everything to do with this whole process, from what is in the database, to what are different providers doing. Some of that sometimes comes out a little bit in prosecutions of people who get caught, by providers, for uploading and sharing CSAM on their services. There will be depositions and testimony and so forth. But it’s still kind of a black box. And that makes it hard to critique the suggested improvements, to have any kind of oversight.
And that’s part of the frustration here, I think, is that it’s very difficult to, say, “You just have to trust us and trust everything all the way down from every point, from NCMEC on down,” and simultaneously, “Just know that what we’re doing is not something that has other collateral harms,” because for anything outside of CSAM, you have more ambiguity and legitimate use cases and context where it matters.
When it comes to CSAM, context does not matter. Something that I’ve been saying in recent days is: there’s no fair use for CSAM the way that there is for using copyrighted work. There’s this lack of information that makes it really difficult for folks like Jen or me or other people in civil society, other researchers, to be able to comment. And Jen, I’m so glad that you have this background, that you at least have both the privacy and the understanding from working on this from the provider’s side.
If you take that and you view it from Apple’s side, most charitably: well, at least Apple announced something. Right? They are being transparent, to a degree. We went and asked Google, “Hey, do you do this scanning in Google Photos?” And there’s no way to know. We just don’t know the answer to that question.
I think if you went to Dropbox and asked them they would just not tell you. We assume that they are. But at least here, Apple is saying, “We’re doing it. Here’s the method by which we’re doing it.” That method, that addition of capability to the iPhone, is problematic in various ways. But they’re copping to it and they’re explaining how it works. Do they get points for that?
RP: They certainly learned that they won’t get any plaudits for that. You’ve identified that. This might be a point where they say other organizations scan using PhotoDNA in the cloud, and they do so over email. And I don’t know how well understood that is by the general public, that, for most of the services that you use, if you are uploading photos, they are getting scanned to look for CSAM for the most part. If you’re using webmail, if you’re using a cloud storage provider — Dropbox absolutely does.
But you’re right that they are not necessarily that forthcoming about it in their documentation. And that’s something that might kind of redound to the benefit of those who are trying to track and catch these offenders, is that there may be some misunderstanding or just lack of clarity about what is happening. That trips up people who trade in this stuff and share and store this stuff because they don’t realize that.
I guess there’s almost some question about whether Apple is kind of ensuring that there will be less CSAM on iCloud Photos three months from now than there is today, because they’re being more transparent about this and about what they are doing.
JK: There is a really complicated relationship here between the companies and law enforcement that I think bears mentioning, which is that, the companies, broadly, are the source of all this material. You know? Hands down. I don’t even know if you see offline CSAM these days. It’s all online, and it’s all being traded on the backs of these large organizations.
Holding CSAM is illegal. Every copy the platforms hold is a felony, essentially, a criminal felony. At the same time that they are the source of this material and law enforcement wants to crack down, law enforcement needs the platforms to report it. So there’s this tension at play that I think is not necessarily well understood from the outside.
There’s a bit of a symbiotic relationship here where, if the companies crack down too much and force it all off their services, it all ends up on the dark web, completely out of the reach of law enforcement without really heavy investigative powers. In some ways, that disadvantages law enforcement. One could argue that they need the companies to not crack down so much that it completely disappears off their services because it makes their job much harder. So there is a very weird tension here that I think needs to be acknowledged.
It feels like one enormous aspect of this entire controversy is the fact that the scanning is being done on the device. That’s the Rubicon that’s been crossed: up until now, your local computer has not scanned your local storage in any way. But once you hit the cloud, all kinds of scanning happens. That’s problematic, but it happens.
But we have not yet entered the point where law enforcement is pushing a company to do local scanning on your phone, or your computer. Is that the big bright line here that’s causing all the trouble?
RP: I view this as a paradigm shift, to take where the scanning is happening from in the cloud, where you are making the choice to say, “I’m going to upload these photos into iCloud.” It’s being held in third parties’ hands. You know, there’s that saying that “it’s not the cloud; it’s just somebody else’s computer,” right?
You’re kind of assuming some level of risk in doing that: that it might be scanned, that it might be hacked, whatever. Whereas moving it down onto the device — even if, right now, it’s only for photos that are in the cloud — I think is very different and is intruding into what we consider a more private space that, until now, we could take for granted that it would stay that way. So I do view that as a really big conceptual shift.
Not only is it a conceptual shift in how people might think about this, but also from a legal standpoint. There is a big difference between data that you hand over to a third party and assume the risk that they’re going to turn around and report to the cops, versus what you have in the privacy of your own home or in your briefcase or whatever.
I do view that as a big change.
JK: I would add that some of the dissonance here is the fact that we just had Apple come out with the “asks apps to not track” feature, which was already in existence before, but they actually made that dialog box prominent to ask you when you were using an app if you want the app to track you. It seems a bit dissonant that they just rolled out that feature, and then suddenly, we have this thing that seems almost more invasive on the phone.
But I would say, as someone who’s been studying privacy in the mobile space for almost a decade, there is already an extent to which these phones aren’t ours, especially when you have third-party apps downloading your data, which has been a feature of this ecosystem for some time. This is a paradigm shift. But maybe it’s a paradigm shift in the sense that we had areas of the phone that we maybe thought were more off-limits, and now they are less so than they were before.
The illusion that you’ve been able to control the data on your phone has been nothing more than an illusion for most people for quite a while now.
The idea that you have a local phone that has a networking stack, that then goes to talk to the server and comes back — that is almost a 1990s conception of connected devices, right? In 2021, everything in your house is always talking to the internet, and the line between the client and the server is extremely blurry to the point where we market the networks. We market 5G networks, not just for speed but for capability, whether or not that’s true.
But that fuzziness between client and server and network means that the consumer might expect privacy on local storage versus cloud storage, but I’m wondering if this is actually a line that we crossed — or if just because Apple announced this feature, we’re now perceiving that there should be a line.
RP: It’s a great point because there are a number of people who are kind of doing the equivalent of “If the election goes the wrong way, I’m going to move to Canada” by saying “I’m just going to abandon Apple devices and move to Android instead.” But Android devices are basically just a local version of your Google Cloud. I don’t know if that’s better.
And at least you can fork Android, [although] I wouldn’t want to run a forked version of Android that I sideloaded from some sketchy place. But we’re talking about a possibility that people just don’t necessarily understand the different ways that the different architectures of their phones work.
A point that I’ve made before is that people’s rights, people’s privacy, people’s free expression, that shouldn’t depend upon a consumer choice that they made at some point in the past. That shouldn’t be path-dependent for the rest of time on whether or not their data that they have on their phone is really theirs or whether it actually is on the cloud.
But you’re right that, as the border becomes blurrier, it becomes both harder to reason about these things from arm’s length, and it also becomes harder for just average people to understand and make choices accordingly.
JK: Privacy shouldn’t be a market choice. I think it’s a market failure, for the most part, across industry. A lot of the assumptions we had going into the internet in the early 2000s was that privacy could be a competitive value. And we do see a few companies competing on it. DuckDuckGo comes to mind, for example, on search. But bottom line, privacy shouldn’t be left up to... or at least many aspects of privacy shouldn’t be left up to the market.
There’s another tension that I want to explore with both of you, which is the sort of generalized surveillance tension around encryption and Apple specifically. Apple famously will not unlock iPhones for law enforcement, or at least they say they won’t do it here. They say they don’t do it in other countries like China. They have wanted to encrypt the whole of iCloud, and famously the FBI talked them out of it. And in China, they’ve handed over the iCloud data centers to the Chinese government. The Chinese government holds those keys.
I believe what they want to do is encrypt everything and just wash their hands of it, and walk away, and say, “It’s our customers’ data. It’s private. It’s up to them.” They cannot, for various reasons. Do you think that tension has played into this system as it is currently architected, where they could just say, “We’re scanning all the data in the cloud directly and handing it over to the FBI or NCMEC or whoever,” but instead they want to encrypt that data, so they’ve now built this other ancillary system that does a little bit of local hashing comparison against the table in the cloud, it generates these complicated security vouchers, and then it reports to NCMEC if you pass a threshold.
All of that seems like at some point they’re going to want to encrypt the cloud, and this is the first step towards a deal with law enforcement, at least in this country.
RP: I have heard that idea from someone else I talked to about this and mentioned it to my colleague at SIO, Alex Stamos. Alex is convinced that this is a prelude to announcing end-to-end encryption for iCloud later on. It seems to be the case that, however it is that they are encrypting iCloud data for photos, that they have said it is “too difficult to decrypt everything that’s in the cloud, scan it for CSAM, and do that at scale.” So it’s actually more efficient and, in Apple’s opinion, more privacy-protective, to do this on the client side of the architecture instead.
I don’t know enough about the different ways that Dropbox encrypts their cloud, that Apple encrypts their cloud, that Microsoft encrypts its cloud, versus how iCloud does it, to know whether Apple is in fact doing something different that makes it uniquely hard for them to scan in the cloud the way that other entities do. But certainly, I think that looming over all of this is that there has been several years’ worth of encryption files, not just here in the US, but around the world, primarily focused in the last couple of years on child sex abuse material. Prior to that, it was terrorism. And there’s always concerns about other types of material as well.
One thing that’s a specter looming over this move by Apple is that they may see this as something where they can provide some kind of a compromise and hopefully preserve the legality of device encryption and of end-to-end encryption, writ large, and maybe try and rebuff efforts that we have seen, including in the US, even just last year, to effectively ban strong encryption. This might be, “If we give an inch, maybe they won’t take a mile.”
I’ve seen a lot of pushback against that idea. Just to be honest, personally, if the outcome is the same — there’s scanning done of stuff you put on the cloud — I think that is the consumer expectation. Once you upload something to somebody else’s server, they can look at it. They can, I don’t know, copyright strike it. They can scan it for CSAM. That stuff is going to happen once you give your data away to a cloud provider. That does feel like a consumer expectation in 2021, whether that is good or bad. I just think it’s the expectation.
It seems like this is a very complicated mechanism to accomplish the same goal of just scanning in the cloud. But because it is this very complicated mechanism, that is “give an inch so they won’t take a mile,” the controversy seems to be they’re not just going take the inch.
Governments around the world will now ask you to expand this capability in various ways that maybe the United States government won’t do, but certainly the Chinese government or the Indian government or other more oppressive governments would certainly take advantage of. Is there a backstop here for Apple to not expand the capability beyond CSAM?
RP: This is my primary concern. The direction I think this is going is that we don’t have, ready to go, hashed databases or hashes of images of other types of abusive content besides CSAM, with the exception of terrorist and violent extremist content. There is a database called GIFCT that is an industry collaboration, to collaboratively contribute imagery to a database of terror and violent extremist content, largely arising out of the Christchurch shooting a few years back, which really woke up a new wave of concern around the world about providers hosting terrorists and violent extremist material on their services.
So my prediction is that the next thing that Apple will be pressured to do will be to deploy the same thing for GIFCT as they are currently doing for the NECMC database of hashes of CSAM. And from there on, I mean, you can put anything you’d like into a hashed image database.
Apple just said, “If we’re asked to do this for anything but CSAM, we simply will not.” And, that’s fine, but why should I believe you? Previously, their slogan was, “What happens on your iPhone stays on your iPhone.” And now that’s not true, right?
They might abide by that, where they think that the reputational trade off is not worth the upside. But if there’s a distinction with choices between either you implement this hashed database of images that this particular government doesn’t like, or you lose access to our market, and you will never get to sell a Mac or an iPhone in this country again? For a large enough market, like China, I think that they will fold.
India is one place that a lot of people have pointed to. India has a billion people. They actually are not that big of a market for iPhones, at least commensurate with the size of the market that currently exists in China. But the EU is. The European Union is a massive market for Apple. And the EU just barely got talked off the ledge from having an upload filter mandate for copyright-infringing material pretty recently. And there are rumblings that they are going to introduce a similar plan for CSAM at the end of this year.
For a large enough market, basically, it’s hard to see how Apple, thinking of their shareholders, not just of their users’ privacy or of the good of the world, continues taking that stand and says, “No, we’re not going to do this,” for whatever it is they’re confronted with. Maybe if it’s lese majeste laws in Thailand that say, “You are banned from letting people share pictures of the king in a crop top” — which is a real thing — maybe they’ll say, “Eh, this market isn’t worth the hit that we would take on the world stage.” But if it’s the EU, I don’t know.
Let’s say the EU was going implement this upload filter. If they say, “We need an upload filter for CSAM,” and Apple’s already built it, and it preserves encryption, isn’t that the correct trade-off?
RP: I think that there are absolutely a lot of folks that you could talk to who would quietly admit that they might think — if this really did get limited only ever to CSAM for real — that that might be a compromise that they could live with. Even though we’re talking about moving surveillance down into your device. And, really, there’s no limitation on them for only doing this for iCloud photos. It could be on your camera roll next. If we really believe that this would not move beyond CSAM, there are a lot of folks who might be happy with that trade-off.
Going back to your question about what a backstop might be, though, to keep it from going up beyond CSAM, this goes back to what I mentioned earlier about how CSAM is really unique among types of abuse. And once you’re talking about literally any other type of content, you’re necessarily going to have an impact on free expression, values on news, commentary, documentation of human rights abuses, all of these things.
And that’s why there’s already a lot of criticism of the GIFCT database that I mentioned, and why it would be supremely difficult to build out a database of images that are hate speech, whatever that means. Much less something that is copyright infringing. There is nothing that is only ever illegal and there’s no legitimate context, except for CSAM.
So I think that this is a backstop that Apple could potentially try to point to. But just because it would trample free expression and human rights to deal with this for anything else — I don’t necessarily know that that’s something that’s going to stop governments from demanding it.
For CSAM, there is a database of images that exist that are just illegal. You can’t have them, you can’t look at them. And there’s no value towards even pointing them out and saying, “look at this” for things like scholarship or research.
But a database of images of terrorism, the video of the Christchurch shooting, there are fuzzier boundaries there. Right? There are legitimate reasons for some people to have that video or to have other terrorism-related content: to report on it, to talk about it, to analyze it. And because that is a fuzzier set, it’s inherently more dangerous to implement these kinds of filters.
JK: I would argue that your example points to one of the easiest examples of that whole genre, and that it’s much harder from those extreme examples to work backwards to “what is terrorism” versus “what are groups engaging in rightful protests on terrorism-related issues,” for example? The line-drawing becomes much, much harder.
To kind add some context to what Riana was saying, we are very much talking about the US and the fact that this content is illegal in the US. In Europe, those boundaries, I think, are much broader because they’re not operating under the First Amendment. I’m not a lawyer, so I’m definitely speaking a little bit outside my lane, but there isn’t the same free speech absolutism in the EU because they don’t have the First Amendment we have here in the US. The EU has been much more willing to try to draw lines around particular content that we don’t do here.
RP: I think that there are different regimes in different countries for the protection of fundamental rights that look a little different from our Constitution. But they exist. And so, when there have been laws or surveillance regimes that would infringe upon those, there are other mechanisms, where people have brought challenges and where some things have been struck down as being incompatible with people’s fundamental rights as recognized, in other countries.
And it’s very difficult to engage in that line-drawing. I have a side hustle talking about deepfakes. There is absolutely a lot of interest in trying to figure out, okay, how do we keep mis- and disinformation from undermining democracy, from hurting vaccine rollout efforts, and also from having deepfakes influence an election. And it would be real easy — this is what law professors Danielle Citron and Bobby Chesney call “the liar’s dividend” — for a government that does not like evidence of something that actually happened, something that is true and authentic but inconvenient for them, to say, “That’s fake news. That is a deepfake. This is going in our database of hashes of deepfakes that we’re gonna make you implement in our country.”
So there’s all of these different issues that get brought up on on the free expression side once you’re talking about anything other than child sex abuse material. Even there, it takes a special safe harbor under the federal law that applies to make it okay for providers to have this on their services. As Jen was saying, otherwise that is just a felony, and you have to report it. If you don’t report it, you don’t get the safe harbor, and that provider is also a felon.
The National Center for Missing and Exploited Children is the only entity in America that is allowed to have this stuff. There are some debates going on in different places right now about whether there are legitimate applications for using CSAM to train AI and ML models. Is that a permissible use? Is that re-victimizing the people who are depicted? Or would it have an upside in helping better detect other images? Because the more difficult side of this is detecting new imagery, rather than detecting known imagery that’s in a hashed database.
So even there, that’s a really hot button issue. But it gets back to Jen’s point: if you start from the fuzzy cases and work backwards, Apple could say “We’re not going to do this for anything other than CSAM because there’s never going to be agreement on anything else other than this particular database.”
Apple has also said they are not compiling the hashed databases, the image databases themselves. They’re taking what is handed to them, with the hashes, that NCMEC provides or that other child safety groups in other countries provide. If they don’t have visibility into what is in those databases, then again, it’s just as much of a black box to them as it is to anybody else. Which has been a problem with GIFCT: we don’t know what’s in it. We don’t know if it contains human rights documentation or news or commentary or whatever. Rather than just something that everybody can agree nobody should ever get to look at ever, not even consenting adults.
So you’re saying the danger there is, there’s a child safety organization in some corrupt country. And, the dictator of that country says, “There’s eight photos of me sneezing, and I just want them to not exist anymore. Add them to the database.” Apple will never know that it’s being used in that way, but the photos will be detected and potentially reported to the authorities.
RP: Well, Apple is saying one of the protections against non-CSAM uses of this is that they have a human in the loop who reviews matches, if there is a hit for a sufficiently large collection of CSAM. They will take a look and be like, “Yep, that matches the NCMEC databases.” If what they’re looking at is the Thai king in a crop top, then they can say, “What the heck? No, this isn’t CSAM.” And supposedly, that’s going to be another further layer of protection.
I think that I have already started seeing some concerns, though, about, “Well, what if there’s a secret court order that tells NCMEC to stick something in there? And then NCMEC employees have to just go along with it somehow?” That seems like something that could be happening now, given that PhotoDNA is based off of hashes that NCMEC provides even now for scanning Dropbox and whatever.
This is really highlighting how it’s just trust all the way down. You have to trust the device. You have to trust the people who are providing the software to you. You have to trust NCMEC. And it’s really kind of revealing the feet of clay that I think is kind of underpinning the whole thing. We thought our devices were ours, and Apple had taken pains during Apple v. FBI to say, “Your device is yours. It doesn’t belong to us.” Now it looks like, well, maybe the device really is still Apple’s after all, or at least the software on it.
This brings me to just the way they’ve communicated about this, which we were talking about briefly before we started recording. You both mentioned big meaty debates happening in civil society organizations, with policymakers, with academics, with researchers, about how to handle these things, about the state of encryption, about the various tradeoffs.
It does not appear that Apple engaged those debates in any substantive way before rolling this out. Do you think if they had, or if they had been more transparent with members of that community, that the reaction wouldn’t have been quite so heated?
RP: The fact that Apple rolled this out with maybe a one day’s heads up to some people in civil society orgs and maybe some media, isn’t helpful. Nobody was brought into this process while they were designing this, to tell them, “Here are the concerns that we have for queer 12-year-olds. Here are the concerns for privacy. Here are the civil liberties and the human rights concerns,” all of that. It looks like this was just rolled out as a fait accompli with no notice.
With, I have to say, really confusing messaging, given that there are these three different components and it was easy to conflate two of them and get mixed up about what was happening. That has further caused a lot of hammering and wailing and gnashing of teeth.
But if they had involved elements of civil society other than, presumably, NCMEC itself and probably law enforcement agencies, maybe some of the worst could have been averted. Or maybe they would have ignored everything that we would have said and just gone forth with the thing that they’re doing it as-is.
But, as Jen and I can tell you — Jen and I have both been consulted before by tech companies who have something that impacts privacy. And they’ll preview that for us in a meeting and take our feedback. And that’s standard practice for tech companies, at least at some points. If you don’t really care what people’s feedback is, then you roll out where you get feedback from people later and later in the process,
But if they had really wanted to minimize the free expression and privacy concerns, then they should have consulted with outsiders, even if there are voices they thought that would be “too screechy,” as the executive director of NCMEC called everybody who expressed any kind of reservation about this. Even if they didn’t want to talk to what Apple might think is somehow the lunatic fringe or whatever, they could have talked to more moderate voices. They could have talked to academics. They could have talked to me, although I’m probably too screechy for them, and at least taken those concerns back and thought about them. But they didn’t.
We’ve heard about the controversy, we’ve heard about the criticism. Do you think Apple responds to that in any meaningful way? Do you think they back off this plan, or is this just shipping in iOS 15, as they’ve said?
JK: I think image hashing match ships. I don’t know about the “nanny cam,” again, for lack of a better word.
I predict that they will double down on the CSAM image scanning for all of the different reasons we’ve talked about today. I think Riana really hit the nail on the head — I think there’s some kind of political strategizing going on behind the scenes here. If they are trying to take a bigger stand on encryption overall, that this was the piece that they had to give up to law enforcement in order to do so.
RP: I think certainly for the stuff about Siri that is uncontroversial, they’ll keep rolling that out. I’m not certain, but it seems like the iMessage stuff either wasn’t messaged clearly at the beginning, or maybe they really did change over the course of the last few days in terms of what they said they were going to do. If that’s true, and I’m not sure whether it is, that then indicates that maybe there is some room to at least make some tweaks.
However, the fact that they rolled out this whole plan as a fait accompli, that’s going to be put into iOS 15 at the very end, without any consultations, suggests to me that they are definitely going to go forward with these plans. With that said, there may be some silver lining in the fact that civil society was not consulted at any point in this process, that now, maybe there’s an opportunity to use this concerted blowback as a way to try and get pushback in that might not have been possible, had civil society been looped in all along the way, and incorporated and neutralized, almost.
So, I’m not sanguine about the odds of them just not deploying this CSAM thing at all. Don’t get me wrong, I would love to be wrong with the slippery slope arguments, that the next thing will be demanding this for GIFCT and then it’ll be not as much to say in deepfakes and copyright infringement. I would love to be proved wrong about that, even as silly as it would make me look. But I’m not sure that that’s going to be the case.
Update August 10th, 5:53PM ET: Added full transcript.