Newly declassified documents reveal that as of 2011, the NSA believed it regularly collected thousands of emails and other communications with no connection to international terrorism every year, but it apparently had almost no way to get accurate numbers or accurately weed out domestic data. In files released today by the Director of National Intelligence, an estimate shows that up to 56,000 communications a year were "wholly domestic," something explicitly banned by surveillance laws. The FISA court, however, believed that actual numbers would be impossible to get. "The sheer volume of transactions acquired by the NSA through its upstream collection is such that any meaningful review of the entire body of transactions is not feasible," the court wrote in a memorandum reviewing the NSA's procedures.

To figure out how many communications involved only US citizens or people in the US, a sample of about 50,000 pieces of data was collected. Reviewers tried to determine how many were communications like emails and, of those, how many only involved people within the US. Then, the data was extrapolated to the millions of pieces of information actually gathered. But that last point proved difficult to pin down, for the simple reason that email contents, metadata, and user data can't always accurately identify where someone is or where they hold citizenship. The NSA's procedures for examining communications, the court wrote, also made it likely that some purely domestic data would be swept up in the search for international terrorists.

"The NSA could do substantially more to minimize the retention of information concerning United States persons."

The results of this study were troubling. The court believed that overall, the NSA was successfully minimizing the data that it actually collected, but dealing with that information was a different story. After gathering a set of communications, analysts often didn't look closely enough to pick up individual messages that should have been discarded. Instead, those were stored with the other data — for up to five years. "It appears that the NSA could do substantially more to minimize the retention of information concerning United States persons," the court wrote. "The government has not, for instance, demonstrated why it would not be feasible to limit access ... to a smaller group of specially trained analysts" who could better remove domestic communications.

The court allowed the NSA to keep collecting information in the same way, saying it would be too difficult to reduce how much US citizen data was swept up. But the agency was forced to review how it examined and stored that information. The current procedures, the court said, violated the Fourth Amendment. It also worried that the NSA had all but lied by giving misleading reports about US citizen data. "The court is troubled that the government's revelations ... mark the third instance in less than three years in which the government has disclosed a substantial misrepresentation regarding the scope of a major collection program."

A month later, the NSA put forward another proposal, promising that trained analysts would segregate and examine anything that could possibly involve US citizens and that data would be kept for two years instead of five. That proposal was accepted, and a 2012 letter says that the NSA later purged all data from before these reforms took place. Nonetheless, audits from that year still revealed privacy violations, and despite promises that the NSA has always been strictly accountable for its actions, it seems to have operated for years without even recognizing that it was storing US citizens' data. A footnote in the document says that "until NSA's manual review of a six-month sample ... revealed the acquisition of wholly domestic communications, the government asserted that NSA had never found a wholly domestic communication in its upstream collection."