A couple weeks ago the Google Books Settlement fairness hearing took place in New York City, where Judge Denny Chin heard dozens of oral arguments discussing the settlement’s implications for competition, copyright law, and privacy. The settlement raises a number of very challenging legal questions, and Judge Chin’s decision, expected to come down later this spring, is sure to be a page-turner no matter how he rules.
My work on the Google Books Settlement has focused on reader privacy concerns, which have been a major point of contention between Google and civil liberties groups like EFF, ACLU, and CDT. While I agree with these groups that existing legal protections for sensitive user information stored by cloud computing providers are inadequate, I do not believe that reader privacy should factor into the court’s decision on whether to approve or reject the settlement.
I elaborated on reader privacy in an amicus curiae brief I submitted to the court last September. I argued that because Google Books will likely earn a sizable portion of its revenues from advertising, placing strict limits on data collection (as EFF and others have advocated) would undercut Google’s incentive to scan books, ultimately hurting the very authors whom the settlement is supposed to benefit. While the settlement is not free from privacy risks, such concerns aren’t unique to Google Books nor are they any more serious than the risks surrounding popular Web services like Google search and Gmail. Comparing Google Book Search to brick-and-mortar libraries is inapt, and like all cloud computing providers, Google has a strong incentive to safeguard user data and use it only in ways that benefit users and advertisers.
It’s worth noting that while Google has a reasonably strong track record of preventing data breaches and accidental disclosure of data to untrustworthy parties, Google generally does not challenge court-approved criminal or civil subpoenas of data associated with its users. I didn’t properly articulate this in my amicus brief, in which I stated incorrectly that “Google has a history of vigorously resisting government data requests if it deems them invalid.” In fact, Google usually does not attempt to quash subpoenas, although it has done so at least once before (in 2006, Google successfully fought a request from the U.S. Department of Justice seeking logs containing millions of user search queries).
Upon receiving a subpoena of a user’s data, Google typically informs the user that his or her data will be handed over in 20 days unless the user successfully moves to quash the subpoena. Most other cloud computing providers have similar policies. In certain rare circumstances, however, subpoenas are issued in secret. In such cases, Google is barred from telling the user about the subpoena, so the user doesn’t have a chance to challenge it in court.
While Google’s policy for disclosing user data is perhaps not as protective of privacy as it could be, it’s still quite reasonable in light of the economic realities of cloud computing. Sure, Google could challenge all subpoenas it receives as a matter of course (as CDT and others have urged) but such a policy would be prohibitively expensive considering the fact that Google that likely processes tens of thousands subpoenas each year (Unfortunately, Google does not disclose how many subpoenas it receives each year, much to my chagrin). Remember, the vast majority of Google users aren’t even paying customers! Expecting Google to bear the legal burden of defending its users — some of whom actually are criminals — from legal proceedings is hardly fair.
Instead of trying to persuade Congress, regulatory agencies, and the courts to regulate Google and other online providers, privacy advocates should focus on the underlying deficiencies in U.S. privacy laws. Under the 1986 Electronic Communications Privacy Act (ECPA), many kinds of potentially sensitive user data can be obtained by government authorities with a mere subpoena, rather than a search warrant. Compounding this problem is the refusal of courts to extend Fourth Amendment protections to sensitive information stored in the cloud on the basis of the seriously flawed “third party doctrine” To remedy this, Congress should amend ECPA to strengthen privacy protections for sensitive data stored by remote computing service providers. Just as authorities are required to obtain a search warrant if they wish to get hold of files stored in one’s home, warrants should also be necessary to compel cloud computing providers to disclose individual information that users very clearly expect to remain private.
In the meantime, let’s not create burdensome new regulations on online data collection. As Berin, Adam, and others have documented with incredible thoroughness (1, 2, 3, 4), smart data mining has myriad benefits for consumers, and targeted advertising is among the most promising avenues for financing future content production.