In case you’ve been in a pre-holiday daze this week, the blogosphere has been atwitter (not to mention a-twittering) with the news that the Hon. Louis L. Stanton, the Federal district judge presiding over Viacom’s massive copyright infringement suit against YouTube has ordered Google, which owns YouTube, to turn over its viewership records (12 terabytes). Most notably, TechCrunch’s Michael Arrington has called Judge Stanton a “moron” for failing to appreciate that “handing over user names and a list of videos they’ve watched to a highly litigious copyright holder is extremely likely to result in lawsuits against those users that have watched copyrighted content on YouTube.” Whatever one thinks of the Viacom v. YouTube/Google case, Arrington’s concern is misplaced (if not hysterical) and his logic betrays his ignorance of how litigation actually works.
Judge Stanton’s July 2 order (PDF) explains:
[YouTube and Google's] “Logging” database contains, for each instance a video is watched, the unique “login ID” of the user who watched it, the time when the user started to watch the video, the internet protocol address other devices connected to the internet use to identify the user’s computer (“IP address”), and the identifier for the video. That database (which is stored on live computer hard drives) is the only existing record of how often each video has been viewed during various time periods. Its data can “recreate the number of views for any particular day of a video.” Plaintiffs [primarily Viacom] seek all data from the Logging database concerning each time a YouTube video has been viewed on the YouTube website or through embedding on a third-party website. They need the data to compare the attractiveness of allegedly infringing videos with that of non-infringing videos. A markedly higher proportion of infringing-video watching may bear on plaintiffs’ vicarious liability claim, and defendants’ substantial non-infringing use defense.
While Stanton denied other requests made by Viacom for the search code that powers YouTube and Google Video on the grounds that Viacom had not made a sufficient showing of need for such records, he rejected Google’s arguments that turning over the large amount of viewer data would be unduly burdensome (given today’s cheap and convenient storage). Also rejecting Google’s “speculative privacy concerns,” the judge agreed with Viacom that, the “’login ID is an anonymous pseudonym that users create for themselves when they sign up with YouTube’ which without more ‘cannot identify specific individuals.’” The judge noted that Google–hoisted on its own petard–had elsewhere taken the position that IP addresses alone are not personally identifying:
We . . . are strong supporters of the idea that data protection laws should apply to any data that could identify you. The reality is though that in most cases, an IP address without additional information cannot.
The Electronic Frontier Foundation raises the valid question of whether the release of viewer data would violate the Video Privacy Protection Act (VPPA), passed in 1988 after a newspaper disclosed Supreme Court nominee Robert Bork’s video rental records during his controversial and abortive nomination. While Judge Stanton’s order dismisses this law as inapplicable in a footnote, EFF argues that the law does in fact apply because (i) the law covers “prerecorded video cassette tapes or similar audio visual materials,” which should include YouTube and (ii) some user names do identify users (e.g., “berinszoka”). If EFF is correct, the VPPA would preclude the kind of comprehensive data production ordered by Judge Stanton.
Whether EFF is correct as a legal matter, this is certainly the kind of question privacy advocates should ask. Those of us who argue that government should generally address concerns about user privacy by enforcing privacy policies (rather than dictating to companies how they should treat data through regulation) must be especially vigilant whenever the government forces companies to turn over potentially identifying user data, either to other companies in lawsuits such as this one or to law enforcement, lest the threat of the real “Big Brother” (government) completely obscure the fact that companies like Google live and die by their reputation, and thus have strong incentives to protect user privacy.
But Arrington is not engaging in such thoughtful analysis, merely name-calling:
I can understand why Judge Stanton, who graduated from law school in 1955, may be completely and utterly clueless when it comes to online video services. But perhaps one of his bright young clerks or interns could have told him that (1) handing over user names and a list of videos they’ve watched to a highly litigious copyright holder is extremely likely to result in lawsuits against those users that have watched copyrighted content on YouTube, and (2) YouTube’s source code is about as valuable as the hard drive it would be delivered on, since the core Flash technology is owned by Adobe and there are countless YouTube clones out there, most of which offer higher quality video. … Judge Stanton doesn’t seem to care much about [the the Video Privacy Protection Act] for now. And he clearly doesn’t understand that far more data is being transferred than is necessary to comply with Viacom’s core stated concern, which is to understand the popularity of copyright infringing v. non-infringing material. Viacom has asked for far more data than that, and there’s only one use for that data: to sue individual users (or shake them down via the threat of lawsuit, which has been perfected by the RIAA) who have watched a few music videos or television shows on YouTube. I say this with the utmost respect, but Judge Stanton is a moron. And Google simply cannot hand this data over without facing a class action lawsuit of staggering proportions.
Is Arrington unfamiliar with the concept of a protective order? A standard feature of any major lawsuit, protective orders allow parties to limit use of sensitive information they may be required to provide in the “discovery” process during litigation. In this case, the protective order grants access to the viewership data Google is required to provide to a very limited number of individuals on plaintiffs’ litigation team, requires that they “maintain that information … in confidence and use it only for the purposes of this litigation.” Thus, as noted by CNET, the viewership records could not be used in copyright infringement lawsuits against users, such as those pursued by the Recording Industry Association.
For those interested, here is the most recent protective order in the case (see paragraphs 3 and 10):
Of course, there is always the possibility that such records, once released, might be accidentally disclosed–though the fact that the plaintiffs in this case are subject to criminal sanctions for violation of the protective order does create a rather strong incentive for them to avoid such disclosures. (EFF provides the example of 2006 “AOL search data scandal.”) Accidental disclosure–or hacking–certainly is a valid concern. Indeed, this exactly the kind of concern that always weighs in the balance when courts make decisions about whether to order the production of documents. In this particular case, Stanton noted, in rejecting Viacom’s demands for the YouTube and Google Video search code, that “the protections set forth in the stipulated confidentiality order are careful and extensive, but nevertheless not as safe as nondisclosure.” That is, Viacom had not shown sufficient need for the search code to overcome the risk of accidental disclosure–while he reached the opposite conclusion on viewership records.
The real question here is a difficult one of balancing the plaintiff’s need for certain data to make its case against concerns about sensitivity of the data at issue. Such questions are highly fact-dependent and it is always difficult for those outside the case to evaluate the claims, since we lack all the key details, many of which has been redacted from court filings.
Setting aside (but not trivializing) EFF’s arguments about the VPPA, the most one can say here is that Google’s response–to request that Viacom “respect users’ privacy and allow us to anonymize the logs before producing them under the court’s order”–seems eminently reasonable, at least from the outside. Certainly such a solution would set a valuable precedent for future disclosures that would allow plaintiffs like Viacom access to data where necessary while minimizing the risks to users’ privacy. Such a solution would be akin to the March 2006 court order that required Google to disclose only a sample of Google’s search index, rather than individual user search terms, in response to the Justice Department’s broader demands for data it said it needed to test software intended to block access to child pornography.