July 4, 2008

Privacy Falls into YouTube’s Data Tar Pit

Filed under: Internet, privacy, trade secret — wseltzer @ 3:53 pm

As a big lawsuit grinds forward, its parties engage in discovery, a wide-ranging search for information “reasonably calculated to lead to the discovery of admissible evidence.” (FRCP Rule 26(b)) And so Viacom has calculated that scouring YouTube’s data dumps would help provide evidence Viacom’s copyright lawsuit.

According to a discovery order released Wednesday, Viacom asked for discovery of YouTube source code and of logs of YouTube video viewership; Google refused both. The dispute came before Judge Stanton, in the Southern District of New York, who ordered the video viewing records — but not the source code — disclosed.

The order shows the difficulty we have protecting personally sensitive information. The court could easily see the economic value of Google’s secret source code for search and video ID, and so it refused to compel disclosure of that “vital asset,” the “product of over a thousand person-years of work.”

But the user privacy concerns proved harder to evaluate. Viacom asked for “all data from the Logging database concerning each time a YouTube video has been viewd on the YouTube website or through embedding on a third-party website,” including users’ viewed videos, login IDs, and IP addresses. Google contended it should not be forced to release these records because of users’ privacy concerns, which the court rejected.

The court erred both in its assessment of the personally identifying nature of these records, and the scope of the harm. It makes no sense to discuss whether an IP address is or is not “personally identifying” without considering the context with which it is connected. It may not be a name, but is often one search step from it. Moreover, even “anonymized” records often provide sufficiently deep profiles that they can be traced back to individuals, as researchers armed with the AOL and Netflix data releases showed.

Viewers “gave” their IP address and username information to YouTube for the purpose of watching videos. They might have expected the information to be used within Google, but not anticipate that it would be shared with a corporation busily prosecuting copyright infringement. Viewers may not be able to quantify economic harm, but if communications are chilled by the disclosure of viewing habits, we’re all harmed socially. The court failed to consider these third party interests in ordering the disclosure.

Trade secret wins, privacy loses. Google has said it will not appeal the order.

Is there hope for the end users here, concerned about disclosure of their video viewing habits? First, we see the general privacy problem with “cloud” computing: by conducting our activities at third-party sites, we place a great deal of information about our activities in their hands. We may do so because Google is indispensable, or because it tells us its motto is “don’t be evil.” But discovery demands show that it’s not enough for Google to follow good precepts.

Google, like most companies, indicates that it will share data where “We have a good faith belief that access, use, preservation or disclosure of such information is reasonably necessary to (a) satisfy any applicable law, regulation, legal process or enforceable governmental request.” Its reputation as a good actor is important, but the company is not going to face contempt charges over user privacy.

I worry that this discovery demand is just the first of a wave, as more litigants recognize the data gold mines that online service providers have been gathering: search terms, blog readership and posting habits, video viewing, and browsing might all “lead to the discovery of admissible evidence” — if the privacy barriers are as low as Judge Stanton indicates, won’t others follow Viacom’s lead? A gold mine for litigants becomes a tar pit for online services’ user.

Economic concerns, the cost of producing the data in response to a wave of subpoenas, or reputational concerns, the fear that users will be driven away from a service that leaves their sensitive data vulnerable, may exercise some constraint, but they’re unlikely to be enough to match our privacy expectations.

We need the law to supply protection against unwanted data flows, to declare that personally sensitive information — or the profiles from which identity may be extracted and correlated — deserves consideration at least on par with “economically valuable secrets.” We need better assurance that the data we provide in the course of communicative activities will be kept in context. There is room for that consideration in the “undue burden” discovery standard, but statutory clarification would help both users and their Internet service providers to negotiate privacy expectations better.

Is there a law? In this particular context, there might actually be law on the viewers’ side. The Video Privacy Protection Act, passed after reporters looked into Judge Bork’s video rental records, gives individuals a cause of action against “a video tape service provider who knowingly discloses, to any person, personally identifiable information concerning any consumer of such provider.” (”Video tape” includes similar audio visual materials.) Will any third parties intervene to ask that the discovery order be quashed?

Further, Bloomberg notes the concerns of Europeans, whose privacy regime is far more user-protective than that of the United States. Is this one case where “harmonization” can work in favor of individual rights?


  1. I agree that if this decision establishes a precedent it will unleash a wave of subpoenas that will dwarf what we saw when Iran-Contra taught everyone that emails never really get deleted.

    The Video Privacy Protection Act is promising, if the courts hold that video-on-demand is “similar” to “prerecorded video cassette tapes.” Since that act also requires the provider to “destroy personally identifiable information” within a year, that would seem to limit Google’s ability to retain log information. What does “destroy” mean in a world of backup tapes anyway?

    Comment by Christopher Herot — July 4, 2008 @ 5:00 pm

  2. Is YouTube Similar to Videotape?…

    Wendy Seltzer points out that Wednesday’s discovery order in Viacom v. YouTube, which requires Google to release the login names and IP addresses of every person who has ever viewed or embedded a YouTube video may be in contradiction to…

    Trackback by Christopher Herot's Weblog — July 4, 2008 @ 5:31 pm

  3. [...] Wendy?s Blog: Legal Tags » Privacy Falls into YouTube?s Data Tar Pit - excellent analysis from Wendy Seltzer of the YouTube data disaster in the making [...]

    Pingback by My del.icio.us bookmarks for July 5th through July 7th » the billblog — July 7, 2008 @ 2:01 am

  4. Wendy — great post and good quote on Sat in the Wall Street Journal.

    Comment by Auren Hoffman — July 7, 2008 @ 11:21 am

  5. Interestingly enough, Judge Stanton had the Video Protection Privacy Act staring him in the face. Google certainly brought it to his attention, as witnessed in Footnote 5 on page 13 of the order. Yet he seemingly ignored the following section of the VPPA:

    (2) A video tape service provider may disclose personally identifiable information concerning any consumer—

    (F) pursuant to a court order, in a civil proceeding upon a showing of compelling need for the information that cannot be accommodated by any other means, if—
    (i) the consumer is given reasonable notice, by the person seeking the disclosure, of the court proceeding relevant to the issuance of the court order; and
    (ii) the consumer is afforded the opportunity to appear and contest the claim of the person seeking the disclosure.

    If an order is granted pursuant to subparagraph (C) or (F), the court shall impose appropriate safeguards against unauthorized disclosure.

    The judge said that Viacom “need(s) the data to compare the attractiveness of allegedly infringing videos with that of non-infringing videos.” But nowhere in the order does he explain why they need the IP AND the username, much less why this need is compelling or why Viacom can’t prove its case without this information.

    And even if the need is compelling, I do not remember Viacom giving me reasonable notice of this disclosure or affording me an opportunity to contest this claim.

    Judges hold people in contempt of Court. It’s too bad that people can’t hold the Judge in contempt of the People.

    Comment by themaskedanalyst — July 8, 2008 @ 9:08 pm

  6. [...] Wendy Seltzer at the Citizen Media Law Project summarizes the bifurcated outcome of the case: “trade secret wins; privacy loses.” Kurt Opsahl of the Electronic Frontier Foundation calls this a “setback to privacy rights,” and argues that some of the login names and IP address information, which the court states are anonymous, can in fact be used to identify individual users. The most contentious portion of the 25-page opinion from Judge Louis Stanton concerns YouTube’s logging database. Each time a video is watched on youtube.com, this database records the YouTube account name of the viewer (if he or she has one), the IP address of the viewer’s computer, an identifier for the video, and the time. [...]

    Pingback by JOLT Digest » Viacom v. YouTube — July 12, 2008 @ 8:02 pm

  7. [...] Why does Viacom get a record of every legal video I’ve watched? What right do they have? Wendy Seltzer writes about the dangerous precedent being set: “I worry that this discovery demand is just the first of a wave, as more litigants recognize [...]

    Pingback by Unit Structures – Ongoing Analysis of YouTube-Viacom — August 22, 2008 @ 11:44 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress