personal data – Technology Liberation Front

600 Billion Data Points Per Day? It’s Time to Restore the Fourth Amendment

Jim Harper — Mon, 17 Aug 2009 19:04:14 +0000

Jeff Jonas has published an important post: “Your Movements Speak for Themselves: Space-Time Travel Data is Analytic Super-Food!”

More than you probably realize, your mobile device is a digital sensor, creating records of your whereabouts and movements:

Mobile devices in America are generating something like 600 billion geo-spatially tagged transactions per day. Every call, text message, email and data transfer handled by your mobile device creates a transaction with your space-time coordinate (to roughly 60 meters accuracy if there are three cell towers in range), whether you have GPS or not. Got a Blackberry? Every few minutes, it sends a heartbeat, creating a transaction whether you are using the phone or not. If the device is GPS-enabled and you’re using a location-based service your location is accurate to somewhere between 10 and 30 meters. Using Wi-Fi? It is accurate below 10 meters.

The process of deploying this data to markedly improve our lives is underway. A friend of Jonas’ says that space-time travel data used to reveal traffic tie-ups shaves two to four hours off his commute each week. When it is put to full use, “the world we live in will fundamentally change. Organizations and citizens alike will operate with substantially more efficiency. There will be less carbon emissions, increased longevity, and fewer deaths.”

This progress is not without cost:

A government not so keen on free speech could use such data to see a crowd converging towards a protest site and respond before the swarm takes form — detected and preempted, this protest never happens. Or worse, it could be used to understand and then undermine any political opponent.

Very few want government to be able to use this data as Jonas describes, and not everybody wants to participate in the information economy quite so robustly. But the public can’t protect itself against what it can’t see. So Jonas invites holders of space-time data to reveal it:

[O]ne way to enlighten the consumer would involve holders of space-time-travel data [permitting] an owner of a mobile device the ability to also see what they can see:

(a) The top 10 places you spend the most time (e.g., 1. a home address, 2. a work address, 3. a secondary work facility address, 4. your kids school address, 5. your gym address, and so on);

(b) The top three most predictable places you will be at a specific time when on the move (e.g., Vegas on the 215 freeway passing the Rainbow exit on Thursdays 6:07 – 6:21pm — 57% of the time);

(c) The first name and first letter of the last name of the top 20 people that you regularly meet-up with (turns out to be wife, kids, best friends, and co-workers – and hopefully in that order!)

(d) The best three predictions of where you will be for more than one hour (in one place) over the next month, not counting home or work.

Google’s Android and Latitude products are candidates to take the lead, he says, and I agree. Google collectively understands both openness and privacy, and it’s nimble enough still to execute something like this. Other mobile providers would be forced to follow this innovation.

What should we do to reap the benefits while minimizing the costs? The starting point is you: It is your responsibility to deal with your mobile provider as an adult. Have you read your contract? Have you asked them whether they collect this data, how long they keep it, whether they share it, and under what terms?

Think about how you can obscure yourself. Put your phone in airplane mode when you are going someplace unusual – or someplace usual. (You might find that taking a break from being connected opens new vistas in front of your eyes.) Trade phones with others from time to time. There are probably hacks on mobile phone system that could allow people to protect themselves to some degree.

Privacy self-help is important, but obviously it can be costly. And you shouldn’t have to obscure yourself from your mobile communications provider, giving up the benefits of connected living, to maintain your privacy from government.

The emergence of space-time travel data begs for restoration of Fourth Amendment protections in communications data. In my American University Law Review article, “Reforming Fourth Amendment Privacy Doctrine,” I described the sorry state of the Fourth Amendment as to modern communications.

The “reasonable expectation of privacy” doctrine that arose out of the Supreme Court’s 1967 Katz decision is wrong—it isn’t even founded in the majority holding of the case. The “third-party doctrine,” following Katz in a pair of early 1970s Bank Secrecy Act cases, denies individuals Fourth Amendment claims on information held by service providers. Smith v. Maryland brought it home to communications in 1979, holding that people do not have a “reasonable expectation of privacy” in the telephone numbers they dial. (Nevermind that they actually have privacy—the doctrine trumps it.)

Concluding, apropos of Jonas’ post, I wrote:

These holdings were never right, but they grow more wrong with each step forward in modern, connected living. Incredibly deep reservoirs of information are constantly collected by third-party service providers today.

Cellular telephone networks pinpoint customers’ locations throughout the day through the movement of their phones. Internet service providers maintain copies of huge swaths of the information that crosses their networks, tied to customer identifiers. Search engines maintain logs of searches that can be correlated to specific computers and usually the individuals that use them. Payment systems record each instance of commerce, and the time and place it occurred.

The totality of these records are very, very revealing of people’s lives. They are a window onto each individual’s spiritual nature, feelings, and intellect. They reflect each American’s beliefs, thoughts, emotions, and sensations. They ought to be protected, as they are the modern iteration of our “papers and effects.”

A Response to Jonathan Zittrain in The New York Times

Ryan Radia — Mon, 27 Jul 2009 18:52:45 +0000

In response to Professor Jonathan Zittrain’s op-ed in The New York Times last Monday about online privacy and open platforms (which Adam thoroughly refuted last week) I have a letter to the editor in today’s The New York Times:

To the Editor: Re “Lost in the Cloud” (Op-Ed, July 20): In discussing the privacy risks that have accompanied the growth of the Internet, Prof. Jonathan Zittrain rightly bemoans the willingness of governments to violate individuals’ privacy rights. Unfortunately, he proposes new legal restrictions that would stifle online innovation while doing little to enhance consumer privacy. Mr. Zittrain proposes a “fair practices law” that would require companies to release personal data back to users upon request. Such a rule may sound workable, but purging specific data across globally dispersed server farms is no simple endeavor. Who is to pay for the implementation of such privacy procedures — especially for free services like Facebook or Twitter that have yet to turn a profit? A better approach to online privacy is to educate users on safeguarding personal information. Ultimately, however, the only foolproof approach to protecting sensitive data online is to simply not disclose it.

To clarify my last point, I don’t think that universal nondisclosure of sensitive data online is necessarily a wise approach to privacy. Rather, my point is that it’s important to remember that transmitting data on the Internet — a very public network — entails some degree of risk, no matter how strong the encryption or how diligent the party at the other end. And free services like Facebook and Twitter are all about making personal information public — they simply aren’t designed to provide ironclad data security or anything remotely resembling it. Other online services, like bank websites or enterprise-grade Web collaborative tools, are able to offer far stronger privacy assurances backed by strong terms of service. Privacy is not a black and white matter. It involves shades of gray, which is one reason why legislation is such an ineffective means of dealing with privacy challenges.

How much do we really care about protecting our personal information?

Ryan Radia — Sun, 17 Aug 2008 23:10:08 +0000

Over on Techdirt, Mike Masnick discusses an interesting new survey that highlights the sharp disconnect between how much we claim privacy matters to us and how far we’re willing to go to safeguard it. America Online polled 1,000 users in the United Kingdom, and the results further reinforce what other recent studies have suggested:

The study found 84% of users say they carefully guard their info online — but when tested, 89% of people actually did give away info in the same exact survey.

The AOL survey brings to mind security guru Bruce Schneier’s insightful quip on privacy from back in 2001:

If McDonald’s in the United States would give away a free hamburger for a DNA sample they would be handing out free lunches around the clock. So people care about their privacy, but they don’t care to pay for it.

When presented with the option of sacrificing a bit of privacy for something of value, like a chocolate bar or a free gift certificate, many users are surprisingly willing to dole out data to third parties for commercial use. And the value of personal details to marketers is massive. As social networking sites and ad-serving networks amass ever greater knowledge of our hobbies, political views, and even our favorite music, these sites are getting better at mining data to tailor ads with pinpoint precision, commanding high click rates while sustaining server farms and original content publishers.

Online ads are often irrelevant, and sometimes even downright annoying, but they don’t have to be. Just ask my colleague Christine Hall, who recently discovered a new band thanks to a Facebook ad that was presumably targeted to her individual preferences:

You see, I’m on Facebook. As I surf around on the site, little targeted ads appear on the left side of the screen. Clearly the ads are accessing, directly or indirectly, information I’ve shared with Facebook – even information that I’ve made “private” from regular viewing. The ads I usually get reference my age or the fact that I am married, but they are generally useless – ‘you’re married? click on this link to win $500.’ Riiiiggghhht. Well, finally, one of these ads caught my interest and attention! It was an ad for a band…one I discovered I actually like – Velvet Code! I surmise that the band submitted an ad to Facebook with a search criteria that included “goofy people who fancy electronic music,” because, well, it found me.

Of course, finding a desirable new product via an ad isn’t quite the same as receiving a free chocolate bar in exchange for personal data. Still, it’s a sign that in the future, we may start to realize more concrete benefits made possible by “smart” ads.

Many people rightly value privacy, but it doesn’t exist in a vacuum. There are often tradeoffs between privacy and targeted marketing, and we often underestimate the importance of advertising as a vehicle for wealth creation in the online world.

Google vs. Google

Ryan Radia — Tue, 08 Jul 2008 18:47:12 +0000

Google has found itself stuck between a rock and a hard place in its legal battle with Viacom over the question of whether IP addresses constitute “personally identifiable information,” as Jim pointed out yesterday . It’s worth noting, however, that EU regulators have left Google little choice but to stake out uncharted territory in order to defend its data collection practices.

Under the European Union’s strict privacy directive , websites are prohibited from retaining “personal data” for more than six months. What exactly constitutes personal data is up for debate. Google, which retains IP addresses for 18 months , has taken the position that IP addresses don’t constitute personal data and therefore are not subject to EU data retention limits.

That argument has placed Google in a double-bind in its legal proceedings with Viacom. In his recent ruling, Judge Stanton specifically referenced Google’s recent blog post which argued that IP addresses should not be considered personally identifiable information. If IP addresses aren’t private, Stanton reasoned, then what’s the harm in Google handing them over to Viacom?

Whether an IP address can identify an individual is a matter of context. Google stated recently, “Based on our own analysis, we believe that whether or not an IP address is personal data depends on how the data is being used.” That makes sense; an IP address alone is generally not enough information to identify an individual, absent a court order.

Yet while IP addresses are not capable of overtly identifying individuals in the same way as phone numbers and addresses, IP addresses combined with other details often make it possible to positively identify individuals with a high degree of accuracy. Anybody can run a reverse DNS lookup on an IP address, which usually reveals the city and state in which the user of that IP address is located, along with the service provider. The YouTube logs that Google has been ordered to produce include not just IP addresses but also usernames and specific viewing times, so it’s all but guaranteed that quite a few individuals could be personally identified given enough man-hours of data mining .

Surprisingly, Google is not arguing that usernames are personally identifiable. Sure, they’re self-selected and often completely pseudonymous, as Berin noted . But it’s fairly common for people to use the same username across online forums and instant messaging services. The same LobsterBoy1922 who spends his evenings watching Rick Astley clips on YouTube is probably the same LobsterBoy1922 who often posts using his real name on the AVS Forum.

Some people even use their real name as their username, but that still doesn’t mean that they’ve sacrificed their expectation of privacy. While I believe that users have no inherent of expectation of privacy online , Google has a robust privacy policy governing user data. Website privacy policies can go a long way towards establishing an expectation of privacy for users. Google admits its privacy policy is legally binding , so it seems reasonable for people to watch YouTube under the assumption that their viewing habits won’t be exposed to third parties.

I don’t see why Judge Stanton felt it necessary to grant such a broad discovery order in the first place. Why does Viacom need usernames or IP addresses of YouTube viewers to accomplish its objective of determining whether YouTube is “ capable of substantial non-infringing uses ?” Google’s (unanswered) request to hand over partially redacted viewing logs would protect user privacy without impeding Viacom’s ability to compare viewing statistics between infringing and non-infringing videos.

Some have even argued that Google should simply refuse to comply with the court order. Such a bold move would invariably trigger contempt penalties, but would also give Google a lot of favorable press and positive blogosphere credo. And Google might even overturn the court order on appeal, although any appeal wouldn’t be heard until after the discovery deadline passes.

While Judge Stanton’s ruling is a blow to the privacy of YouTube users, many in the media are vastly exaggerating the privacy implications of the court order. Viacom won’t have free reign to do as it pleases with the YouTube logs, despite what some bloggers have suggested. Remarkably, journalists continue to omit any discussion of the protective order that places narrow conditions on Viacom’s access to the YouTube logs.

As Berin explained last week , worries about the YouTube records being used to file lawsuits against viewers are similarly overblown. Suing viewers would likely run afoul of the protective order —possibly resulting in legal penalties—plus there’s no legal precedent for taking people to court merely because they viewed infringing video files . And due to the strict confidentiality provisions in the protective order, a data breach or accidental leak would expose Viacom (or its outside experts) to serious civil liability. Still, information is volatile, as Jim warns , and Viacom’s analysis will only endanger the secrecy of YouTube’s logs.