“De-identified”? Sometimes You Can Disagree With Yourself

by on May 28, 2009 · 16 comments

Recall a couple of years ago when I lauded Google – and also picked on them – for making customer data “more anonymous”?

“‘Anonymous’ is correctly regarded as an absolute condition,” I wrote. “Like pregnancy, anonymity is either there or it’s not. Modifying the word with a relative adjective like ‘more’ is a curious use of language.”

The challenge of these concepts – “anonymized” or “de-identified” data – is still around, and it’s still a difficult one.

Here’s a sophisticated take on the question:

Information is increasingly difficult to classify as “identified” or “de-identified,” particularly as it is copied, exchanged, or recombined with other information. With rapidly evolving technologies and databases, it is more appropriate to describe a spectrum of “identifiability,” rather than a binary classification of information as identifiable or not. The question could then become not whether deidentified information might be made re-identifiable, but rather which entities would be able to re-identify the information, how much effort they would have to expend, and what limits are placed on their doing so.

And here’s an advocacy group apparently lacking that sophistication. They treat information as flatly “de-identified” in a legal filing about a New Hampshire law that bans the sale of prescription drug data for marketing purposes:

[T]he Prescription Information Law does not implicate patient privacy. While it purports to protect privacy interests, the statute regulates patient de-identified information.

Here’s the thing: Both quotes were issued by the Center for Democracy and Technology.

The first is from CDT’s filing with the Department of Health and Human Services about the circumstances under which HHS should require health providers to notify patients about a data breach. CDT wants more breach notices, so it argues that information might be pieced together. The concept of “de-identification” is weak.

The second quote is from a CDT legal brief asking the Supreme Court to review (and I believe they would argue to reject) the New Hampshire law. CDT wants the data to be shared, so it argues that the data is “de-identified.”

However, as data is copied, exchanged, or recombined with other information such as payment claims to Medicare and Medicaid, it’s easy to imagine records of doctors’ prescribing practices being used to help piece together patients’ drug-taking habits and health conditions.

Is this mendacity on the part of CDT? I don’t think so. It illustratates how difficult these issues are, even for sophisticated parties. Until more intellectual groundwork is laid, information policy arguments before regulators, lawmakers, and courts will not rest on solid footing. Everyone’s trying their best!

You’re dying to know the right answers, of course: Government-mandated data breach notifications are part of a growing trend toward command-and-control data security. Giving injured parties common law remedies and letting the legal incentives sort things out would be much better. HHS, of all agencies, should not be doing data security.

The New Hampshire law weakens drug comanies’ ability to market to doctors, which deprives them of information that could help them serve patients better. The remote privacy risk to patients when doctors’ prescribing practices are shared should also be handled by common law remedies rather than the state’s regulation, with its attenuated privacy claims. Rather than the U.S. Supreme Court finding a federal trump card, the legislature in New Hampshire should correct its error and maximize the flow of information in the state’s health care system.

Previous post:

Next post: