New Google Tool Discloses Censorship & User Data Requests from Governments

Google has just launched a new tool that lets users view the total number of requests received “from government agencies around the world to remove content from our services, or provide information about users of our services and products.” As the FAQ explains, the tool overlays the requests received over the last six months, except for countries like China that prohibit the release of such numbers, on a map with totals for both data requests if over 30 (criminal-related but not civil) and removal requests if over 10 (not including requests from private parties, like DMCA copyright take-down notices). Google makes a few important observations about the data—especially that Brazil and India’s numbers are skewed way off because of the popularity of Orkut, Google’s answer to Facebook, there.

This tool represents the beginning of a new era in transparency into how governments censor the Internet and violate users’ privacy. I very much look forward to seeing Google improve this tool to provide greater granularity of disclosure, and to seeing other companies improve upon what Google has started. Over time, this transparency could do wonders to advance Internet freedom for users by promoting positive competition among countries.

To illustrate the kinds of things one could do with this data with a more robust interface, I put together the following spreadsheet (by scraping Google’s request numbers and mashing them up with total Internet users numbers I found here (which are mostly from late 2009):

Note that I broke out separate lines for India and Brazil removing Orkut in the removal request columns but did not do so for data requests (because Google doesn’t break down data requests by service).

In particular, I’d like to see Google include XML export functionality in the next iteration of this tool, so that users like me can more easily mash-up the data with other data sets as I’ve done above—or with similar request numbers from other companies. This would allow the creation of a single repository for censorship and user data requests by governments for the entire Internet ecosystem. Google has already blazed a path in making censorship transparent to users when they are required to do it—such as notifying users that a particular video has been removed from YouTube or that a particular page has been removed from censored search results, such as they have done in China. But this tool takes that transparency to a much higher level.

While I appreciate that Google doesn’t want to get too specific about these requests, lest they individually identify anyone or compromise law enforcement investigations, I would very much like to see Google (and other companies in the future) break down the data more granularly—not just by country and by data request v. removal request, but to tell us something about what kind of data request or removal request we’re talking about in whatever categories make sense (e.g., hate speech, indecency, etc.). The current tool does tell users what service the request pertained to and what percentage of the overall requests were fully or partially complied with (if a user clicks on a particular country). But since the current data counts only distinct requests, the most important improvement would be to find a way to indicate how many users a single data request applied to or how many pieces of content a removal request applied to.

There’s a long history of using quantitative measures as indicia of how seriously a country takes freedom, and to rank countries according to the results. Perhaps most famous is the Heritage Foundation’s Index of Economic Freedom. When done properly, these rankings could highlight both the good and bad, and thus motivate countries to reduce their meddling with content and users’ privacy.

Rigorous comparisons will be tougher to do that in this context because there are other variables at play here like the popularity of Google (or any other company’s) services in each country that make cross-border comparisons somewhat difficult. If Google would also include the number of users for each of its services in each country, one could properly normalize the request data to make an accurate assessment requests per user, instead of per capita as I’ve done here (normalizing for total Internet users). But over time, I’m confident a good tool for making censorship and governmental privacy invasions transparent (if only in aggregate form) will emerge through a process of iterative innovation.

Finally, I’m hopeful that kind of tool will increase public awareness of just what a threat government is to our digital liberties and thus lead to public support for legal reforms to limit censorship and to strike a better balance between legitimate needs of law enforcement and users’ privacy. For example, in the United States, The Progress & Freedom Foundation recently joined Google and a number of other companies, trade associations and think tanks in the Digital Due Process coalition, which is pushing for heightened protections for users’ privacy when government demands data from “cloud” service providers, such as email and document hosting.

