cookie management – Technology Liberation Front

Abandoning the Dumb Federal Cookie Policy

Jim Harper — Tue, 11 Aug 2009 19:32:59 +0000

Today’s Washington Post has a story entitled U.S. Web-Tracking Plan Stirs Privacy Fears. It’s about the reversal of an ill-conceived policy adopted nine years ago to limit the use of cookies on federal Web sites.

In case you don’t already know this, a cookie is a short string of text that a server sends a browser when the browser accesses a Web page. Cookies allow servers to recognize returning users so they can serve up customized, relevant content, including tailored ads. Think of a cookie as an eyeball – who do you want to be able to see that you visited a Web site?

Your browser lets you control what happens with the cookies offered by the sites you visit. You can issue a blanket refusal of all cookies, you can accept all cookies, and you can decide which cookies to accept based on who is offering them. Here’s how:

Internet Explorer: Tools > Internet Options > “Privacy” tab > “Advanced” button: Select “Override automatic cookie handling” and choose among the options, then hit “OK,” and next “Apply.”

I recommend accepting first-party cookies – offered by the sites you visit – and blocking third-party cookies – offered by the content embedded in those sites, like ad networks. (I suspect Berin disagrees!) Or ask to be prompted about third-party cookies just to see how many there are on the sites you visit. If you want to block or allow specific sites, select the “Sites” button to do so. If you selected “Prompt” in cookie handling, your choices will populate the “Sites” list.

Firefox: Tools > Options > “Privacy” tab: In the “cookies” box, choose among the options, then hit “OK.”

I recommend checking “Accept cookies from sites” and leaving unchecked “Accept third party cookies.” Click the “Exceptions” button to give site-by-site instructions.

There are many other things you can do to protect your online privacy, of course. Because you can control cookies, a government regulation restricting cookies is needless nannying. It may marginally protect you from government tracking – they have plenty of other methods, both legitimate and illegitimate – but it won’t protect you from tracking by others, including entities who may share data with the government.

The answer to the cookie problem is personal responsibility. Did you skip over the instructions above? The nation’s cookie problem is your fault.

If society lacks awareness of cookies, Microsoft (Internet Explorer), the Mozilla Foundation (Firefox), and producers of other browsers (Apple/Safari, Google/Chrome) might consider building cookie education into new browser downloads and updates. Perhaps they should set privacy-protective defaults. That’s all up to the community of Internet users, publishers, and programmers to decide, using their influence in the marketplace. (I suspect Berin is against it!)

Artificially restricting cookies on federal Web sites needlessly hamstrings federal Web sites. When the policy was instituted it threatened to set a precedent for broader regulation of cookie use on the Web. Hopefully, the debate about whether to regulate cookies is over, but further ‘Net nannying is a constant offering of the federal government (and other elitists).

By moving away from the stultifying limitation on federal cookies, the federal government acknowledges that American grown-ups can and should look out for their own privacy.

Ends, Means, and One Man’s War on Advertising

Jim Harper — Tue, 24 Mar 2009 15:02:18 +0000

Chris Soghoian has responded to my recent post lauding his Targeted Advertising Cookie Opt-Out (or “TACO” – documented and downloadable here). We’re agreed in the main on user empowerment. The interesting stuff is on the margin: He disagrees with me that blocking third party cookies as I do (and he does too) is a satisfactory approach to suppressing tracking by advertisers.

There are a couple of points worth making about the discussion.

The first has to do with our slightly differing objectives. Chris is deeply focused on advertisers and his dislike of being tracked by advertisers. Though it is not absolute, I have a preference against tracking by anyone other than sites that I know, like, and trust. I’m no more worried about advertisers than any entity that would track my surfing – and there are many.

Again, TLF readers, I ask you to try setting your browser to query you before setting cookies. It’s a real insight into the dozens of entities getting a look at you as you surf, including a bunch of social networks and news sites.

If “advertisers” are what you seek to harness, that seems like a group that can be captured through some kind of centralized control mechanism. (I don’t think it actually is.) But if your goal is privacy as against all comers, you don’t attempt to centrally plan or decide who is good and who is bad. Responsibility rests with the end user.

Let the goal be “advertisers,” though. And I ask: Those social networks and news aggregators – are they “advertisers”? If you’re going to require a subset of Web communicators to obey opt-out cookies, you have to be able to define that subset – a problem Chris doesn’t seem to have thought about yet.

Lots of different publishers, sites, and networks have data that is entirely fungible with the tracking data advertisers collect. What do you get if you push down on the “officially advertisers” part of the balloon? Workarounds.

But I’ve backed into the second point – the means to these ends. Chris soft-pedals how he would get at tracking, but as far as I can tell it’s a law that says “advertisers” have to obey opt-out cookies.

Unlike all of the previous anti-advertising technologies, the opt-out mechanism provides users with a way to positively affirm that they do not wish to be tracked and targeted. This opt-out cookie is something that advertisers cannot ignore.

Is it by magic that they “cannot ignore” opt-out cookies? No, it’s by law.

With the right law in place, Chris appears to believe, “[t]he Federal Trade Commission and Congress would likely take an interest” when advertisers tried to skirt opt-out cookies, using other technologies to glean information about Web surfers’ interests.

His hope is to end the “arms race” in which users have to constantly chase the shifting tactics advertisers use to track them. It’s a fair point: There is a constant, rolling change in how the Web is used by publishers, advertisers, and consumers to interact and trade the data each produces.

That is an “arms race” only if you’ve adopted the rigid, war-like stance that tracking by advertisers is inherently wrong. It’s not. Berin and Adam, who have done a lot more work than me on this lately, have done a good write-up of the subtleties. What Chris calls an “arms race” is better thought of as a constantly unfolding negotiation among all parties about the terms of the content-for-advertising bargain.

I believe, as a person who dislikes third-party cookies, that offering them to my computer in the hopes of gleaning some information is not wrong. Some people think it’s horribly wrong. Most people are indifferent.

Who’s right? Everyone and nobody. There doesn’t have to be one answer.

But should the terms of use for the Web be written by a vociferous minority (i.e. Chris) that can’t persuade the public to refuse tracking using the tools available to them? Perhaps the demand for control comes because the public won’t be persuaded.

Now that would be wrong – regulating cookies to force “protection” on a public that could seek it for itself, but won’t. That would deprive “advertisers” – we still don’t know who they are – of freedom and communications channels, it would deny publishers revenues, and it would deny consumers content they want and enjoy.

But let’s talk about arms races. Chris seeks exit from the so-called arms race on the technical and user side in favor of an arms race in the legislative and regulatory world. The law he imagines – so perfect as it resides there in his head – would have to be passed by Congress and implemented by a regulatory agency like the Federal Trade Commission.

Each of these regulatory bodies is under constant, well, “siege” by phalanxes of lobbyists, paid to advocate the views of their clients, including ” advertisers.” There is no realistic hope that Chris’ opt-out cookie law would make it through that in the form he wants. Defining what one means by “advertisers” is a gruesome task, with likely First Amendment problems. Instead of the clean bill Chris imagines, it would be perverted (from Chris’ perspective) by lobbying and special-interest influence. Remember when Congress passed a law alleging it would prevent spam?

Chris would transfer the arms race we’re in now – where consumers are in control, if apathetic – to a field where consumers are not in control and very apathetic, believing that they are protected by the government. This is the approach preferred by victims of the fatal conceit, who think that they can design society better than society can design itself. (Berin has done a terrific job of lambasting the Center for Democracy and Technology for its similarly conceited, blindly pro-regulatory armchair quarterbacking on the online advertising issue.)

Plenty of people dream about regulation that works, of course. The SEC’s failure to protect investors in the Madoff case provides one more example among many where law and regulation failed utterly to protect consumers – and by its existence encouraged their irresponsibility.

It is damaging folly to try protecting consumers from the tracking advertisers do when consumers can just as well protect themselves.

Chris Soghoian’s Cool Opt-Out Plugin

Jim Harper — Thu, 19 Mar 2009 18:40:28 +0000

What a victory for privacy and personal responsibility is Chris Soghoian’s Targeted Advertising Cookie Opt-Out (or “TACO” – documented and downloadable here). It signals to the 27 ad networks with well-configured opt-out cookies that you don’t want them to track you.

It’s a technical solution that empowers (and places responsibility with) the user to exercise dominion over his or her personal information. No need for law and regulation. No need to go pleading to politicians and bureaucrats for help.

It’s also a little more efficient than my method of controlling tracking, which is to take a glance at cookies as Web sites ask to set them on my computer.

(The answer is usually “no,” but it’s very interesting to see who all wants to get a glance at me when I visit any site. It’s a lot more than just ad networks, btw. I have no idea why people think ad-network tracking is bad and tracking by others is a matter of indifference.)

Now, Chris and I always find something to disagree about, so for good measure I’ll note that I disagree with his goal of switching targeted advertising from opt-out to opt-in.

Cookies are the wrong mechanism for universal opt-out, he correctly notes, and an opt-out HTTP header, were one adopted, would be switched on by default, so the big players won’t go there. “The only way we will get an easy to use, built-into the browser solution,” he concludes, “will be if government regulators get involved. FTC staffers — are you listening?”

Actually, an easy to use, built-into-the-browser solution is right there. In Firefox, it’s Tools > Options > Privacy > uncheck “Accept cookies from sites” or “Accept third-party cookies” (or further define what you want done with cookies). In Internet Explorer, it’s Tools > Internet Options > Privacy > Advanced > select “Override automatic cookie handling” and define what you want done.

A lot of folks think it’s jaw-droppingly difficult to look at cookies as they’re offered. It’s not. It’s easy to give cookies a quick skim as they come in. (Sometimes exercising responsibility for yourself is difficult. Walk it off.)

Now, should everyone do as I do? No. Should everyone do a Chris wants (and be untracked unless they request it)? Also, no.

The default on the street and on the Internet is for information to be available to others. If you don’t like it, you cover up your nakedness with clothes, or you figure out how to block cookies offered by sites you don’t want a relationship with. Kudos to Chris for giving people a cloak to wear, even though he advocates that regulators should tut-tut Web site operators for using their eyes to see.

Nuts & Bolts: Everything You Wanted To Know About Cookies But Were Afraid To Ask

Adam Marcus — Tue, 27 Jan 2009 12:25:06 +0000

As a means of introducing myself to TLF readers, this is an article that I wrote for the PFF blog in September that has not been previously mentioned on the TLF. Most of my other PFF blog posts have been cross-posted by Adam Thierer or Berin Szoka, but I’ve taken ownership of those posts so they appear on my TLF author page.

This is the first in a series of articles that will focus directly on technology instead of technology policy. With an average age of 57, most members of Congress were at least 30 when the IBM PC was introduced in 1981. So it is not surprising that lawmakers have difficulty with cutting-edge technology. The goal of this series is to provide a solid technical foundation for the policy debates that new technologies often trigger. No prior knowledge of the technologies involved is assumed, but no insult to the reader’s intelligence is intended.

This article focuses on cookies–not the cookies you eat, but the cookies associated with browsing the World Wide Web. There has been public concern over the privacy implications of cookies since they were first developed. But to understand them , you must know a bit of history.

According to Tim Berners Lee, the creator of the World Wide Web, “[g]etting people to put data on the Web often was a question of getting them to change perspective, from thinking of the user’s access to it not as interaction with, say, an online library system, but as navigation th[r]ough a set of virtual pages in some abstract space. In this concept, users could bookmark any place and return to it, and could make links into any place from another document. This would give a feeling of persistence, of an ongoing existence, to each page.”[1. Tim Berners-Lee, Weaving The Web: The Original Design and Ultimate Destiny of the World Wide Web. p. 37. Harper Business (2000).] The Web has changed quite a bit since the early 1990s.

Today, websites are much more dynamic and interactive, with every page being customized for each user. Such customization could include automatically selecting the appropriate language for the user based on where they’re located, displaying only content that has been added since the last time the user visited the site, remembering a user who wants to stay logged into a site from a particular computer, or keeping track of items in a virtual shopping cart. These features are simply not possible without the ability for a website to distinguish one user from another and to remember a user as they navigate from one page to another. Today, in the Web 2.0 era, instead of Web pages having persistence (as Berners-Lee described), we have dynamic pages and “user-persistence.”

This paper describes the various methods websites can use to enable user-persistence and how this affects user privacy. But the first thing the reader must realize is that the Web was not initially designed to be interactive; indeed, as the quote above shows, the goal was the exact opposite. Yet interactivity is critical to many of the things we all take for granted about web content and services today.

Stateful Sessions

On the original World Wide Web designed by Berners-Lee (Web 1.0), Web servers responded to each client request without relating that request to previous requests. There was no need to remember what other pages the user had requested because the requests were for static pages. But if you’ve used a Web-based email system like Gmail, Hotmail, Yahoo! Mail, etc., you know that once you log in, the service remembers who you are as you click from message to message. When a website can keep track of a user as they move from page to page within a site it is called a “stateful session.” The website doesn’t necessarily need to know anything about the user, it just needs to be able to distinguish that particular user from all other users. For example, if you go to an online store and place a few items in your virtual shopping cart, the site still does not know your name, email address, or billing information. But it does know what you’ve placed in your cart–or more precisely, it knows what someone using your browser has placed placed in a particular cart. If you leave the site before buying anything and then go back an hour later, it’s possible that the site will have completely forgotten about you. In that case, the unique identifier persists during your “session” on the site, but it doesn’t persist between sessions.

URLs and HTTP Requests

Web 1.0 sites achieve Web page persistence by having a unique address or Uniform Resource Locator (URL) for each Web page, which is displayed in the address bar at the top of your browser as you browse the web. For example, http://www.pff.org/about/ is a simple URL pointing to a specific Web page. Every user that visits the PFF site at www.pff.org and clicks on the “About” link will be taken to the exact same page.

URLs can also store information about the user. For example, if you search for “test” on Google, the URL of the resulting page may look like the following: http://www.google.com/search?q=test&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a.[2. http://googlesystem.blogspot.com/2006/07/meaning-of-parameters-in-google-query.html] The URL contains a number of different pieces of data, separated by ampersands. There is the search query (“q=test”), the character encoding of the input (“ie=utf-8”), the character encoding of the output (“oe=utf-8”), the type and language of the client (“rls=org.mozilla:en-US:official”), and the Web browser used (“client=firefox-a”). None of this information can be used to uniquely identify the user, but this basic example illustrates how URLs can be used to specify more than simply static Web pages–and how some information can be remembered as a user navigates a website even without using cookies. Knowing how this works, you can create your own advanced searches or change the way the results are formatted (e.g., changing the language).

So how did Google know I speak English and use Firefox? That information is included in the HTTP request that my Web browser sends to the Google Web server when it requests a page. HTTP requests specify (among a few other more technical things) the desired language and a “User-Agent” field that includes the name of the browser and sometimes your operating system. This information allows websites to customize their content for different Web browsers (e.g., to ensure that it displays properly). HTTP requests also include your IP address so the Web server knows where to send its response, and geotagging allows Web servers to associate an IP address with a geographic area (though the area is rarely more accurate than the country or state). HTTP requests can also contain HTTP cookies.

HTTP Cookies

URLs can be used to uniquely identify individual users and allow stateful sessions, but unless a user bookmarks the URL containing their unique identifier, there is no way for the site to associate the same unique identifier with the same user on subsequent visits. Another option is to have users create an account and then log in each time they access the site. The website could then include the user’s unique ID in the URL on subsequent pages, so that the user only needs to log in once per session. Having to bookmark or create an account on every site you want to remember you would quickly become unmanageable. It would be nice if mapping and weather websites, for example, just remembered your location. It would be nice if the blogs you follow remembered what post you last read and displayed only unread posts when you next visit their site. What was needed at this point in the Web’s evolution was a way for websites to automatically store a unique identifier on the user’s computer and send it back to the website automatically[3. A site could also try to uniquely identify users by the IP address of their computer, but this is unreliable as there can be many computers behind a firewall sharing a single IP address.]—which is precisely what a cookie does.

To quote Wikipedia,

“HTTP cookies, or more commonly referred to as Web cookies, tracking cookies or just cookies, are parcels of text sent by a server to a Web client (usually a browser) and then sent back unchanged by the client each time it accesses that server. HTTP cookies are used for authenticating, session tracking (state maintenance), and maintaining specific information about users, such as site preferences or the contents of their electronic shopping carts.”

A cookie can contain one or more pieces of data, a description and/or URL for an online description of the cookie, how long the Web browser should store the cookie, and the domain, path, and port that the cookie should be limited to. Cookies can be set to expire after a specified interval, or can be “session cookies” that will expire when the Web browser is closed. When a cookie expires, it is deleted by the Web browser. Unexpired cookies are automatically sent back to the originating Web server when the Web browser makes any subsequent requests to the same server (the same domain, path, and port).

Neither Web servers nor Web browsers are required to support cookies, but a server may refuse to work with a Web browser that does not return the cookie(s) it sends. Cookies do not contain any executable code and are extremely small in size. They only contain data sent by the website and the data is not changed by the client computer, so there generally should be no privacy concerns about sending a cookie back to the website that created it (“First-party cookies”).

First-Party and Third-Party Cookies

Cookies are normally only sent to the server setting them or a server in the same domain ( e.g., a cookie set by mail.google.com could be shared with calendar.google.com). These are called first-party cookies because they’re set by the site displayed in the address bar of the Web browser. These cookies are typically used to tailor the website for the user. Third-party cookies, on the other hand, are typically used by advertising networks to track users across multiple Web sites where the networks have placed advertising–which allows the advertising network to target subsequent advertisements to the user’s presumed interests and also to limit the number of times a user is shown a particular ad. This targeting allows the delivery of “smarter” advertising that is less annoying and more informative to the user–and therefore more valuable to the advertiser, who will be willing to pay websites more for their ad space. However, this targeting also raises privacy concerns.

It is trivial for a Web page to contain images or other components stored on servers in other domains (“third-party elements”). In fact, it is often easier to link to an image already hosted online elsewhere than it is to host an image on your own Website.

Examples:

Typical first-party embedded image:
Typical third-party embedded image:

Whenever a Web browser loads a Web page or component of a Web page, it will include in its request for that component any cookies already stored on the user’s computer that are associated with the domain hosting the content. The Web server, in turn, can send a cookie or update a cookie already existing on the user’s computer.

Although your Web browser will not send a third-party cookie to the first-party Web server (and it won’t send a first-party cookie to the third-party Web server), the first-party Web server can send information to the third-party Web server by embedding it in the URL for the third-party content. The most common form of this communication between the sites you visit and the sites they rely on for content or ads is called a “web bug”–a small (usually 1 pixel by 1 pixel) graphic not meant to be noticed by the user. Its purpose is to cause the user’s Web browser to load the third-party embedded content from the external Web server, which will allow the third party (usually an advertising network) to track the user.

Example third-party embedded web bug:

While this all may seem scary and invasive,the fact that a website or ad network can uniquely identify your browser does not mean that they have any clue who you are. Even if you provide your name, email address, or other personally-identifiable information to the first-party Web site, most sites’ privacy policies state that they will not share this information with their advertising partners. To use a real-world analogy, third-party advertising is equivalent to a marketer in a mall watching you come out of a music store and then offering you a flyer for a concert: The marketer may know that you’re interested in music (because you were shopping at the music store), but they have no idea who you are. And as my colleagues Adam Thierer and Berin Szoka explained in their post on Adblock Plus, websites (especially smaller independent websites) depend on advertising as a source of revenue and to cover their overhead costs.

Alternatives to Cookies

Cookies are not the only way websites can do stateful sessions. As has already been mentioned, Websites can put unique identifiers in URLs. But custom URLs don’t last between sessions. Websites that need to remember users ( e.g., websites that charge a fee for access) can require users to create an account and log into the site every time they use it.

But most websites do not require users to create an account and log in every time. And more and more users are configuring their Web browsers to delete all cookies when they close the browser. In response, Web site operators have found other methods to uniquely identify users by storing a unique identifier on users’ computers.

The cookie alternatives listed below are not any more or less invasive of privacy than cookies if the user is aware of them and manages them the same way they manage cookies. But most Web browsers don’t give users the same amount of control over cookie alternatives that they do over cookies, and few users know about these alternatives.

Per-session cookie alternatives – These cookie alternatives are not saved to disk and thus are not accessible after you close your Web browser.

Hidden form fields – Web pages can contain hidden Web forms that submit data back to the Web server when an on-screen button is pressed. This method is quite limited because it requires the user to click a specific button, and there is no method for saving data after you’ve navigated away from the site. Beyond these limitations, the only way to detect hidden form fields is to inspect the HTML code for a page. There is also no easy way to block hidden form fields.
window.name – JavaScript embedded in a Web page can set or read the this internal value that’s not really used for anything else. The value can be up to 32 megabytes in size and once set a value can be accessed by any Web site. Although the only way to detect this is to inspect the HTML code for a page, you can disable JavaScript.

Persistent cookie alternatives – These cookie alternatives are like cookies in that they are saved on your computer and can be accessed even after you’ve closed your Web browser.

Flash Cookies – Also known as Local Shared Objects, Flash cookies require Adobe Flash to be installed on your computer. Whereas HTTP cookies are limited to 4 kilobytes, Flash cookies can contain up to 100 kilobytes by default and can contain an unlimited amount of data if the user desires. To view and delete the Flash cookies stored on your computer, go to this page (although accessed via a Web page, the Flash cookies shown are stored on your computer). You can also permanently disable Flash cookies on that page.
DOM Storage – DOM storage was designed specifically to allow Web 2.0 applications to work offline, saving data locally when they are unable to access the host website and to save data that would otherwise be lost if a page is accidentally reloaded. DOM storage is currently only implemented in Firefox (and Internet Explorer 8 Beta). If cookies are disabled, DOM storage is also disabled. Users can also manually disable DOM storage even when cookies are enabled.
userData behavior – The userData behavior does for Internet Explorer what DOM storage does for Firefox. Each “document” is limited to 128 kilobytes of storage, with a per-domain limit of 1024 kilobytes. The data is stored in Internet Explorer’s cache and are deleted when you delete cookies using the Delete Browsing History dialog box.

Conclusion

This article should give you a better sense of what cookies are used for and how they work. You should now see that per-session cookies and cookie alternatives are completely harmless. Persistent cookies (and cookie alternatives) can make your Web browsing a bit easier, but deleting them will not (in most cases) cause any problems. If you are concerned about your privacy, you will need to do a bit more than just delete cookies–you also need to delete or disable the above-mentioned cookie alternatives.