Cookies: the paparazzi cyberspace

Picture this: you've just arrived home from the supermarket, you put the bag of shopping down and put the kettle on, sit down…

Picture this: you've just arrived home from the supermarket, you put the bag of shopping down and put the kettle on, sit down to read the newspaper you've just bought from the corner shop, and all of a sudden. . . ZZZZZZZINNTSCHHH!

A lightning streak - a flashbulb - a guy at the window! Someone has just had the nerve to take some potshots at you with a motorised Pentax.

"Get OUT!" you shout, shocked and angry at this incredible incursion into your private space. "Hold on mister," the whippersnapper replies. "We've every right. After all, you've just been to our supermarket, and we need to keep track of what you've been buying. This is all for our marketing department - we have to know what our customers are doing, if we're going to Customise Their Shopping Experience and make it as smooth and enjoyable as possible. Can't say fairer than that!"

"Right so," you say, still bewildered, as you follow him through your living room and upstairs to the bathroom and bedrooms, and down again and out the back door. "Oh, and by the way," he scratches his chin, "today was the eighteenth time this year you've gone from our supermarket straight to that newsagents around the corner."

READ MORE

OK, the above might seem like an exaggeration, but on the World Wide Web something like this is already happening. Web sites are getting smarter by the day. They seem to "know" more and more about your browsing habits - about when you last visited them, what you looked at and how long you spent doing it. One of the most talked about technologies Web sites use to track their visitors is called "cookies". Since cookies are a largely invisible mechanism, many Web users are either uncomfortable about them - or unaware that they even exist. Opinions differ over whether they are a nuisance, a fact of life, a key feedback mechanism or a gross invasion of your privacy.

Yes, cookies have become the paparazzi of the Internet, snapping on the heels not only of the rich and famous but of ordinary users too. But are they harmful? And how do they work?

Netscape defines cookies as: "a general mechanism which server side connections (such as CGI scripts) can use to both store and retrieve information on the client side of the connection. The addition of a simple, persistent, client-side state significantly extends the capabilities of Web-based client/server applications." Er, yes, right. In simpler English, a cookie is a small nugget of information which is attached to a particular Web page. The Web server - the computer holding the pages you are viewing - sends the cookie down the line to your PC along with the page you have requested. Your Web browser then stores the cookie (as a text file) on your hard drive. If and when you return to that site, some of this stored information is sent back to the Web server, along with your new request.

This interchange of information could have significant benefits. Cookies can eliminate repetitive ID and password entry to specific sites. Some online newspapers, for example, are free but make first-time users register. Cookies, though, bypass having to re-register each time you visit (and you don't have to type in an easily lost or forgotten password).

Online stores and catalogues are also using cookies more and more: as you browse from page to page you add items to your virtual shopping basket, and the cookies keep track of these. Later, you can view these items all together and present the list at the "checkout". Even if you click on another site and come back - or even shut down your PC altogether and return a couple of days later - the cookies ensure that the site still knows who you are, and what was in your basket.

But cookies have become even more than this - they are a Web marketing manager's dream, amassing data about anything from the "attractiveness" of a specific online advert or newspaper columnist to pinpointing frequent customers.

This is a form of market research on an unprecedented scale. These are "micro-surveys", in real-time, of the habits of millions of online buyers and readers. Market surveys used to take weeks or months to complete; now, in milliseconds, they can monitor where visitors are coming from, what links they have followed within your site, how long they've spent on a particular page, and what type of Web browser they are using.

They also allow Web sites to personalise content and to "wrap" it around individual users. Cookies can note what material you skip over and what you seem to focus on, and react to your number of visits. Unlike, say, those bland "virtual pubs" which give you a standard "Howdy stranger!" greeting every time you return, a cookie-intelligent site can say "Back so soon!" or "Long time no see!" and so on. . .

It all sounds so innocent. Even the very name "cookies" makes them seem cuddlesome, a sweet little reward for good behaviour, as it were. But don't you remember how, when you were a kid, you were told not to take sweets from strangers?

Many people are fearful of cookies because they believe they can carry viruses (they don't) or they are an invasion of privacy (which they well could be). The issue isn't what cookies can do to your computer, but what information they can collect and pass on to Web servers. If you want to know how people use the Web, a good place to start is the Graphic Visualization and Usability Center (GVU) in the US. Its seventh survey, conducted last April and May, was based on 19,970 respondents. When they were asked about an identifier that would uniquely label users across sessions at a site, only one in five thought this ought to be possible.

Yet such identifiers - cookies - already exist. They already track your comings and goings, and some civil liberties groups see cookies as a major intrusion into ordinary people's personal domains. It's a level of snooping and surveillance that simply wouldn't be tolerated in the physical, "off-line" world. Three months ago the Electronic Privacy Information Center (EPIC) ran a survey of the top 100 Web sites (as listed in www.100hot.com). Of these 100 sites, it found that

49 collected personal information through online registrations, mailing lists, surveys, user profiles, and order fulfilment requirements.

only 17 had formal privacy policies, and few were easy to find.

at least 24 used cookies - but not one of them told the user that information about the user was being placed on the user's systems.

There are many other ways Webmasters could misuse cookies, even if this isn't intended. For example, a site could store a membership password within a cookie in an unencrypted form. This is very bad practice on shared computers, where anyone else might see the cookie and read the password.

The Internet Engineering Task Force recently proposed a change in the default setting for third-party cookies, to give users greater control over the creation and collection of personal information. Major Web sites and advertisers have opposed the plan, saying they'd lose a major method of assessing the success of online advertising - and it would involve "hundreds of thousands of man-hours" on reprogramming Web sites.

Maybe that is the price to pay for our privacy. Do we have the right to remain anonymous? When you walk into a shopping centre, you don't expect give away such personal information. Or if one of those market research clipboards walks up to you, at least you have the choice to answer some or none of their questions. If that kind of norm is good enough for the off-line world, why shouldn't it be for the online one?

Yet cookies are small cheese compared with the major source for tracking users' movements on the Web: log files.

Most Web servers track their visitors to some degree - with or without cookies - via these log files, which contain detailed information about every single request the server receives.

Any adept Web site operator can use the logs to identify what kind of computer and browser you are using, how long you've spent on a particular page or picture, the address of your Internet service provider (ISP), the Web site you've just come from and even the next site you decide to go to.

Perhaps the greatest form of surveillance is much closer to home. For example, your own ISP is almost certain to keep routine copies of messages that pass through its network. No ground rules exist about whether commercial ISPs in Ireland can make routine inspections of these electronic transactions, from the contents of your email account to the Web pages you are downloading.

In workplaces, too, system administrators can keep copies of the email messages you send to colleagues or loved ones. The same system managers could even be looking at a monitor that shows an exact reflection of what's on your screen.

From the workplace to the home, we have already entered an era of unprecedented electronic surveillance. The symptoms that our online privacy is being eroded are everywhere. For example, the next time a big batch of junk email arrives, just ask yourself: where did they get my email address? Cookies aren't even the half of it.

Some privacy links http://www.epic.org/reports/surfer-beware.html

The Electronic Privacy Information Center's survey of 100 high-profile sites.

http://www.vortex.com/privarch.htm

The Internet Privacy Forum

http://www.cookiecentral.com

Cookie Central.

http://home.netscape.com/newsref/std/cookie_spec.html

Netscape's technical synopsis about how to write scripts that send cookies.