Out-Law News 2 min. read
21 May 2010, 10:55 am
Internet users are often warned to ensure that they control the use of cookies – small files that websites leave on a user's computer so that the user can be identified on a return visit – to reflect accurately their privacy preferences.
But digital rights group the Electronic Frontier Foundation (EFF) has conducted research that shows that the simple technical details displayed to websites by every user's browser software can be almost as effective an identifier as a cookie. The revelation could have serious consequences for individuals' ability to mask their identity or control the information gathered about them by organisations. It could also influence policy makers and legislators.
The EFF conducted a study of almost half a million visitors to a website and discovered that 84% of those visitors had a browser profile that was unique to them. Browsers that were installed with Adobe Flash or Java Virtual Machine software were unique in 94% of cases.
This information would not allow the website operator to identify the individual by name, but would allow an organisation to collate information on any activity undertaken by anyone using that machine, without any need to place a file on the user's computer.
"The website anonymously logged the configuration and version information from each participant's operating system, browser, and browser plug-ins – information that websites routinely access each time you visit – and compared that information to a database of configurations collected from almost a million other visitors," said an EFF statement. "EFF found that 84% of the configuration combinations were unique and identifiable, creating unique and identifiable browser 'fingerprints'. Browsers with Adobe Flash or Java plug-ins installed were 94% unique and trackable."
The EFF's report said that browser fingerprints were functioning as 'global identifiers' that users could not control in the same way that they could control cookies.
"Global identifier fingerprints are a worst case for privacy," said the report. "But even users who are not globally identified by a particular fingerprint may be vulnerable to more context-specific kinds of tracking by the same fingerprint algorithm, if the print is used in combination with other data."
"Browser fingerprinting is a powerful technique, and fingerprints must be considered alongside cookies and IP addresses when we discuss web privacy and user trackability," said EFF senior staff technologist Peter Eckersley. "We hope that browser developers will work to reduce these privacy risks in future versions of their code."
"We took measures to keep participants in our experiment anonymous, but most sites don't do that," said Eckersley. "In fact, several companies are already selling products that claim to use browser fingerprinting to help websites identify users and their online activities. This experiment is an important reality check, showing just how powerful these tracking mechanisms are."
The EFF report said that identification would become almost certain if other information was used alongside browser data.
"A fingerprint that carries no more than 15-20 bits of identifying information will in almost all cases be sufficient to uniquely identify a particular browser, given its IP address, its subnet, or even just its Autonomous System Number," said the report. "If the user deletes their cookies while continuing to use an IP address, subnet or ASN that they have used previously, the cookie-setter could, with high probability, link their new cookie to the old one."
The identifying information in a browser fingerprint will change over time, the EFF said, as users upgrade to another version or change browser or change its settings. Even in those cases, though, a user can be identified, the EFF said.
Its techniques resulted in a correct identification of a user returning to the site with a changed browser in over 99% of the cases where it made a guess.
"[We] made a correct guess in 65% of cases, an incorrect guess in 0.56% of cases, and no guess in 35% of cases," said the report. "99.1% of guesses were correct, while the false positive rate was 0.86%. Our algorithm was clearly very crude, and no doubt could be significantly improved with effort."