Do your documents contain hidden data?

Out-Law News | 14 Oct 2004 | 12:00 am | 1 min. read

With governments, large corporations and newspapers guilty of inadvertently disclosing information hidden in the content of Word and other electronic documents, a new web site has been set up to help ordinary companies and individuals avoid embarrassing disclosures.

The problem relates to metadata - the hidden but potentially threatening information that lives and travels within every genre of content, from basic documents and spreadsheets to presentations and web copy.

Generally the information retained in the document relates to its title, author, date of origin, content, size and location, but corrections, comments, deletions and other information stored in track changes can also get locked in metadata.

Unless this information, which is often confidential or sensitive, is actively removed, it can leak to unintended audiences – as the UK government discovered to its cost last year when metadata revealed that an article, upon which part of the Iraq "dodgy dossier" was based, was actually written by a university student.

Software firm the SCO Group, locked in battle with members of the open-source community, was similarly embarrassed in March this year when shortly after filing suit against Linux end-users AutoZone, a Memphis-based car parts retail giant, and DaimlerChrysler, a copy of the Word document of the DaimlerChrysler suit found its way into the hands of tech site CNET News.com.

This proved to contain hidden prior versions of the document, showing that the Bank of America was the original target of the suit.

But Microsoft Word is not the only format from which hidden content can be gleaned.

According to a report by Planet PDF, in October 2002 the Washington Post published on its web site a scanned pdf file of the letter supposedly sent by the Washington Sniper to police.

Unfortunately, while the newspaper had blacked out certain details – including the bank account to which the police was supposed to send $10 million – savvy users were able to remove the blacking and read the information.

"The dangers of metadata are a growing concern as businesses continually rely on electronic communications," stated Joe Fantuzzi, president and CEO of Workshare, sponsor of the new web site.

"The numbers of people affected by an accidental slip of sensitive metadata are rapidly increasing, making content security a growing priority for content creators and IT teams alike. Metadatarisk.org will help identify the risks for today's workers, allowing them to do their job efficiently and keep their content secure without having to worry about the hidden information they may be sending out to co-workers, competitors or clients."

While there are existing tools that allow metadata to be scrubbed from documents before they are released, the new site is intended to be a comprehensive on-line resource for professionals concerned about the issue. It provides information to help individuals and organisations understand the consequences, liability issues and risks of sharing certain types of information.