MS BUGS



The Privacy Foundation has discovered that it is possible to add
"Web bugs" to Microsoft Word documents. A "Web bug" could
allow an author to track where a document is being read and how
often. In addition, the author can watch how a "bugged" document
is passed from one person to another or from one organization to
another.

Some possible uses of Web bugs in Word documents include:

Detecting and tracking leaks of confidential documents
from a company.
Tracking possible copyright infringement of newsletters
and reports.
Monitoring the distribution of a press release.
Tracking the quoting of text when it is copied from one
Word document to a new document.

Web bugs are made possible by the ability in Microsoft Word of a
document to link to an image file that is located on a remote Web
server. Because only the URL of the Web bug is stored in a
document and not the actual image, Microsoft Word must fetch the
image from a Web server each and every time the document is
opened. This image linking feature then puts a remote server in
the position to monitor when and where a document file is being
opened. The server knows the IP address and host name of the
computer that is opening the document. A host name will typically
include a company name if a computer is located at a business.
The host name of a home computer usually has the name of a
user's Internet Service Provider (ISP).

An additional issue, and one that could magnify the potential
surveillance, is that Web bugs in Word documents can also read
and write browser cookies belonging to Internet Explorer. Cookies
could allow an author to match up the computer viewer of a Word
document to their visits to the author's Web site.

Web bugs are used extensively today by Internet advertising
companies on Web pages and in HTML-based email messages
for tracking. They are typically 1-by-1 pixel in size to make them
invisible on the screen to disguise the fact that they are used for
tracking.

Although the Privacy Foundation has found no evidence that Web
bugs are being used in Word documents today, there is little to
prevent their use.

Short of removing the feature that allows linking to Web images in
Microsoft Word, there does not appear to be a good preventative
solution. However, the Privacy Foundation has recommended to
Microsoft that cookies be disabled in Microsoft Word through a
software patch.

In addition to Word documents, Web bugs can also be used in
Excel 2000 and PowerPoint 2000 documents.

TOP OF PAGE


Detailed Description

Microsoft Word from the beginning has supported the ability to
include picture files in Word documents. Originally the picture files
would reside on the local hard drive and then be copied into a
document as part of Word .DOC file. However, begining with Word
97, Microsoft provided the ability to copy images from the Internet.
All that is required to use this feature is to know the URL (Web
address) of the image. Besides copying the Web image into the
document, Word also allows the Web image to be linked to the
document via its URL. Linking to the image results in smaller Word
document files because only a URL needs to be stored in the file
instead of the entire image. When a document contains a linked
Web image, Word will automatically fetch the image each time the
document is opened. This is necessary to display the image on
the screen or to print it out as part of the document.

Because a linked Web image must be fetched from a remote Web
server, the server is in a position to track when a Word document
is opened and possibly by whom. Furthermore, it is possible to
include an image in a Word document solely for the purpose of
tracking. Such an image is called a Web Bug. Web bugs today
are already used extensively by Internet marketing companies on
Web pages and embedded in HTML email messages.

When a Web bug is embedded in a Word document, the following
information is sent to the remote Web server when the document
containing the bug is opened:

The full URL of the Web bug image
The IP address and the host name of the computer
requesting the Web bug
A Web browser cookie (optional)

This information is typically saved in an ordinary log file by Web
server software.

Because the author of the document has control of the URL of the
document, they can put whatever information they choose in this
URL. For example, a URL might contain a unique document ID
number or the name of the person to whom the document was
orginally sent.

These tracking abilities might be used in any number of ways. In
most cases, the reader of a particular document will not know that
the document is bugged, or that the Web bug is surreptitiously
sending identifying information back through the Internet.

One example of this tracking ability is to monitor the path of a
confidential document, either within or beyond a company's
computer network. The confidential document could be "bugged"
to "phone home" each time it is opened. If the company's Web
server ever recevied a "server hit" from an IP address for the bug
outside the organization, then it could learn immediately about the
leak. Because the server log would include the host name of the
computer where the document was opened, a company could
know that the organization that received the leaked document was
a competitor or media outlet.

All original copies of a confidential document could also be
numbered so that a company could track the source of a leak. A
unique serial number could be encoded in the query string of the
Web bug URL. If the document is leaked, the server hit for the Web
bug will indicate which copy was leaked.

A serial number could be added to a Web bug in a document
either manually — right before a copy of a document is saved —
or automatically through a simple utility program. The utility
program would scan a document for the Web bug URL and add a
serial number in the query string. A Perl script of less than 20 lines
of code could easily be written to do this sort of serialization.

Another use of Web bugs in Word documents is to detect
copyright infringement. For example, a publishing company could
"bug" all outgoing copies of its newsletter. The Web bugs in a
newsletter could contain unique customer ID numbers to detect
how widely an individual newsletter is copied and distributed.

A third possible use of Web bugs is for market research purposes.
For example, a company could place Web bugs in a press release
distributed as a Word document. The server log hits for the Web
bugs would then tell the company what organizations have actually
viewed the press release. The company could also observe how a
press release is passed along within an organization, or to other
organizations.

In an academic setting, Web bugs might be used to detect
plagiarism. A document could be bugged before it is distributed.
An invisible Web bug could be placed within each paragraph in
the document. If text were to be cut and pasted from the document,
it is likely that a Web bug would be picked up also and copied into
the new document

To place a Web bug in a Word document is relatively simple.
These are the steps in Word 2000:

1. Select the Insert | Picture | From File... menu command
2. Type in the URL of the Web Bug in the "File Name" field of
the Insert Picture dialog box.
3. Select the "Link to File" option of the "Insert" button.

Access to the sender's server logs is required to monitor the
movement of such Web bugs.

The Privacy Foundation ran simple experiments with Excel and
PowerPoint files and found that these files can also be "bugged" in
Office 2000. The Privacy Foundation continues to investigate this
issue with regard to other software programs.

The Privacy Foundation has set up a demonstration of a Web bug
in a Word document. The demo document can be downloaded
from the University of Denver Privacy Center Web site at this URL:
http://www.privacycenter.du.edu/demos/bugged.doc

The document contains a visible Web bug. When the document is
opened, the Web bug will show the host name of the computer that
fetched the image. In addtion, a non-identifying Web browser
cookie will be set on your computer. The cookie is non-identifying
because everyone gets the same cookie value, which is simple
test string.

Demonstrations of "bugged" Excel and PowerPoint files are also
available for download from the Privacy Center Web site:

http://www.privacycenter.du.edu/demos/bugged.xls
http://www.privacycenter.du.edu/demos/bugged.ppt

The use of Web bugs in Word does point to a more general
problem. Any file format that supports automatic linking to Web
pages or images could lead to the same problem. Software
engineers should take this privacy issue into consideration when
designing new file formats.

This issue is potentially critical for music file formats such as MP3
files where piracy concerns are high. For example, it is easy to
imagine an extended MP3 file format that supports embedded
HTML for showing song credits, cover artwork, lyrics, and so on.
The embedded HTML with embedded Web bugs could also be
used to track how many times a song is played and by which
computer, identified by its IP address.

TOP OF PAGE


Vendor Contact and Response

Microsoft was contacted about this issue on 8/4/00, and again on
8/25/00. They confirmed that Microsoft Word will access the
Internet in order to fetch Web images that are linked to in a Word
document. They went on to say that Word uses Internet Explorer to
fetch images and therefore standard Web browser cookies can be
both read and set from inside a Word document. However, the
company claims that Word users can mitigate the use of cookies.

Regarding the potential use of Web bugs to track Word
documents, Microsoft said that there is no evidence that such
activities are occurring.

TOP OF PAGE


Recommendations

Short of getting rid of the ability to link to Web images from Word
documents, there really is no solution to being able to track Word
documents using Web bugs. Because this linking ability is a useful
feature, the Privacy Foundation does not recommend its removal.

However, the Foundation does believe that the Web browser
cookies should be disabled inside of Word documents. There
appears to be very little need for cookies outside of a Web
browser. In general, the Foundation believes that cookies should
be disabled by default any time Internet Explorer is reused inside
of other applications such as Word, Excel, or Outlook. We would
like to see Microsoft make this change in the next release of
Internet Explorer.

Users concerned about being tracked can use a program such as
ZoneAlarm (www.zonelabs.com) to warn about Web bugs in Word
documents. ZoneAlarm monitors all software and warns if an
unauthorized program is attempting to access the Internet.
ZoneAlarm is designed to catch Trojan Horses and Spyware.
However, because Word typically does not access the Internet,
ZoneAlarms can also be used to catch "bugged" Word
documents.

TOP OF PAGE


Acknowledgements

The Privacy Foundation would like to thank Barry Shell, research
communications editor at the Centre for Systems Science, Simon
Fraser University, Burnaby, BC, Canada. His tip that Microsoft
Word will access the Internet when pasting HTML text into a Word
document lead to our investigation to see if Web bugs could be
embedded in a Microsoft Word document.