MS-Word is Not a document exchange format
Typically you are getting this because you sent someone an email
message using MS-Word or some other operating system or
text-processing specific attachment. Alternatively, you may
have placed MS-Word files on the web as the only means for getting at
the document content.
Contents
1 What's wrong with sending MS-Word files?
1.1 Requires proprietary software
1.2 Version problems
1.3 Proprietary data format
1.4 Viruses and security
1.5 Size
1.6 Prior version info
1.7 Typically attached "wrong" to email
1.8 Word is not device independent
1.9 Word isn't even good at what it is designed for
2 Alternatives
3 Where MS-Word is appropriate
4 Response to the "it's the emergent standard" refrain
5 History and related documents
5.1 Similar documents
5.2 Rants about MS-Word
5.3 Reaction so far
5.4 How you can help
5.5 About this document and copyright notice
5.6 Shameless plug
5.7 Acknowledgement
1 What's wrong with sending MS-Word files?
1.1 Requires proprietary software
You are basically assuming that everyone has on their desktop the same
software that you have. That often goes against the spirit of the
Internet which is supposed to be about inter-operability of
heterogeneous systems. That fact that one "persistently predatory
monopoly"1 attempts to
subvert that goal, doesn't mean that you should go along with it.
Someone who sends me such mail is perfectly welcome to purchase for me
a machine and software specifically so that I can read mail in that
proprietary system. But I will still have the inconvenience of having
to forward the file to a system I wouldn't normally use.
1.2 Version problems
Even for those who chose to use MS-Word, there are compatibility
problems between various versions. Foreshadowing the next topic, it
appears that Microsoft is unwilling to provide fixes for very
substantial security problems in older versions.
An
article on CNN's website (September 13, 2002) reports such an
instance.
1.3 Proprietary data format
The above two problems are closely tied to the question of proprietary
data formats. When you store your work in MS-Word format, you are
betting that you will always have access to some licensed software that
will be able to read that format. The Open Data
Format Initiative has more information on what is wrong with closed
formats.
1.4 Viruses and security
MS-Word allows full macro-scripting. It is now the most common
carrier for viruses. What this means is that embedded within a Word
file can be a program which runs silently (or otherwise) on the
recipient's computer whenever they view the file. Are you happy with
letting other people run programs on your machine?
In one instance that I know of, a substantial portion of an MBA
graduating class sent out résumés with a Word macro virus. I
don't think that this helped their job prospects. But the particular
business school had an official MS-Word policy.
Often what would be just a few kilobytes of plain text is hundreds of
kilobytes as a Word file. I find it interesting that MS-file browsers
and emailers don't make it obvious to the sender how large particular
files are.
1.6 Prior version info
Because of Word's system of doing version control, it is possible that
recipients may see prior drafts of your document (which may contain
confidential information).
I've heard a number of "friend of a friend" stories about this sort
of thing. In one case, a potential customer was given a quote for
some product, and the quote was sent in an MS-Word file. When the
customer viewed the version history, they found that a previous
version of the document had been used for a quote to other customers,
with much lower numbers. But since initially writing this, I have
heard a number of first hand accounts. Some of which are below.
Since I almost never read MS-Word documents sent to me, I will have to
rely on the accounts of others.
Probably one of the most spectacular instances of information
inadvertently leaked because someone (the British Prime Minister's
office) used MS-Word for document exchange is described in an article by
Richard M. Smith,
Microsoft
Word bytes Tony Blair in the butt. The edit history of the "February
dossier" has become a matter of contention to say the least. Smith's
article provides links and details.
Other, more mundane, accounts of meta-data leaking from MS-Word documents
follow.
In a
Usenet
news article,
Alan Frame describes some of his experiences with this
In the past, I've received MS Word documents from an agency,
describing a job vacancy where they've refused to name the client -
lo and behold, the document properties reveals all.
And also
Indeed, I've also seen an internal business proposal which appears
to have originated at the supplier that the proponent was err,
proposing.
I have also received word from others saying,
This regularly happens to me because I deal with public relations
companies who always use the very latest spiffy version of Word and
Powerpoint and seem to be totally unaware that not everyone does the
same.
Normally I junk these docs, but if I need them I view them ... and
often see where corrections have been made...
I have never seen anything really sensitive as a result of this,
probably because most press releases aren't on very sensitive
subjects. Usually I see comments like "CLAIRE: should we describe
what the possible treatment options might be?", plus minor
word-changes. But I live in hope.
Charles Wankel posted a message concerning this to the
E-Media
list of the Academy of Management
saying,
I received a paper for an effort that I was an editor for from
someone who had used a ghostwriter. The ghostwriter had had
embedded her name in such a way that when I looked at the document
in a source view I could see it with the dates that wrote, edited,
and re-edited drafts of the document.
1.7 Typically attached "wrong" to email
While this is not strictly speaking a problem with MS-Word files, it
is a related problem. People and systems that think that it is right
to just send such things, seem to think that it is OK to send
everything with the MIME Content-type of
application/octet-stream and let the recipient work things
out from the filename info that is also sent. That is a violation of
the intent of the MIME standards, and indicates broken design for
exchange of information.
1.8 Word is not device independent
I have been told that MS-Word documents will format differently
depending on the specifics of the printer. This is not merely issues
of printer resolution or color depth, but the actual formating of the
document will differ. I was surprised to learn this. I had assumed
that Word was "What You See Is What You Get", but it appears that I
was mistaken about that. So it won't even achieve the goal of
ensuring that your recipient sees things with all the formatting you
see things with even if the recipient also uses MS-Word.
1.9 Word isn't even good at what it is designed for
As an aside, I feel that MS-Word produces probably the worst output
and is the slowest and most tedious to work in of any document
preparation system in serious use I've seen in the past 15 years. I
find it remarkable that when people are presented a choice between a
structural mark-up system (what you mean is what get) versus a visual
mark-up system (what you see is all you get) people opt for the
latter. For more on this point see section 5.2. Note
that the argument that MS-Word is an inappropriate exchange format
is independent of this point about its quality as a document
preparation system.
2 Alternatives
When talking about things sent by email it is important to distinguish
between document exchange and message exchange. Message exchange is
typically what one does by email. Making announcements or
participating in a discussion, and many of the other things we
typically do with email. For these plain text is the only reasonable
thing. It is the safest, most portable and by far the most compact.
It allows responses quoting portions, and has none of the dangers
mentioned above. The small added value of the formating information
isn't worth all of the problems.
If you absolutely need to present the formating information for
document exchange, then use a page description language like PDF.
Also consider using (standards compliant) HTML. Please note that I
am not in any way advocating the use of HTML in ordinary email. It is
grossly inappropriate for that for reasons that are beyond the scope
of this document.
In earlier versions of this document, I listed RTF (Rich Text Format) as
a more standards based way of exchanging word-processor documents. I
have been corrected on that point innumerable times. RTF is little
better than MS-Word format itself. It is a <em>little</em> better, but
it shares all of the problems as MS-Word. Although RTF was advertised as
a document exchange format, it never lived up to that. It appears to
have varying features, and the various version of RTF that Microsoft products
create have elements which only Microsoft Products can read. Note that
this is not because MS-Word is a better product, but because Microsoft
keeps elements of what it considers to be RTF secret.
3 Where MS-Word is appropriate
MS-Word is appropriate for document exchange among co-authors of a
document who are all developing it and have agreed before hand to use
MS-Word. If you have been referred to the document you are now reading,
then the person who referred you to it probably doesn't consider
themselves party to such an agreement, and having sent them an MS-Word
document is inappropriate.
4 Response to the "it's the emergent standard" refrain
Several people have responded with sophisticated "network analysis"
essays about MS-Word being a de facto standard, and pointing out that
even if the standard isn't the optimal one, it is better to go along
with the standard anyway. My counter argument is two-fold:
- Whether or not the argument about emergent standard holds for
authorship (eg, "I use Word because it is what my potential
co-authors use") has little bearing on what you use for document
exchange. I use LATEX for document preparation, but I distribute
them as PDF.2 So there may be
an argument for using MS-Word even though it is inferior to other
options, but that in no way suggests that MS-Word should be used for
document exchange.
- The second argument is an ethical one, and I start with an
analogy.
Over the past few years it has become fashionable in the
US to drive some form of truck as a primary commuting/errands
vehicle. There are many issues regarding that fashion, but for this
analogy I would like to focus on two of them. When two vehicles
collide the occupants of the lighter one are far more likely to
suffer injury than they would if the had collided with an equally
light vehicle. So when someone drives a truck, they are
putting those in normal sized vehicles at an extra risk. The second
property is similar. The headlights of the trucks are much higher
off the ground than those of cars. Driving a car at night with one
of these trucks close behind you is extremely annoying and possibly
dangerous. In both of these cases, the drivers of the trucks don't
experience the disadvantage of others driving trucks. In the first
case, they too are in heavy vehicles, and in the second the driver
is high enough off the ground to not be impaired by the headlights
of other trucks.
By the logic of the "emergent standard" advocates, the only way to
deal with the truck problems I've described is to switch to driving
a truck oneself. The emergent standard argument might have some
validity if the standards were arbitrary, but if some are
particularly destructive to community as a whole, they should be
opposed. Use of MS-Word for document exchange is simply bad network
citizenship. Paraphrasing Juhapekka Tolvanen: using MS-Word is like
smoking; using it for document exchange is like blowing your smoke
in everyone else's face.
- There is a third argument, closely related to the second: Do you
want to be part of Microsoft's marketing effort?
5 History and related documents
5.1 Similar documents
When I first wrote the first version of this document in March, 2001,
it was because I not only was fed up with people sending me unwanted
MS-Word documents, but because I was tired of explaining repeatedly
why I objected to them. I wrote this to be part of a canned
response.
Being remarkably lazy, I didn't want to investigate and write this up
if someone else had already written something. So I did a little bit
of searching for documents like this. I knew from personal
communication that while I am in a minority there is a substantial
minority which feels exactly the same way. I expected that someone
would have already written something like this document.
I didn't find any when I looked, but clearly I didn't look carefully
enough. I have since been informed of others that I've missed. I
list them here, along with some which were written after my document.
- plaintext: In praise of practical e-mail hygiene
-
This is Martin Vermeer's essay. It covers the same points as
mine but goes deeper into trying to persuade people to be
better network citizens.
http://www.netby.dk/Oest/Europa-Alle/vermeer/plain.html
- We can put an end to Word attachments
-
This is an article by Richard M. Stallman advocating efforts like
mine to discourage people from sending MS-Word documents. The
article itself is aimed at those who already know that Word
attachments are wrong.
http://www.gnu.org/philosophy/no-word-attachments.html
- Sincere Choice
- This is the
home page of the Sincere Choice platform who say "We believe
that there should be a fair, competitive market for computer
software, both proprietary and Open Source."
http://sincerechoice.com/
The Sincere Choice principles of
open
standards and
interoperability
underly much of what has been stated here.
http://sincerechoice.com/Principles/Open_Standards.html
http://sincerechoice.com/Principles/Choice_Through_Interoperability.html
- Open Data Format Initiative
- This is an
attempt to encourage software companies to fully document the formats
of their data files. To paraphrase earlier words of the founder of
this initiative, if you own the data in the PowerPoint presentation
you created, why should you need a license from Microsoft to get at your
presentation?
http://odfi.org/
- Miksi on
typerää postittaa sähköpostin...
-
As you can see, this detailed essay and analysis by Juhapekka
Tolvanen is in Finnish. I don't read that language, but there are
some useful links from that. He comes up with a very useful
analogy, which I will rephrase more harshly: Using MS-Word is
like smoking; emailing those files is like blowing smoke into
other people's faces.
http://www.cc.jyu.fi/~juhtolv/mswordmail.html
- MS-Word?
nom obrigado
- A similar document to mine, available in
Portugues and Galician, by Ramón Flores d'as Seixas. While this
document is based on the others listed here, it also adds points
about what makes a good document exchange format. It also
discusses the values of standards of exchange in terms of
establishing a level playing field. The Galician is pretty much
readable to those who can read Spanish.
http://members.tripod.com.br/ramonflores/word/index.html
- Brave new Word
-
A similar document in Norwegian, a language I can't read. Written
by Thomas Gramstad. It has some links at the end that might be
useful to people who don't read Norwegian.
http://www.efn.no/brave-new-word.html
- Avoid
E-Mail attachments, especially Microsoft Word
-
A similar document to this, but much shorter. It gives some brief
instructions to MS-Word users on alternatives they can use for
document exchange.
http://bcn.boulder.co.us/~neal/attachments.html
- Elektronische
infomatieoverdracht binnen de VU-organisatie: Het gebruik van e-mail en
MS Word (PDF)
- A document in Dutch by
Reinout van Schouwen. Also it is
directed internally.
http://www.cs.vu.nl/~reinout/word-attachments.pdf
5.2 Rants about MS-Word
The focus of this document has been on the misuse of Word for document
exchange. It is geared toward MS-Word users to encourage them to send
documents in other formats, even if they continue to use Word for
document production. It should be noted, however, that those
individuals who are most annoyed by receiving MS-Word files for
document exchange are those who do not regularly use MS-Word. None
the less, it is hoped that fans of MS-Word will recognize that
whatever its virtues, it is not a document exchange format.
The arguments I've presented stand even if MS-Word were a good tool
for document preparation. However, I'd also like to point to some
documents which argue (correctly in my view) why MS-Word is a bad
choice of document preparation system and not just a bad choice of
document exchange format.
- Word
Processors: Stupid and Inefficient
- by Allin Cottrell
discusses what is wrong with What You See is All You Get systems
using visual mark-up, as opposed to the far more reasonable
structural system where you separate the tasks of controlling the
appearance from the task of writing the content.
http://www.ecn.wfu.edu/~cottrell/wp.html
- No Proprietary
Binary Data Formats
- by Sam Steingold. This discusses the
dangers of keeping important data in formats which require
restricting and licensed software to recover. MS-Word is a
proprietary and secret document format. You are trusting your
future access to you own documents to the whim of a persistent
monopolist.
http://www.podval.org/~sds/data.html
5.3 Reaction so far
As far as I can tell my campaign has met with little success so far
(January 2002) other than a few people taking some care to send me RTF
documents instead of MS-Word documents, with no change in their
general practice. If I get any response at all it is typically
"Well, you're right but I'm going to stick with my current
practices." I find that disappointing, particularly when people
acknowledge the correctness of the ethical argument I make.
On September 13, 2002 an opportunity fell into my lap during a
discussion of a newly reported security bug in MS-Word to shamelessly
plug this document in
http://slashdot.org/comments.pl?sid=39860&cid=4252157.
This generated a number of supportive email messages and a flurry of
typo corrections.
There has also been one, somewhat harsh, critique of version 1.27 of
this document. That critique and brief discussion can be found at
http://slashdot.org/comments.pl?sid=39860&cid=4264355. I have
modified the wording of section 1.9 and further
emphasized the point made at the beginning of
section 5.2 as a result.
5.4 How you can help
There are a number of ways you can help. These include, but are
hardly limited to
- Don't use MS-Word for document exchange
- Refer people who assume that you do use MS-Word for document
exchange to this or similar document.
- Promote the ideas described in this document. You may do this
by linking to it or redistributing it. See section 5.5
for copyright notice and redistribution restrictions.
5.5 About this document and copyright notice
This document is available in several formats from
http://www.goldmark.org/netrants/no-word/.
Copyright (c) 2001-2002 by Jeffrey Goldberg. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, v1.0 or later (the latest version is
presently available at http://www.opencontent.org/openpub/).
Distribution of the work or derivative of the work in any standard
(paper) book form is prohibited unless prior permission is obtained
from the copyright holder.
Please note that that if you wish do something with this that requires
my explicit permission, just ask. I suspect that I'd grant it for
most requests. Note also that the Open Publication License does allow
you to do many things with this document without my permission.
5.6 Shameless plug
If you have found this interesting, you may wish to see
other netrants I have at http://www.goldmark.org/netrants/.
5.7 Acknowledgement
Among others, I would like to thank Jim Diamond, Alan Frame, Dave
Reader, Pete Mitchell and Juhapekka Tolvanen for their comments on an
earlier draft. Your name can be added here as well. Just provide
useful comments and suggestions. Other people are acknowledge in the
change log of this document.
Footnotes:
1In the words of a U.S. federal judge.
2Using LATEX does have exactly the cost
described by those who raise the "de facto standard"
argument: I find myself limited in co-authors to a subset of clueful,
intelligent and network cooperative individuals.
File translated from
TEX
by
TTH,
version 3.60.
On 14 Jun 2005, 19:26.