Microsoft Word 2011 Bloated PDFs

I’m documenting this here in case someone else has the same problem. Today I wrote a ten-page response to a government motion, and when I saved it (from MS Word 2011) to PDF format it was over 5 megabytes—too big to be filed via ECF. Poking around The Google, I found a suggestion that I save it first as a Postscript file, then open that with Preview.

After trying that, and other things in the same vein, I found a better solution: save the file as a .doc file (Word ’97-2004 format), then save that as a PDF, all within MS Word. Here’s how a smaller document sizes out:

Screen Shot 2013 03 10 at 11 56 07 PM

The document (three and a half pages) saved in .DOC format takes up 37 kilobytes; the PDF of that document is even smaller, at 27 KB.

Save it as a .DOCX instead, and it bloats to 118 KB. Save that to PDF and three and a half pages of text take up a staggering 494 KB—more than eighteen times as big as it needs to be.

Astounding. What the hell is wrong with Microsoft?

About Mark Bennett

Mark Bennett got his letter of marque from the Supreme Court of Texas in May 1995. He is famous for having no sense of humor when it comes to totalitarianism.
This entry was posted in Uncategorized. Bookmark the permalink.

13 Responses to Microsoft Word 2011 Bloated PDFs

  1. Ric Moore says:

    You’re just finding out… Heh, I use LibreOffice for all of my PDF conversions. Nice, tight and small file sizes …right out of the box. At SOME point, all documents should be in ODF (Open Document Format), which is now standard in most European countries.

    It seems that they don’t want all of their documents hostage to a monopoly that they routinely have to haul into court to make them play nice. Microsoft tried to sabotage the Open Document standard, so they had to slap them around in court for that, too. Here in the states, it’s like there is no problem with one company running things, although Apple has been changing that tune.

    I haven’t owned a copy of Windows since 3.1 and never looked back. Right now, I’m running Ubuntu 12.10 and it’s sweet. Our website server is running on a headless version of Debian Linux and it hauls coal. Without the overhead for a Desktop, it’s pure text-mode on steroids, devoted to only serving Apache2 webpages. Oh yeah, wordpress runs without a hitch. So, find some old box, download Ubuntu, burn it to a DVD, and install it. Play with it. See if you find you can live nicely and never pay for software again. And, your PDF files will never be bigger than they need to be.

    I respect you too much to steer you wrong! :) Ric

    • Jeff Gamso says:

      I understand almost nothing about computers. I do my blog on Blogger because it requires even less skill than WordPress. And I switched to Mac 6 years ago because my tech guy made me when I needed a new computer and so basically had to give up WordPerfect. So pretty much everything Ric said is way beyond my level of competence/understanding.

      But I was curious and checked on the reply brief I filed ECF yesterday. 53 KB as saved as .doc in Word 2008, 160 KB for the PDF I sent the court (created through the Mac’s print function), 180 KB for the version the court sent with the notification of filing.

      Not sure what any of that means.

      (I do, however, hate everything Microsoft.)

    • I’ve used LibreOffice for years (because some lawyers still use WordPerfect, and send me files in that format). But for some reason LibreOffice doesn’t recognize my full set of fonts. So when I want to use Equity Text B Regular, I can’t, and if I open a document with text in Equity Text B Regular (which is my preferred typeface) LibreOffice displays it and prints it as Equity Text B Italic.

  2. Jeremy Kridel says:

    Another way of saying what Ric said: The switch from .doc to .docx came with a lot of plain-text markup designed to create something a bit more “open” and standards-compliant, while also doing a little of the co-opting. It had the result of adding a lot of extra stuff to the files underneath the hood. For most persons’ purposes it’s a kind of white noise–not bothersome, doesn’t cause problems with getting things done.

    Until it does. All that extra markup gets crammed into the PDF you tried to create.

    That said, I was a software developer for eight years before going into law. Know what? Using Linux distributions has been too painful for me to put up with–I know how and therefore do spend way too much time tweaking the system, and far too little time getting real work done. I say stick with the Mac if you’re happy with it–but maybe change your Word defaults to stick with the “old” .doc format if you can get away with it.

    • Ric Moore says:

      Jeremy, Ubuntu has made Linux somewhat easier on the user, as far as software goes. The downside is that yes, you do have to understand the computer a bit more than with the proprietary OS’s. The plus side is that you DO get a lot more choices to make.

      I install stock Ubuntu, which comes with the Unity desktop (think Windows8) , then immediately after I install the XFCE desktop on top, as it works the way I expect a desktop to work, almost like Win95 with the drop down start menu. It also gives me 6 virtual desktops to use, kinda like stacking monitors into a 2X3 stack and panning among them with one physical monitor. I just mouse move to the edge and there is another desktop window to run an app on. Meh, I like it. And, the pdf’s I create are a lot smaller than the same word document converted to pdf using Word.

  3. Mark Lyon says:

    I had a similar issue – I was using a font that Office decided not to embed, it was converting my doc to an image. If I used the ‘save to PDF’ option, it was multi-MB. If I printed to the PDF printer, it was just a few hundred kb.

    If you don’t have Adobe’s Acrobat Printer installed, check out PDF 995.

    • Mark Draughn says:

      Oh God, that would do it. The Microsoft Word document format has been through about 25 years of evolution and is famously confusing. Adobe’s PDF format has roots just as old, and it’s a hideous mishmash of cross references, version changes, and optional embedded content. Any piece of software that translates from one to the other is likely to make decisions that screw up the rendering, bloat the resulting document, or both. Add in the fact that Microsoft really hopes you’ll switch to their XPS alternative, and anything could happen.

  4. Ross says:

    I use PDF Creator set up as a printer choice. It seems to do a decent job of making pdf’s,

  5. Keith says:

    Mark,

    Do you get the same bloat if you print to Adobe PDF rather than saving it as a PDF?

  6. Jeremy Kridel says:

    Hate to say it, but ripping and replacing an OS and a window manager for the sake of smaller PDFs is kind of user-unfriendly, no?

    If I were an economist I would probably try to argue MS should fix it or consumers will look elsewhere. But 1) mostly, consumers won’t, and will suffer instead; and 2) having been jerked around as an IT guy for years of platform changes, I’ll just blame MS for the problem. :-)

  7. Andrew Dillon says:

    I noticed the same problem with .doc/.docx conversion a long time ago. It happens in Word 2011 and it happens with Microsoft Word 2010 in Windows. With Apple, I’ve taken to Nisus Writer Pro. It plays really well with Equity (and Concourse) and I find it much more enjoyable to write and format in.

    I have Adobe Pro installed but I stopped using its PDF writer. I have a lot more success with embedding subsets and with controlling file size when I use the “Save As PDF” option from Apple’s print dialogue. (That’s for all programs, including MS Word.)

  8. bryan simmons says:

    If I ever meet Bill Gates, I am going to punch him in the nose. I just got Windows 8 and now I really hate the man. I really liked Windows XP and then my laptop died. Right in the nose. He’s going to get it right in the nose.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>