Crushed by the Wheels of Industry

Being the geek in the family can get you in trouble. I was sat at a social function with my wife's friends, all happily chattering away in Mandarin. I can barely count to ten in the language so meaningful conversation is a little difficult unless it's about arithmetic. I got bored, which is always a dangerous thing.

“My daughter has a new computer, would you help set it up ?” asked one of my wife's friends. “I'd love to,” I replied. Finally, something to do that I would enjoy !

It was a new laptop with Windows Vista installed. I'm still amazed at how quickly the sales channel cleared of Windows XP computers, leaving Vista the only product on offer. I recently had to buy my brother a new laptop to replace the one he had stolen from his house (he lives in Sheffield in the UK where such things are not uncommon, unfortunately), and even at Frys electronics here in the valley I had to search in the “discontinued” bin to get him a machine with Windows XP installed. Although a brand new machine this thing booted slowly.

“Can you put some useful software on there; stuff she can use to do her homework ?,” he said. His daughter is around eleven years old so was probably starting to have to write essays at school. “Sure. Does she have an Office suite installed ?” I asked. “There's a demo one on there I think but I think it's expensive to buy the real version,” he said. I checked, it was the trial version of Microsoft Office 2007, installed by default by the vendor. I knew just the thing to do and started to download OpenOffice.

I grabbed lots of freely available software from the Web. All of it useful, most of it fun and educational. Not all of it “Free Software” of course, some proprietary software available “free for personal use”. I may be dogmatic about freedom on my own machines, but when setting up for a child I'm not going to try and preach the Free Software religion, like some dotty old uncle rambling on in the corner. I set up OpenOffice to save by default in the Microsoft Office file formats so she wouldn't have to worry about swapping files with her friends and teachers.

Feeling very pleased with myself I started to show off all the new things I'd set up on the machine. “... and you even have a free office suite for your homework, compatible with Microsoft Office.” I proudly announced at the end of the demo. “Great, can I see the homework I've already done ?” she asked. Her father had told me this was a brand new machine, and I didn't know she'd been using it for a while, but overconfident in my skills I navigated over to her “Documents folder”, and to my dismay saw several files – ending not in .DOC as I expected, but in .DOCX. This is the new document format, OfficeOpenXML, introduced by Microsoft for Office 2007. I'd completely forgotten about this, not being a Microsoft Office user (I'm waiting until there's a Linux port). I tried opening them in OpenOffice. No deal. Now being tidy by nature I'd removed the trial version of Microsoft Office 2007 in order to save space. An eight-hundred megabyte download and lots of tedious registrations later (I used my email address instead of hers for the “helpful product messages” it promised to send. It seemed the least I could do) and the trial version was reinstalled.

“I'll just convert these back to the .DOC format the older versions use. That's most likely the version you have at school.” I told her. But when I read the files in Word 2007 and eventually found the “Save As” entry in the new menu system I discovered to my horror it was grayed out. “This feature is only available in the full version of Office 2007.” popped up a helpful little message. “Click here to purchase it.” Getting increasingly worried, I decided to try a more desperate measure. I selected the whole file and looked for the “Copy and Paste” option. I might lose the formatting this way, but at least I'd get the text of the essay she'd written. Copy and paste were disabled in the same way, and with the same message.

Copy and paste were disabled. Think about the fear and paranoia that led to that decision in the product design meeting for the trial version. “We want people to save in the new formats. The new formats are better.” So much so that all customer choice must be disabled by default. Choice is an optional extra, only available after purchase. The new Office .DOCX format happens to be incompatible with OpenOffice as well. Quite by accident I'm sure. What this means is that if you use the trial version of Office 2007 for thirty days, all documents you create will be completely unreadable by any other software unless you buy back access to your documents by purchasing the full version of the software. No easy way to get your documents out.

The story does have a happy ending. Being a geek I did get the data out. But I was very embarrassed about it. I got her to email the documents to me, and used the OfficeOpenXML (.DOCX format) translator that was created by my friend Michael Meeks at Novell. I still have one old SuSE Linux virtual machine around at home so I can fix bugs in Samba for my old company, and this was enough to install the translator into the Novell version of OpenOffice and retrieve the text. I emailed her back .DOC format files which she was happy with (and hopefully her homework got handed in on time). I also learned an important lesson about not making assumptions about what people need and expect, and I will be a little more careful next time.

Hopefully this is an instructive lesson in the usefulness of public, open standards. In collaboration with others, OpenOffice.org has created and standardized the Open Document Format (ODF) standard. Had both word processors supported Open Document Format as an option then it wouldn't have mattered if she had been using Microsoft Office, OpenOffice or any other common word processor. Possibly complex formatting or presentation information wouldn't have translated, but for most people all they're trying to do is write a document in such a a way that it looks attractive so their management will like it. Or to get a good grade on homework. An open document format is such an obvious public benefit that I'm surprised it took so long to be standardized.

Please help support Open Document Format in your local or national government, community or school. As you may have noticed if you follow the press, there's a lot of lobbying going on for and against, and it needs all the popular support it can get. If you don't you might end up getting left with the tower of babel document formats we have now, where people who can't afford to buy proprietary office software can't communicate with their government or most businesses. This isn't good for them, and isn't good for society as a whole. Write to your representatives and tell them public money can be saved by standardizing on ODF. Once plug-ins are available for Microsoft Office they can continue to use the software they've already bought, so this doesn't even mean a wholesale switch of existing software or retraining cost.

Being a geek, I'll leave you with a puzzle. I did work out a another way I could have retrieved her homework files without having to use a DOCX translator. Can you think of what I might have in mind ? Let me know what you would have done in this circumstance. In true Open Source fashion let's work out how to solve the problem in as many ways as we can think of !

Jeremy Allison



Comments

Errr...

Did you try drag and drop? :)))

-----
Ceterum censeo, Microsoft delenda est

I did work out a another way

I did work out a another way I could have retrieved her homework files without having to use a DOCX translator. Can you think of what I might have in mind ?

Screenshots and OCR? Using sed to strip out the XML markup? Using xslt to convert the document to something more useful? There are probably hundreds of ways of doing it. But all are painful, and wouldn't have been necessary had MS been using a standard document format.

getting the data out

print -> ps -> ps2ascii
Or is printing disabled too?

Yay !

That's exactly the way I was thinking of. I'd have used it if I was in my last desparate attempt to get the data out :-).

Jeremy.

Really?

Are you sure the print option was enabled? If something as basic as cut and paste was disabled, i'd be really surprised if print was working.

The answer to your puzzle

How about Microsoft Word viewer utility? It is a free download from Microsoft and you can cut and paster from it, just not edit.

Or use some online file format conversion utility.

Was that the answer you were thinking of?

I didn't know about the Word viewer

That's a really good suggestion - I should have tried that one.

Jeremy.

MS Converter

Microsoft used to have a downloadable tool that would convert the new format to the old format. That has recently been removed and now they just have a compatibility pack that updates older versions of MS Office.

There are also online sites that will convert the format:

http://www.docx2doc.com/
http://docx-converter.com/

Re: MS Converter

Well look at that. Microsoft creating a black-market for converting office documents, the same way drug prohibition has created black markets in organic chemicals and plant extracts.

What astounds me is that, since the Microsoft management is not made up of morons or imbeciles (technically speaking, not figuratively), why do they think this is going to work for them?

Are they really taking PT Barnum so seriously, as to imagine they can fool enough people long enough to retrieve their stock options?

strings

will do it perhaps.

how about

Editing the raw xml?

Reply

How about emailing it ? Then you could copy/paste.

Jazzman

Puzzle Answer.

Python+scite+libxml2...

PDF route

Oh, well, another way is to print to a PDF printer like the one in KDE then import the PDF into kword then save that as ODF.

Isn't it just a ZIP in disguise?

Just change the extension from .docx to .zip. Now open it and look inside at the XML content; you will find your text, and can copy and paste all you want.

PrintToPDF

PrintToPDF (or is PDFCreator) - Anyway one of the utilities on the "OpenCD" (Collection of Open Source software for Windows) Then use the PDF tools to turn them back into workable documents???

Wow... Disabled "cut and paste" - Microsoft must REALLY be getting desperate...

Wow! Microsoft products - free handcuffs included...

cut and paste

Cut and Paste the whole document from Word to OpenOffice?

Don't know why you felt embarassed you only did the right thing

The poor child and her parents would have had to shell out for the full version (or a student license at least) for her to be able to keep her own homework. All you did is notice and fix the problem before it actually happened !

DOCX burn

I'm feeling the burn of Microsoft's proprietary new "Open XML" document formats something bad now. Some Microsoft lover decided to push out Office 2007 to the PCs (thousands) where I work, and almost overnight DOCX and XLSX became the new standard for storing documents (barf! gag! ug!).

I never realized how dependent we had become on DOC files. This looks bad for OpenOffice because it suddenly can't read "standard" document files! Obviously Microsoft intended this. Annoyingly even some Microsoft stuff doesn't work with DOCX!

The world needs to ditch Microsoft now in a bad way. They keep pulling these kinds of stunts and it needs to stop.

My Favorite DOCX solution....

.... is the one that would probably annoy Redmond the most. Apple Pages reads DOCX files just fine, and can save in DOC or DOCX. I have become the default translator (Well, my Nine Year Old Pismo, actually) for a group of collaborators, one of who can't seem to grasp the "Save As" concept.

Back to top