None of it's rocket science, but Joel Spolsky does a wonderful job explaining what 'opened Office' means for you.

Here's the most important take-home: The file format is more useful for academics and archivists than it is for practical programmers. I like the following "hint" from Joel best, where he explains what to do if you really think your company needs to do any Office translations using the Office binary format.

[You think you need Office's binary format, and] your web hosting environment is Linux. Buy one Windows 2003 server, install a fully licensed copy of Word on it, and build a little web service that does the work. Half a day of work with C# and ASP.NET.


There's absolutely no good reason to roll anything with the Office formats in house. There's zero reason to ever open the formats other than curiosity. If you need to write to office, you use, as Joel suggests, comma separated values for Excel, html or rtf for Word. If you need more complicated Office documents, you create them with Office, as he suggests above.

Here's a key for those not familiar with Office: A web service means any app, anywhere can give your Windows 2003 server a call. If you need to ship this functionality in a heavy client, let your clients know that, at least for complicated Office formats, they're going to need to either have Office installed on a Windows box or access to the Internet. It's not insane to invest $2500 on a server with one license of Office, one license of Windows Server, and a C# coder for a day. Long run, you're going to easily save that cash which you would have poured into some geek geeking out over Office file formats.

So now that we're through with Joel and your business needs, another interesting set of questions remains. First there's the XML Office formats, created in part for those governments that demanded an open file format as part of their RFC requests. How easy is it to use XML Office files? Though certainly easier to use than binary (and I wonder if the open binary format isn't an attempt by Microsoft to more easily qualify for some of those same jobs), it's still a huge mess. Still, XML is the format for machines and humans, here aka programmers, to read at the same time. It's gotta be much easier to write out specific subsets of the Office formats using XML than binary formats. That is, if you have a specific need to produce very specific documents for Office, XML might be a good option, if your users have recent versions of Office.

Second, sure, in the typical capitalist view, it's better for five-thousand individual companies to spend twenty-five hundred dollars a piece on Microsoft products to create five thousand different but incredibly similar web services all over the US. But then what if those five thousand companies decides instead to form a co-op and remake Microsoft Office. Now where are we? What if they pitched in on Open Office to make it easier to automate via JSP? Would we save serious dough across the board now? What if we were already 80% feature complete with an Office replacement, and the binary information makes another 10% of that quite a bit easier for the co-op to finish up? What should they do now?

(Unfortuantely I'm still a fan of the Windows 2003 server with Office.)