I've enjoyed a couple of Elliotteย Rustyย Harold's books teaching Java that have been published by O'Reilly, and though I'm certainly not an XML fan (exhibit A & exhibit B), thought his online book, Processing XML with Java, would be worth a look. Even if I'm not the biggest fan of XML's due to its overuse (or at least its proponents attempts to have it overused), I do think it's great for the one purpose I've mentioned before (see exhibit A again), namely preparing your app to interface with anonymous applications.

At any rate, much of the first chapter deals with why you'd want to use XML in your Java apps. Here's his goal:

Would you rather write the code to send and receive orders that are formatted as nice, simple linefeed delimited files as shown in Exampleย 1.1 or as complex, marked up XML documents such as Exampleย 1.2? Both documents contain the same information. Most uninitiated developers prefer the first, simpler form. After all each piece of information is presented on a line by itself with no extraneous markup characters getting in the way. It's my goal to convince you that contrary to most developers' first intuition the second form is more robust, more extensible, and much easier to work with.

Well, there's no doubt XML is more robust that "rolling your own text files", though I'd tend to disagree with an arguement that it's more robust and extensible than a good rdbms system -- or even an underpowered one like Access or hsqldb. It's always interesting to read these comparisions that prop up the straw man of a text file and then knock it down with XML when rdbms's run rings around both in these "middle cases" where the users of the data-store aren't anonymous, and chances are they never will be.

And when it comes to small-time apps with little data, I'll stick to my guns and say a home-rolled text file, well wrapped in good object-oriented code, will do you one better than XML every time. Harold admits it himself, I'm afraid, making the case a shut one in my mind with this quote:

One of the original ten goals for XML was that "It shall be easy to write programs which process XML documents." Originally, this was interpreted as meaning that a "Desperate Perl Hacker" could write an XML parser in a weekend. Later it became clear that XML was simply too complex, even in its simplest form, for this goal to be met. However, the understanding of this requirement changed to mean that a typical programmer could use any of a number of free tools and libraries to process XML. Given this interpretation, the goal has most certainly been met. [bold was mine, of course]

I'll spare you the point by point refutation of the first chapter for now, but thought I'd at least blog this much. The advantage of XML is twofold, neither of which wins it for me. One, it's extensible and quite easy to change its design after the fact. This has kept me coming back to XML now and again, hoping it was more elegant than I last remembered (so far, no). Second, XML is human readable and machine readable, which Harold talks about at length. But if it's machine written, its just as likely to be mis-written as the plain text file, and being human readable only matters if it's a mistake that's easily caught/eyeballed and if the data is routinuely spotchecked (which implies a big system, indeed!). Otherwise you've just added a maintenance nightmare waiting to happen with overhead you simply don't need to include inside your application.

That said, I hope to inch my way through the online book. ;^)