Put the knife down and take a green herb, dude. (c) Ruffin Bailey 2001-2021

From Chapter 5. of Processing XML with Java, "Reading XML":

The main point is this: most programs you write are going to read documents written in a specific XML vocabulary. They are not going to be designed to handle absolutely any well-formed document that comes down the pipe. Your programs will make assumptions about the content and structure of those documents, just as they now make assumptions about the content and structure of external objects.

That's really an intelligent way of coming at reading XML, and it's the way I argue is the way it should, in about 75% of real world cases where XML is used, be done. That Harold does it, even tongue in cheek, is far more important than my recommendation, of course. His books on Java are top rate stuff, imo. The bottom line here is that you know darn well what the XML file you want to read is going to look like. Why put in any libraries for parsing it all out that you don't absolutely need? KISS.

Here's some of the code where he's reading out the XML from his specific example.

private static BigInteger readFibonacciXMLRPCResponse(
   InputStream in) throws IOException, NumberFormatException, 
   StringIndexOutOfBoundsException {
    
    StringBuffer sb = new StringBuffer();
    Reader reader = new InputStreamReader(in, "UTF-8");
    int c;
    while ((c = in.read()) != -1) sb.append((char) c);
    
    String document = sb.toString();
    String startTag = "";
    String endTag = "";
    int start = document.indexOf(startTag) + startTag.length();
    int end = document.indexOf(endTag);
    String result = document.substring(start, end);
    return new BigInteger(result);
    
  }

I think you can see what's going on. Honestly, that's the right way to do it in what is a much more common, real-world situation than one might suspect. Now granted, Harold adds...

Straight text parsing is not the appropriate tool with which to navigate an XML document. The structure and semantics of an XML document is encoded in the document’s markup, its tags and its attributes; and you need a tool that is designed to recognize and understand this structure as well as reporting any possible errors in this structure. This tool is called an XML parser.

But think of how many times you've seen XML used -- essentially as a text file. There's absolutely no reason to learn all the new APIs necessary to parse XML in these cases. You could just as easily have created your own file format, but then I understand why one might use XML instead. There is structure that would take weeks for a typical programming team to put together in conference. Why not lean on an over-engineered structured for your flat files? If you do step past known, easy to consume formats, then you don't have to re-engineer to use XML APIs.

XML's strength is to compose very large batches of structured data for anonymous consumption, but that doesn't mean that's the way it's most commonly used, nor the only way that XML's structure can be used. Let your approach, both for the file and the parsing, match your needs.

Labels: java, xml

title: Put the knife down and take a green herb, dude.	descrip: One feller's views on the state of everyday computer science & its application (and now, OTHER STUFF) who isn't rich enough to shell out for www.myfreakinfirst-andlast-name.com Using 89% of the same design the blog had in 2001.
FOR ENTERTAINMENT PURPOSES ONLY!!! Back-up your data and, when you bike, always wear white. As an Amazon Associate, I earn from qualifying purchases. Affiliate links in green.
x MarkUpDown is the best Markdown editor for professionals on Windows 10. It includes two-pane live preview, in-app uploads to imgur for image hosting, and MultiMarkdown table support. Features you won't find anywhere else include... MarkUpDown Multiline Table & Bootstrap Grid support. Beautiful Easy Actions that keep the Markdown flowing. HTML paste to paste HTML source into your documents. You've wasted more than $15 of your time looking for a great Markdown editor. Stop looking. MarkUpDown is the app you're looking for. Learn more or head over to the 'Store now!

Tuesday, May 27, 2008
Elliotte Rusty Harold's view on "Reading XML" From Chapter 5. of Processing XML with Java, "Reading XML": The main point is this: most programs you write are going to read documents written in a specific XML vocabulary. They are not going to be designed to handle absolutely any well-formed document that comes down the pipe. Your programs will make assumptions about the content and structure of those documents, just as they now make assumptions about the content and structure of external objects. That's really an intelligent way of coming at reading XML, and it's the way I argue is the way it should, in about 75% of real world cases where XML is used, be done. That Harold does it, even tongue in cheek, is far more important than my recommendation, of course. His books on Java are top rate stuff, imo. The bottom line here is that you know darn well what the XML file you want to read is going to look like. Why put in any libraries for parsing it all out that you don't absolutely need? KISS. Here's some of the code where he's reading out the XML from his specific example. private static BigInteger readFibonacciXMLRPCResponse( InputStream in) throws IOException, NumberFormatException, StringIndexOutOfBoundsException { StringBuffer sb = new StringBuffer(); Reader reader = new InputStreamReader(in, "UTF-8"); int c; while ((c = in.read()) != -1) sb.append((char) c); String document = sb.toString(); String startTag = ""; String endTag = ""; int start = document.indexOf(startTag) + startTag.length(); int end = document.indexOf(endTag); String result = document.substring(start, end); return new BigInteger(result); } I think you can see what's going on. Honestly, that's the right way to do it in what is a much more common, real-world situation than one might suspect. Now granted, Harold adds... Straight text parsing is not the appropriate tool with which to navigate an XML document. The structure and semantics of an XML document is encoded in the document’s markup, its tags and its attributes; and you need a tool that is designed to recognize and understand this structure as well as reporting any possible errors in this structure. This tool is called an XML parser. But think of how many times you've seen XML used -- essentially as a text file. There's absolutely no reason to learn all the new APIs necessary to parse XML in these cases. You could just as easily have created your own file format, but then I understand why one might use XML instead. There is structure that would take weeks for a typical programming team to put together in conference. Why not lean on an over-engineered structured for your flat files? If you do step past known, easy to consume formats, then you don't have to re-engineer to use XML APIs. XML's strength is to compose very large batches of structured data for anonymous consumption, but that doesn't mean that's the way it's most commonly used, nor the only way that XML's structure can be used. Let your approach, both for the file and the parsing, match your needs. Labels: java, xml posted by ruffin at 5/27/2008 12:33:00 AM

<< Older \| Newer >>