MacBook, defective by design banner

title:
Put the knife down and take a green herb, dude.


descrip:
One feller's views on the state of everyday computer science & its application (and now, OTHER STUFF) who isn't rich enough to shell out for www.myfreakinfirst-andlast-name.com



FOR ENTERTAINMENT PURPOSES ONLY!!! Back-up your data and always wear white.
URLs I want to remember:
* Atari 2600 programming on your Mac
* joel on software (tip pt)
* Professional links: resume, github, paltry StackOverflow * Regular Expression Introduction (copy)
* The hex editor whose name I forget
* JSONLint to pretty-ify JSON
* Using CommonDialog in VB 6 * Free zip utils
* that hardware vendor review site I forget about is here * Javascript 1.5 ref
* Read the bits about the zone * Find column in sql server db by name
* Giant ASCII Textifier in Stick Figures (in Ivrit) * Quick intro to Javascript
* Don't [over-]sweat "micro-optimization" * Parsing str's in VB6
* .ToString("yyyy-MM-dd HH:mm:ss.fff", CultureInfo.InvariantCulture); (src) * Break on a Lenovo T430: Fn+Alt+B
email if ya gotta, RSS if ya wanna RSS, ¢ if you're keypadless
 
 
Thursday, July 02, 2015

TIL there's a ["conflict of interest" "behavioral guideline" at Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:Conflict_of_interest). Seems this would be easy to misuse, but it would've been useful when I tried to make the [entry for the Midwest Book Review](https://en.wikipedia.org/w/index.php?title=Midwest_Book_Review&type=revision&diff=339443101&oldid=339436355) a little more informative, a site that I'm suspicious sells positive Amazon reviews, back when I stumbled over them in 2010.

Fwiw, I'd found an insanely positive review for a book I was considering buying, and the review didn't seem to include the sorts of specifics someone who'd really read the book would've used. I looked over the Midwest Book Review's history, and -- I'm doing this on memory; could be off a little -- the reviews were all very high, with the gross majority 5's. Turned out they took review solicitations.

There was a dude who wouldn't quit editing out some flavor of the following passage:

> Jordan Lapp, an author, asked Mr. Cox [Midwest Book Review's Editor in
> Chief] why Mr. Cox felt that Amazon's rating system was flawed, and why
> "all of the books [Midwest Book Review] rate merit a 5 star rating."
> Mr. Cox answered by saying, "So I instructed our webmaster (who does
> all the posting for reviews generated 'in-house' by the Midwest Book
> Review editorial staff) to use 5 if the book was given a positive
> recommendation." Mr. Cox continued by explaining that, "for a book to
> make it all the way through the Midwest Book Review process... it
> merited the highest recommendation available under the Amazon rating
> system. Inferior books, flawed books, substandard books are assumed to
> have been weeded out and never made it to the 'finish line' of
> publication in one of our book review magazines."

They do realize those less than 5 star "reviews" would be just as more useful than the ones they let out, right? That is, I'm going out on a limb and say that those sub-5 reviews don't exist. Or at least the "in-house" team isn't paid for writing them.

Reminds you of the [Seinfeld episode about car reservations](https://www.youtube.com/watch?v=A7uvttu8ct0), doesn't it?

> "We know why we have the reviews."
>
> "I don't think you do. If you did, I'd have a clue about the book. See,
> you know how to *write* the review, you just don't know how to put
> *critique* in there. And that's really the most important part of
> the review. The reading and *critiquing*. Anybody can just *write* them..."

Sure, that's my take on the quote, but do note I didn't include anything from Seinfeld in my Wikipedia edit. You're welcome to make your own conclusions. ;^)

And the controversial source for this potentially damning material from Mr. Cox? [The Midwest Book Review's website](http://www.midwestbookreview.com/jimcox/jun_07.htm). The page with that quote is still there.

Anyhow, I think once you googled Cirt, the anti-editor who kept taking out my changes, enough, you found a connection. If true, this flag would've really helped.

*sigh* I've probably detailed that here before. The strange thing to me is how much of what's on Wikipedia can be control by those with the most endurance for making edits. Not exactly a merit-based environment at its edges (core?).

Of course what's most interesting is that it'd be possible to algorithmically track places where folks used this tool to influence Wikipedia's contents, and see if there are any obvious categories of COI usages.


posted by ruffin at 7/02/2015 10:20:00 AM
0 comments
 
Wednesday, July 01, 2015

As evidenced by the last post, I've been boning up on my JSP & Java Servlet skillz this week as we prepare to migrate an app I prototyped in Node over to a WebLogic host, its eventual home.

I thought it'd be safe to develop against Tomcat, but wasn't absolutely sure, so I started googling around a little. So far, so good. Looks like vanilla Tomcat has, at worst, a subset of the features of every other major servlet container.

One of the most promising articles by title, [WebSphere vs. JBoss vs. WebLogic vs. Tomcat – presentation from the InterConnect 2015](http://whywebsphere.com/2015/02/27/websphere-vs-jboss-vs-weblogic-vs-tomcat-presentation-from-the-interconnect-2015/) (it's got everything! All the major servlet containers! It's from this year!) turned out to be a painful foray into marketing-by-blog from IBM, but there was one bonus, the slide (which is now above, but will later be) below:

(image later goes here)

So 90% of TOC for a software system *isn't* licensing. That's probably true to no-worse-than-trueish.

An interesting exercise, however, would be to put numbers -- okay, okay, *first* you have to recategorize ("Developer, admin and end-user training cost" falls in the same category? RLY? etc) -- beside each of the other categories.

Though it's worth saying that a 9% savings is significant any way you look at it, especially for garage-companies where developer and admin cost is paid in elbow grease.


posted by ruffin at 7/01/2015 01:57:00 PM
0 comments
 
Tuesday, June 30, 2015

Well, all I needed to get *up and running* on JSP and Servlets (current project might be switching stacks) [is here](https://www.youtube.com/playlist?list=PLE0F6C1917A427E96). Very good video package so far. Very basic, but thorough, which makes remembering all this stuff that I haven't used in probably 10-11 years pretty simple.

It's all different dialects of the same language, but it's useful to have a primer before changing regions.

I just heard someone on a Mac podcast complain about an irrational hate of Java. I don't get it. Java is a good language, other than [the ivory tower syndrome that infests many of its stock objects](http://myfreakinname.blogspot.com/2004/02/watch-out-for-apis-link-muses-about.html). There's a reason Microsoft stole a lot from Java when they put together C#, to the point that I'm happy working in either for faceless code. Maybe Objective-C users are prone to another syndrome, Stockholm.


posted by ruffin at 6/30/2015 11:39:00 AM
0 comments

From [CNet](http://www.cnet.com/news/siri-why-apple-musics-silliest-feature-may-be-the-crucial-one/):
> Also working in Apple's favor: iPhone owners already use Siri for other
> simple tasks. About 42 percent of iPhone owners turn to Siri at least
> once a month, according to ComScore.

Is it just me, or is that number waaaay under where you would've expected it'd be?

I try to tell folks to use Siri for directions, which it's pretty good at providing, and which seems to, surprisingly, be a difficult thing for folks to do on their own "by hand". I can almost get it to text for me too, especially when I'm plugged in in the car. "Hey, Siri. Text [pseudo Siri-phonetic pronounciation] blah blah some message blah."

But half of iPhone users not even *playing* with Siri once a month? That seems like a fail. I wonder how many iPhone users in that survey still have iPhones that can't use Siri. It can't be many. Forty-two percent, though a famous number, here is a real fail.

That said, the point of the CNet piece -- that Siri will be AppleMusic's differentiator -- is interesting. Their example "play the top 10 alternative songs now" is actually pretty compelling. It's Pandora stations with potentially static, user-defined rules on demand. That's pretty cool. Of course, see [why I think Apple's (naturally?) moving to streaming music](http://myfreakinname.blogspot.com/2015/06/apple-likes-streaming-because-phones.html) to keep your grain of salt handy.


posted by ruffin at 6/30/2015 10:16:00 AM
0 comments
 
Saturday, June 27, 2015

Why does Apple like streaming music? Because they get to trade on-device storage for cellular bandwidth. And I don't say that because they want to skimp on device-based storage. I'm saying that because folks that can take advantage of unlimited streaming pay for high bandwidth plans, and high bandwidth plans are expensive. You want to keep phones high[ish] end devices that carry status, and right now being able to stream means you've got cash to burn on frivolous cellular bandwidth.

(Yes, I said frivolous. I'm old, I know, I know. GET OFF MY LAWN. But I do cringe when I listen to a few minutes of MLB on my iPhone, being a Ting user. These continuous bandwidth things just aren't necessities. And on-device music is pretty cheap. If you don't like buying, try etree.org)

By keeping high bandwidth uses mainstream, Apple keeps phone use upmarket, and that's an important part of their company's future. Stay tuned for more high-bandwidth (see current exhibit Facetime over cellular...) uses to come. It's not that there aren't awesome things bandwidth makes possible, but it's not a coincidence or convergent evolution alone that put Apple on board.

Labels: , ,


posted by ruffin at 6/27/2015 01:54:00 PM
0 comments

From a comment on a Lifehacker article:

Also, if you open the Image Capture app, at the bottom left there’s a collapsible menu that allows you to select what application automatically opens whenever any media is mounted.

Yes, please, thank you. I had a real rough ride with Photos, and eventually gave up. I'm not sure why Apple keeps thinking a single "file" with all your pictures is a better paradigm than a real file system. And they can't write a good photo management tool to save their lives. I ran into import stalls, app freezes, and the inability to delete (or even cancel deleting) the giant photo file until I turned off the iMac, rebooted, and used rm -rf *. That shouldn't be necessary. NOTE: Don't type that command in your terminal unless you darn well know what you're doing.

Photos in OS X 10.9 stinks. iPhoto tried hard, but stunk. That's it. I'm done. Again. I hope.

So I'm back to using Image Capture, which I love, with Picasa, which I also have learned to love. Picasa is quick and defensive, sitting politely on top of your file system without really screwing with it at all. It's an easy paradigm, and it updates quickly when you use the Finder to move your files around instead of its own interface. Because, you know, files.

The only problem? Photos keeps trying to open when I attach an iOS device.

Beautiful. Not even a button for "create a new library", which I believe iPhotos had. I'm growing increasingly confident Apple's testing doesn't extend to edge cases and what I'll call "exploratory misuse". If you don't follow their oxymoronically private Flowchart of Proper Apple Software Usage, too bad. Why isn't there a "make new library" button? Well, because nobody ever needs to do that. Why won't the Photos library delete in 3 hours? Well, because we all know the first rule of Photos Club: Nobody deletes Photos Club. (rm -rf was much faster, in case you're wondering.)

This Image Capture trick does the trick (though you have to do it for each source) and makes me like Image Capture more. Simpler is often better.

Labels: , ,


posted by ruffin at 6/27/2015 10:27:00 AM
0 comments
 
Friday, June 26, 2015

A good, thought-provoking post from Actively Lazy today:

I think we massively underestimate the cost of coordination and communication when building software.

That's true. Full stop. This, the communication price, is exactly why folks are still quoting The Mythical Man-Month. Communication makes [hu]man-months mythical.

But what we do with this discovery is what's really important. It's easy to find one solution and think it's the only solution, as I believe Actively Lazy has done. Let's explore two.


Pair Programming

Here's Mr. Green's (Actively Lazy's author's) take home (emphasis mine):

From the outside its very easy to miss: a quick 5 minute conversation laden with jargon. And yet… this is where the magic happens: this is where the design comes from. But if that 5 minute conversation interrupted someone’s work, the next 45 minutes could be lost while they try and reload into memory what they were working on. Pile up a few of these interruptions in your day, and no wonder it feels like you’re swimming upstream. [anybody else hear Joel Spolsky now? -mfn]

Clearly, what we should have been doing but weren’t was pairing. That way there would only have been one narrative thread. One sequence of ideas being applied at a time. ... There is no cost of interruption when you’re both already there, immersed in the problem. ...

So in fact: two people can be more productive than one. Two people pairing is definitely better than one person working on their own. ... It’s all based on a false assumption: that two people working on different machines are twice as productive as one person working alone. Once you realise that this assumption is fundamentally flawed, the “cost” of pairing evaporates. Instead pairing removes the cost of coordination between two developers: no interruptions, no divergent ideas, no merge conflicts.

My quoting makes the argument a little worse; you really should read the entire post in context. But you get the picture. He's arguing that pair programming pays the "coordination and communication" costs as-you-go so that you don't have to pay the piper in spades, to mix some metaphors, later.


Pair programming trades function for quality

When you work in pairs, you create less working code that is higher quality. This is nearly a truism. If you're somewhere code review's benefits aren't appreciated, run. I'm not saying you should require or even use a lot of code review -- my jury's still deliberating on its best use -- but reviewed code is of higher quality than code that isn't. It may not necessarily do the job better-qua-[characteristic X], but it's higher quality code.[1]

Just to get your started, here are some common Characteristic Xs (feel like we're making Powerpuff Girls):

  • Best practices
  • Efficiency/Scalability
  • Normalization/smart reuse
  • Standardization
  • Proper error handling


Quality vs. Functionality

So is pair programming the answer to removing communication problems? Sort of. Depends on why you think you need to communicate! One key here is that Green only has two developers:

Me and one other developer have been working on a new, greenfield project.

If you only have two devs, of course pair programming kills the problem of deferred communication. But if you have 40 devs, you've now got 20 pairs that could need to sync back up.

Twenty pairs is still much, much better than 40 individuals. The key take-home is that pair programming makes for higher quality software, not more of it. There's more than a single metric for evaluating software's worth.

Another side of code is functionality, often reduced to the reasonably useful metric, lines of code (loc).

Aside: If used correctly, lines of code can give you a general idea of how productive your devs have been. Though insert the usual caveats:

  • loc is really only useful comparing a specific dev's output to her or his own
  • that an increase in loc for a dev often means they're pumping out code too fast, and feel stressed
  • brilliant solutions don't always produce lots of code; usually it's the reverse

And there's a performance bar that tells us if code is good enough. That is, pair programmed code may be higher quality, but it may also not be significantly faster or use significantly less memory, etc. The biggest problem in evaluating how much quality you need is that often you don't know how significant a bad LINQ statement might be until you run it at scale. Yet I think the suggestion is that you have to bias your development culture so that it always strives for high quality with respect to best practices. You want good scalability? Don't overuse ORMs. Don't push logic into the client. Write good SQL. Take time to plan your schema.

You get the point. The bottom line: Your code is higher quality if it's reviewed for standard practices, errors, and efficiency/scalability first, even if that review time reduces how much functionality you can build. But remember that functionality is why you're in business: you shouldn't sacrifice function for idealistic quality. In other words, you need to discover...


The nasty truth: There is "good enough"

Imperfect code can still provide acceptable functionality. News flash: There is no perfect code. [Most] Any solution is beyond a certain level of complexity is subjective.

That there are many subjective Right Answers and because you can only reasonably select one of them is precisely why you ask for advice before tackling particularly difficult problems. That is, this is why you design code before coding. Even when you're going to tackle a major problem alone, you "pre-[re]view" with someone else (don't you?!!) your selection from all the different ways you [both] can think of to solve it.

The reason you consult someone else first is because you know the problem you're working on needs an extra helping of quality mixed in with the quantity. Your solution -- or at least its design -- needs a second set of eyes. In a sense, though you may not have touched a keyboard [much], you are already pair programming.

The conflation of these two code metrics -- functionality & quality -- is what provokes Mr. Green to say...

Unwinding a few days, we probably would have got more done with just one person working. That way there would only be one narrative thread through the code, one sequence of refactoring steps at a time.

Wait, what – say that again: we would have got more done last week if only one person had been working on it.

Sure, one person working alone could bring the code back to a single narrative, but it'd be one person's narrative. You've right back into communication debt. Hopefully it's not as bad as it was earlier, but it exists. This single person's progress is another unreviewed revision. And perhaps that's good enough.


A different lesson: Seperation of concerns

The real key is that you can't have X people, where X is determined by your company's management, working on the same code at the same time without factoring in the communication costs for getting X folk on the same page.

If you don't have time to factor in communication and code review, the new code will be worse in all of those standard ways -- best practices, efficiency, normalization/smart reuse, standardization, error handling. Worse, each person will find themselves coding around or being hamstrung by changes to their narrative made by X-1 other coders. Codebases shared by X devs without coordination is worse than having X coders doing their own thing. You do not get X-times (or "Xx" -- I regret my use of "X" at this point) the work, even if each dev is working on their own seemingly independent story!

To get "Xx" functionality (and that's what the company wants, man. When you feel pressure to go faster, and are considering adding more devs to finish sooner, it's because you want functionality), you have to separate concerns perfectly.

  • You must create interfaces ahead of time, at least one per "communication unit".
  • You must hit these interfaces.
    • Rather, if you change them, you have to change your schedule drastically.
    • You just changed two team's narratives!
  • You must create test cases for these interfaces with realistic values.
  • The interfaces must be natural breaks; they must be truly separate concerns.
    • You can't have any interaction between them.
  • They require an emphasis on defensive code.
    • There can be no unintended interactions between the opposite sides of the interface.

The interface is a contract. And objective contracts (strangely possible in code; it's magic) are the most efficient means of communication for software projects.

Now you're coding as fast as possible. And if done correctly, you can get a lot closer to the myth. But there's a huge, obvious downside...


Communication Debt added to your Technical

And remember, if X is large and your stories are done by one person per story, your "communication debt" will be just as huge. You will have technical debt, and you will have a huge learning curve for the new dev if the person working on the code changes. And if the meatware half of the cyborg leaves your company before the knowledge transfer/code review takes place, you're in trouble.

The good news? This debt will be firewalled by the interface. That's as far as the bad can go, if you have smart TDD.

If X is large and your stories are done by two people, the debt will be more than halved, but your functional output could be more than halved as well!

Quality code is hard, (c) 1842.


[1] See the constant complaining about JSLint rules wherever JSLint is used. Let me summarize two-thirds of the answers on the JSLint tag on StackOverflow: "If you don't like rule X, you should use JSHint and turn it off." JSLint is a form of code review that emphasizes a set of "best practices". If you're left to your own devices, you might not always follow them. Following them might not make for better code in every situation, but you will have good, standardized code with fewer errors. JSLinted code is higher quality code, though conforming takes more time than not.

As I've said before, "If I had to inherit legacy code sight unseen, and could only require that it be JSLinted or JSHinted (pick one), I'd pick JSLinted code every time," and that's because it's higher quality.

Labels: , ,


posted by ruffin at 6/26/2015 10:20:00 AM
0 comments
 
Tuesday, June 23, 2015

Programmers are copying security flaws into your software, researchers warn - CNET:

Working more as code assemblers than as writers, programmers are sourcing about 80 percent to 90 percent of the code in any given software application from third parties, many experts estimate.

Unless we're talking about third-party libraries (which I just finished lamenting earlier this month), there's no way that's true. Even then, there's no mature, custom codebase with only 20% original code. Gosh, I wish it were that easy. In other words, your WordPress programmer isn't [a programmer, to be clear].

I love the unsourced "many experts" too.

Labels: , ,


posted by ruffin at 6/23/2015 07:57:00 PM
0 comments

Our IT guy just relayed a priceless message from Sharepoint that "explains" problems we're having accessing Sharepoint today. Honestly, it's classic.

If you’re having problems accessing SharePoint today, we just got this:

 “Current Status: Engineers are analyzing the service infrastructure to identify the root cause of impact and formulate a remediation plan.

User Experience: Affected users are experiencing latency when accessing or rendering SharePoint Online sites.” 

That “latency” can be several seconds to 10 minutes.


The blue stuff is from Sharepoint, the rest from our IT guy (who is a bright, often funny, and thankfully competent dude).

You can't make this stuff up. I taught business and technical writing for four semesters while TAing in grad school. I'm not sure I could have taught folks to be this perfectly horrible if I'd tried my darnedest. Wow. In a strange, perverted sense, this was perfectly written.

Just for fun, let's translate:

We don't know why Sharepoint is hanging for up to 10 minutes, but we're going to try and fix it as soon as we can.

Labels: , , ,


posted by ruffin at 6/23/2015 11:51:00 AM
0 comments

Got this message at my outlook account Friday, and just noticed it today:

It's time to upgrade to an even better Skype experience on your Windows 8 device. Your current version of Skype is being replaced by an app called Skype for Windows desktop. It has more features to help you stay in touch like screen sharing and group video calling. Also, your chat conversations (from the last 30 days) and all your contacts will appear as normal after upgrading.

Ouch. I thought Win10 was going to be more "Metro" friendly, somehow maintaining backwards compatibility with apps written fro Win8, but this is making me wonder if the migration path goes back through the desktop.

In other fun, whoever came up with the mbox format wasn't on top of their game. It's hard to imagine in the days where SGML and XML are passé that any non-clearly delimited storage format could gain so much practical acceptance. Wow.

Labels: , , , , ,


posted by ruffin at 6/23/2015 08:56:00 AM
0 comments
 
Wednesday, June 17, 2015

I've got a project where I've been "given"[1] code that has lots of display logic in sprocs on SQL Server, and that display logic (colors, in this case) is fairly inextricably tied in with the data I want to use too. That is, we have several tables that hold raw data, and these sprocs both tease out the data and put it, somewhat inextensibly, into what boils down to one giant "row" of data. In other words, the data in the sproc's output is not normalized.

That info is, in the original system, passed on to SSRS (SQL Server Reporting Services, a, afaict, sort of SQL Server specific Crystal Reports equivalency) and translated, somewhat painfully, into SSRS's pseudo-html.

You can get those results into another table pretty easily, via building a table whose structure matches the output and using something like insert into spam EXEC crazySproc @p1 = 'crazy', @p2 = 'sproc'. And then you're off to the races. Sort of.

Because of the way we've got this set up, all the values are in a single row that extends until the end of time, with specialized display column value after specialized display column value interspersed with the raw data we want to operate on.[2]

The Right Thing To Do would be to rewrite the sproc to give us normalized data. But The Quick Thing would be to try and get each column name and value along with that value's display info into a single row we can easily JSON up and send, packaged with a little more display info that'll replace the static setup we have in SSRS, to the client.

So from this:

+------+--------+---------+-----------+-----------+--------+
| val1 | val2   | ...valn | colorVal1 | colorVal2 | ...etc |
+------+--------+---------+-----------+-----------+--------+
| 5    | 8.23   | ...     | red       | green     | ...etc |
+------+--------+---------+-----------+-----------+--------+

We want to have...

+------------+--------------+-------------+---------+--------------------------+-----------+
| metricName | metricAmount | metricColor | ordinal | label                    | isPercent |
+------------+--------------+-------------+---------+--------------------------+-----------+
| val1       |            5 |         red |       1 | This is the first value  |         0 |
| val2       |        8.23  |       green |       2 | This is the second value |         1 |
| ...        |              |             |         |                          |           |    
+------------+--------------+-------------+---------+--------------------------+-----------+

We can join on metricName pretty easily to get those extra fields on the right, though, as always, I hate joining on character strings. But desperate times and all that; we're joining to a sproc result that only expected to be used for one purpose, and didn't anticipate reuse.

So how to turn the first table into one that can be used to build the second? UNPIVOT it, turning columns into rows.

Generic case

I'm not sure why, but I had to stare at some examples for a while before I kinda got what was going on with UNPIVOT. Let's just go ahead and look at an extended but simplistic example:

create table #unpivotMe
(
    id int identity(1,1) not null,
    col1 varchar(50),
    col2 varchar(50),
    col3 varchar(50)
);
go

insert into #unpivotMe (col1, col2, col3)
values
    ('1v1', '1v2', '1v3'),
    ('2v1', '2v2', '2v3'),
    ('3v1', '3v2', '3v3');

select * from #unpivotMe

So far, so simple enough...

+----+------+------+------+
| id | col1 | col2 | col3 |
+----+------+------+------+
| 1  |  1v1 |  1v2 |  1v3 |
| 2  |  2v1 |  2v2 |  2v3 |
| 3  |  3v1 |  3v2 |  3v3 |
+----+------+------+------+

Here's how you need (or how I needed) to think of what comes next -- Each column is a name and value pair. And we have a choice with each column. We can:

  1. ignore it,
  2. translate it into a single value in a row,
  3. ... or display it "as-is", without UNPIVOTing.

Let's start by unpivoting every coln column (option 2 across the board), turning each named column into a row value.

select * from #unpivotMe
unpivot
(
    colValue for col in (col1, col2, col3)
) as unpivoted

The result is reasonably neat. Remember that, in my use case, there are no ids and only one row, so it'd stop after the third row displayed below.

+----+----------+------+
| id | colValue | col  |
+----+----------+------+
| 1  |      1v1 | col1 |
| 1  |      1v2 | col2 |
| 1  |      1v3 | col3 |
| 2  |      2v1 | col1 |
| 2  |      2v2 | col2 |
| 2  |      2v3 | col3 |
| 3  |      3v1 | col1 |
| 3  |      3v2 | col2 |
| 3  |      3v3 | col3 |
+----+----------+------+

Just for fun, let's experiment with option 3, "display a column as is", by leaving col3 out of the unpivoting this time, but keeping it in the SELECT clause.

select * from #unpivotMe
unpivot
(
    colValue for col in (col1, col2)
) as unpivoted

Fewer rows, and an extra column.

+----+------+----------+------+
| id | col3 | colValue | col  |
+----+------+----------+------+
|  1 |  1v3 |      1v1 | col1 |
|  1 |  1v3 |      1v2 | col2 |
|  2 |  2v3 |      2v1 | col1 |
|  2 |  2v3 |      2v2 | col2 |
|  3 |  3v3 |      3v1 | col1 |
|  3 |  3v3 |      3v2 | col2 |
+----+------+----------+------+

This shows us that what's explicitly listed in the unpivot clause are the only columns that we're going to "explode" into row values, which allows some interesting uses.

There are some crazy caveats, however, like that the cols in the UNPIVOT have to be the same type and length apparently. Check out this link at "If we attempt to use the UNPIVOT operator to do this, we run into trouble". That's pretty painful, and requires some wacky casting to keep up the shortcut charade, below.

One row mash-up use case (kludges ahoy!)

Let's also create something closer to my original use case, so you can tell exactly how it's useful.

create table #valuesAndDisplayInfo
(
    val1 decimal(5,2),
    val2 decimal(5,2),
    val3 decimal(5,2),
    colorVal1 varchar(15),
    colorVal2 varchar(15),
    colorVal3 varchar(15)
)

insert into #valuesAndDisplayInfo
    values
    (5, 8.23, 98.85, 'red', 'green', 'yellow');

select * from #valuesAndDisplayInfo;

That gives us the giant single row of data that parallels what the sproc I talked about gives me.

+------+------+-------+-----------+-----------+-----------+
| val1 | val2 | val3  | colorVal1 | colorVal2 | colorVal3 |
+------+------+-------+-----------+-----------+-----------+
| 5.00 | 8.23 | 98.85 |       red |     green |    yellow |
+------+------+-------+-----------+-----------+-----------+

That's kind of nasty. There are three values with three colors for display, all in the same place. Am I supposed to just JSON that up and look for every label in the format val[0-9]*? Yuck.

So let's UNPIVOT.

select * from #valuesAndDisplayInfo
unpivot
(
    value for field in (val1, val2, val3, colorVal1, colorVal2, colorVal3)
) as valuesAsRows

Whoops!

The type of column "colorVal1" conflicts with the type of other columns specified in the UNPIVOT list.

Remember that we have to have all of the column types the same in our UNPIVOT list. Let's get cast crazy.

select field, value from (
    select
        cast(val1 as varchar(15)) as val1,
        cast(val2 as varchar(15)) as val2,
        cast(val3 as varchar(15)) as val3,
        colorVal1,
        colorVal2,
        colorVal3
    from #valuesAndDisplayInfo
) as castedValAndInfo
unpivot
(
    value for field in (val1, val2, val3, colorVal1, colorVal2, colorVal3)
) as valuesAsRows

Though the casting stinks, that's not puke out loud horrible, but I really wanted the colors to be on the same row as the raw values. Right now, they aren't.

+-----------+--------+
| field     | value  |
+-----------+--------+
| val1      | 5.00   |
| val2      | 8.23   |
| val3      | 98.85  |
| colorVal1 | red    |
| colorVal2 | green  |
| colorVal3 | yellow |
+-----------+--------+

Here's the bullheaded, inefficient, magic string way around that that we probably oughta integrate earlier.

-- Note that we're selecting into a "real" table now
-- Probably should use a temp table.
select field, value into unpivotedVals from (
    select
    cast(val1 as varchar(15)) as val1,
    cast(val2 as varchar(15)) as val2,
    cast(val3 as varchar(15)) as val3,
    colorVal1,
    colorVal2,
    colorVal3
    from #valuesAndDisplayInfo
) as castedValAndInfo
unpivot
(
    value for field in (val1, val2, val3, colorVal1, colorVal2, colorVal3)
) as valuesAsRows

select v1.field, v1.value, v2.value as color from unpivotedVals v1
    inner join unpivotedVals v2
        on v2.field = 'color' + v1.field

Success. I don't love it, but you can see how unpivoting helped us get here.

+-------+-------+--------+
| field | value | color  |
+-------+-------+--------+
| val1  |  5.00 |    red |
| val2  |  8.23 |  green |
| val3  | 98.85 | yellow |
+-------+-------+--------+

Pretty good introduction and further exploration of UNPIVOT at those links, which is also where I found out about the type and length restrictions for UNPIVOT. I haven't read all the way through, but it looks like he has a method of using APPLY that might make this less painful.

Again, The Right Way is to rewrite the sproc, which I think I'm going to do, but this is a nice an incredibly kludgey, defensive (as in it doesn't affect any of the "given" code) method to get from A to B.


[1] Hrm, not exactly "given". Inherited? What's it called when you have contractors who are hired before you start and pick a stack that's probably not what you would've picked?

[2] "... on which we wish to operate"?

Labels:


posted by ruffin at 6/17/2015 04:12:00 PM
0 comments

That tab separated values to ASCII (and Unicode) table-like structures generator is here. And here's the sauce of that link (for me, at least).

Particularly useful for StackOverflow.

Labels: , ,


posted by ruffin at 6/17/2015 02:19:00 PM
0 comments

Support freedom
All posts can be accessed here:


Just the last year o' posts:



Powered by Blogger furthurnet.org Curmudgeon Gamer badge