Monday, March 29, 2010

Performance Issue: UNPIVOT with a Large Number of Columns

I love the PIVOT and UNPIVOT commands. They have saved me a ton of time on quite a few projects. UNPIVOT especially, since many times when working on BI projects I'm handed a monster spreadsheet by someone on the business side. Business people like seeing pivoted data. Database people like seing normalized data. I won't get into the keywords/commands here since there is plenty of material out there, most of which do a better job than I ever could of explaining their use.

However, I will tell you that SQL Server does not like "UNPIVOTing" a large number of columns as I recently found out. I should note that the query I have contains a few nested SELECTs (which I'm sure is part of the problem). I don't know the internals of what the database engine is doing when you using these commands but I do know they produce some pretty ugly execution plans. Since I just started this project and have a year of work to deliver in about 2 months (I'm not kidding) I haven't had the time to deconstruct the execution (nor should I really be writing about this topic).

What I have figured out is that if I break up a 280+ column UNPIVOT into 20 column chunks (an unpivot for every 20 columns - yes this is a maintenance nightmare) then I get reasonable performance (a couple of minutes to execute) whereas my original query never completed.

There's no way I'm going to be ok with having 15 queries instead of one to solve a performance issue but this is what I know for now and I'm able to present a working proof-of-concept. I hope to have a better understanding of the issue and as a result, the solution soon. I'll keep you posted. In the meantime, if you know what's going on please share . . .