Monday, August 27, 2012

Publish SSDT Projects Without Visual Studio Using SqlPackage.exe

If you'd like to publish your SSDT database project but don't have access to your target database from the machine you run Visual Studio (with SSDT) and/or you don't have Visual Studio installed on your target server you can still publish your database by using SqlPackage.exe.

For those familiar with Visual Studio Database Projects (VSDB), SqlPackage.exe is the replacement for VSDBCMD.exe

Microsoft provides a WPI (web platform installer) for the Microsoft SQL Server 2012 Data-Tier Application Framework which lays down everything required to use the SqlPackage command line utility. You can also install the individual components from the Microsoft SQL Server 2012 Feature Pack if you prefer. They consist of the following:
  1. dacframework.msi
  2. SQLDOM.msi
  3. SQLLS.msi
  4. SQLSysClrTypes.msi
Once everything is installed you should be able to find SqlPackage.exe in a location similar to "C:\Program Files (x86)\Microsoft SQL Server\110\DAC\bin."

Below is a basic sample of how to execute SqlPackage.exe.

C:\Program Files (x86)\Microsoft SQL Server\110\DAC\bin\sqlpackage.exe /a:Publish /tsn:TargetServerName /tdn:TargetDatabaseName /sf:c:\YourDacPac.dacpac

For an exhaustive list of available parameters see MSDN.

Wednesday, August 22, 2012

SSDT: unresolved reference to object [dbo].[sp_executesql].

Procedure: Some_Procedure has an unresolved reference to object [dbo].[sp_executesql].

Look familiar? There's a simple solution. We just need to add a reference to the master database.

1) Right-click the References folder in Solution Explorer.

2) Select the "master" System database.


3) Voila. The reference appears under the folder and the warning goes away.


Tuesday, August 14, 2012

Creating a CLR Function in SSDT

Creating a CLR function in SSDT (SQL Server Data Tools) is not very difficult but if you've never done it before you might be wondering where to start. Here goes...

Enabling CLR
First things first. While not directly related to SSDT it is necessary to enable CLR on your SQL Server instance since whatever code you end up writing won't be able to execute until this is done. All we have to do is set the clr enabled configuration to 1. See below.
sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'clr enabled', 1;
GO
RECONFIGURE;
GO

SSDT Project Properties
It is necessary to tell your SSDT project which version of the .NET Framework to use. If you're running SQL Server 2008 R2 you can use .NET Framework 3.5 or lower. SQL Server 2012 support .NET Framework 4.0. To get to the screen below, right click your project in the Solution Explorer and select Properties and then click the SQLCLR tab on the left-hand size. Set your Target framework appropriately.


Add a Function to the Project
Right-click within the project and select Add > New Item... 



Choose SQL CLR C# User Defined Function to add a function to the project.

Define the Function
Below is a sample function from MSDN. Obviously you can create a function to do whatever you're trying to accomplish.


Publish
Publish your SSDT project as you normally would.

SSDT makes database development and deployment a breeze but sometimes the approach can be less than intuitive. If you're used to creating CLR objects manually you might have been wondering how to create the assembly and the T-SQL reference to the CLR object's method. SSDT is kind enough to do this work for us.

If you look at Object Explorer you'll see the new assembly (DBProj) and the new dbo.validatePhone function.


If you further explore the dbo.validatePhone function you'll see that it is a wrapper for your CLR function.


Use it



Simple as that.

Conclusion
CLR objects can be as complex as the code that you write but SSDT makes their development and deployment easier. Hopefully this post shows you just how easy it is.

Friday, August 10, 2012

SSIS Event Handlers: Behavior In Package Hierarchy


I recently stumbled onto what seemed to be an odd performance problem in an SSIS project I was working on. The problem was that I had a number of straightforward packages that ran extremely fast but sometimes, without explanation, ran really slow.

Luckily the project included a robust logging system that provided, among other things, the execution duration of every package and Control Flow task.  The logs showed that the Data Flows in the packages in question were running in 10 to 20 seconds yet the packages were at times taking upwards of 20 minutes. Furthermore, I found that when called on their own, each package did not run much longer than its single data flow, yet when run as a child of another package (via an execute package task) the execution duration more than doubled. In one example I had a package that ran in 10 seconds on its own, 23 seconds as a child, 56 seconds as a child 3 levels deep, and almost 3 minutes when run as a child four levels deep. In that same example the Data Flow took only 7 seconds. So why was the package so slow when run as a child of one or more packages?

Time for SQL Server Profiler.

The logging mechanisms in place were stored procedures being called by Execute SQL tasks and were being passed things like a batch Id, Package Id, Source Name, Source GUID, Start Time, End Time, etc. Once I began profiling it was clear that there were way too many Pre and Post execute logging stored procedure calls.

Let me take a step back and explain the project/package design. The project contained about 50 SSIS packages, some of which were being called by Execute Package tasks within the project. For example, there was a Master package which called some staging packages and some staging packages that called lower level staging packages, and those in turn called some generic logging packages. There were as many as four packages in any given package execution hierarchy. In each package, regardless of where in the package hierarchy it fell, there were four event handlers:
  1. OnError: Handled exceptions that were then logged to the log database.
  2. OnVariableValueChanged: Handled the change in those variables we were interested in and logged to the log database.
  3. OnPreExecute: Handled the logging of the pre-execution data (e.g. start time, source name, etc.) for all package tasks.
  4. OnPostExecute: Handled the logging of post-execution data (e.g. end time) for all package tasks.
Getting back to what I was seeing in Profiler (many calls to the Pre and Post execute logging stored procedures), it only made sense that the OnPreExecute and OnPostExecute event handlers were being fired each time the logging stored procedures were being called. So why would those event handlers fire so many times?

I started to research event handlers propagation to understand what happens in SSIS when we have package level event handlers and a package design in which packages call other packages. I quickly ran into Jamie Thomson’s “SSIS: Understanding event propagation” blog post. In his post he talks about three things:
  1. Events being caught by more than one event handler.
  2. Event handlers firing events.
  3. The System::Propagate variable.

Numbers 1 and 2 were what I was most interested in. Number 3 is something I’ve written about in the past in a blog post title “SSIS Foreach Loop Container: Continue on Error” when explaining how to continue a Foreach Loop container when a child task fails.

I took Jamie’s example and modified it to fit my situation which was a four package hierarchy with a single Control Flow task (empty script task just to get the event handler to fire) in the lowest level package and an OnPostExecute event handler in all packages. Instead of an empty Control Flow task in the parent packages, an Execute Package task was used to call the package one level down (e.g. Package_Level2 called Package_Level1, PacakageLevel3 called Package_Level2, etc.). Screen shots below depict the design.

Package_Level1 Control Flow

Package_Level1 OnPostExecute Event Handler

Package_Level2 Control Flow w/ Execute Package Edit Dialog


When running this sample package(s) I found that each subsequent level in the hierarchy more than doubles the number of events fired. When running Package_Level1 the OnPostExecute event handler is executed twice. Once for the Level1_EmptyControlFlowTask task and then again for the OnPostExecute of the package itself.  When running Package_Level2 the OnPostExecute event handler is fired a total of 8 times. See the table below. Package_Level1 bubbles up the Level1_EventHandler (twice), the Level1_EmptyControlFlowTask, and the Package_Level1 OnPostExecute events to the Package_Level2 event handler. In addition to those 4 events we still have the original two events we say when executing Package_Level1 on its own. And, we have the Package_Level2 OnPostExecute events for the Level2_ExecuteChildPackage (call to execute Pacakge_Level1 package) and the Package_Level2 package execution.

Sequence
Event Handler Package
Task
Event
1
Package_Level1
Level1_EmptyControlFlowTask
OnPostExecute
2
Package_Level2
Level1_EventHandler
OnPostExecute
3
Package_Level2
Level1_EmptyControlFlowTask
OnPostExecute
4
Package_Level1
Package_Level1
OnPostExecute
5
Package_Level2
Level1_EventHandler
OnPostExecute
6
Package_Level2
Package_Level1
OnPostExecute
7
Package_Level2
Level2_ExecuteChildPackage
OnPostExecute
8
Package_Level2
Package_Level2
OnPostExecute

When we execute Package_Level4 we see a total of 52 events fired, all triggered by the single Level1_EmptyControlFlowTask task in Package_Level1.
Below is the formula for determining how many additional events will be fired by the top-level package in a package hierarchy, per Control Flow task when there is a single OnPostExecute event handler task in all packages in the hierarchy.

y = 2^n + x

x = previous package level’s additional events
n = level in package hierarchy (starting at base package level 1)
y = total additional events fired at level n

Package Level
Calculation
Additional Events Fired
Total Events Fired
1
2^1 + 0
2
2
2
2^2 + 2
6
8
3
2^3 + 6
14
22
4
2^4 + 14
30
52
5
2^5 + 30
62
114

As you can see, having nested packages with package level event handlers can create quite a bit of overhead. I initially failed to see this in my logs because my logging stored procedures were ignoring the event handlers that were fired by other event handlers. However, the process still had the overhead of all the events firing and the stored procedure calls, the data just wasn’t being persisted to the database.

Event handler behavior can be very tricky. Don’t be like me, understand what’s happening before implementing your event handler logic.