Sunday, March 11, 2018

Estimated Number of Rows = 1 Query Very Slow or Not Returning Data

A query that is slow that should be fast (e.g. underlying tables are not enormous, has been fast in the past, etc. - I'll let you determine what should or should not be fast) is often (but not always) slow because of a "bad" query plan. Determining what is and what is not a good query plan is not something we're going to get into here but if you're at your wits end with a query that has run for hours and should be much faster, there's a good chance you don't have an optimal plan.

If you've landed on this page it's likely that you've already looked at your query plan and seen that some operators are showing Estimated Number of Rows = 1 when you know there are definitely more rows than that.

A key ingredient to a good query plan is accurate cardinality estimation. Cardinality refers to the uniqueness of values contained within a particular column or index and has a strong influence on the SQL Server Query Optimizer. It will drive decisions such as which join operator (e.g. hash match, netsted loops, merge) to use which will significantly impact query performance.

If you're encountering the Estimated Number of Rows = 1 there's a good chance that either your statistics need to be updated and/or the Cardinality Estimator is failing you.

To resolve the issue first trying updating statistics on the table(s) in question. Use the WITH FULLSCAN option to cover your bases.

 UPDATE STATISTICS table_or_indexed_view_name WITH FULLSCAN 

After you've updated your statistics, take another look at your query plan. Has the Estimated Number of Rows value changed to something more accurate? If so, try running your query again. If not, it might be that you're having issues with the Cardinality Estimator.

The Cardinality Estimator received an overhaul with SQL Server 2014 and while it's better in most cases, it is not in all. In fact, I stumbled upon a bug in the CE just last year that Microsoft acknowleged and fixed. If you suspect the CE is not working as it should be there are still a few things you can try.

First, determine if you're having an issue that someone else has already encountered and that Microsoft has fixed. You can do this by enabling trace flag 4199 which will enable all query optimizer fixes. You can first try this at the query level by using the following option.


If this works and you want to apply this globally, you can execute the following.

 DBCC TRACEON (4199, -1); 

If you still haven't solved the problem you can have SQL Server use the old or "legacy" CE as they call it. This can be used at the query level using the following hint after your query


or at the database level using the following script.


And if none of that works...
Take a look at your query plan. Do you have Merge Joins that require sorts on large data sets? Are there Nested Loops that are taking forever (try using Live Query Statistics to find your bottlenecks). You can try using join hints to force a HASH join and see if that helps.

I know all too well how painful this type of situation can be. While we were very grateful that Microsoft fixed our problem, it took months to identify it and then months more before it was available in a service pack. Unfortunately for us, using the Legacy CE was not an option. Hopefully for you, one of these fixes will work!

Thursday, March 8, 2018

Visual Studio SQL Server Project (SSDT) OPENQUERY Warning

Using OPENQUERY in a stored procedure in a Visual Studio SQL Server project will result in the following warning.

 SQL70558: The column definition from this OPENQUERY function cannot be verified, and the operations involved with it might not be understood correctly.       

Depending on how your project and build is configured, this warning it may or may not fail the build. Regardless, if you have to use OPENQUERY and don't want the warning, you can use the Suppress TSql Warnings property on the stored procedure. To view this property in Visual Studio, click on the .sql file in the Solution Explorer window. You should now see this property in the last row of the Properties window. Enter the 70558 warning number. The warning should now be suppressed and will no longer appear in the Error window or the build output.

The Suppress TSql Warnings property can be used to suppress other warnings as well. To suppress multiple warnings just delimit them with a comma.

Tuesday, July 12, 2016

Script to Copy Data Between Two SQL Server Databases

I recently had a requirement to "copy" the data between two identical database schemas. Actually the requirement was to "rebuild" a database from the ground up (objects and data). I almost always Visual Studio SQL Server projects (available after installing SSDT as well as with the latest versions of Visual Studio) when doing database development which means I can deploy an "empty" database with more or less the click of a mouse so I won't get into creating the actual database, only populating it. If you need to copy the database objects to a new database there are tools built within SQL Server Management Studio to allow you to do this. You can check out this TechNet post for more information. You can also generate DML or data population scripts from SSMS but it can result in massive files if you have a database with a decent amount of rows. My databases were rather large so I wanted to copy the data from one database to the other without having to explicitly write anything to disk.

This stored procedure handles tables with Identity columns as well as Computed columns. It also only attempts to populate tables that have zero rows. You might have different requirements so feel free to enhance it and post any enhancements that you think would be valuable to others.

CREATE PROCEDURE [dbo].[apUtilCopyDataFromDatabase] @SourceLinkedServerName     VARCHAR(128) = NULL,
                                       @SourceDatabaseName         VARCHAR(128),
                                       @SourceSchemaName           VARCHAR(128),
                                       @SourceTableNamePrefix      VARCHAR(128) = '',
                                       @TargetLinkedServerName     VARCHAR(128) = NULL,
                                       @TargetDatabaseName         VARCHAR(128) = NULL,
                                       @TargetSchemaName           VARCHAR(128) = 'dbo',
                                       @OnlyPopulateEmptyTablesInd BIT = 0
              @SQLTruncate         VARCHAR(MAX),
              @SQL                 VARCHAR(MAX),
              @TableName           SYSNAME,
              @SchemaName          VARCHAR(MAX),
              @HasIdentity         BIT,
              @CurrentDatabaseName VARCHAR(128) = DB_NAME(),
 @TargetDatabase VARCHAR(128) = @TargetDatabaseName; --ToDo: Cleanup confusing names  

      DECLARE @SQLNoCheck NVARCHAR(4000) = 'USE ' + @TargetDatabase + '; EXEC sp_MSforeachtable @command1 = ''ALTER TABLE ? NOCHECK CONSTRAINT ALL''; USE ' + @CurrentDatabaseName + ';';
 EXEC sp_executesql @SQLNoCheck;

      SET @SourceDatabaseName = IIF(@SourceLinkedServerName IS NOT NULL, QUOTENAME(@SourceLinkedServerName) + '.', '')
                                + QUOTENAME(@SourceDatabaseName) + '.'
                                + QUOTENAME(@SourceSchemaName);

      SET @TargetDatabaseName = IIF(@TargetLinkedServerName IS NOT NULL, QUOTENAME(@TargetLinkedServerName) + '.', '')
                                + IIF(@TargetDatabaseName IS NOT NULL, QUOTENAME(@TargetDatabaseName) + '.', '')
                                + QUOTENAME(@TargetSchemaName)

      SET @SQLCursorData = 'DECLARE TableCursor CURSOR GLOBAL FOR
        FROM ' + @TargetDatabase + '.sys.tables t
          JOIN ' + @TargetDatabase + '.sys.schemas s
            ON t.schema_id = s.schema_id
          JOIN ' + @TargetDatabase + '.sys.objects AS o
            ON t.object_id = o.object_id
          JOIN ' + @TargetDatabase + '.sys.partitions AS p
            ON o.object_id = p.object_id
          t.type = ''U''
          AND t.NAME LIKE ''' + @SourceTableNamePrefix + '%''
        GROUP  BY
      HAVING SUM(p.rows) = 0 
OR ' + CAST(@OnlyPopulateEmptyTablesInd AS VARCHAR(1)) + ' = 0;'

      EXEC sp_executesql @SQLCursorData;

      OPEN TableCursor;

      FETCH NEXT FROM TableCursor INTO @SchemaName, @TableName;

      DECLARE @ColumnList VARCHAR(MAX);

            BEGIN TRY
                      + '-----------------------'
                      + QUOTENAME(@SchemaName) + '.'
                      + QUOTENAME(@TableName)
                      + '-----------------------'

--ToDo: Add logic to determine if there are FK constraints. TRUNCATE if not, DELETE if there are.
                IF (@OnlyPopulateEmptyTablesInd = 0)
  + QUOTENAME(@TargetDatabase) + '.'
  + QUOTENAME(@SchemaName) + '.'
  + QUOTENAME(@TableName) + ';';


PRINT CAST(@@ROWCOUNT AS VARCHAR) + ' rows deleted.';

                DECLARE @SQLHasIdentity NVARCHAR(1000) = 
'USE ' + @TargetDatabase + '; 
                        FROM ' + @TargetDatabase + '.sys.columns
                        WHERE object_id = OBJECT_ID(QUOTENAME(''' + @SchemaName + ''') + ''.'' + QUOTENAME(''' + @TableName + ''')); 
USE ' + @CurrentDatabaseName + ';';

                EXEC sp_executesql @SQLHasIdentity, N'@val bit OUTPUT', @val = @HasIdentity OUTPUT;

SET @ColumnList = NULL;

                DECLARE @SQLColumnList NVARCHAR(4000) = 
'USE ' + @TargetDatabase + '; 
SELECT @val = COALESCE(@val + '','', '''') + QUOTENAME(name)
 + @TargetDatabase + '.sys.columns
 object_id = OBJECT_ID(QUOTENAME(''' + @SchemaName + ''') + ''.'' + QUOTENAME(''' + @TableName + '''))
 AND is_computed = 0 
 AND NAME NOT IN ( ''CreatedDate'', ''CreatedUser'', ''LastUpdatedDate'', ''LastUpdatedUser'' ); 
 USE ' + @CurrentDatabaseName + ';';

                EXECUTE sp_executesql @SQLColumnList, N'@val VARCHAR(MAX) OUTPUT', @val = @ColumnList OUTPUT;                

                SET @SQL = IIF(@HasIdentity = 1, 'SET IDENTITY_INSERT ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName) + ' ON;', '')
                           + 'INSERT INTO ' + @TargetDatabaseName + '.'
                           + @TableName + ' (' + @ColumnList + ') SELECT '
                           + @ColumnList + ' FROM '
                           + CONVERT(VARCHAR, @SourceDatabaseName) + '.'
                           + QUOTENAME(@TableName) + ' '
                           + IIF(@HasIdentity = 1, 'SET IDENTITY_INSERT ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName) + ' OFF ', '')
                           + ';';


PRINT CAST(@@ROWCOUNT AS VARCHAR) + ' rows inserted.';
            END TRY
            BEGIN CATCH
                PRINT ERROR_MESSAGE();
            END CATCH

            FETCH NEXT FROM TableCursor INTO @SchemaName, @TableName;;

      CLOSE TableCursor;

      DEALLOCATE TableCursor;

 DECLARE @SQLCheck NVARCHAR(4000) = 'USE ' + @TargetDatabase + '; EXEC sp_MSforeachtable @command1 = ''ALTER TABLE ? CHECK CONSTRAINT ALL''; USE ' + @CurrentDatabaseName + ';';
 EXEC sp_executesql @SQLCheck;

      RETURN 0

Tuesday, June 21, 2016

Visual Studio, TFS, Git, & GitHub

For those unfamiliar with TFVC it is one of the two version control engines shipped with Team Foundation Server, available with Visual Studio Team Services, and available on Microsoft's GitHub "rival", CodePlex. Many developers assume TFVC when they refer to using TFS but this assumption went out the window with TFS 2013 when it began shipping with the option to use either TFVC or Git as a version control repository.

There are many sites such as that do a great job a describing Git and what makes it so powerful so I won't get into specifics here but I will say that it excels over most other version control platforms when it comes to large, distributed projects. One such project is the Linux operating system. In fact, it was Linux's creator, Linus Torvalds, that developed Git to support the large, distributed nature of the development of Linux. Even with smaller projects, developers are embracing Git partially due to applications like GitHub that easily allow sharing and collaboration across the globe.

Git does such a great job that Microsoft itself is moving away from TFVC in favor of Git for many of its own projects. It's even hosting many of its public (open source) projects on GitHub including the .NET Core project.

...which brings us to GitHub. GitHub is not Git and Git is not GitHub. Git is a standalone version control application whereas GitHub is a web-based Git repository hosting service. GitHub allows users to create and share Git repositories and interact with them via the website, APIs, command line tools, IDE plugins, etc.

Back to Microsoft and its integrations with Git and GitHub...

1) CodePlex offers a Git version control option (it is the default):

2) TFS and Visual Studio Team Services offer a Git option (also the default option):

3) Visual Studio 2015 provides local Git repository integration as well as
GitHub integration:

If you landed here it's because you spend at least part of your time developing on the Microsoft stack and are interested in Git. If you know and like Git and didn't know if or how it integrated with Microsoft development tools, now you know enough to find the resources you need to get started. If you've only just heard about Git in passing and work with Microsoft development tools, I would urge you to do some research on Git and GitHub prior to starting your next project so that you can make an informed decision as to which version control system to use and if you want to use hosted repository service like GitHub.

Useful Links
  1. Interesting Wired article that gives some history of Git and GitHub
  2. The Register article about Microsoft moving projects to Git
  3. Visual Studio GitHub extension for all version of Visual Studio prior to Visual Studio 2015 Update 1.
  4. posh-git PowerShell Interface Install Overview which is also part of the GitHub Desktop app install
  5. Combining TFVC and Git repos in the same TFS project blog post

Wednesday, June 1, 2016

Tuesday, May 31, 2016

Multiple Rows of Tabs/Docs in SSMS

For a long time I felt like a crazy person for constantly closing documents in SSMS just because I could not easily access the documents that no longer had a visible tab. I would find myself resizing Object Explorer, maximizing SSMS, etc. to be able to see all the tabs. I'm sure I'm not the only one. In fact, Sergey Vlasov has created a Visual Studio plugin to deal with exactly this. Since SSMS is built on top of Visual Studio, the plugin works for those that also work exclusively in SSMS.

Visual Studio Gallery Link

Thank you Sergey!

Wednesday, March 5, 2014

SSIS Lookup Transformation Using as a Data Source

Pragmatic Works offers a Source component as part of its Task Factory SSIS component library. Using this component along with the out-of-the-box SSIS Cache Transform, it is possible to populate a Lookup Transformation with data.

*Note: this post assumes you have, at a minimum, installed the trial version of the Pragmatic Works TaskFactory product.

Example scenario where this is useful: We have a data warehouse that contains customer account data. Each customer account can have a broker that is responsible for servicing the account (each account can have 0 to 1 broker and each broker can have 0 to many accounts). We want to write this data warehouse account data to a Account object. The Account object also has a relationship to a Broker object. In order for us to maintain this relationship in, it is necessary for us to pass the internal Broker Id when writing the data to Since we do not maintain the Broker Id in our data warehouse me must look it up from using a Broker Number that is a unique Broker identifier in our data warehouse. Since we cannot use the Connection Manager in a Lookup Transformation we will use an approach that leverages the Cache Transform to allow us to have a Lookup Transformation that uses data.

Let's take a look at the Control Flow of our package. 

As you can see, this simple package contains two Data Flows. The first (PopulateLookupCache) populates the Cache Transform  and the second (OleDBSourceToSalesforce) writes data to . I've highlighted three of the components necessary to make this happen. The first is the PopulateLookupCache Data Flow and the second and third are Connection Managers of types CACHE and SALESFORCE.

Let's look at the PopulateLookupCache Data Flow.

This Data Flow uses the Pragmatic Works Source component to retrieve data from using a SOQL query and populates a Cache Transform (LookupCache) with the internal Id and the external Id (BrokerNumber). Below is the Cache Connection Manager definition. Note the Index Position value of 1 for the BrokerNumber column. This means that BrokerNumber will be the column used to perform the lookup when we define our Lookup Transformation.

Now that we've populated our Cache Transform we can move on to our primary Data Flow that will perform our lookup using our cached data.

In this data flow we retrieve Account data from a SQL Server database using our OleDBSource Connection Manager to write data to the Account object in In between our source and destination components we have 1) a Data Conversion transformation to convert our source system's BrokerNumber from non-unicode (VARCHAR/STR) to unicode (NVARCHAR/WSTR) and 2) our broker lookup that uses data to retrieve the Account's internal Broker Id using our source system's broker number.

Let's take a look at our lookup. As you can see, we're using the Cache connection manager connection type as opposed to the OLE DB connection manager connection type that we're so used to. This allows us to use the data that we wrote to the Cache Transform in our previous data flow.

Next, we set the Connection to our LookupCache Connection Manager.

Once we've set our data source we can configure the lookup columns.

We join on our data warehouse BrokerNumber to retrieve the Salesforce Broker Id that we alias as Broker__c since this is the name of our Account object's broker field.

As you can see from this post, using the Pragmatic Works Task Factory Source component and the out-of-the-box Cache Transform component, it is possible to populate a Lookup Transformation with data even though the Lookup Transformation does not directly support a Connection Manager.