SQL Server: Hiding your prod data and built-in function STUFF()

If you need to use production data for your testing and don’t want to expose any sensitive data on your development environment, you should look at obfuscating them.  This article shows the common obfuscation methods in use: character scrambling, repeating character masking, numeric variance, nulling, artificial data generation, truncating, encoding, and aggregating.

Also, if you are looking for a function in SQL Server that pretty much does like the function REPLACE() but only replaces one substring in a specific position and length, then STUFF() might be the one.

Nested Transactions in SQL Server

If you have a stored proc that executes a bunch of SQL statements inside a transaction because they are all meant to be executed as one atomic transaction and needs to be executed as quick as possible to avoid blocking too long others who want to call this stored proc, then you need to make sure this stored proc is not nested in another transaction.  Because nested transactions in SQL server is different than the nesting concept that you might have been accustomed to in a programming language, say in C#.

-- Say for example you need to write to a field in a table
CREATE TABLE [dbo].[TestTable](
	[TestField1] [nchar](10) NULL
) ON [PRIMARY]
GO

-- And it needs to be in a transaction, so you created a stored proc below
CREATE PROCEDURE [dbo].[usp_TestInnerTrans] 
AS
BEGIN
	BEGIN TRANSACTION
		INSERT INTO [dbo].[TestTable] ([TestField1]) VALUES ('Value1')
	COMMIT TRANSACTION
END
GO

-- When you call this stored proc directly, say using EXEC, 
--  you can get at most a 1 sec exec time

-- Now what if you have another stored proc that calls usp_TestInnerTrans 
--  nested inside another transaction

-- Below is that other stored proc and let's say it takes 10 secs to finish 
--  and commit the outer transaction
CREATE PROCEDURE [dbo].[usp_TestOuterTrans] 
AS
BEGIN
	BEGIN TRANSACTION
	EXEC usp_TestInnerTrans
	WAITFOR DELAY '00:00:10'
	COMMIT TRANSACTION
END
GO

-- What happens is that any other process that calls usp_TestInnerTrans 
--  will block and will wait until usp_TestOuterTrans finishes, 
--  which is 10 secs and not 1 sec.

As you can see inner transactions are ignored and the outermost transaction becomes the only one true transaction.  SQL Server Transaction Locking and Row Versioning Guide has a section on nested transactions that explains it better.  It's a great article so I recommend reading it from start to finish.

SQL Server: My Quick and Dirty Way of Debugging Stored Procedure

I use the below SQL script to debug a stored procedure in chunks, replacing or adding into it the rest of the stored procedure code until every code checks out fine.  Since I use transactions, changes are temporary and it rolls back the transaction at the end.  I also used the error handling in SQL Server to catch and print the error line, error number and error message.

BEGIN TRANSACTION
BEGIN TRY
	--
	-- your SQL code here
	--
END TRY
BEGIN CATCH
	PRINT 'Error at line # ' + CAST(ERROR_LINE() AS VARCHAR(MAX)) 
		+ ': ' + CAST(ERROR_NUMBER() AS VARCHAR(MAX)) 
		+ ' - ' + ERROR_MESSAGE()
END CATCH
IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION

SQL Server: Schemas, DELETE vs. TRUNCATE, Generate INSERTs, and NOT IN vs. NOT EXISTS

Schemas

Data Organization Using Schemas best explains with clear examples what schema in SQL Server is.  Important things to note are:

  • Schemas were first introduced in SQL Server 2005.
  • dbo in pre SQL Server 2005 was used as the default database owner but now it is used as the default schema.
  • Schema is analogous to namespace (as in namespace in C#), or a container used to store database objects.
  • Schema owner can be a Windows domain login, Windows local login, SQL Server login, Windows group, database role, server role, or application role.
  • Schemas are simpler to manage in terms of permissions and security.
  • Lastly, schemas can provide logical boundaries without the need to create multiple physical databases.
  • Additional readings:

 

DELETE vs. TRUNCATE

Sometimes deleting so many rows in a SQL Server table takes up a loooong time.  If rolling back data does not matter, use TRUNCATE instead.

TRUNCATE TABLE dbo.MyTable

 

Generate INSERTs

I used to use a script I got from the Internet that generates INSERT statements from a table so I can repopulate the table next time with same data, or populate a similar table in another database for example.  Recently I found another alternative in SQL Server Management Studio with the Generate Scripts… database task.  Just right-click on the database where the table you want to generate INSERTs from is, go to Tasks, and click Generate Scripts….  Go over the steps and make sure on the scripting options, by clicking the Advanced button, that you select Data only for the Types of data to script under the General options.  This will generate the INSERT statements.

 

NOT IN vs. NOT EXISTS

SELECT ProductID
FROM Products
WHERE ProductID NOT IN
(
	SELECT ProductID
	FROM OrderDetails
)

-- versus

SELECT ProductID
FROM Products p
WHERE NOT EXISTS
(
   	SELECT 1
	FROM OrderDetails od
	WHERE od.ProductID = p.ProductID
)

 

I am not talking about the performance difference between these two SQL operations but rather the functional difference between them.  It might be tempting to say they are the same but it is not.  NOT IN will not behave as one expects when the column used for comparison in the subquery contains NULL values (OrderDetails.ProductID in the example above).  The article NOT EXISTS vs NOT IN explains why.