2016-05-20

T-SQL: Tally Table

Recently a question from my colleague made me want to write something about tally table – it is not formally documented in MSDN, but you can find lots of references if you google this key word.

Firstly let’s see the differences between below two queries. All of these two queries are to count 1,000,000 rows and insert into a temporary table:
The first query:
 ; with cte as  
 (  
 select 1 as n  
 union all  
 select n + 1  
 from cte   
 where n < 1000000  
 )  
 select * from cte option(maxrecursion 0)  
 go  


The second query:
 ;with tally as  
 (  
 select ROW_NUMBER() over (order by (select null)) as n  
 from (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as a(n)  
 cross join   
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as b(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as c(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as d(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as e(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as f(n)  
 )  
 select * from tally  
 go  


If we turn on Statistics IO and Statistics Time, we will see these two queries make a big difference (result can be various depends on the environment when you run the queries)

Statistics IO - CTE


Statistics IO - Tally table


Statistics Time - CTE


Statistics Time - Tally table


Actually, in a production environment I would always create one single column table with sequence number populated. This kind of number table can do lots of help when you need to do certain loop task. But in a real world scenario, we do have ad hoc situations, for these situations, tally table will be the friend.

Here is a real scenario when applying tally table. Imagine we have an aggregated purchase history table, in which we have customer details and purchased item amount. Now we need to print out customer details with their purchased item row by row – that is, if purchased amount is X, we need to print out X rows with the item sequence 1 through X.

To demonstrate, run below script to have the sample data

 declare @t table  
 (  
 Amount int,  
 Fname varchar(50),  
 LastName varchar(50)  
 )  
 insert into @t   
 values  
 (3, 'J', 'Smith'),  
 (2, 'M', 'Tse'),  
 (5, 'I', 'Mila'),  
 (1, 'C', 'Qi')  


For the table above, we want final output looks like:



Now think about what we can do:
  • Cursor: well, you know you don’t want to touch it.
  • While loop: essentially it is a cursor
  • Recursive CTE: Yes it works
  • And Tally table: a much better solution.


I won’t go through cursor and while loop, but below is recursive CTE and Tally table solutions. When the size of table becomes large, you can try these two approaches and see how big the difference it could be.

CTE
 ;with t1 as   
 (  
 select Amount, Fname, lname, ROW_NUMBER() over (partition by fname, lname order by amount) cnt  
 from @t t  
 )  
 , t2 as  
 (  
 select amount, Fname, LName  
 from t1   
 where cnt = 1  
 union all  
 select amount - 1, Fname, LName  
 from t2  
 where Amount > 1  
 )  
 select * from t2  
 order by 2, 3  


Tally table
 ;with tally as  
 (  
 select ROW_NUMBER() over (order by (select null)) as n  
 from (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as a(n)  
 cross join   
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as b(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as c(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as d(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as e(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as f(n)  
 )  
 select tt.n, t.Fname, t.LName  
 from @t t left join tally tt on t.Amount >= tt.n  
 order by 2, 3  


So now we see how tally table can help us in a real business scenario. Enjoy the script :)

2016-05-16

SSIS Custom Control Flow Task - RS Execution Task

Long time ago I blogged the topic of how to create custom control flow component in SSIS.  In those posts I demonstrated how to build the control flow task by a very simple component but without any execution logic.


Recently I started to upgrade my old jobs from 2012 to 2014 version. So to demonstrate how to design the execution logic, I uploaded the source code of my report execution task onto the Github. You can access the C# project here.


The whole project was created by following the same routine I posted at part1, part2, part3, and part4. So hopefully it can give you some general ideas when you want to create your own control flow task. You do not need to follow steps listed in my posts: in a real world scenario, you can create the task by using just one class library project, simply add Winform into the project.


Feel free to modify the source code of the project if you want to adopt it in your own scenario: just follow the instruction in the readme and build the project into your SSIS task folder:


If you are using other version of SQL Server, e.g. 2008R2 or 2012, you can still use it. Because the task is built on top of ReportService2010 and ReportExecution2005 web services, you can build you own version by referencing correct Microsof.SqlServer.ManagedDTS and Microsoft.SqlServer.Dts.Design. By default installation, you should find them at the SDK folder: ...\Microsoft SQL Server\<SQL version>\SDK\Assemblies\.


Unfortunately there is no simple solution to convert the task to support SSRS2005 or SSRS2008, because they are built based on ReportingService2005/2006 web services. So to get the task fully support these two versions, you have to rebuild the web service reference and fix the conflict in the project.


One last word is, please always do the test before deploy it into your production, it is at your own risk if you decide to use it in your production environment.