2016-05-20

T-SQL: Tally Table

Recently a question from my colleague made me want to write something about tally table – it is not formally documented in MSDN, but you can find lots of references if you google this key word.

Firstly let’s see the differences between below two queries. All of these two queries are to count 1,000,000 rows and insert into a temporary table:
The first query:
 ; with cte as  
 (  
 select 1 as n  
 union all  
 select n + 1  
 from cte   
 where n < 1000000  
 )  
 select * from cte option(maxrecursion 0)  
 go  


The second query:
 ;with tally as  
 (  
 select ROW_NUMBER() over (order by (select null)) as n  
 from (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as a(n)  
 cross join   
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as b(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as c(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as d(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as e(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as f(n)  
 )  
 select * from tally  
 go  


If we turn on Statistics IO and Statistics Time, we will see these two queries make a big difference (result can be various depends on the environment when you run the queries)

Statistics IO - CTE


Statistics IO - Tally table


Statistics Time - CTE


Statistics Time - Tally table


Actually, in a production environment I would always create one single column table with sequence number populated. This kind of number table can do lots of help when you need to do certain loop task. But in a real world scenario, we do have ad hoc situations, for these situations, tally table will be the friend.

Here is a real scenario when applying tally table. Imagine we have an aggregated purchase history table, in which we have customer details and purchased item amount. Now we need to print out customer details with their purchased item row by row – that is, if purchased amount is X, we need to print out X rows with the item sequence 1 through X.

To demonstrate, run below script to have the sample data

 declare @t table  
 (  
 Amount int,  
 Fname varchar(50),  
 LastName varchar(50)  
 )  
 insert into @t   
 values  
 (3, 'J', 'Smith'),  
 (2, 'M', 'Tse'),  
 (5, 'I', 'Mila'),  
 (1, 'C', 'Qi')  


For the table above, we want final output looks like:



Now think about what we can do:
  • Cursor: well, you know you don’t want to touch it.
  • While loop: essentially it is a cursor
  • Recursive CTE: Yes it works
  • And Tally table: a much better solution.


I won’t go through cursor and while loop, but below is recursive CTE and Tally table solutions. When the size of table becomes large, you can try these two approaches and see how big the difference it could be.

CTE
 ;with t1 as   
 (  
 select Amount, Fname, lname, ROW_NUMBER() over (partition by fname, lname order by amount) cnt  
 from @t t  
 )  
 , t2 as  
 (  
 select amount, Fname, LName  
 from t1   
 where cnt = 1  
 union all  
 select amount - 1, Fname, LName  
 from t2  
 where Amount > 1  
 )  
 select * from t2  
 order by 2, 3  


Tally table
 ;with tally as  
 (  
 select ROW_NUMBER() over (order by (select null)) as n  
 from (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as a(n)  
 cross join   
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as b(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as c(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as d(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as e(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as f(n)  
 )  
 select tt.n, t.Fname, t.LName  
 from @t t left join tally tt on t.Amount >= tt.n  
 order by 2, 3  


So now we see how tally table can help us in a real business scenario. Enjoy the script :)

No comments :

Post a Comment