2016-11-30

TSQL Recipes: Generate Range from Granularity Values

Recently I am working on a few reporting tasks. Because the raw data from the data source is not formatted for reporting purpose, I have to prepare quite a lot scripts to transform the data. So in next few posts I will demonstrate some of them, hopefully they can help you in some ways.

Today's topic is about time range. So basically what happened is the data saved from front end application is in granularity level, but we need to combine then to form data range format. For example, John Smith works at 01:00 - 02:00, 02:00 - 03:00, 05:00 - 06:00, the output needs to be John Smith: 01:00 - 03:00, 05:00 - 06:00.

To demonstrate the script, firstly let's create a table variable to host some dummy data

use tempdb
go

--create a table variable to host dummy data
declare @tbl table (UserKey varchar(10), ScheduleDate date, StartTime time(0), EndTime time(0))

insert into @tbl
select 'JSmith', '20161201', '00:00', '01:00'
union all
select 'JSmith', '20161201', '01:00', '02:00'
union all
select 'JSmith', '20161201', '02:00', '03:00'
union all
select 'JSmith', '20161201', '04:00', '05:00'
union all
select 'JSmith', '20161201', '05:00', '06:00'
union all
select 'JSmith', '20161201', '06:00', '07:00'
union all
select 'JSmith', '20161201', '07:00', '08:00'
union all
select 'JSmith', '20161201', '17:00', '18:00'
union all
select 'JSmith', '20161201', '18:00', '19:00'
union all
select 'JSmith', '20161201', '19:00', '20:00'
union all
select 'JSmith', '20161202', '09:00', '10:00'
union all
select 'JSmith', '20161202', '10:00', '11:00'
union all
select 'JSmith', '20161202', '11:00', '12:00'
union all
select 'JSmith', '20161202', '13:00', '14:00'

select * from @tbl


What we have now is a batch of dummy data like this:


Now time to transform the dummy data into range form. Firstly we can use a left self join to get begin time for each range:

--find the start time, lets call it TS
select t1.*
from @tbl t1 
 left join @tbl t2 on t1.UserKey = t2.UserKey 
   and t1.ScheduleDate = t2.ScheduleDate 
   and DATEADD(HOUR, -1, t1.StartTime) = t2.StartTime 
where t2.UserKey is null


Now we do the same left self join, but this time change t1 and t2 position. So what we have is the end time for each range

--find the end time, lets call it TE 
select t1.*
from @tbl t1 
 left join @tbl t2 on t1.UserKey = t2.UserKey 
   and t1.ScheduleDate = t2.ScheduleDate 
   and t1.StartTime = DATEADD(HOUR, -1, t2.StartTime)
where t2.UserKey is null


If we run these two queries, we get below results:



From the result it is very clear what we need to do next:
  • Join these two result sets by UserKey and ScheduleDate
  • Use StartTime From ST result set as the begin time of the range
  • From all StartTime From ET result set which are greater than the StartTime of ST result set, we use the first EndTime as the end time of the range
Well, I guess you are confused. So I make life simple for you :p

;with TS as
(
select t1.*
from @tbl t1 
 left join @tbl t2 on t1.UserKey = t2.UserKey 
  and t1.ScheduleDate = t2.ScheduleDate 
  and DATEADD(HOUR, -1, t1.StartTime) = t2.StartTime 
where t2.UserKey is null
),
TE as
(
select t1.*
from @tbl t1 
 left join @tbl t2 on t1.UserKey = t2.UserKey 
  and t1.ScheduleDate = t2.ScheduleDate 
  and t1.StartTime = DATEADD(HOUR, -1, t2.StartTime)
where t2.UserKey is null
)

select ts.UserKey, ts.ScheduleDate, ts.StartTime, MIN(te.EndTime) as EndTime
from TS join TE on ts.UserKey = te.UserKey and ts.ScheduleDate = te.ScheduleDate and ts.StartTime < te.StartTime
group by ts.UserKey, ts.ScheduleDate, ts.StartTime


Run the script and we get what we want. Enjoy.


No comments :

Post a Comment