2016-12-20

TSQL Recipes: Distributing Time by Interval

This should be the last post in 2016. Before the post Merry Christmas and Happy New Year 😊

In the previous post we tried to build a time range by merging time segments. Today let's look at another typical business scenario: distributing time to different slots.

As usual, to make the concept simple, let's look at screenshot below. What I have here is a two column result set, The first column is a date time column which indicates the action start date time, and the 2nd column shows total duration in seconds since the start date time.



Saving the data in such a format definitely is fine. But what if the end user wants to do an analysis by hour? Something like below?



Or even better, flexible output by time interval setting (this time the interval is 15 minutes)?



So this is the topic today. Firstly let's create a few variables

 declare @BeginDateTime datetime = '20161201 00:02:06' -- begin datetime
 declare @Duration int = 17894 -- seconds
 declare @Interval int = 10 --minutes
 declare @Result TABLE 
 (
 BeginDateTime datetime,
 Duration int,
 OutputTimeWindow datetime,
 SecondsLeft int,
 SecondsInWindow int
 )


Hopefully declarations made above make sense to you. If not, basically we need to take the @BeginDateTime as an anchor and then split duration by intervals, and output the result into the result table.

After the declaration, what we need to do is to get some transformed variable: logic is simple when we break down procedures step by step.

declare @BaseDate datetime = CAST(CAST(@BeginDatetime as date) as datetime)  --based date from @BeginDateTime
declare @MinutesToBaseDate  int = DATEDIFF(MINUTE, @BaseDate, @BeginDateTime) --minutes differences between @BeginDatetime and base date
declare @AmountOfIntervals int = @MinutesToBaseDate / @Interval --how many intervals we have


Starts from here, we can begin to build the initial time window

declare @DtWindow datetime = DATEADD(MINUTE, @Interval * @AmountOfIntervals, @BaseDate)
declare @NextDtWindow datetime = dateadd(MINUTE, @interval * (@AmountOfIntervals + 1), @BaseDate)


To make logic simple, I did an initial check to see if the initial time window is sufficient enough to hold the whole duration

IF DATEADD(SECOND, @Duration, @BeginDatetime) < @NextDtWindow 
BEGIN
  insert into @Result(BeginDateTime, Duration , OutputTimeWindow, SecondsLeft, SecondsInWindow)
  select @BeginDatetime, @Duration, DATEADD(MINUTE, @Interval * @AmountOfIntervals, @BaseDate), 0, @Duration
END


Now here is the important part, what if the initial time window is not big enough? What I did is creating a recursive cte and deduct duration by amount of intervals:

;with cte as 
(
select @BeginDatetime as BeginDatetime, @Duration as Duration
    , DATEADD(MINUTE, @Interval * (DATEDIFF(MINUTE, CAST(CAST(@BeginDateTime as date) as datetime), @BeginDateTime) / @Interval), @BaseDate) as DtWindow --base bucket
    , @Duration as SecondsLeft
    , DATEDIFF(SECOND, @BeginDatetime, @NextDtWindow) as SecondsInWindow
union all
select BeginDatetime, Duration
    , DATEADD(MINUTE, @Interval, DtWindow)
    , SecondsLeft - SecondsInWindow
    , IIF(SecondsLeft - SecondsInWindow > @Interval * 60, @Interval * 60, SecondsLeft - SecondsInWindow)
from cte
where SecondsLeft - SecondsInWindow > 0
)

insert into @Result(BeginDateTime, Duration , OutputTimeWindow, SecondsLeft, SecondsInWindow)
select BeginDatetime, Duration, DtWindow, SecondsLeft, SecondsInWindow
from cte


This is all we need. At the end of this post there is a table valued function I created by merging all steps listed above. You can call the function like below:

declare @t table (DT datetime, Duration int)
insert into @t(DT, Duration)
values
('2016-12-01 00:02:06', 17894),
('2016-12-01 05:00:20', 3482),
('2016-12-01 18:00:08', 920),
('2016-12-01 18:15:28', 23)

declare @interval int = 30 --half hour

select x.*
from @t t cross apply dbo.ufnDistributeTimeByInterval(@interval, t.DT, t.Duration) x




Do some tests by changing the @interval variable value. Enjoy it and see you next year.



CREATE function [dbo].[ufnDistributeTimeByInterval](@Interval int, @BeginDateTime datetime, @Duration int)
RETURNS @Result TABLE 
(
 BeginDateTime datetime,
 Duration int,
 OutputTimeWindow datetime,
 SecondsLeft int,
 SecondsInWindow int
)
AS
BEGIN

 declare @BaseDate datetime = CAST(CAST(@BeginDatetime as date) as datetime)  --based date from @BeginDateTime
 declare @MinutesToBaseDate  int = DATEDIFF(MINUTE, @BaseDate, @BeginDateTime) --minutes differences between @BeginDatetime and base date
 declare @AmountOfIntervals int = @MinutesToBaseDate / @Interval --how many intervals we have

 declare @DtWindow datetime = DATEADD(MINUTE, @Interval * @AmountOfIntervals, @BaseDate)
 declare @NextDtWindow datetime = dateadd(MINUTE, @interval * (@AmountOfIntervals + 1), @BaseDate)
 
 IF DATEADD(SECOND, @Duration, @BeginDatetime) < @NextDtWindow 
 BEGIN
  insert into @Result(BeginDateTime, Duration , OutputTimeWindow, SecondsLeft, SecondsInWindow)
  select @BeginDatetime, @Duration, DATEADD(MINUTE, @Interval * @AmountOfIntervals, @BaseDate), 0, @Duration
 END
 ELSE
 BEGIN
  ;with cte as 
  (
   select @BeginDatetime as BeginDatetime, @Duration as Duration
    , DATEADD(MINUTE, @Interval * (DATEDIFF(MINUTE, CAST(CAST(@BeginDateTime as date) as datetime), @BeginDateTime) / @Interval), @BaseDate) as DtWindow --base bucket
    , @Duration as SecondsLeft
    , DATEDIFF(SECOND, @BeginDatetime, @NextDtWindow) as SecondsInWindow
   union all
   select BeginDatetime, Duration
    , DATEADD(MINUTE, @Interval, DtWindow)
    , SecondsLeft - SecondsInWindow
    , IIF(SecondsLeft - SecondsInWindow > @Interval * 60, @Interval * 60, SecondsLeft - SecondsInWindow)
   from cte
   where SecondsLeft - SecondsInWindow > 0
  )

  insert into @Result(BeginDateTime, Duration , OutputTimeWindow, SecondsLeft, SecondsInWindow)
  select BeginDatetime, Duration, DtWindow, SecondsLeft, SecondsInWindow
  from cte

 END --end of if else

 RETURN

END --end of function

2016-11-30

TSQL Recipes: Generate Range from Granularity Values

Recently I am working on a few reporting tasks. Because the raw data from the data source is not formatted for reporting purpose, I have to prepare quite a lot scripts to transform the data. So in next few posts I will demonstrate some of them, hopefully they can help you in some ways.

Today's topic is about time range. So basically what happened is the data saved from front end application is in granularity level, but we need to combine then to form data range format. For example, John Smith works at 01:00 - 02:00, 02:00 - 03:00, 05:00 - 06:00, the output needs to be John Smith: 01:00 - 03:00, 05:00 - 06:00.

To demonstrate the script, firstly let's create a table variable to host some dummy data

use tempdb
go

--create a table variable to host dummy data
declare @tbl table (UserKey varchar(10), ScheduleDate date, StartTime time(0), EndTime time(0))

insert into @tbl
select 'JSmith', '20161201', '00:00', '01:00'
union all
select 'JSmith', '20161201', '01:00', '02:00'
union all
select 'JSmith', '20161201', '02:00', '03:00'
union all
select 'JSmith', '20161201', '04:00', '05:00'
union all
select 'JSmith', '20161201', '05:00', '06:00'
union all
select 'JSmith', '20161201', '06:00', '07:00'
union all
select 'JSmith', '20161201', '07:00', '08:00'
union all
select 'JSmith', '20161201', '17:00', '18:00'
union all
select 'JSmith', '20161201', '18:00', '19:00'
union all
select 'JSmith', '20161201', '19:00', '20:00'
union all
select 'JSmith', '20161202', '09:00', '10:00'
union all
select 'JSmith', '20161202', '10:00', '11:00'
union all
select 'JSmith', '20161202', '11:00', '12:00'
union all
select 'JSmith', '20161202', '13:00', '14:00'

select * from @tbl


What we have now is a batch of dummy data like this:


Now time to transform the dummy data into range form. Firstly we can use a left self join to get begin time for each range:

--find the start time, lets call it TS
select t1.*
from @tbl t1 
 left join @tbl t2 on t1.UserKey = t2.UserKey 
   and t1.ScheduleDate = t2.ScheduleDate 
   and DATEADD(HOUR, -1, t1.StartTime) = t2.StartTime 
where t2.UserKey is null


Now we do the same left self join, but this time change t1 and t2 position. So what we have is the end time for each range

--find the end time, lets call it TE 
select t1.*
from @tbl t1 
 left join @tbl t2 on t1.UserKey = t2.UserKey 
   and t1.ScheduleDate = t2.ScheduleDate 
   and t1.StartTime = DATEADD(HOUR, -1, t2.StartTime)
where t2.UserKey is null


If we run these two queries, we get below results:



From the result it is very clear what we need to do next:
  • Join these two result sets by UserKey and ScheduleDate
  • Use StartTime From ST result set as the begin time of the range
  • From all StartTime From ET result set which are greater than the StartTime of ST result set, we use the first EndTime as the end time of the range
Well, I guess you are confused. So I make life simple for you :p

;with TS as
(
select t1.*
from @tbl t1 
 left join @tbl t2 on t1.UserKey = t2.UserKey 
  and t1.ScheduleDate = t2.ScheduleDate 
  and DATEADD(HOUR, -1, t1.StartTime) = t2.StartTime 
where t2.UserKey is null
),
TE as
(
select t1.*
from @tbl t1 
 left join @tbl t2 on t1.UserKey = t2.UserKey 
  and t1.ScheduleDate = t2.ScheduleDate 
  and t1.StartTime = DATEADD(HOUR, -1, t2.StartTime)
where t2.UserKey is null
)

select ts.UserKey, ts.ScheduleDate, ts.StartTime, MIN(te.EndTime) as EndTime
from TS join TE on ts.UserKey = te.UserKey and ts.ScheduleDate = te.ScheduleDate and ts.StartTime < te.StartTime
group by ts.UserKey, ts.ScheduleDate, ts.StartTime


Run the script and we get what we want. Enjoy.


2016-11-20

G-NAF Bulk Import Script: build an Australia national address database

Address verification had never been a simple topic. It involves quite a lot resources if you want to achieve an acceptable result. Previously we had tried to use Australia PAF file for verification purpose. However it is not as good as we expected: the PAF file is for parcel delivery, so for instance, you can identify a P.O. box through the PAF file, but it cannot be treated as a residential address.

Now good news is, through years and years negotiation, PSMA finally opened its G-NAF (geo-coded national address file) to public this year. Additionally PSMA also confirmed the it will keep the G-NAF refreshed every 3 months (May, August, November, and so on). But because data.gov.au is being rebuilt, the latest version I could find is August version. Well, my purpose of this post is to help you import address data into SQL Server quickly. So if you want to know more about G-NAF, simply go to their website.

The G-NAF file downloaded from data.gov.au includes the T-SQL script for database setup as well as the script for constraints setup. However importing the raw data could be a bit trouble due to amount of tables and raw data files.



To make life easier, you can try below bulk load statements. I generated them by looping tables and file names, what you need to do is just replacing "{G-NAF Folder}" with your own folder. Hope they can save you a bit time. Enjoy.


 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\ACT_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\ACT_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\ACT_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\ACT_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\ACT_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\ACT_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\ACT_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\ACT_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\ACT_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\ACT_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\ACT_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\ACT_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\ACT_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\ACT_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\ACT_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\ACT_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\NSW_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\NSW_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\NSW_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\NSW_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\NSW_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\NSW_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\NSW_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\NSW_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\NSW_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\NSW_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\NSW_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\NSW_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\NSW_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\NSW_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\NSW_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\NSW_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\NT_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\NT_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\NT_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\NT_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\NT_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\NT_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\NT_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\NT_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\NT_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\NT_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\NT_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\NT_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\NT_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\NT_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\NT_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\NT_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\OT_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\OT_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\OT_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\OT_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\OT_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\OT_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\OT_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\OT_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\OT_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\OT_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\OT_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\OT_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\OT_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\OT_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\OT_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\OT_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\QLD_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\QLD_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\QLD_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\QLD_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\QLD_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\QLD_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\QLD_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\QLD_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\QLD_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\QLD_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\QLD_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\QLD_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\QLD_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\QLD_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\QLD_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\QLD_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\SA_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\SA_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\SA_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\SA_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\SA_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\SA_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\SA_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\SA_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\SA_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\SA_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\SA_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\SA_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\SA_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\SA_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\SA_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\SA_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\TAS_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\TAS_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\TAS_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\TAS_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\TAS_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\TAS_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\TAS_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\TAS_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\TAS_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\TAS_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\TAS_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\TAS_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\TAS_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\TAS_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\TAS_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\TAS_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\VIC_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\VIC_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\VIC_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\VIC_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\VIC_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\VIC_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\VIC_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\VIC_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\VIC_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\VIC_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\VIC_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\VIC_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\VIC_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\VIC_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\VIC_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\VIC_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_ALIAS From '{G-NAF Folder}\WA_ADDRESS_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DEFAULT_GEOCODE From '{G-NAF Folder}\WA_ADDRESS_DEFAULT_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_DETAIL From '{G-NAF Folder}\WA_ADDRESS_DETAIL_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_MESH_BLOCK_2011 From '{G-NAF Folder}\WA_ADDRESS_MESH_BLOCK_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE_GEOCODE From '{G-NAF Folder}\WA_ADDRESS_SITE_GEOCODE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.ADDRESS_SITE From '{G-NAF Folder}\WA_ADDRESS_SITE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_ALIAS From '{G-NAF Folder}\WA_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_NEIGHBOUR From '{G-NAF Folder}\WA_LOCALITY_NEIGHBOUR_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY_POINT From '{G-NAF Folder}\WA_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.LOCALITY From '{G-NAF Folder}\WA_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.MB_2011 From '{G-NAF Folder}\WA_MB_2011_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.PRIMARY_SECONDARY From '{G-NAF Folder}\WA_PRIMARY_SECONDARY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STATE From '{G-NAF Folder}\WA_STATE_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_ALIAS From '{G-NAF Folder}\WA_STREET_LOCALITY_ALIAS_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY_POINT From '{G-NAF Folder}\WA_STREET_LOCALITY_POINT_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  
 bulk insert G_NAF.dbo.STREET_LOCALITY From '{G-NAF Folder}\WA_STREET_LOCALITY_psv.psv' with(FieldTerminator = '|', RowTerminator = '\n', FirstRow = 2);  

2016-10-04

Peronal Template in SQL Server Management Studio

It has been a while since my last post. Well, there are so many jobs need to be done during this busy season. When we are busy, we need to find a way to improve our productivity, or simply free us from repeated jobs. Today let’s see a very important function in SSMS Template.

When we have SSMS installed, by default there will be standard system templates available. You can open the Template Browser by using the hot key “Ctrl + Alt + T”, or by using the command item from the menu View -> Template Explorer.



Applying the template is quite simple, simply find the template you want to use, double click it, and then replace the variables in the template. Let’s see an example:

Assume I want to create a new function by applying the function template. To do so, I double click the “Create Inline Function (new menu)” template under the Function category.



At this stage a new query window with default pattern should appear.



As you can see the template pattern contains comments section, script section, and lots of “&lt;&gt;”. These are the parameters can be quickly replaced. As specified by the instruction comments, we can use hot key “Ctrl + Shift + M” to replace the parameters, as shown below.



After we input the values, click OK button and we will have the function populated within the query window.



This is a standard function provided in SSMS. Obviously it can save us lots of time. But we can make it even better, i.e. we can create our own parameterised template!

Firstly, we need to have a blank folder. In below screenshots, I created a personal template folder by right clicking top-most node “SQL Server Templates” and selecting New -&gt; Folder, and then named it “Personal Template”.



Right click the newly created folder and this time, I selected New -&gt; Template. Now we will have an empty template available in the folder.

To fill in our template, we need to right click the template, and then select Edit, as shown below.



Now the template should be opened in SSMS. Just like example shown below, to indicate the parameter, we need to use the format “<name="" type="" example="">”. To make it simple, in this example I just created a "If exists, select" pattern.
 USE <database_name, sysname, master>  
 GO  
 IF EXISTS (SELECT 1 FROM sys.tables where name = '<table_name, sysname, dummy_table>')  
 BEGIN  
      SELECT * FROM <table_name, sysname, dummy_table>  
 END  




After we done everything, we can save it by using “Ctrl + S” or using the menu bar Save command. Close the template query window, now double click our personal template and press “Ctrl + Shift + M”, you should see parameter window appears.



Fill in values for the parameters, close the popup window by pressing OK button, you should see the result as expected. Handy function, isn’t it?

2016-07-05

Power BI & Telstra Wifi Hotspot API

A while ago I wrote the post to show how to call Telstra Wi-Fi Hotspot API through C#. Today let’s see how easy we can achieve the same result (even better) by using Microsoft Power BI.


Power BI, in short, is an powerful BI tool for data extraction, transformation, and visualisation.  I would highly recommend it if you haven’t tried it yet.  At this stage you can sign it up for free, by applying a trial office 365 license.


Now back to the topic, today my plan is to call the API by using Power Query embedded in Power BI, and then display the result on the workspace. To do so let’s open Power BI Desktop, and press Edit Queries button on the ribbon.




On the Query Editor window (Power Query), open dropdown menu by press New Source button, then select Blank Query option.




Now click Advanced Editor button on the ribbon to open the editor window




Now Advanced Editor window pops up, and we can start to type in Power Query statements. To demonstrate how powerful the power query is, I decided to split the procedure into two: create a function to get oAuth token, and then call the Wi-Fi Hotspot API.


To create the function in Power Query, firstly let’s write below query (replace consumer key and consumer secret)




 let  
   authUrl = "https://api.telstra.com/v1/oauth/token",  
   authKey = "client_id={consumerkey}&client_secret={consumer secret}&grant_type=client_credentials&scope=WIFI",  
   getJson = Web.Contents  
       (authUrl,  
         [  
           Headers = [#"Content-Type" = "application/x-www-form-urlencoded"],  
           Content = Text.ToBinary(authKey)  
         ]  
       ),  
   jsonDoc = Json.Document(getJson)  
 in  
   jsonDoc  


Click Done button and the oAuth result should be shown on Power Query window directly, as shown below




So far everything is good. What we need to do now is to introduce parameters into the query, and then parameterise the query. To do so firstly create new parameters




For demonstration purpose, I treat consumer key and consumer secret as parameters, as shown below






Now change our query a bit, I have highlighted changes in the screenshot attached below




Once you done everything and click Done button, the function window appears




Now you can test the function by entering consumer key and consumer secret then pressing the Invoke button.




The function should work as expected. Time to write the 2nd script. Following the same step to open a blank query window and input statements like below.




 let  
   accessToken = fnGetToken("{consumerkey}", "{consumersecret}")[access_token],  
   latitude = "-37.8103713989",  
   longitude = "144.99530273",  
   radius = "1000",  
   wifiUrl = Text.Format("https://api.telstra.com/v1/wifi/hotspots?lat=#{0}&long=#{1}&radius=#{2}", {latitude, longitude, radius}),  
   accessTokenHeader = "Bearer " & accessToken,  
   getWifiJson = Web.Contents(wifiUrl, [Headers = [#"Authorization" = accessTokenHeader]]),  
   wifiJson = Json.Document(getWifiJson),  
   table = Table.FromList(wifiJson, Splitter.SplitByNothing(), null, null, ExtraValues.Error),  
   wifiTable = Table.ExpandRecordColumn(table, "Column1", {"lat", "long", "address", "city", "state"}, {"lat", "long", "address", "city", "state"})  
 in  
      wifiTable   


Now things could be a little bit tricky. Depends on your Power BI desktop setting, you might get the result without any issue. But in most cases, you could see two different types of warnings: connection warning and privacy warning. In short it is because we are passing oAuth token from a different Url (https://api.telstra.com/v1/oauth/token vs https://api.telstra.com/v1/wifi/hotspots). Let's see how we can resolve these conflicts:


The first warning you might see is the connection warning.




Click the button “Edit Credential” will lead us to a popup window. Because we retrieve our token by passing consumer key and consumer secret into the function, we can simply select Anonymous method and then select the Url https://api.telstra.com/v1/oauth/token as the target.




The next window is about data privacy.




On the Privacy Level screen, select the Url https://api.telstra.com/v1/oauth/token, and then select “Public” as the scope.




After all these steps, the data table should be generated automatically




Now we can go back to the Power BI by clicking the button “Close & Apply” at the top left corner. Drag a map visual onto the workspace, drop Lat and Long columns onto the corresponding map fields, and that is it


2016-06-14

Telstra WiFi API Quick Demonstration

Update on 16/06/2016: VS 2015 project for this post can be accessed on my personal Github repository.


It has been a while since I blogged the SMS API published by Telstra (you can find it here). After that post I spent quite a long time to blog other stuff. Recently I re-visited Telstra's Dev centre and found this interesting WiFi API. So let's try this API today.




Basically the API will accept query parameters of latitude, longitude, and a radius, then it will give us a list of available Telstra WiFi hotspots around the coordination.


The first step to use this API is to get the API key. Similar to the steps I described in my SMS post, it should be quite simple to get a new app created. Just be aware, the product to be ticked is WiFi API, as demonstrated below.






After the registration, expand the app pannel by pressing its name, you should see the consumer key and consumer secret are available now.






Time to look at C# implementation of the API.


Create a Winform project in Visual Studio, and I have a form created: 3 textboxes to accept latitude, longitude, and radius. A calculate button to call the WiFi API, and finally a list box to display all returned WiFi hotspots' addresses (keep in mind that, at this stage the API will return maximum 10 hot spots).






Before starting of the code task, I added two references into my project: one is System.Net.Http, which will be used to issue request and receive response; another one has been heavily used in my previous posts: Newtonsoft.Json. It will help us parse the json string returned from API call.


One thing we can reuse from previous SMS post is the authentication method: pass in consumer key and consumer secret, and then return a token and its expiry seconds:


     private async Task<TelstraToken> GetAccessToken(string consumerkey, string consumersecret)  
     {  
       TelstraToken Token = new TelstraToken();  
       string AccessUrl = @"https://api.telstra.com/v1/oauth/token";  
       HttpClient authClient = new HttpClient();  
       HttpContent httpContent = new FormUrlEncodedContent(new Dictionary<string, string>  
       {  
         {"client_id", consumerkey},  
         {"client_secret", consumersecret},  
         {"grant_type", "client_credentials"},  
         {"scope", "WIFI"}  
       });  
       HttpRequestMessage Request = new HttpRequestMessage()  
       {  
         Method = HttpMethod.Post,  
         RequestUri = new Uri(AccessUrl),  
         Content = httpContent  
       };  
       try  
       {  
         var ResponseMessage = await authClient.SendAsync(Request);  
         var Response = await ResponseMessage.Content.ReadAsStringAsync();  
         if (ResponseMessage.IsSuccessStatusCode)  
         {  
           var AuthToken = JsonConvert.DeserializeObject<object>(Response);  
           JObject jObj = JObject.Parse(AuthToken.ToString());  
           Token.AccessToken = jObj["access_token"].ToString();  
           Token.ExpiredDt = DateTime.Now.AddSeconds(double.Parse(jObj["expires_in"].ToString()));  
         }  
       }  
       catch (Exception ex)  
       {  
         MessageBox.Show(ex.Message);  
       }  
       return Token;  
     }  


You may question the TelstraToken object from the method above: It is just a simple object to host token value/expiry date time. Beside this simple object, I also created another helper class for Hotspot returned from API. Below are the definitions for these two classes:

     internal class TelstraToken  
     {  
       internal string AccessToken;  
       internal DateTime ExpiredDt;  
       internal TelstraToken() { }  
       internal TelstraToken(string _accessToken, DateTime _expiredDt)  
       {  
         this.AccessToken = _accessToken;  
         this.ExpiredDt = _expiredDt;  
       }        
     }  
     internal class TelstraWifiHotSpot  
     {  
       internal string latitude;  
       internal string longitude;  
       internal string address;  
       internal string city;  
       internal string state;  
       internal TelstraWifiHotSpot(string _latitude, string _longitude, string _address, string _city, string _state)  
       {  
         this.latitude = _latitude;  
         this.longitude = _longitude;  
         this.address = _address;  
         this.city = _city;  
         this.state = _state;  
       }  
       public override string ToString()  
       {  
         string fullAddress = string.IsNullOrWhiteSpace(address) ? string.Empty : address;  
         fullAddress += ", ";  
         fullAddress += string.IsNullOrWhiteSpace(city) ? string.Empty : city;  
         fullAddress += " ";  
         fullAddress += string.IsNullOrWhiteSpace(state) ? string.Empty : state;  
         return fullAddress;  
       }  
     }  


Now all preparations are done. The only missing part is the button click event:


     private async void btnCalculate_Click(object sender, EventArgs e)  
     {  
       List<TelstraWifiHotSpot> lst = new List<TelstraWifiHotSpot>();  
       if (_token == null || _token.ExpiredDt < DateTime.Now)  
       {  
         _token = await GetAccessToken("your consumer key", "your consumer secret");  
       }  
       string latitude = txtLatitude.Text;  
       string longitude = txtLongitude.Text;  
       string radius = txtRadius.Text;  
       string url = string.Format("https://api.telstra.com/v1/wifi/hotspots?lat={0}&long={1}&radius={2}", latitude, longitude, radius);  
       HttpClient client = new HttpClient();  
       client.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", _token.AccessToken);  
       HttpRequestMessage request = new HttpRequestMessage()  
       {  
         Method = HttpMethod.Get,  
         RequestUri = new Uri(url)  
       };  
       HttpResponseMessage responseMessage = await client.SendAsync(request);  
       var message = await responseMessage.Content.ReadAsStringAsync();  
       var jMsg = JsonConvert.DeserializeObject<object>(message);  
       foreach (var j in (JArray)jMsg)  
       {  
         string wifiLatitude = ((JObject)j)["lat"].ToString();  
         string wifiLongitude = ((JObject)j)["long"].ToString();  
         string wifiAddress = ((JObject)j)["address"].ToString();  
         string wifiCity = ((JObject)j)["city"].ToString();  
         string wifiState = ((JObject)j)["state"].ToString();  
         TelstraWifiHotSpot hotspot = new TelstraWifiHotSpot(wifiLatitude, wifiLongitude, wifiAddress, wifiCity, wifiState);  
         lst.Add(hotspot);  
       }  
       if (lst.Count > 0)  
       {  
         lstHotspots.DataSource = lst;  
         lstHotspots.DisplayMember = lst.ToString();  
       }  
     }  


After we put everything together and run the project, you should be able to find the hotspots Telstra provided:





2016-05-20

T-SQL: Tally Table

Recently a question from my colleague made me want to write something about tally table – it is not formally documented in MSDN, but you can find lots of references if you google this key word.

Firstly let’s see the differences between below two queries. All of these two queries are to count 1,000,000 rows and insert into a temporary table:
The first query:
 ; with cte as  
 (  
 select 1 as n  
 union all  
 select n + 1  
 from cte   
 where n < 1000000  
 )  
 select * from cte option(maxrecursion 0)  
 go  


The second query:
 ;with tally as  
 (  
 select ROW_NUMBER() over (order by (select null)) as n  
 from (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as a(n)  
 cross join   
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as b(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as c(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as d(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as e(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as f(n)  
 )  
 select * from tally  
 go  


If we turn on Statistics IO and Statistics Time, we will see these two queries make a big difference (result can be various depends on the environment when you run the queries)

Statistics IO - CTE


Statistics IO - Tally table


Statistics Time - CTE


Statistics Time - Tally table


Actually, in a production environment I would always create one single column table with sequence number populated. This kind of number table can do lots of help when you need to do certain loop task. But in a real world scenario, we do have ad hoc situations, for these situations, tally table will be the friend.

Here is a real scenario when applying tally table. Imagine we have an aggregated purchase history table, in which we have customer details and purchased item amount. Now we need to print out customer details with their purchased item row by row – that is, if purchased amount is X, we need to print out X rows with the item sequence 1 through X.

To demonstrate, run below script to have the sample data

 declare @t table  
 (  
 Amount int,  
 Fname varchar(50),  
 LastName varchar(50)  
 )  
 insert into @t   
 values  
 (3, 'J', 'Smith'),  
 (2, 'M', 'Tse'),  
 (5, 'I', 'Mila'),  
 (1, 'C', 'Qi')  


For the table above, we want final output looks like:



Now think about what we can do:
  • Cursor: well, you know you don’t want to touch it.
  • While loop: essentially it is a cursor
  • Recursive CTE: Yes it works
  • And Tally table: a much better solution.


I won’t go through cursor and while loop, but below is recursive CTE and Tally table solutions. When the size of table becomes large, you can try these two approaches and see how big the difference it could be.

CTE
 ;with t1 as   
 (  
 select Amount, Fname, lname, ROW_NUMBER() over (partition by fname, lname order by amount) cnt  
 from @t t  
 )  
 , t2 as  
 (  
 select amount, Fname, LName  
 from t1   
 where cnt = 1  
 union all  
 select amount - 1, Fname, LName  
 from t2  
 where Amount > 1  
 )  
 select * from t2  
 order by 2, 3  


Tally table
 ;with tally as  
 (  
 select ROW_NUMBER() over (order by (select null)) as n  
 from (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as a(n)  
 cross join   
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as b(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as c(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as d(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as e(n)  
 cross join  
      (values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) as f(n)  
 )  
 select tt.n, t.Fname, t.LName  
 from @t t left join tally tt on t.Amount >= tt.n  
 order by 2, 3  


So now we see how tally table can help us in a real business scenario. Enjoy the script :)

2016-05-16

SSIS Custom Control Flow Task - RS Execution Task

Long time ago I blogged the topic of how to create custom control flow component in SSIS.  In those posts I demonstrated how to build the control flow task by a very simple component but without any execution logic.


Recently I started to upgrade my old jobs from 2012 to 2014 version. So to demonstrate how to design the execution logic, I uploaded the source code of my report execution task onto the Github. You can access the C# project here.


The whole project was created by following the same routine I posted at part1, part2, part3, and part4. So hopefully it can give you some general ideas when you want to create your own control flow task. You do not need to follow steps listed in my posts: in a real world scenario, you can create the task by using just one class library project, simply add Winform into the project.


Feel free to modify the source code of the project if you want to adopt it in your own scenario: just follow the instruction in the readme and build the project into your SSIS task folder:


If you are using other version of SQL Server, e.g. 2008R2 or 2012, you can still use it. Because the task is built on top of ReportService2010 and ReportExecution2005 web services, you can build you own version by referencing correct Microsof.SqlServer.ManagedDTS and Microsoft.SqlServer.Dts.Design. By default installation, you should find them at the SDK folder: ...\Microsoft SQL Server\<SQL version>\SDK\Assemblies\.


Unfortunately there is no simple solution to convert the task to support SSRS2005 or SSRS2008, because they are built based on ReportingService2005/2006 web services. So to get the task fully support these two versions, you have to rebuild the web service reference and fix the conflict in the project.


One last word is, please always do the test before deploy it into your production, it is at your own risk if you decide to use it in your production environment.



2016-04-19

SSIS Case Study: Optimise the Data Process

When design a data procedure to deal with large amount of data, the white paper “We Loaded 1TB in 30 Minutes with SSIS, and So Can You” is always a good starting point to begin our design task. However, I am afraid in a real word scenario, it might be a bit difficult to get an environment set up like that: gigabytes fibre channel, independent physical servers, flat file data source without data quality issue, stable control flow/data flow without checkpoint and error handling. Well, it is simply too good to be true.

Recently we migrated all our client activities to a new dialler platform, after that I received the request to achieve/back up all business critical data from the old dialler platform to our local database server. Because of some business decisions, we could not ask for back up files directly, so we decided to use SSIS to pull all data we need from the remote site.

The old dialler system is sitting on MySQL database hosted in a remote data centre. The connection between the remote data centre and our site is a 30MB fibre connection. The data needs to be downloaded into an archive database hosted on a virtual server. We need to avoid some certain time points, as the full back up tasks could freeze IO. And additionally we need to download recording files as well.

The packages are simple, here I just want to share some lessons we learnt from the design:


Allow Errors & Propagate = False


Like common ETL framework, let the job skips the error, logs the error, and handles the error at a later stage. In our case we created tables like below: DownloadHelper to store parameters for each small packages, DownloadLog to store pre-execution information and post-execution information, and ErrorLog to store actual exceptions. So when an error occurs, we can adopt different approaches to handle the error based on its severity level.






To allow the package bypass the error, firstly bring up Variable Grid Options window, and check the option “Show system variables”.




Then go to Event Handlers tab, add an OnError event handler for selected task. In addition to log the error on the screen, the most important setting is to change variable “Propagate” from True to False. After this change, the job will continue to next step even though the task issues an error.




Simplify Data Flow


To improve the data flow process, we removed nearly all transformation components within the data flow, especially the data conversion component.

However there are differences in data types between MySQL and SQL Server. So to handle the differences, we built destination table schema by using openquery targeting remote MySQL database:

 select * into [destination] from openquery([LinkServerName], 'select * from [MySQL table] limit 1')  

Eventually most of our data flow tasks look like below screenshot:





Optimize Data Flow


Loading the data into partitions/using balanced loading approach could be a great help in many cases, but after evaluate our situation, we did not adopt this approach as we found the bandwidth is the major component to affect our data flow. And obviously, we won’t increase the bandwidth just because of archives/backup job.

So within our environment we just did below configurations:

1. For the data flow task, increase the size of DefaultBufferMaxRows and DefaultBufferSize;



2. For the data destination component, select Table or View – Fast Load, and uncheck Keep Identity, Keep Nulls, and Check Constraints, and check Table Lock option. Leave the SSIS service to determine rows per batch and maximum commit size.




After did all these settings, below are the comparison result. As you can see, the total time spent dropped a lot. Well, this result cannot compare to the case as mentioned at the beginning of this post, but within our environment it is a very acceptable result.