2017-09-17

Power BI/Power Query: API Response Pagination Part 2 - Paginate by Page Cursor

This is the 2nd post of the topic API Response Pagination in Power BI/Power Query. You can find the first post here.

In the first post we have seen an API pagination example in Power Query, in which the API response provides us with the total record count, then we can calculate amount of pages we need to query. So today let's see another typical scenario in API response: the response contains an cursor pointing to the next page. (Here the cursor could be a direct link brings you to the next page, or it could be a page token you need to apply to your next API call).

The example I demonstrated today is Youtube Search API, and the channel I am going to search is VicRoad (as this post is focusing on result pagination, I am not going to explain how to prepare the API call :P). Below is a well-formatted Json response returned from the call. As you can see, the key field (the cursor) we are going to use is "nextPageToken".



Similar to the last post, the first step is preparing a base call function, as shown below. However, this time we return a list, which contains the result set of the current call, as well as the nextPageToken field we can use for drilling calls.



Now here is the core function of today's topic: List.Generate(). Basically you can treat this function as the FOR loop: we initialize a variable, give it a stop condition, specify the incremental action to the variable, and finally populate the return list result. Below is what I wrote:

So what we did is quite clear now: we create a helper function to call the API; Inside the List.Generate() function, we call the function to repeatedly populate records on different pages by using nextPageToken value. So what we need to do is just turning the result to table? Hold on for one second!

Let's review this piece of codes. So initially we call the API without page token, and the return result set contains records from page 1 and token for page 2, hence the second call gives us records from page 2 and token for page 3...


Initial TokenResultsetNext Token
""page 1page 2
page 2page 2page 3
page 3page 3page 4
page 4page 4page 5
.........
last pagelast pagenull

Now you see the problem: our condition for generating the list is "for each NOT NULL nextPageToken". We haven't gone through the last page!

Fixing the problem is quite simple, we can just manually query the last page, and then combine the result sets:

Put all together:

let    
    ufnGetList = (pgToken) =>
        let 
            result = Json.Document(
                        Web.Contents("https://www.googleapis.com/youtube/v3/search?part=snippet&channelId=UCkT7-rjW_Foojt0ZRtbXMdA&maxResults=50&type=video&key=[put your api key here]&pageToken=" & pgToken)
                        ),
            nextPageToken = try result[nextPageToken] otherwise null,
            items = result[items],
            record = [items = items, nextPageToken = nextPageToken]
        in 
            record,

    resultSet = List.Generate(
                    () => ufnGetList(""),
                    each _[nextPageToken] <> null,
                    each ufnGetList(_[nextPageToken]),
                    each [nextPageToken = _[nextPageToken], items = _[items]]
                    ),

    lastPageToken = List.Last(Table.FromRecords(resultSet)[nextPageToken]),
    lastResultSet = ufnGetList(lastPageToken)[items],

    fullResultSet = Table.Combine({Table.FromRecords(List.Combine(Table.FromRecords(resultSet)[items])), Table.FromRecords(lastResultSet)})
    
    
in
    fullResultSet


Now Run the query we should have all the records we need.



1 comment :