Multiple paginated GET API calls in parallel/async in Node
Asked Answered
I

2

9

I am making call to the bitbucket API to get all the files that are in a repo. I have reached to a point where I can get the list of all the folders in the repo and make the first API call to all the root folders in the repo in parallel and get the the list of first 1000 files for all folders.

But the problem is bitbucket api can give me only 1000 files per folder at a time.

I need to append a query param &start =nextPageStart and make the call again, until it is null and isLastPage is true per API. How can I achieve that with below code??

I get the nextPageStart from first call to the api. See the API response below.

Below is the code that I have so far.

Any help or guidance is appreciated.

Response from individual API thats called per folder.

{
    "values": [
        "/src/js/abc.js",
        "/src/js/efg.js",
        "/src/js/ffg.js",
        ...
    ],
    "size": 1000,
    "isLastPage": false,
    "start": 0,
    "limit": 1000,
    "nextPageStart": 1000
}

function where i made asynchronous calls to get the list of files

export function getFilesList() {
  const foldersURL: any[] = [];
  getFoldersFromRepo().then((response) => {
    const values = response.values;
    values.forEach((value: any) => {
    //creating API URL for each folder in the repo
      const URL = 'https://bitbucket.abc.com/stash/rest/api/latest/projects/'
                   + value.project.key + '/repos/' + value.slug + '/files?limit=1000';
      foldersURL.push(URL);
        });
    return foldersURL;
      }).then((res) => {
    // console.log('Calling all the URLS in parallel');
    async.map(res, (link, callback) => {
       const options = {
         url: link,
         auth: {
           password: 'password',
           username: 'username',
         },
       };
       request(options, (error, response, body) => {

      // TODO: How do I make the get call again so that i can paginate and append the response to the body till the last page.

         callback(error, body);
       });
     }, (err, results) => {
       console.log('In err, results function');
       if (err) {
         return console.log(err);
       }
       //Consolidated results after all API calls.
       console.log('results', results);
     });
  })
   .catch((error) => error);
}
Ignace answered 16/10, 2018 at 19:6 Comment(2)
Where will be the url for next page? Or say how will it be formed when isLastPage = false.Rexfourd
i need to append a query param to the URL &start =nextPageStart if the isLastPage is falseIgnace
I
6

I was able to get it working be creating a function with callback.

export function getFilesList() {
  const foldersURL: any[] = [];
  getFoldersFromRepo().then((response) => {
    const values = response.values;
    values.forEach((value: any) => {
    //creating API URL for each folder in the repo
      const URL = 'https://bitbucket.abc.com/stash/rest/api/latest/projects/'
                   + value.project.key + '/repos/' + value.slug + '/files?limit=1000';
      foldersURL.push(URL);
        });
    return foldersURL;
      }).then((res) => {
    // console.log('Calling all the URLS in parallel');
    async.map(res, (link, callback) => {
       const options = {
         url: link,
         auth: {
           password: 'password',
           username: 'username',
         },
       };
      const myarray = [];
// This function will consolidate response till the last Page per API.
      consolidatePaginatedResponse(options, link, myarray, callback);
     }, (err, results) => {
       console.log('In err, results function');
       if (err) {
         return console.log(err);
       }
       //Consolidated results after all API calls.
       console.log('results', results);
     });
  })
   .catch((error) => error);
}

function consolidatePaginatedResponse(options, link, myarray, callback) {
  request(options, (error, response, body) => {
    const content = JSON.parse(body);
    content.link = options.url;
    myarray.push(content);
    if (content.isLastPage === false) {
      options.url = link + '&start=' + content.nextPageStart;
      consolidatePaginatedResponse(options, link, myarray, callback);
    } else {
// Final response after consolidation per API
      callback(error, JSON.stringify(myarray));
    }
  });
}
Ignace answered 25/10, 2018 at 17:1 Comment(0)
A
1

I think the best way is to wrap it in a old school for loop (forEach doesn't work with async, since it's synchronous and it will cause all the requests to be spawn at the same time).

What I understood is that you do some sort of booting query where you get the values array and then you should iterate among the pages. Here some code, I didn't fully grasp the APIs so I'll give a simplified (and hopefully readable) answer, you should be able to adapt it:

export async function getFilesList() {

    logger.info(`Fetching all the available values ...`);

    await getFoldersFromRepo().then( async values => {

        logger.info("... Folders values fetched.");

        for (let i = 0; ; i++ ) {

            logger.info( `Working on page ${i}`);

            try {
                // if you are using TypeScript, the result is not the promise but the succeeded value already
                const pageResult: PageResult = await yourPagePromise(i);
                if (pageResult.isLastPage) {
                    break;
                }
            } catch(err) {
                console.err(`Error on page ${i}`, err);
                break;
            }

        }

        logger.info("Done.");

    });

    logger.info(`All finished!`);

}

The logic behind is that first getFoldersFromRepo() returns a promise which returns the values, and then I sequentially iterate on all available pages through the yourPagePromise function (which returns a promise). The async/await construct allows to write more readable code, rather then having a waterfall of then().

I'm not sure it respects your APIs specs, but it's the logic you can use as foundation! ^^

Authentic answered 24/10, 2018 at 15:23 Comment(2)
Thanks I will try this way too. I finally figured out a way to paginate. I will try your way to see which one performs better.Ignace
Mine is using a bit of sugar syntax to avoid nesting, just be aware of mapping promises or simply cycling on them without the await, because you'll spawn all of them at the same time (talking by experience here, I bombed my services with 2000 requests and killed it! :P )Authentic

© 2022 - 2024 — McMap. All rights reserved.