GitHub API provides the feature of activity events for users
, orgs
and repos
. The APIs support pagination upto 10 pages for a total of 300 events
with 30 events
per page. Rate Limiting is achieved using ETAG
headers. I am trying to poll this API to get the latest activity. However this scheme is very in-efficient due to the design supported by Github as mentioned. Lets say I make a request on page-1
by
https://api.github.com/users/me/events/orgs/my-org?page=1
and i will get an ETAG
entry for this page. Now I move to the next page-2
and do
https://api.github.com/users/me/events/orgs/my-org?page=2
and will get the ETAG
for this 2nd page. Similarly I can pull events from all 10 supported pages.
Now lets say that some activity was performed on my orgs Github account. Lets assume that only 1 new event occured. In this case when I poll
the API for page-1
with the ETAG
it will return the changed page with the new event
included in it. Similarly polling
on page-2
with its previous ETAG
will also send the changed page. This change in page-2
is however the event that was previously the last event of page-1
and has now moved to the top on page-2
. This "shift-to-next" will happen for all the pages. There is NO way to find out the number of NEW events that took place.The only solution is to keep polling on page-1
to get the latest events
. However this approach has a serious flaw explained below:
The situation gets worse when the number of new events
between my poll
rounds is greater than 30(max items on one page). In this case, events prior to the latest new 30 events will slip to page-2
directly. If I only poll
on page-1
i will loose these events that slipped to page-2
. The only solution that is coming to my mind is to keep a cache of the entire events and then sweep on all pages. This is however a very in-efficient and un-desirable way to do it and kills the purpose of on events notification API.
I hope some github-dev can answer this