Stack Exchange API to get the Impact/ Number of People Reached

Asked 25/10, 2020 at 19:26 Answered 19/3 at 16:22

I searched the entire documentation for the Stack Exchange API v2.2, but could not find any API to get the data about the Impact section on the user page.

I am interested in the Impact/Number of People Reached data for a specific user.

One way to solve this problem is to GET the entire user page by using the URL: https://stackoverflow.com/users/${id} and using document.getElementById(), get the required data.

But the problem is, fetching the entire user page is bulky and not the optimal solution.

Melloney answered 25/10, 2020 at 19:26 Comment(3)

But the problem is, fetching the entire user page is bulky and not the optimal solution. and unreliable for longevity. – Seaman 27/10, 2020 at 17:58

Have you already cross-referenced all of the available fields on the User object to make sure none of them lined up with the Impact field? – Seaman 27/10, 2020 at 18:2

Yes sir @Taco. I have checked all the fields on the User object. No field for impact. – Melloney 28/10, 2020 at 6:26

This is only possible via scraping users' profiles. Neither SE API nor SEDE provide a field with the number of people reached. You can only get the number of views a profile has received. It's the view_count field in the /users/{ids} method (not included in the filter by default) and the Views column in the Users table in SEDE. See the database schema for more details.

Neologism answered 29/10, 2020 at 13:41 Comment(0)

One way to do it is

w3m -dump https://stackoverflow.com/users/[[USER ID]]/[[USER NAME]]\?tab\=topactivity | grep -izoP "(?s)[0-9]+(\.[0-9]+)?[kmb](?=\s(people)?\sreached)"

The reason why I use w3m is because it lifts the need to parse HTML tags, and the regex uses lookahead to make sure it's the number we're looking for, which at the time of writing this, is expected to be followed by the string reached or people.

Hebraic answered 25/9, 2022 at 2:51 Comment(0)

At the time of writing there are still no traces of people reached in the API specification, however on paper there is a workaround.

The computation of people reached is described on Meta and Meta Stack Exchange and also implemented in the public prototype #1 SEDE query. It can be noted that all the required components: question views, asked questions, answered question, question answers, scores and answer acceptance – all are accessible via API endpoints. So it seems possible to retrieve all the necessary data and reproduce the calculations. A sample algorithm:

Fetch questions asked by the user /user/{user_id}/questions
For each question:
    Save question_id and view_count

Fetch answers provided by the user /user/{user_id}/answers
For each answer:
    Skip if the question answered is in the questions asked by the user
    Skip if the answer is not accepted and doesn’t have positive score
    Save answered question_id and answer score
    Mark useful if the answer is accepted or 
        the score is equal or above the threshold (5)

Fetch answered questions /questions/{ids}
For each question:
    Save view_count and answer_count
    Mark useful if answer_count is equal or below the threshold (3)
    Add to the following inspection if question is not yet marked as useful

Fetch answers for questions marked for inspection /questions/{ids}/answers
For each answer:
    Skip if the answer has zero or negative score
    Add answer score to the answered question total score
    Keep track of the answered question top scores

For each inspected question:
    Mark useful if answer score is in the top scores
    Mark useful if answer score percent is equal or
        above threshold (20%) from total score

Accumulate view_count of asked questions and user answered questions

And here is a sample python implementation: StackAPI-Impact.

Though practical limitations make this approach have a somewhat narrow application scope. Retrieving the large collections of user’s answers or questions ends up with heavy throttling, regardless of API key usage. That is why the calculation of people reached by e.g. John Skeet takes tens of minutes.

As a conclusion, there should be a feature request for the number of people reached available in API.

Cockneyfy answered 19/3 at 16:22 Comment(0)

Recommended topics

Hot tags