I have been spending a long time using R to try to scrape NBA data, so far I was doing it a little by trial and error, but finally I found this documentation. Some time ago I had some problems scraping the shotchartdetail, and I figured out the problem when I found this
This works
For that this is what I did:
shotURLtotal <- paste0("http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=2016-17&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=0&PlusMinus=N&Position=&Rank=N&RookieYear=&Season=2016-17&SeasonSegment=&SeasonType=Regular+Season&TeamID=0&VsConference=&VsDivision=&mode=Advanced&showDetails=0&showShots=1&showZones=0&PlayerPosition=")
Season <- rjson::fromJSON(file = shotURLtotal, method="C")
Names <- Season$resultSets[[1]][[2]]
Season <- data.frame(matrix(unlist(Season$resultSets[[1]][[3]]), ncol = length(Names), byrow = TRUE))
colnames(Season) <- Names
But this does not
but when I try to do the same with the shotchartlineupdetail, and it does not work, I suspect it has to do with the CFID, which I don't know what it means, this is what I tried.
shoturl <- "http://stats.nba.com/stats/shotchartlineupdetail/?leagueId=00&season=2016-17&seasonType=Regular+Season&teamId=0&outcome=&location=&month=0&seasonSegment=&dateFrom=&dateTo=&opponentTeamId=0&vsConference=&vsDivision=&gameSegment=&period=0&lastNGames=0&gameId=&group_id=0&contextFilter=&contextMeasure=FGA"
Season <- rjson::fromJSON(file = shoturl, method="C")
Names <- Season$resultSets[[1]][[2]]
Season <- data.frame(matrix(unlist(Season$resultSets[[1]][[3]]), ncol = length(Names), byrow = TRUE))
colnames(Season) <- Names
Expected Results
The expected result should be a dataframe with the following columns:
c("GRID_TYPE", "GAME_ID", "GAME_EVENT_ID", "GROUP_ID", "GROUP_NAME", "PLAYER_ID", "PLAYER_NAME", "TEAM_ID", "TEAM_NAME", "PERIOD", "MINUTES_REMAINING", "SECONDS_REMAINING", "EVENT_TYPE", "ACTION_TYPE", "SHOT_TYPE", "SHOT_ZONE_BASIC", "SHOT_ZONE_AREA", "SHOT_ZONE_RANGE", "SHOT_DISTANCE", "LOC_X", "LOC_Y", "SHOT_ATTEMPTED_FLAG", "SHOT_MADE_FLAG", "GAME_DATE", "HTM", "VTM")
which you can get by doing:
shoturl <- "http://stats.nba.com/stats/shotchartlineupdetail/?leagueId=00&season=2016-17&seasonType=Regular+Season&teamId=0&outcome=&location=&month=0&seasonSegment=&dateFrom=&dateTo=&opponentTeamId=0&vsConference=&vsDivision=&gameSegment=&period=0&lastNGames=0&gameId=&group_id=0&contextFilter=&contextMeasure=FGA"
Season <- rjson::fromJSON(file = shoturl, method="C")
Names <- Season$resultSets[[1]][[2]]
So Names would be the columns of the dataframe, the problem is that by not using the CFID you get that the list where the data for those columns should be are empty, the answer that @be_green gives are the league average, and I need the team specific data
?CFID=33&CFPARAMS=2016-17&
part it gives out an empty dataframe as well. so it seems that the CFID is what we need to figure out in order to get the data, unfortunately, CFID is not documented. And I am not sure how to get that parameter – Ansel"GROUP_ID", "GROUP_NAME"
– Ansel