I am learning Pandas
package by replicating the outing from some of the R vignettes. Now I am using the dplyr
package from R as an example:
http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html
R script
planes <- group_by(hflights_df, TailNum)
delay <- summarise(planes,
count = n(),
dist = mean(Distance, na.rm = TRUE))
delay <- filter(delay, count > 20, dist < 2000)
Python script
planes = hflights.groupby('TailNum')
planes['Distance'].agg({'count' : 'count',
'dist' : 'mean'})
How can I state explicitly in python that NA
needs to be skipped?
pandas
includeNaN
? – Abreu