How to extract average ROC curve predictions using ROCR?
Asked Answered
I

1

7

The ROCR library in R offer the ability to plot an average ROC curve (right from the ROCR reference manual):

library(ROCR)
library(ROCR)
data(ROCR.xval)
# plot ROC curves for several cross-validation runs (dotted
# in grey), overlaid by the vertical average curve and boxplots
# showing the vertical spread around the average.
data(ROCR.xval)
pred <- prediction(ROCR.xval$predictions, ROCR.xval$labels)
perf <- performance(pred,"tpr","fpr")
plot(perf,col="grey82",lty=3)
plot(perf,lwd=3,avg="vertical",spread.estimate="boxplot",add=TRUE)

Averaged ROC plot with boxplot

Lovely. Unfortunately, there's seemingly no ability to obtain the average ROC curve itself as an object/dataframe/etc. for further statistical testing (say, with pROC). I did do some research (albeit perhaps after the fact), and I found this post:

Global variables in R

I looked through ROCR's code reveals the following lines for passing a result to a plot:

performance_plots.R, (starting at line 451)

## compute average curve
 perf.avg <- perf.sampled
 [email protected] <- list( rowMeans( data.frame( [email protected])))
 [email protected] <- list(rowMeans( data.frame( [email protected])))
 [email protected] <- list( alpha.values )

So, using the trace function I looked up here (General suggestions for debugging in R):

trace(.performance.plot.horizontal.avg, edit=TRUE)

I added the following line to the performance_plots.R after the lines listed above:

perf.rocr.avg <<- perf.avg # note the double `<<`

A horrible hack, yet it works as I can plot perf.rocr.avg without a problem. Unfortunately, when using pROC, I can't compare my averaged ROC curve because it requires a pROC roc object. That's fine, but the catch is that the pROC roc object requires the original prediction and reference data to create. As far as I can tell, ROCR is averaging the ROC curves themselves and not the predictions, so it seems I can't get what I want out of ROCR.

Is there a way to reverse-engineer the predictions from the averaged ROC curve created by ROCR?

Isidore answered 24/4, 2016 at 23:14 Comment(5)
Have you looked to see if the predict command would work with ROC?Fabio
@Fabio - I have, but I didn't make much headway. I've assigned a variable after the last line above perf.avg.rocr <<- perf.avg, which gives me a ROCR performance object, and the desired average ROC plot. Unfortunately, I now realize I can't use roc.test because it's not a prediction object. Any other advice welcomed...Isidore
Have you looked at this answer: #11468355 or this hopstat.wordpress.com/2014/12/19/… I have not used the ROCR library, so I can't provide much more adviceFabio
@Fabio - Ya gotta love how that question on SO has been upvoted 16 times and is entirely RTFM, whereas I ask something programmatic in nature that has me honestly stumped and I get downvoted. Anyway, thanks! I'm (now) pretty versed in the usage of ROCR. It's just that it doesn't do what I need it to. To make matters worse, pROC only accepts a roc object for statistical testing, which itself requires the original prediction and reference data. I'll keep at it on my end.Isidore
@Isidore By any chance were you able to figure out a solution? I am looking to be able to extract the data frame for individual ROC curves, so if you could guide me I would appreciate it.Hengist
S
0

I met the same problem as you. In my perspective, the average ROC generated by the ROCR package just assigned numeric values, while other statistical attribution (e.g. confidence interval) lacks. That means statistic with the average ROC may make no sense and that's why the roc object can't be generated by (tpr, fpr) list in PRoc package. However, I find a paper to address this problem, i.e., the comparison between average ROCs. The title is "The average area under correlated receiver operating characteristic curves: a nonparametric approach based on generalized two-sample Wilcoxon statistics". I hope that's helpful.

Susy answered 17/4, 2020 at 15:54 Comment(1)
Actually, I implement the method proposed in that paper, and the result seems reasonable. That's a good choice if you aim to make statistical test between average ROCs.Susy

© 2022 - 2024 — McMap. All rights reserved.