Given a trained contextual bandit model, how can I retrieve a prediction vector on test samples?
For example, let's say I have a train set named "train.dat" containing lines formatted as below
1:-1:0.3 | a b c # <action:cost:probability | features>
2:2:0.3 | a d d
3:-1:0.3 | a b e
....
And I run below command.
vw -d train.dat --cb 30 -f cb.model --save_resume
This produces a file, 'cb.model'. Now, let's say I have a test dataset as below
| a d d
| a b e
I'd like to see probabilities as below
0.2 0.7 0.1
The interpretation of these probabilities would be that action 1 should be picked 20% of the time, action 2 - 70%, and action 3 - 10% of the time.
Is there a way to get something like this?
--cb
, but the vowpal-wabbit source tree on github has several--cb
examples intest/RunTests
with data-sets and results, so perhaps you should start there? Another trick that I often use is the option-a
(aka--audit
) which outputs the weights of features on stderr asvw
runs. This can help gain deep visibility into the model in real-time. HTH. – Triolet