How to make vowpal wabbit use more observations
Asked Answered
P

1

5

I am new to vowpal wabbit so have some questions about it.

I passed a dataset to the vw and fit a model and got in-sample predictions, saved the model with -f. So far so good. I know how to use the model and make prediction on different dataset. But I want to know how to add more observation to the model and update it.

Main Objective : Use some chunk of data to first make vw to learn it online then use that model to predict some data. then use the new data to update model. then use updated data to predict another new observation and this process should go on.

As I said I am a newbie, so kindly try to excuse the triviality of the question

Podolsk answered 9/7, 2015 at 6:56 Comment(0)
C
7
vw -i existing.model -f new.model more_observations.dat

Mnemonics:

  • -i initial
  • -f final

You may even use the same model filename in -i and -f to update "in-place" since it is not really in-place. The model replacement happens at the end of the run in atomic fashion (rename of a temporary file to the final file) as can be seen in the following strace output (with comments added):

$ strace -e open,close,rename vw --quiet -i zz.model -f zz.model f20-315.tt.gz
# loading the initial (-i zz.model) model into memory
open("zz.model", O_RDONLY)              = 3
# done loading, so we can close it
close(3)                                = 0
# Now reading the data-set and learning in memory
open("f20-315.tt.gz", O_RDONLY)         = 3
# data read complete. write the updated model into a temporary file
open("zz.model.writing", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4
close(4)                                = 0
# and rename atomically to the final (-f zz.model) model file 
rename("zz.model.writing", "zz.model")  = 0
...
close(4)                                = 0
close(3)                                = 0
+++ exited with 0 +++
Casaubon answered 9/7, 2015 at 17:13 Comment(4)
giving this "more_observation.dat" and giving "-d more_observation.dat" are different or the same?Podolsk
Should behave the same with or without -dCasaubon
i would add that for such use case it's better to save/load model files with --save_resume key. Without it vw doesn't expect that model will be used for something except prediction and doesn't include some additional learning algorithm specific params in model which may be used to continue model training later. So without --save_resume consecutive training will be a bit less effective.Scotsman
Yes, thank-you. --save_resume may be desirable in most scenarios. The difference is at the point of starting where the (decayed) learning-rate parameters in the prior model get a jolt from being reset upwards.Casaubon

© 2022 - 2024 — McMap. All rights reserved.