I have trained a faster rcnn model with a custom dataset using Tensorflow's Object Detection Api. Over time I would like to continue to update the model with additional images (collected weekly). The goal is to optimize for accuracy and to weight newer images over time.
Here are a few alternatives:
- Add images to previous dataset and train a completely new model
- Add images to previous dataset and continue training previous model
- New dataset with just new images and continue training previous model
Here are my thoughts: option 1: would be more time consuming, but all images would be treated "equally".
Option 2: would like take less additional training time, but one concern is that the algorithm might be weighting the earlier images more.
Option 3: This seems like the best option. Take original model and simply focus on training the new stuff.
Is one of these clearly better? What would be the pros/cons of each?
In addition, I'd like to know if it's better to keep one test set as a control for accuracy or to create a new one each time that includes newer images. Perhaps adding some portion of new images to model and another to the test set, and then feeding older test set images back into model (or throwing them out)?