UPDATE
It looks like fine-tuning a fine-tuned model is not supported anymore, as stated in the official OpenAI documentation:
Can I continue fine-tuning a model that has already been fine-tuned?
No, we do not currently support continuing the fine-tuning process
once a job has finished. We plan to support this in the near future.
As stated in the official OpenAI documentation:
If you have already fine-tuned a model for your task and now have
additional training data that you would like to incorporate, you can
continue fine-tuning from the model. This creates a model that has
learned from all of the training data without having to re-train from
scratch.
To do this, pass in the fine-tuned model name when creating a new
fine-tuning job (e.g., -m curie:ft-<org>-<date>
). Other training
parameters do not have to be changed, however if your new training
data is much smaller than your previous training data, you may find it
useful to reduce learning_rate_multiplier
by a factor of 2 to 4.
Which option to choose?
You're asking about two options:
- Option 1:
ada + bigger-training-dataset.jsonl
- Option 2:
ada:ft-acme-inc-2022-06-25 + additional-training-dataset.jsonl
The documentation says nothing about which option is better in terms of which would yield better results.
However...
Choose Option 2
Why?
When training a fine-tuned model, the total tokens used will be billed
according to our training rates.
If you choose Option 1, you'll pay for some tokens in your training dataset twice. First when doing fine-tuning with initial training dataset, second when doing fine-tuning with bigger training dataset (i.e., bigger-training-dataset.jsonl
= initial-training-dataset.jsonl
+ additional-training-dataset.jsonl
).
It's better to continue fine-tuning from a fine-tuned model because you'll pay only for tokens in your additional training dataset.
Read more about fine-tuning pricing calculation.