OpenAI GPT-3 API: Fine tune a fine tuned model? [closed]
Asked Answered
D

1

8

The OpenAI documentation for the model attribute in the fine-tune API states a bit confusingly:

model

The name of the base model to fine-tune. You can select one of "ada", "babbage", "curie", "davinci", or a fine-tuned model created after 2022-04-21.

My question: is it better to fine-tune a base model or a fine-tuned model?

I created a fine-tune model from ada with file mydata1K.jsonl:

ada + mydata1K.jsonl --> ada:ft-acme-inc-2022-06-25

Now I have a bigger file of samples mydata2K.jsonl that I want to use to improve the fine-tuned model. In this second round of fine-tuning, is it better to fine-tune ada again or to fine-tune my fine-tuned model ada:ft-acme-inc-2022-06-25? I'm assuming this is possible because my fine tuned model is created after 2022-04-21.

ada + mydata2K.jsonl --> better-model

or

ada:ft-acme-inc-2022-06-25 + mydata2K.jsonl --> even-better-model?
Drida answered 26/6, 2022 at 0:35 Comment(1)
I found the answer on the openai forum: community.openai.com/t/continuous-fine-tuning-best-practices/… "If you have already fine-tuned a model for your task and now have additional training data that you would like to incorporate, you can continue fine-tuning from the model. This creates a model that has learned from all of the training data without having to re-train from scratch."Drida
S
5

UPDATE

It looks like fine-tuning a fine-tuned model is not supported anymore, as stated in the official OpenAI documentation:

Can I continue fine-tuning a model that has already been fine-tuned?

No, we do not currently support continuing the fine-tuning process once a job has finished. We plan to support this in the near future.


As stated in the official OpenAI documentation:

If you have already fine-tuned a model for your task and now have additional training data that you would like to incorporate, you can continue fine-tuning from the model. This creates a model that has learned from all of the training data without having to re-train from scratch.

To do this, pass in the fine-tuned model name when creating a new fine-tuning job (e.g., -m curie:ft-<org>-<date>). Other training parameters do not have to be changed, however if your new training data is much smaller than your previous training data, you may find it useful to reduce learning_rate_multiplier by a factor of 2 to 4.

Which option to choose?

You're asking about two options:

  • Option 1: ada + bigger-training-dataset.jsonl
  • Option 2: ada:ft-acme-inc-2022-06-25 + additional-training-dataset.jsonl

The documentation says nothing about which option is better in terms of which would yield better results.

However...

Choose Option 2

Why?

When training a fine-tuned model, the total tokens used will be billed according to our training rates.

If you choose Option 1, you'll pay for some tokens in your training dataset twice. First when doing fine-tuning with initial training dataset, second when doing fine-tuning with bigger training dataset (i.e., bigger-training-dataset.jsonl = initial-training-dataset.jsonl + additional-training-dataset.jsonl).

It's better to continue fine-tuning from a fine-tuned model because you'll pay only for tokens in your additional training dataset.

Read more about fine-tuning pricing calculation.

Scintillator answered 4/1, 2023 at 13:21 Comment(5)
Hello. I have a question, if in the training data that I gave to the first model I find that there are a few errors (suppose 5% of the data set) should I retrain a new model or extend the training of this one, only with that 5% of the data that was wrong and already corrected.Favata
@Favata Hi! Fine-tuning might not be the best option for your case. First, fine-tuning is not even the right approach if you want to get an exact answer (i.e., a fact) to a specific question. To see why, read my previous answers (this and this). Second, this is probably one of the biggest downsides of fine-tuning. You can't just "replace" wrong or outdated data.Scintillator
@Favata [continuing] You have two options, as you've already figured out. Either you continue fine-tuning a fine-tuned model or do fine-tuning from scratch. If you chose fine-tuning over semantic search (i.e., the embeddings approach) in the first place, then I suggest you pick the second option (i.e., fine-tuning from scratch). Why? Because if you fine-tune a fine-tuned model, the model will have two sets of data (wrong or outdated and a new one), which will probably make the model hallucinate even more!Scintillator
Thank you for your detailed response. I am making a first approach to a classification model using LLM, as shown in the OpenAI documentation.Favata
@Favata Then follow the example in the documentation. I've never done classifications. I just know that the Classifications API endpoint is deprecated. Regarding your question, I really don't have hands-on experience, so I can't give you a suggestion on whether you should continue fine-tuning a fine-tuned model or do fine-tuning from scratch for classification purposes.Scintillator

© 2022 - 2024 — McMap. All rights reserved.