Saving and reload huggingface fine-tuned transformer
Asked Answered
M

2

28

I am trying to reload a fine-tuned DistilBertForTokenClassification model. I am using transformers 3.4.0 and pytorch version 1.6.0+cu101. After using the Trainer to train the downloaded model, I save the model with trainer.save_model() and in my trouble shooting I save in a different directory via model.save_pretrained(). I am using Google Colab and saving the model to my Google drive. After testing the model I also evaluated the model on my test getting great results, however, when I return to the notebook (or Factory restart the colab notebook) and try to reload the model, the predictions are terrible. Upon checking the directories, the config.json file is there as is the pytorch_mode.bin. Below is the full code.

from transformers import DistilBertForTokenClassification

# load the pretrained model from huggingface
#model = DistilBertForTokenClassification.from_pretrained('distilbert-base-cased', num_labels=len(uniq_labels))
model = DistilBertForTokenClassification.from_pretrained('distilbert-base-uncased', num_labels=len(uniq_labels)) 

model.to('cuda');

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir = model_dir +  'mitmovie_pt_distilbert_uncased/results',          # output directory
    #overwrite_output_dir = True,
    evaluation_strategy='epoch',
    num_train_epochs=3,              # total number of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=64,   # batch size for evaluation
    warmup_steps=500,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir = model_dir +  'mitmovie_pt_distilbert_uncased/logs',            # directory for storing logs
    logging_steps=10,
    load_best_model_at_end = True
)

trainer = Trainer(
    model = model,                         # the instantiated πŸ€— Transformers model to be trained
    args = training_args,                  # training arguments, defined above
    train_dataset = train_dataset,         # training dataset
    eval_dataset = test_dataset             # evaluation dataset
)

trainer.train()

trainer.evaluate()

model_dir = '/content/drive/My Drive/Colab Notebooks/models/'
trainer.save_model(model_dir + 'mitmovie_pt_distilbert_uncased/model')

# alternative saving method and folder
model.save_pretrained(model_dir + 'distilbert_testing')

Coming back to the notebook after restarting...

from transformers import DistilBertForTokenClassification, DistilBertConfig, AutoModelForTokenClassification

# retreive the saved model 
model = DistilBertForTokenClassification.from_pretrained(model_dir + 'mitmovie_pt_distilbert_uncased/model', 
                                                        local_files_only=True)

model.to('cuda')

Model predictions are terrible now from either directory, however, the model does work and outputs the number of classes I would expect, it appears that the actual trained weights have not been saved or are somehow not getting loaded.

Maltzman answered 3/11, 2020 at 13:3 Comment(4)
There are two other questions regarding saving and loading transformer models leading to worse accuracy (here and here). I think you should open a bug report. – Duckling
Did you find a workaround to this issue? – Sonora
Try checking the layers, before save, and after reload Colab and the model, try the last layer, the weights should be the same – Varsity
Could you also share how you run the testing? Note that model.generate and trainer.predict does not always generate same outputs. – Penetrate
C
8

Try to use the following code:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)
Cognoscenti answered 26/8, 2022 at 10:7 Comment(2)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center. – Manx
Please clarify whether model_path is the destination of .save_model() or .save_pretrained(), if you could. – Clatter
A
4

Do you tried loading the by the trainer saved model in the folder:

mitmovie_pt_distilbert_uncased/results

The Huggingface trainer saves the model directly to the defined output_dir.

Adventitia answered 16/9, 2021 at 9:50 Comment(1)
Has anyone found an answer? I am having the same issue. The only difference is that I am using tensorflow to train the fine-tuning model. I have used save_pretrained and save_weights and no luck. I think weights are not loaded. Somehow it worked good for a moment but it seems the order of load compile evaluate matter – Tippets

© 2022 - 2024 β€” McMap. All rights reserved.