I am trying to reload a fine-tuned DistilBertForTokenClassification model. I am using transformers 3.4.0 and pytorch version 1.6.0+cu101. After using the Trainer to train the downloaded model, I save the model with trainer.save_model() and in my trouble shooting I save in a different directory via model.save_pretrained(). I am using Google Colab and saving the model to my Google drive. After testing the model I also evaluated the model on my test getting great results, however, when I return to the notebook (or Factory restart the colab notebook) and try to reload the model, the predictions are terrible. Upon checking the directories, the config.json file is there as is the pytorch_mode.bin. Below is the full code.
from transformers import DistilBertForTokenClassification
# load the pretrained model from huggingface
#model = DistilBertForTokenClassification.from_pretrained('distilbert-base-cased', num_labels=len(uniq_labels))
model = DistilBertForTokenClassification.from_pretrained('distilbert-base-uncased', num_labels=len(uniq_labels))
model.to('cuda');
from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir = model_dir + 'mitmovie_pt_distilbert_uncased/results', # output directory
#overwrite_output_dir = True,
evaluation_strategy='epoch',
num_train_epochs=3, # total number of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=64, # batch size for evaluation
warmup_steps=500, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir = model_dir + 'mitmovie_pt_distilbert_uncased/logs', # directory for storing logs
logging_steps=10,
load_best_model_at_end = True
)
trainer = Trainer(
model = model, # the instantiated π€ Transformers model to be trained
args = training_args, # training arguments, defined above
train_dataset = train_dataset, # training dataset
eval_dataset = test_dataset # evaluation dataset
)
trainer.train()
trainer.evaluate()
model_dir = '/content/drive/My Drive/Colab Notebooks/models/'
trainer.save_model(model_dir + 'mitmovie_pt_distilbert_uncased/model')
# alternative saving method and folder
model.save_pretrained(model_dir + 'distilbert_testing')
Coming back to the notebook after restarting...
from transformers import DistilBertForTokenClassification, DistilBertConfig, AutoModelForTokenClassification
# retreive the saved model
model = DistilBertForTokenClassification.from_pretrained(model_dir + 'mitmovie_pt_distilbert_uncased/model',
local_files_only=True)
model.to('cuda')
Model predictions are terrible now from either directory, however, the model does work and outputs the number of classes I would expect, it appears that the actual trained weights have not been saved or are somehow not getting loaded.