Given a simple neural net in Pytorch like:
import torch.nn as nn
net = nn.Sequential(
nn.Linear(3, 4),
nn.Sigmoid(),
nn.Linear(4, 1),
nn.Sigmoid()
).to(device)
How do I convert it into a Huggingface PreTrainedModel object?
The goal is to convert the Pytorch nn.Module
object from nn.Sequential
into the Huggingface PreTrainedModel
object, then run something like:
import torch.nn as nn
from transformers.modeling_utils import PreTrainedModel
net = nn.Sequential(
nn.Linear(3, 4),
nn.Sigmoid(),
nn.Linear(4, 1),
nn.Sigmoid()
).to(device)
# Do something to convert the Pytorch nn.Module to the PreTrainedModel object.
shiny_model = do_some_magic(net, some_args, some_kwargs)
# Save the shiny model that is a `PreTrainedModel` object.
shiny_model.save_pretrained("shiny-model")
PreTrainedModel.from_pretrained("shiny-model")
And it seems like to build/convert any native Pytorch models into a Huggingface one, there's a need for some configurations https://huggingface.co/docs/transformers/main_classes/configuration
There are many how-tos to train models "from scratch", e.g.
[Using BertLMHeadModel, not that scratch] https://www.kaggle.com/code/mojammel/train-model-from-scratch-with-huggingface/notebook (this is also fine-tuning from bert, not scratch)
[Not really scratch, using roberta as template] https://huggingface.co/blog/how-to-train (this is fine-tuning from roberta, not really training from scratch)
[Sort of uses some Config template] https://www.thepythoncode.com/article/pretraining-bert-huggingface-transformers-in-python (this is kinda from scratch but uses the template from BERT to generate the config, what if we want to change how the model works, how should the config look like?)
[Kinda defined a template but using RobertaForMaskedLM] https://skimai.com/roberta-language-model-for-spanish/ (this looks like it kinda defines a template but restricts it to RobertaForMaskedLM template)
Questions in parts:
If we have a much simpler Pytorch model like in the code snippet above, how to create a Pretrained Model from scratch in Huggingface?
How to create the Pretrained model config we need for Huggingface to make the converting from native Pytorch nn.Module work?