How to convert a PyTorch nn.Module into a HuggingFace PreTrainedModel object?

Asked 4/10, 2022 at 12:56 Answered 18/10, 2022 at 10:50

python machine-learning pytorch huggingface-transformers pre-trained-model

Given a simple neural net in Pytorch like:

import torch.nn as nn

net = nn.Sequential(
      nn.Linear(3, 4),
      nn.Sigmoid(),
      nn.Linear(4, 1),
      nn.Sigmoid()
      ).to(device)

How do I convert it into a Huggingface PreTrainedModel object?

The goal is to convert the Pytorch nn.Module object from nn.Sequential into the Huggingface PreTrainedModel object, then run something like:

import torch.nn as nn
from transformers.modeling_utils import PreTrainedModel


net = nn.Sequential(
      nn.Linear(3, 4),
      nn.Sigmoid(),
      nn.Linear(4, 1),
      nn.Sigmoid()
      ).to(device)

# Do something to convert the Pytorch nn.Module to the PreTrainedModel object.
shiny_model = do_some_magic(net, some_args, some_kwargs)

# Save the shiny model that is a `PreTrainedModel` object.
shiny_model.save_pretrained("shiny-model")

PreTrainedModel.from_pretrained("shiny-model")

And it seems like to build/convert any native Pytorch models into a Huggingface one, there's a need for some configurations https://huggingface.co/docs/transformers/main_classes/configuration

There are many how-tos to train models "from scratch", e.g.

[Using BertLMHeadModel, not that scratch] https://www.kaggle.com/code/mojammel/train-model-from-scratch-with-huggingface/notebook (this is also fine-tuning from bert, not scratch)
[Not really scratch, using roberta as template] https://huggingface.co/blog/how-to-train (this is fine-tuning from roberta, not really training from scratch)
[Sort of uses some Config template] https://www.thepythoncode.com/article/pretraining-bert-huggingface-transformers-in-python (this is kinda from scratch but uses the template from BERT to generate the config, what if we want to change how the model works, how should the config look like?)
[Kinda defined a template but using RobertaForMaskedLM] https://skimai.com/roberta-language-model-for-spanish/ (this looks like it kinda defines a template but restricts it to RobertaForMaskedLM template)

Questions in parts:

If we have a much simpler Pytorch model like in the code snippet above, how to create a Pretrained Model from scratch in Huggingface?
How to create the Pretrained model config we need for Huggingface to make the converting from native Pytorch nn.Module work?

Sidwel answered 4/10, 2022 at 12:56 Comment(0)

You will need to define custom configuration and custom model classes. It is important to define attributes model_type and config_class inside those classes:

import torch.nn as nn
from transformers import PreTrainedModel, PretrainedConfig
from transformers import AutoModel, AutoConfig

class MyConfig(PretrainedConfig):
    model_type = 'mymodel'
    def __init__(self, important_param=42, **kwargs):
        super().__init__(**kwargs)
        self.important_param = important_param

class MyModel(PreTrainedModel):
    config_class = MyConfig
    def __init__(self, config):
        super().__init__(config)
        self.config = config
        self.model = nn.Sequential(
                          nn.Linear(3, self.config.important_param),
                          nn.Sigmoid(),
                          nn.Linear(self.config.important_param, 1),
                          nn.Sigmoid()
                          )
    def forward(self, input):
        return self.model(input)

Now you can create (and obviously train a new model), save and then load your model locally

config = MyConfig(4)
model = MyModel(config)
model.save_pretrained('./my_model_dir')

new_model = MyModel.from_pretrained('./my_model_dir')
new_model

If you wish to use AutoModel, you will have to register your classes:

AutoConfig.register("mymodel", MyConfig)
AutoModel.register(MyConfig, MyModel)

new_model = AutoModel.from_pretrained('./my_model_dir')
new_model

Fawcett answered 18/10, 2022 at 10:50 Comment(0)

One way to do this is to put the model inside a class that inherits from PreTrainedModel, for example, it could be a pretrained resnet34, a timm model or your "net" model. I recommend to look the documentation for more details about configs, I'll use an example from the link. https://huggingface.co/docs/transformers/custom_models#sharing-custom-models

Configs (Note: You can add different configs for example version and you can access config.json later.)

from transformers import PretrainedConfig
from typing import List

class ModelConfig(PretrainedConfig):
    model_type = "mymodel"
    def __init__(
        self,
        version = 1,
        layers: List[int] = [3, 4, 6, 3],
        num_classes: int = 1000,
        input_channels: int = 3,
        stem_type: str = "",
        **kwargs,
    ):
        if stem_type not in ["", "deep", "deep-tiered"]:
            raise ValueError(f"`stem_type` must be '', 'deep' or 'deep-tiered', got {block}.")

        self.version = version
        self.layers = layers
        self.num_classes = num_classes
        self.input_channels = input_channels
        self.stem_type = stem_type
        super().__init__(**kwargs)

Your net model, as I said could be the resnet34.

from transformers import PreTrainedModel
from torch import nn
net = nn.Sequential(
      nn.Linear(3, 4),
      nn.Sigmoid(),
      nn.Linear(4, 1),
      nn.Sigmoid()
      ).to('cuda')
      
class MyModel(PreTrainedModel):
    config_class = ModelConfig

    def __init__(self, config):
        super().__init__(config)
        self.model = net
        
    def forward(self, tensor):
        return self.model(tensor)

Test the model

config = ModelConfig()
model = MyModel(config)
dummy_input = torch.randn(1, 3).to('cuda')
with torch.no_grad():
    output = model(dummy_input)
print(output.shape)

Push to the hugginface hub (note: you need to login with token and you can push more than one time to update the model)

model.push_to_hub("mymodel-test")

Download the model (Note: You are using MyModel class, if you want to create a model like ..bert.modeling_bert.BertModel, I think you need to use the lib structure.)

my_model = MyModel.from_pretrained("User/mymodel-test")

Broyles answered 14/10, 2022 at 3:51 Comment(0)

How do I convert it into a Huggingface PreTrainedModel object?

Recommended topics

Hot tags