How to convert a PyTorch nn.Module into a HuggingFace PreTrainedModel object?
Asked Answered
S

2

8

Given a simple neural net in Pytorch like:

import torch.nn as nn

net = nn.Sequential(
      nn.Linear(3, 4),
      nn.Sigmoid(),
      nn.Linear(4, 1),
      nn.Sigmoid()
      ).to(device)

How do I convert it into a Huggingface PreTrainedModel object?

The goal is to convert the Pytorch nn.Module object from nn.Sequential into the Huggingface PreTrainedModel object, then run something like:

import torch.nn as nn
from transformers.modeling_utils import PreTrainedModel


net = nn.Sequential(
      nn.Linear(3, 4),
      nn.Sigmoid(),
      nn.Linear(4, 1),
      nn.Sigmoid()
      ).to(device)

# Do something to convert the Pytorch nn.Module to the PreTrainedModel object.
shiny_model = do_some_magic(net, some_args, some_kwargs)

# Save the shiny model that is a `PreTrainedModel` object.
shiny_model.save_pretrained("shiny-model")

PreTrainedModel.from_pretrained("shiny-model")

And it seems like to build/convert any native Pytorch models into a Huggingface one, there's a need for some configurations https://huggingface.co/docs/transformers/main_classes/configuration

There are many how-tos to train models "from scratch", e.g.

Questions in parts:

  • If we have a much simpler Pytorch model like in the code snippet above, how to create a Pretrained Model from scratch in Huggingface?

  • How to create the Pretrained model config we need for Huggingface to make the converting from native Pytorch nn.Module work?

Sidwel answered 4/10, 2022 at 12:56 Comment(0)
F
13

You will need to define custom configuration and custom model classes. It is important to define attributes model_type and config_class inside those classes:

import torch.nn as nn
from transformers import PreTrainedModel, PretrainedConfig
from transformers import AutoModel, AutoConfig

class MyConfig(PretrainedConfig):
    model_type = 'mymodel'
    def __init__(self, important_param=42, **kwargs):
        super().__init__(**kwargs)
        self.important_param = important_param

class MyModel(PreTrainedModel):
    config_class = MyConfig
    def __init__(self, config):
        super().__init__(config)
        self.config = config
        self.model = nn.Sequential(
                          nn.Linear(3, self.config.important_param),
                          nn.Sigmoid(),
                          nn.Linear(self.config.important_param, 1),
                          nn.Sigmoid()
                          )
    def forward(self, input):
        return self.model(input) 

Now you can create (and obviously train a new model), save and then load your model locally

config = MyConfig(4)
model = MyModel(config)
model.save_pretrained('./my_model_dir')

new_model = MyModel.from_pretrained('./my_model_dir')
new_model

If you wish to use AutoModel, you will have to register your classes:

AutoConfig.register("mymodel", MyConfig)
AutoModel.register(MyConfig, MyModel)

new_model = AutoModel.from_pretrained('./my_model_dir')
new_model
Fawcett answered 18/10, 2022 at 10:50 Comment(0)
B
1

One way to do this is to put the model inside a class that inherits from PreTrainedModel, for example, it could be a pretrained resnet34, a timm model or your "net" model. I recommend to look the documentation for more details about configs, I'll use an example from the link. https://huggingface.co/docs/transformers/custom_models#sharing-custom-models

Configs (Note: You can add different configs for example version and you can access config.json later.)

from transformers import PretrainedConfig
from typing import List

class ModelConfig(PretrainedConfig):
    model_type = "mymodel"
    def __init__(
        self,
        version = 1,
        layers: List[int] = [3, 4, 6, 3],
        num_classes: int = 1000,
        input_channels: int = 3,
        stem_type: str = "",
        **kwargs,
    ):
        if stem_type not in ["", "deep", "deep-tiered"]:
            raise ValueError(f"`stem_type` must be '', 'deep' or 'deep-tiered', got {block}.")

        self.version = version
        self.layers = layers
        self.num_classes = num_classes
        self.input_channels = input_channels
        self.stem_type = stem_type
        super().__init__(**kwargs)

Your net model, as I said could be the resnet34.

from transformers import PreTrainedModel
from torch import nn
net = nn.Sequential(
      nn.Linear(3, 4),
      nn.Sigmoid(),
      nn.Linear(4, 1),
      nn.Sigmoid()
      ).to('cuda')
      
class MyModel(PreTrainedModel):
    config_class = ModelConfig

    def __init__(self, config):
        super().__init__(config)
        self.model = net
        
    def forward(self, tensor):
        return self.model(tensor)

Test the model

config = ModelConfig()
model = MyModel(config)
dummy_input = torch.randn(1, 3).to('cuda')
with torch.no_grad():
    output = model(dummy_input)
print(output.shape)

Push to the hugginface hub (note: you need to login with token and you can push more than one time to update the model)

model.push_to_hub("mymodel-test")

Download the model (Note: You are using MyModel class, if you want to create a model like ..bert.modeling_bert.BertModel, I think you need to use the lib structure.)

my_model = MyModel.from_pretrained("User/mymodel-test")
Broyles answered 14/10, 2022 at 3:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.