As per the title, how are these two Auto Classes on Huggingface different from each other? I tried reading the documentation but did not find differentiating information
Difference between AutoModelForSeq2SeqLM and AutoModelForCausalLM
Asked Answered
Intuitively, AutoModelForSeq2SeqLM
is used for language models with encoder-decoder architecture, like T5 and BART, while AutoModelForCausalLM
is used for auto-regressive language models like all the GPT models.
These two classes are conceptual APIs to automatically infer a specific model class for the two types of models, e.g., GPT2LMHeadModel
using AutoModelForCausalLM.from_pretrained('gpt2')
. For example, You can see the source code for all the inference models. (MODEL_FOR_CAUSAL_LM_MAPPING
and MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING
)
© 2022 - 2024 — McMap. All rights reserved.