Using the encoder part only from T5 model
Asked Answered
T

1

6

I want to build a classification model that needs only the encoder part of language models. I have tried Bert, Roberta, xlnet, and so far I have been successful.

I now want to test the encoder part only from T5, so far, I found encT5 https://github.com/monologg/EncT5

And T5EncoderModel from HuggingFace.

Can anyone help me understand if T5EncoderModel is what I am looking for or not?

It says in the description: The bare T5 Model transformer outputting encoder’s raw hidden-states without any specific head on top.

This is slightly confusing to me, especially that encT5 mentioned that they implemented the encoder part only because it didn't exist in HuggingFace which is what makes me more doubtful here.

Please note that I am a beginner in deep learning, so please go easy on me I understand that ny questions can be naive to most of you.

Thank you

Taconite answered 7/4, 2022 at 20:56 Comment(1)
Try it simply. to use the hidden states for classification purpose you just need to add additional classification layer and that is all. To make things easy for you note that both bert and T5 encoder are transformers encoder which you understand very clearly according to your question. So yes you can use it.Homebred
G
3

Load T5 encoder checkpoint only:

from transformers import T5EncoderModel
T5EncoderModel._keys_to_ignore_on_load_unexpected = ["decoder.*"]
auto_model = T5EncoderModel.from_pretrained("t5-base")

Note that T5 doesn't have CLS token so you should use another strategy (mean pooling, etc) for your classification task

Guimar answered 6/2, 2023 at 8:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.