transformer-model Questions
5
Solved
I am developing a language model like https://pytorch.org/tutorials/beginner/transformer_tutorial.html.
It is not clear for me - whether positional encoding is neccessary here ?
As far as I unders...
Messenger asked 26/4, 2020 at 11:54
5
I am having hard time understanding position wise feed forward neural network in transformers architecture.
Lets take example as Machine translation task, where inputs are sentences. From the figu...
Votive asked 2/1, 2023 at 5:59
4
Solved
I've been looking to use Hugging Face's Pipelines for NER (named entity recognition). However, it is returning the entity labels in inside-outside-beginning (IOB) format but without the IOB labels....
Dahl asked 30/3, 2020 at 18:58
13
Solved
For example, I want to download bert-base-uncased on https://huggingface.co/models, but can't find a 'Download' link. Or is it not downloadable?
Swank asked 19/5, 2021 at 0:34
4
I'm currently studying code of transformer, but I can not understand the masked multi-head of decoder. The paper said that it is to prevent you from seeing the generating word, but I can not unsers...
Bonbon asked 27/9, 2019 at 2:40
2
I've been experimenting with stacking language models recently and noticed something interesting: the output embeddings of BERT and XLNet are not the same as the input embeddings. For example, this...
Atropos asked 2/4, 2020 at 17:23
1
https://colab.research.google.com/drive/11u6leEKvqE0CCbvDHHKmCxmW5GxyjlBm?usp=sharing
setup.py file is in transformers folder(root directory). But this error occurs when I run
!git clone https://gi...
Hirohito asked 3/5, 2023 at 17:21
2
I understand that WordPiece is used to break text into tokens. And I understand that, somewhere in BERT, the model maps tokens into token embeddings that represent the meaning of the tokens. But wh...
Meill asked 27/9, 2023 at 18:3
2
In this part of Tensorflow's tutorial here, they mentioned that they are training with teacher-forcing. To my knowledge, teacher-forcing involves feeding the target output into the model so that it...
Under asked 18/7, 2019 at 17:12
1
Solved
Is there any built-in positional encoding in pytorch? Basically, I want to be able to specify the dimension of the encoding, and then be able to get the i'th encoding for every i.
Dendriform asked 8/11, 2023 at 9:57
2
Solved
I was recently reading the bert source code from the hugging face project. I noticed that the so-called "learnable position encoding" seems to refer to a specific nn.Parameter layer when ...
Torpor asked 25/7, 2022 at 17:37
3
Solved
I am trying to implement the XLNET on Google Collaboratory. But I get the following issue.
ImportError:
XLNetTokenizer requires the SentencePiece library but it was not found in your environment. ...
Gratulate asked 4/1, 2021 at 5:9
1
I'm trying to install the guanaco language model https://arxiv.org/abs/2305.14314 using pip install guanaco for a text classification model but getting error.
Failed to build guanaco
ERROR: Could n...
Chazan asked 31/5, 2023 at 9:26
2
Solved
BERT output is not deterministic.
I expect the output values are deterministic when I put a same input, but my bert model the values are changing. Sounds awkwardly, the same value is returned twice...
Statolatry asked 17/6, 2019 at 23:17
2
Solved
I am following this blog on transformers
http://jalammar.github.io/illustrated-transformer/
The only thing I don't understand is why there needs to be a stack of encoders or decoders. I understan...
Overbalance asked 18/12, 2019 at 0:57
1
I want to build a classification model that needs only the encoder part of language models. I have tried Bert, Roberta, xlnet, and so far I have been successful.
I now want to test the encoder part...
Taconite asked 7/4, 2022 at 20:56
1
The OpenAI documentation for the model attribute in the fine-tune API states a bit confusingly:
model
The name of the base model to fine-tune. You can select one of "ada", "ba...
Drida asked 26/6, 2022 at 0:35
2
Solved
I am learning the Transformer. Here is the pytorch document for MultiheadAttention. In their implementation, I saw there is a constraint:
assert self.head_dim * num_heads == self.embed_dim, "...
Yardage asked 26/2, 2021 at 16:45
5
I am following this tutorial to learn about the trainer API.
https://huggingface.co/transformers/training.html
I copied the code as below:
from datasets import load_dataset
import numpy as np
from...
Farflung asked 20/5, 2021 at 17:31
1
Getting this error: AttributeError: 'GPT2Tokenizer' object has no
attribute 'train_new_from_iterator'
Very similar to hugging face documentation. I changed the input and that's it (shouldn't affe...
Harlen asked 22/4, 2022 at 20:43
2
error: "Unknown category '2' encountered. Set add_nan=True to allow unknown categories" while creating time series dataset in pytorch forecasting.
training = TimeSeriesDataSet(
train,
tim...
Buckshot asked 13/2, 2022 at 6:47
2
Solved
I have roughly 2 million sentences that I want to turn into vectors using Facebook AI's RoBERTa-large,fine-tuned on NLI and STSB for sentence similarity (using the awesome sentence-transformers pac...
Underpinnings asked 4/5, 2020 at 8:50
1
I am struggling to mask my input for the MultiHeadAttention Layer. I am using the Transformer Block from Keras documentation with self-attention. I could not find any example code online so far and...
Mcardle asked 2/6, 2021 at 12:29
1
Solved
I have several masked language models (mainly Bert, Roberta, Albert, Electra). I also have a dataset of sentences. How can I get the perplexity of each sentence?
From the huggingface documentation ...
Haskel asked 23/12, 2021 at 15:50
1
Solved
I'm using Pytorch to run Transformer model. when I want to split data (tokenized data) i'm using this code:
train_dataset, test_dataset = torch.utils.data.random_split(
tokenized_datasets,
[train...
Tripinnate asked 8/12, 2021 at 12:27
1 Next >
© 2022 - 2025 — McMap. All rights reserved.