huggingface-transformers Questions
3
I'm trying to use transformer's huggingface pretrained model bert-base-uncased, but I want to increace dropout. There isn't any mention to this in from_pretrained method, but colab ran the object i...
Flowerless asked 21/11, 2020 at 19:14
3
When import pipeline from Huggingface on Kaggle notebook,
from transformers import pipeline
it throws this error:
/opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/__init__.py:98: U...
Mckay asked 30/5, 2023 at 9:29
3
Solved
I am trying to implement the XLNET on Google Collaboratory. But I get the following issue.
ImportError:
XLNetTokenizer requires the SentencePiece library but it was not found in your environment. ...
Gratulate asked 4/1, 2021 at 5:9
2
Solved
I've followed this tutorial (colab notebook) in order to finetune my model.
Trying to load my locally saved model
model = AutoModelForCausalLM.from_pretrained("finetuned_model")
yields K...
Tourist asked 12/6, 2023 at 17:34
1
Solved
I am training a sequence-to-sequence model using HuggingFace Transformers' Seq2SeqTrainer. When I execute the training process, it reports the following warning:
/path/to/python3.9/site-packages/t...
Volkman asked 13/6, 2023 at 13:12
2
Solved
in the Tokenizer documentation from huggingface, the call fuction accepts List[List[str]] and says:
text (str, List[str], List[List[str]], optional) — The sequence or batch of sequences to be enco...
Billon asked 7/6, 2023 at 10:15
1
Solved
I need to use pipeline in order to get the tokenization and inference from the distilbert-base-uncased-finetuned-sst-2-english model over my dataset.
My data is a list of sentences, for recreation ...
Gosser asked 8/6, 2023 at 17:26
1
Solved
HuggingFace offers training_args like below. When I use HF trainer to train my model, I found cuda:0 is used by default.
I went through the HuggingFace Docs, but still don't know how to specify whi...
External asked 7/6, 2023 at 21:20
2
i have the following code
import transformers
from transformers import pipeline
# Load the language model pipeline
model = pipeline("text-generation", model="gpt2")
# Input se...
Gurge asked 3/6, 2023 at 20:23
1
I'm trying to install the guanaco language model https://arxiv.org/abs/2305.14314 using pip install guanaco for a text classification model but getting error.
Failed to build guanaco
ERROR: Could n...
Chazan asked 31/5, 2023 at 9:26
2
Solved
Here is the code block which caused the error
training_args = TrainingArguments(
output_dir="my_awesome_mind_model",
evaluation_strategy="epoch",
save_strategy="epoch&qu...
Danzig asked 11/5, 2023 at 8:28
3
Solved
In the HuggingFace tokenizer, applying the max_length argument specifies the length of the tokenized text. I believe it truncates the sequence to max_length-2 (if truncation=True) by cutting the ex...
Quent asked 11/5, 2022 at 13:52
1
Solved
I have a pre-trained model from facebook/bart-large-mnli I used the Trainer in order to train it on my own dataset.
model = BartForSequenceClassification.from_pretrained("facebook/bart-large-m...
Ingratitude asked 4/5, 2023 at 7:32
2
Solved
How is it possible to initialize BERT with random weights? I want to compare the performance of multilingual vs monolingual vs randomly initialized BERT in a masked language modeling task. While in...
Hospitalize asked 20/6, 2021 at 17:57
6
Solved
I simply tried the sample code from hugging face website: https://huggingface.co/albert-base-v2
from transformers import AlbertTokenizer, AlbertModel
tokenizer = AlbertTokenizer.from_pretrained('al...
Flee asked 23/1, 2021 at 1:0
1
Solved
I need a model that is able to classify text for an unknown number of classes (i.e. the number might grow over time). The entailment approach for zero-shot text classification seems to be the solut...
Coming asked 9/5, 2023 at 23:1
4
I'm trying to train a model using a Trainer, according to the documentation (https://huggingface.co/transformers/master/main_classes/trainer.html#transformers.Trainer) I can specify a tokenizer:
t...
Lankester asked 24/9, 2020 at 13:13
3
I'm trying to get the sentiments for comments with the help of hugging face sentiment analysis pretrained model. It's returning error like Token indices sequence length is longer than the specified...
Kaufmann asked 5/4, 2021 at 14:33
3
I am using Huggingface BERT for an NLP task. My texts contain names of companies which are split up into subwords.
tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')
tokenizer.encod...
Haploid asked 3/11, 2020 at 19:29
1
How to add new tokens to an existing Huggingface AutoTokenizer?
Canonically, there's this tutorial from Huggingface https://huggingface.co/learn/nlp-course/chapter6/2 but it ends on the note of &qu...
Ametropia asked 8/5, 2023 at 6:41
1
Solved
I'm trying to finetune the Facebook BART model, I'm following this article in order to classify text using my own dataset.
And I'm using the Trainer object in order to train:
training_args = Traini...
Comity asked 25/4, 2023 at 8:36
3
I am not able to import LLaMATokenizer
Any solution for this problem?
I am using the code of this repo.
https://github.com/zphang/transformers/tree/llama_push
and trying to load the models and toke...
Lachance asked 1/4, 2023 at 17:51
1
Solved
Following this HuggingFace Anonymisation Tutorial.
Using pytorch 2.0.0 and transformers-4.28.1
Running the code as it is, I get an error over the custom pipeline:
def anonymize(text):
ents = pipe(...
Coif asked 19/4, 2023 at 15:17
1
I have a train dataset of size 4107.
DatasetDict({
train: Dataset({
features: ['input_ids'],
num_rows: 4107
})
valid: Dataset({
features: ['input_ids'],
num_rows: 498
})
})
In my training ...
Superstition asked 13/4, 2023 at 7:6
3
Solved
I currently have my trainer set up as:
training_args = TrainingArguments(
output_dir=f"./results_{model_checkpoint}",
evaluation_strategy="epoch",
learning_rate=5e-5,
per_de...
Heterodox asked 18/6, 2022 at 19:46
© 2022 - 2024 — McMap. All rights reserved.