huggingface-transformers Questions

3

I'm trying to use transformer's huggingface pretrained model bert-base-uncased, but I want to increace dropout. There isn't any mention to this in from_pretrained method, but colab ran the object i...
Flowerless asked 21/11, 2020 at 19:14

3

When import pipeline from Huggingface on Kaggle notebook, from transformers import pipeline it throws this error: /opt/conda/lib/python3.10/site-packages/tensorflow_io/python/ops/__init__.py:98: U...
Mckay asked 30/5, 2023 at 9:29

3

Solved

I am trying to implement the XLNET on Google Collaboratory. But I get the following issue. ImportError: XLNetTokenizer requires the SentencePiece library but it was not found in your environment. ...

2

Solved

I've followed this tutorial (colab notebook) in order to finetune my model. Trying to load my locally saved model model = AutoModelForCausalLM.from_pretrained("finetuned_model") yields K...
Tourist asked 12/6, 2023 at 17:34

1

Solved

I am training a sequence-to-sequence model using HuggingFace Transformers' Seq2SeqTrainer. When I execute the training process, it reports the following warning: /path/to/python3.9/site-packages/t...
Volkman asked 13/6, 2023 at 13:12

2

Solved

in the Tokenizer documentation from huggingface, the call fuction accepts List[List[str]] and says: text (str, List[str], List[List[str]], optional) — The sequence or batch of sequences to be enco...

1

Solved

I need to use pipeline in order to get the tokenization and inference from the distilbert-base-uncased-finetuned-sst-2-english model over my dataset. My data is a list of sentences, for recreation ...

1

Solved

HuggingFace offers training_args like below. When I use HF trainer to train my model, I found cuda:0 is used by default. I went through the HuggingFace Docs, but still don't know how to specify whi...
External asked 7/6, 2023 at 21:20

2

i have the following code import transformers from transformers import pipeline # Load the language model pipeline model = pipeline("text-generation", model="gpt2") # Input se...
Gurge asked 3/6, 2023 at 20:23

1

I'm trying to install the guanaco language model https://arxiv.org/abs/2305.14314 using pip install guanaco for a text classification model but getting error. Failed to build guanaco ERROR: Could n...

2

Solved

Here is the code block which caused the error training_args = TrainingArguments( output_dir="my_awesome_mind_model", evaluation_strategy="epoch", save_strategy="epoch&qu...
Danzig asked 11/5, 2023 at 8:28

3

Solved

In the HuggingFace tokenizer, applying the max_length argument specifies the length of the tokenized text. I believe it truncates the sequence to max_length-2 (if truncation=True) by cutting the ex...

1

Solved

I have a pre-trained model from facebook/bart-large-mnli I used the Trainer in order to train it on my own dataset. model = BartForSequenceClassification.from_pretrained("facebook/bart-large-m...

2

Solved

How is it possible to initialize BERT with random weights? I want to compare the performance of multilingual vs monolingual vs randomly initialized BERT in a masked language modeling task. While in...
Hospitalize asked 20/6, 2021 at 17:57

6

Solved

I simply tried the sample code from hugging face website: https://huggingface.co/albert-base-v2 from transformers import AlbertTokenizer, AlbertModel tokenizer = AlbertTokenizer.from_pretrained('al...

1

Solved

I need a model that is able to classify text for an unknown number of classes (i.e. the number might grow over time). The entailment approach for zero-shot text classification seems to be the solut...
Coming asked 9/5, 2023 at 23:1

4

I'm trying to train a model using a Trainer, according to the documentation (https://huggingface.co/transformers/master/main_classes/trainer.html#transformers.Trainer) I can specify a tokenizer: t...
Lankester asked 24/9, 2020 at 13:13

3

I'm trying to get the sentiments for comments with the help of hugging face sentiment analysis pretrained model. It's returning error like Token indices sequence length is longer than the specified...

3

I am using Huggingface BERT for an NLP task. My texts contain names of companies which are split up into subwords. tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased') tokenizer.encod...

1

How to add new tokens to an existing Huggingface AutoTokenizer? Canonically, there's this tutorial from Huggingface https://huggingface.co/learn/nlp-course/chapter6/2 but it ends on the note of &qu...

1

Solved

I'm trying to finetune the Facebook BART model, I'm following this article in order to classify text using my own dataset. And I'm using the Trainer object in order to train: training_args = Traini...

3

I am not able to import LLaMATokenizer Any solution for this problem? I am using the code of this repo. https://github.com/zphang/transformers/tree/llama_push and trying to load the models and toke...
Lachance asked 1/4, 2023 at 17:51

1

Solved

Following this HuggingFace Anonymisation Tutorial. Using pytorch 2.0.0 and transformers-4.28.1 Running the code as it is, I get an error over the custom pipeline: def anonymize(text): ents = pipe(...

1

I have a train dataset of size 4107. DatasetDict({ train: Dataset({ features: ['input_ids'], num_rows: 4107 }) valid: Dataset({ features: ['input_ids'], num_rows: 498 }) }) In my training ...
Superstition asked 13/4, 2023 at 7:6

3

Solved

I currently have my trainer set up as: training_args = TrainingArguments( output_dir=f"./results_{model_checkpoint}", evaluation_strategy="epoch", learning_rate=5e-5, per_de...
Heterodox asked 18/6, 2022 at 19:46

© 2022 - 2024 — McMap. All rights reserved.