huggingface Questions

1

SBERT's (https://www.sbert.net/) sentence-transformer library (https://pypi.org/project/sentence-transformers/) is the most popular library for producing vector embeddings of text chunks in the Pyt...

2

Solved

I have some custom data set with custom table entries and wanted to deal with it with a custom collate. But it didn't work when I pass a collate function I wrote (that DOES work on a individual dat...

1

I do not seem to find an explanation on how the validation and training losses are calculated when we finetune a model using the huggingFace trainer. Does anyone know here to find this information?...

1

Currently my custom data set gives None indices in the data loader, but NOT in the pure data set. When I wrap it in pytorch data loader it fails. Code is in colab but will put it here in case colab...

1

Solved

When I interleave data sets, get a tokenized batch, feed the batch to the pytorch data loader, I get errors: # -*- coding: utf-8 -*- """issues with dataloader and custom data sets A...

2

Solved

I am creating a very simple question and answer app based on documents using llama-index. Previously, I had it working with OpenAI. Now I want to try using no external APIs so I'm trying the Huggin...

1

I was runing the falcon 7b tutorial locally on my RTX A6000 but got an error with an odd mistmach of matrix mult: File "/lfs/hyperturing1/0/brando9/miniconda/envs/data_quality/lib/python3.10/...

0

The Llama2 7B model on huggingface (meta-llama/Llama-2-7b) has a pytorch .pth file consolidated.00.pth that is ~13.5GB in size. The hugging face transformers compatible model meta-llama/Llama-2-7b-...
Epagoge asked 19/7, 2023 at 14:4

1

Solved

I need to use pipeline in order to get the tokenization and inference from the distilbert-base-uncased-finetuned-sst-2-english model over my dataset. My data is a list of sentences, for recreation ...

1

from langchain import PromptTemplate, HuggingFaceHub, LLMChain import os os.environ['HUGGINGFACEHUB_API_TOKEN'] = 'token' # initialize HF LLM flan_t5 = HuggingFaceHub( repo_id="google/flan...
Lowbrow asked 16/5, 2023 at 17:34

2

Solved

I'm trying to load tokenizer and seq2seq model from pretrained models. from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("ozcangundes/mt5...
Liquorish asked 7/1, 2023 at 16:56

1

Solved

I have a pre-trained model from facebook/bart-large-mnli I used the Trainer in order to train it on my own dataset. model = BartForSequenceClassification.from_pretrained("facebook/bart-large-m...

1

Solved

I'm trying to finetune the Facebook BART model, I'm following this article in order to classify text using my own dataset. And I'm using the Trainer object in order to train: training_args = Traini...

3

Solved

I currently have my trainer set up as: training_args = TrainingArguments( output_dir=f"./results_{model_checkpoint}", evaluation_strategy="epoch", learning_rate=5e-5, per_de...
Heterodox asked 18/6, 2022 at 19:46

1

Solved

I want to train the "flax-community/t5-large-wikisplit" model with the "dxiao/requirements-ner-id" dataset. (Just for some experiments) I think my general procedure is not corre...

1

Solved

I am trying to load a large Hugging face model with code like below: model_from_disc = AutoModelForCausalLM.from_pretrained(path_to_model) tokenizer_from_disc = AutoTokenizer.from_pretrained(path_t...

1

Solved

When I try to run the quick start notebook of this repo, I get the error ModuleNotFoundError: No module named 'huggingface_hub.snapshot_download'. How can I fix it? I already installed huggingface_...
Milliard asked 24/11, 2022 at 6:14

2

Solved

How do I write a HuggingFace dataset to disk? I have made my own HuggingFace dataset using a JSONL file: Dataset({ features: ['id', 'text'], num_rows: 18 }) I would like to persist the dataset to...
Goines asked 26/4, 2022 at 23:57

1

Solved

If this is not the best place to ask this question, please lead me to the most accurate one. I am planning to use one of the Huggingface summarization models (https://huggingface.co/models?pipeline...

2

I'm trying to execute the example code of the huggingface website: from transformers import GPTJTokenizer, TFGPTJModel import tensorflow as tf tokenizer = GPTJTokenizer.from_pretrained("Eleut...
Saratov asked 5/10, 2022 at 12:15

1

Solved

I am fine-tuning a BERT model for a multiclass classification task. My problem is that I don't know how to add "early stopping" to those Trainer instances. Any ideas?

© 2022 - 2024 — McMap. All rights reserved.