bert-language-model Questions

4

I want to fine tune LabSE for Question answering using squad dataset. and i got this error: ValueError: The model did not return a loss from the inputs, only the following keys: last_hidden_state,p...
Barbital asked 9/8, 2022 at 10:43

3

Solved

I am trying to fine-tune the BERT language model on my own data. I've gone through their docs, but their tasks seem to be not quite what I need, since my end goal is embedding text. Here's my code:...

3

I am using the SentenceTransformers library (here: https://pypi.org/project/sentence-transformers/#pretrained-models) for creating embeddings of sentences using the pre-trained model bert-base-nli-...

5

I am trying to install bertopic and I got this error: pip install bertopic Collecting bertopic > Using cached bertopic-0.11.0-py2.py3-none-any.whl (76 kB) > Collecting hdbscan>=0.8.2...
Deficient asked 29/7, 2022 at 22:16

2

I have installed PyTorch 1.7.1, and it works very well. However, when I try to run this code: import transformers from transformers import BertTokenizer from transformers.models.bert.modeling_bert ...
Limitation asked 31/8, 2023 at 6:36

11

I got the following error when I ran my PyTorch deep learning model in Google Colab /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias) 1370 ret = torch.ad...
Seagoing asked 28/4, 2020 at 5:39

2

Solved

I'm fine tuning a pre-trained bert model and i have a weird problem: When i'm fine tuning using the CPU, the code saves the model like this: With the "pytorch_model.bin". But when i use ...

3

I am trying to use a huggingface model (CamelBERT), but I am getting an error when loading the tokenizer: Code: from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenize...

6

I am facing below issue while loading the pretrained BERT model from HuggingFace due to SSL certificate error. Error: SSLError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries ex...

1

https://colab.research.google.com/drive/11u6leEKvqE0CCbvDHHKmCxmW5GxyjlBm?usp=sharing setup.py file is in transformers folder(root directory). But this error occurs when I run !git clone https://gi...

2

I understand that WordPiece is used to break text into tokens. And I understand that, somewhere in BERT, the model maps tokens into token embeddings that represent the meaning of the tokens. But wh...

10

We know that BERT has a max length limit of tokens = 512, So if an article has a length of much bigger than 512, such as 10000 tokens in text How can BERT be used?
Invariant asked 31/10, 2019 at 3:34

1

I am using a pre-trained BERT sentence transformer model, as described here https://www.sbert.net/docs/training/overview.html , to get embeddings for sentences. I want to fine-tune these pre-traine...
Alehouse asked 13/10, 2021 at 21:38

2

Solved

I'd like to fix the random seed from BERTopic library to get reproducible results. Looking at the code of BERTopic I see it uses numpy. Will using np.random.seed(123) be enough? or do I also need t...
Envelop asked 2/3, 2022 at 9:19

2

Solved

I was recently reading the bert source code from the hugging face project. I noticed that the so-called "learnable position encoding" seems to refer to a specific nn.Parameter layer when ...

6

I tried to load pre-trained model by using BertModel class in pytorch. I have _six.py under torch, but it still shows module 'torch' has no attribute '_six' import torch from pytorch_pretrained_b...
Dreg asked 21/5, 2019 at 15:41

5

I am working on a machine learning project on Google Colab, it seems recently there is an issue when trying to import packages from transformers. The error message says: ImportError: cannot import...

6

Trying to make a classifier for sentiments of texts with BERT model but getting ValueError : too many dimensions 'str' That is the DataFrame for values of train data; so they are train_labels 0 not...
Tailback asked 20/1, 2021 at 7:12

3

I'm trying to use transformer's huggingface pretrained model bert-base-uncased, but I want to increace dropout. There isn't any mention to this in from_pretrained method, but colab ran the object i...
Flowerless asked 21/11, 2020 at 19:14

2

Solved

I want to build a multi-class classification model for which I have conversational data as input for the BERT model (using bert-base-uncased). QUERY: I want to ask a question. ANSWER: Sure, ask aw...

3

Solved

In the HuggingFace tokenizer, applying the max_length argument specifies the length of the tokenized text. I believe it truncates the sequence to max_length-2 (if truncation=True) by cutting the ex...

1

I know that GPT uses Transformer decoder, BERT uses Transformer encoder, and T5 uses Transformer encoder-decoder. But can someone help me understand why GPT only uses the decoder, BERT only uses en...
Latinalatinate asked 7/5, 2021 at 1:21

6

Solved

For ElMo, FastText and Word2Vec, I'm averaging the word embeddings within a sentence and using HDBSCAN/KMeans clustering to group similar sentences. A good example of the implementation can be see...

2

Solved

How is it possible to initialize BERT with random weights? I want to compare the performance of multilingual vs monolingual vs randomly initialized BERT in a masked language modeling task. While in...
Hospitalize asked 20/6, 2021 at 17:57

3

I am using Huggingface BERT for an NLP task. My texts contain names of companies which are split up into subwords. tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased') tokenizer.encod...

© 2022 - 2025 — McMap. All rights reserved.