text-classification Questions
2
Solved
We can create a model from AutoModel(TFAutoModel) function:
from transformers import AutoModel
model = AutoModel.from_pretrained('distilbert-base-uncase')
In other hand, a model is created by Aut...
Aeolic asked 10/11, 2021 at 3:33
10
We know that BERT has a max length limit of tokens = 512, So if an article has a length of much bigger than 512, such as 10000 tokens in text
How can BERT be used?
Invariant asked 31/10, 2019 at 3:34
6
Trying to make a classifier for sentiments of texts with BERT model but getting ValueError : too many dimensions 'str'
That is the DataFrame for values of train data; so they are train_labels
0 not...
Tailback asked 20/1, 2021 at 7:12
2
I am finetuning the HuggingFace facebook/bart-large-mnli model to suit my need, I use the following parameters:
training_args = TrainingArguments(
output_dir=model_directory, # output directory
n...
Armendariz asked 18/5, 2023 at 5:30
1
Solved
I have a pre-trained model from facebook/bart-large-mnli I used the Trainer in order to train it on my own dataset.
model = BartForSequenceClassification.from_pretrained("facebook/bart-large-m...
Ingratitude asked 4/5, 2023 at 7:32
1
Solved
I'm trying to finetune the Facebook BART model, I'm following this article in order to classify text using my own dataset.
And I'm using the Trainer object in order to train:
training_args = Traini...
Comity asked 25/4, 2023 at 8:36
2
I have a dataset and trying to convert it to topics using berTopic modeling but the problem is, i cant get all the docoments of a topic. berTopic is only return 3 docoments per topic.
topic_model =...
Heritable asked 27/10, 2021 at 14:52
2
Solved
I'm playing around with sklearn and NLP for the first time, and thought I understood everything I was doing up until I didn't know how to fix this error. Here is the relevant code (largely adapted ...
Murrah asked 31/8, 2018 at 21:59
4
Is stopwords removal ,Stemming and Lemmatization necessary for text classification while using Spacy,Bert or other advanced NLP models for getting the vector embedding of the text ?
text="The ...
Bilbao asked 28/8, 2020 at 12:10
2
If I want to implement a classifier using the sklearn library. Is there a way to save the model or convert the file into a saved tensorflow file in order to convert it to tensorflow lite later?
Unpopular asked 13/1, 2020 at 20:47
4
Solved
I'm doing different text classification experiments. Now I need to calculate the AUC-ROC for each task. For the binary classifications, I already made it work with this code:
scaler = StandardScal...
Biotite asked 26/7, 2017 at 16:16
4
Solved
I have a one-dimensional array with large strings in each of the elements. I am trying to use a CountVectorizer to convert text data into numerical vectors. However, I am getting an error saying:
...
Egest asked 14/10, 2014 at 17:48
2
I am doing a text classification and I have very imbalanced data like
Category | Total Records
Cate1 | 950
Cate2 | 40
Cate3 | 10
Now I want to over sample Cate2 and Cate3 so it at least have 40...
Purusha asked 23/6, 2018 at 9:0
2
As my classifier yields about 99% accuracy on test data, I am a bit suspicious and want to gain insight in the most informative features of my NB classifier to see what kind of features it is learn...
Tillage asked 25/4, 2015 at 15:51
1
So I have few words without labels but I need to classify them into 4-5 categories.
I can visibly say that this test set can be classified. Although I do not have training data so I need to use a p...
Phyto asked 12/12, 2020 at 8:5
1
Solved
Being new to the "Natural Language Processing" scene, I am experimentally learning and have implemented the following segment of code:
from transformers import RobertaTokenizer, RobertaFo...
Disallow asked 9/12, 2020 at 16:43
3
Solved
I am using Scikit-learn for text classification. I want to calculate the Information Gain for each attribute with respect to a class in a (sparse) document-term matrix.
the Information Gain is def...
Gadmann asked 15/10, 2017 at 7:17
1
I have a dataset with paragraphs that I need to classify into two classes. These paragraphs are usually 3-5 sentences long. The overwhelming majority of them are less than 500 words long. I would l...
Underbelly asked 17/11, 2020 at 18:50
3
Solved
I am using GloVe as part of my research. I've downloaded the models from here. I've been using GloVe for sentence classification. The sentences I'm classifying are specific to a particular domain, ...
Grosmark asked 25/4, 2017 at 18:15
1
Solved
I am wondering if I can be able to use OpenAI GPT-3 for transfer learning in a text classification problem?
If so, how can I get start on it using Tensorflow, Keras.
Dill asked 9/8, 2020 at 2:17
1
Solved
I've read post which explains how the sliding window works but I cannot find any information on how it is actually implemented.
From what I understand if the input are too long, sliding window can ...
Demmy asked 19/7, 2020 at 10:18
3
Solved
I have 3 questions:
1)
The confusion matrix for sklearn is as follows:
TN | FP
FN | TP
While when I'm looking at online resources, I find it like this:
TP | FP
FN | TN
Which one should I co...
Raber asked 10/5, 2019 at 12:57
1
I am writing a classifier for web pages, so I have a mixture of numerical features, and I also want to classify the text. I am using the bag-of-words approach to transform the text into a (large) n...
Cowbell asked 12/9, 2016 at 7:12
1
Solved
I trained a supervised model in FastText using the Python interface and I'm getting weird results for precision and recall.
First, I trained a model:
model = fasttext.train_supervised("train.t...
Miche asked 14/5, 2020 at 0:21
1
I am doing text classification using a linear SVC model from sklearn. Now I want to visualize which words/tokens have the highest impact on the classification decision by using SHAP (https://github...
Mcelroy asked 26/4, 2019 at 12:39
1 Next >
© 2022 - 2024 — McMap. All rights reserved.