text-classification

2

Solved

What are differences between AutoModelForSequenceClassification vs AutoModel

We can create a model from AutoModel(TFAutoModel) function: from transformers import AutoModel model = AutoModel.from_pretrained('distilbert-base-uncase') In other hand, a model is created by Aut...

nlp text-classification huggingface-transformers

Aeolic asked 10/11, 2021 at 3:33

10

How to use Bert for long text classification?

We know that BERT has a max length limit of tokens = 512, So if an article has a length of much bigger than 512, such as 10000 tokens in text How can BERT be used?

nlp text-classification bert-language-model

Invariant asked 31/10, 2019 at 3:34

6

With BERT Text Classification, ValueError: too many dimensions 'str' error occuring

Trying to make a classifier for sentiments of texts with BERT model but getting ValueError : too many dimensions 'str' That is the DataFrame for values of train data; so they are train_labels 0 not...

python tensor text-classification bert-language-model mlp

Tailback asked 20/1, 2021 at 7:12

2

HuggingFace Evaluate a Fine-tuned Zero-Shot Model

I am finetuning the HuggingFace facebook/bart-large-mnli model to suit my need, I use the following parameters: training_args = TrainingArguments( output_dir=model_directory, # output directory n...

python deep-learning huggingface-transformers text-classification evaluation

Armendariz asked 18/5, 2023 at 5:30

1

Solved

Huggingface - Pipeline with a fine-tuned pre-trained model errors

I have a pre-trained model from facebook/bart-large-mnli I used the Trainer in order to train it on my own dataset. model = BartForSequenceClassification.from_pretrained("facebook/bart-large-m...

python pipeline huggingface-transformers text-classification huggingface

Ingratitude asked 4/5, 2023 at 7:32

1

Solved

Hugging Face Transformers BART CUDA error: CUBLAS_STATUS_NOT_INITIALIZE

I'm trying to finetune the Facebook BART model, I'm following this article in order to classify text using my own dataset. And I'm using the Trainer object in order to train: training_args = Traini...

python pytorch huggingface-transformers text-classification huggingface

Comity asked 25/4, 2023 at 8:36

2

How to get all documents per topic in bertopic modeling

I have a dataset and trying to convert it to topics using berTopic modeling but the problem is, i cant get all the docoments of a topic. berTopic is only return 3 docoments per topic. topic_model =...

nlp text-classification bert-language-model topic-modeling

Heritable asked 27/10, 2021 at 14:52

2

Solved

Sklearn Pipeline ValueError: could not convert string to float

I'm playing around with sklearn and NLP for the first time, and thought I understood everything I was doing up until I didn't know how to fix this error. Here is the relevant code (largely adapted ...

python scikit-learn nlp text-classification

Murrah asked 31/8, 2018 at 21:59

4

Is it necessary to do stopwords removal ,Stemming/Lemmatization for text classification while using Spacy,Bert?

Is stopwords removal ,Stemming and Lemmatization necessary for text classification while using Spacy,Bert or other advanced NLP models for getting the vector embedding of the text ? text="The ...

nlp spacy text-classification bert-language-model

Bilbao asked 28/8, 2020 at 12:10

2

how to convert saved model from sklearn into tensorflow/lite

If I want to implement a classifier using the sklearn library. Is there a way to save the model or convert the file into a saved tensorflow file in order to convert it to tensorflow lite later?

tensorflow machine-learning scikit-learn text-classification tensorflow-lite

Unpopular asked 13/1, 2020 at 20:47

4

Solved

ROC for multiclass classification

I'm doing different text classification experiments. Now I need to calculate the AUC-ROC for each task. For the binary classifications, I already made it work with this code: scaler = StandardScal...

python scikit-learn text-classification roc multiclass-classification

Biotite asked 26/7, 2017 at 16:16

4

Solved

CountVectorizer: AttributeError: 'numpy.ndarray' object has no attribute 'lower'

I have a one-dimensional array with large strings in each of the elements. I am trying to use a CountVectorizer to convert text data into numerical vectors. However, I am getting an error saying: ...

python numpy scikit-learn text-classification

Egest asked 14/10, 2014 at 17:48

2

SMOTE, Oversampling on text classification in Python

I am doing a text classification and I have very imbalanced data like Category | Total Records Cate1 | 950 Cate2 | 40 Cate3 | 10 Now I want to over sample Cate2 and Cate3 so it at least have 40...

python machine-learning nlp text-classification resampling

Purusha asked 23/6, 2018 at 9:0

2

SkLearn Multinomial NB: Most Informative Features

As my classifier yields about 99% accuracy on test data, I am a bit suspicious and want to gain insight in the most informative features of my NB classifier to see what kind of features it is learn...

python machine-learning scikit-learn classification text-classification

Tillage asked 25/4, 2015 at 15:51

1

Pre-Trained models for text Classification

So I have few words without labels but I need to classify them into 4-5 categories. I can visibly say that this test set can be classified. Although I do not have training data so I need to use a p...

python machine-learning keras text-classification pre-trained-model

Phyto asked 12/12, 2020 at 8:5

1

Solved

What do the logits and probabilities from RobertaForSequenceClassification represent?

Being new to the "Natural Language Processing" scene, I am experimentally learning and have implemented the following segment of code: from transformers import RobertaTokenizer, RobertaFo...

python nlp pytorch text-classification huggingface-transformers

Disallow asked 9/12, 2020 at 16:43

3

Solved

Information Gain calculation with Scikit-learn

I am using Scikit-learn for text classification. I want to calculate the Information Gain for each attribute with respect to a class in a (sparse) document-term matrix. the Information Gain is def...

python machine-learning scikit-learn text-classification feature-selection

Gadmann asked 15/10, 2017 at 7:17

1

Passing multiple sentences to BERT?

I have a dataset with paragraphs that I need to classify into two classes. These paragraphs are usually 3-5 sentences long. The overwhelming majority of them are less than 500 words long. I would l...

nlp text-classification bert-language-model huggingface-transformers

Underbelly asked 17/11, 2020 at 18:50

3

Solved

Improving on the basic, existing GloVe model

I am using GloVe as part of my research. I've downloaded the models from here. I've been using GloVe for sentence classification. The sentences I'm classifying are specific to a particular domain, ...

nlp text-classification glove

Grosmark asked 25/4, 2017 at 18:15

1

Solved

How can I use GPT 3 for my text classification?

I am wondering if I can be able to use OpenAI GPT-3 for transfer learning in a text classification problem? If so, how can I get start on it using Tensorflow, Keras.

keras text-classification transfer-learning openai-api gpt-3

Dill asked 9/8, 2020 at 2:17

1

Solved

Sliding window for long text in BERT for Question Answering

I've read post which explains how the sliding window works but I cannot find any information on how it is actually implemented. From what I understand if the input are too long, sliding window can ...

nlp text-classification huggingface-transformers nlp-question-answering bert-language-model

Demmy asked 19/7, 2020 at 10:18

3

Solved

Why scikit learn confusion matrix is reversed?

I have 3 questions: 1) The confusion matrix for sklearn is as follows: TN | FP FN | TP While when I'm looking at online resources, I find it like this: TP | FP FN | TN Which one should I co...

scikit-learn text-classification confusion-matrix performance-measuring

Raber asked 10/5, 2019 at 12:57

1

How do I properly combine numerical features with text (bag of words) in scikit-learn?

I am writing a classifier for web pages, so I have a mixture of numerical features, and I also want to classify the text. I am using the bag-of-words approach to transform the text into a (large) n...

python scikit-learn classification text-classification

Cowbell asked 12/9, 2016 at 7:12

1

Solved

FastText 0.9.2 - why is recall 'nan'?

I trained a supervised model in FastText using the Python interface and I'm getting weird results for precision and recall. First, I trained a model: model = fasttext.train_supervised("train.t...

python-3.x nlp text-classification precision-recall fasttext

Miche asked 14/5, 2020 at 0:21

1

How to use SHAP with a linear SVC model from sklearn using Pipeline?

I am doing text classification using a linear SVC model from sklearn. Now I want to visualize which words/tokens have the highest impact on the classification decision by using SHAP (https://github...

scikit-learn pipeline text-classification svc shap

Mcelroy asked 26/4, 2019 at 12:39

text-classification Questions

Recommended topics

Hot tags