text-classification Questions

1

I am trying to do binary text classification on custom data (which is in csv format) using different transformer architectures that Hugging Face 'Transformers' library offers. I am using this Tenso...
Flowerdeluce asked 30/1, 2020 at 4:10

3

I want to add a few more words to stop_words in TfidfVectorizer. I followed the solution in Adding words to scikit-learn's CountVectorizer's stop list . My stop word list now contains both ...

2

Solved

I am working on a binary classification problem with Tensorflow BERT language model. Here is the link to google colab. After saving and loading the model is trained, I get error while doing the pre...

1

I have a list of twitter users (screen_names) and I need to categorise them into 7 pre-defined categories - Education, Art, Sports, Business, Politics, Automobiles, Technology based on thier intere...
Representative asked 23/2, 2020 at 6:5

1

In the last few layers of sequence classification by HuggingFace, they took the first hidden state of the sequence length of the transformer output to be used for classification. hidden_state = d...

2

I am trying to get code working from the following repo, which is based off this paper. It had a lot of errors, but I mostly got it working. However, I keep getting the same problem and I really do...
Freeboard asked 22/1, 2020 at 20:42

1

I do binary text classification with BERT from the Simpletransformer. I work in Colab with GPU runtime type. I have generated train and test set with the sklearn StratifiedKFold Method. I have t...

1

I am using the bert-for-tf2 library to do a Multi-Class Classification problem. I created the model but training throws the following error: -------------------------------------------------------...
Garnetgarnett asked 3/12, 2019 at 11:4

2

Solved

I'm reasonably new to machine learning, I've done a few projects in python. I'm looking for advice on how to approach the below problem which I believe could be automated. A user in a data quality...

1

I've been thinking about 0-padding of word sequence and how that 0-padding is then converted to the Embedding layer. At first glance, one would think that you want to keep the embeddings = 0.0 as w...

3

I am working with TFIDF sparse matrices for document classification and want to retain only the top n (say 50) terms for each document (ranked by TFIDF score). See EDIT below. import numpy as np i...

1

I want to train a 21 class text classification model using Bert. But I have very little training data, so a downloaded a similar dataset with 5 classes with 2 million samples.t And finetuned downl...

2

Solved

I'm trying to use the packages quanteda and caret together to classify text based on a trained sample. As a test run, I wanted to compare the build-in naive bayes classifier of quanteda with the on...
Invalidism asked 29/1, 2019 at 17:57

4

It has been proved that CNN (convolutional neural network) is quite useful for text/document classification. I wonder how to deal with the length differences as the lengths of articles are differen...
Secondary asked 2/6, 2016 at 1:40

3

Solved

I am trying to develop a text classifier that will classify a piece of text as Private or Public. Take medical or health information as an example domain. A typical classifier that I can think of c...
Kristikristian asked 4/3, 2019 at 22:0

1

Solved

I am trying to do multi-class classification with textual data. Problem I am facing that I have unstructured textual data. I'll explain the problem with an example. consider this image for example:...

1

Solved

I am working on a data set of approximately 3000 questions and I want to perform intent classification. The data set is not labelled yet, but from the business perspective, there's a requirement of...
Insurrectionary asked 24/2, 2019 at 9:50

3

Solved

I have over 15000 text docs of a specific topic. I would like to build a language model based on the former so that I can present to this model new random text documents of various topics and the a...
Nipa asked 23/10, 2013 at 20:40

1

Solved

I use reuters dataset in Keras. And I want to know the 46 topics' names. How can I show topics of reuters dataset in Keras? https://keras.io/datasets/#reuters-newswire-topics-classification
Burnoose asked 17/7, 2017 at 7:27

1

Solved

I'm trying to do some text classification using MultinomialNB, but I'm running into problems because my data is unbalanced. (Below is some sample data for simplicity. In actuality, mine is much lar...
Hamitosemitic asked 9/1, 2019 at 20:45

1

Solved

So I have a text classification model built with Keras. I've been trying to pad my varying length sequences but the Keras function pad_sequences() has just returned zeros. I've figured out that if...
Atone asked 3/1, 2019 at 23:21

1

Solved

I am working on a text classification problem where multiple text features and need to build a model to predict salary range. Please refer the Sample dataset Most of the resources/tutorials deal wi...
Darton asked 26/12, 2018 at 7:56

2

Solved

I have a question regarding GridSearchCV: by using this: gs_clf = GridSearchCV(pipeline, parameters, n_jobs=-1, cv=6, scoring="f1") I specify that k-fold cross-validation should be used with 6 ...
Pathological asked 11/11, 2016 at 10:37

2

I want to perform text classification using word2vec. I got vectors of words. ls = [] sentences = lines.split(".") for i in sentences: ls.append(i.split()) model = Word2Vec(ls, min_count=1, size ...
Vandyke asked 4/4, 2018 at 6:10

4

I am trying to solve a text classification problem. I have a limited number of labels that capture the category of my text data. If the incoming text data doesn't fit any label, it is tagged as 'Ot...

© 2022 - 2024 — McMap. All rights reserved.