LSTM Text Classification Bad Accuracy Keras - McMap

About

LSTM Text Classification Bad Accuracy Keras

Asked 22/8, 2018 at 7:49 Answered 22/8, 2018 at 8:2

Solved keras lstm text-classification recurrent-neural-network multilabel-classification

T

2

5

I'm going crazy in this project. This is multi-label text-classification with lstm in keras. My model is this:

model = Sequential()

model.add(Embedding(max_features, embeddings_dim, input_length=max_sent_len, mask_zero=True, weights=[embedding_weights] ))
model.add(Dropout(0.25))
model.add(LSTM(output_dim=embeddings_dim , activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True))
model.add(Dropout(0.25))
model.add(LSTM(activation='sigmoid', units=embeddings_dim, recurrent_activation='hard_sigmoid', return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))

adam=keras.optimizers.Adam(lr=0.04)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])

Only that I have too low an accuracy .. with the binary-crossentropy I get a good accuracy, but the results are wrong !!!!! changing to categorical-crossentropy, I get very low accuracy. Do you have any suggestions?

there is my code: GitHubProject - Multi-Label-Text-Classification

Thresathresh answered 22/8, 2018 at 7:49 Comment(0)

R

8

In last layer, the activation function you are using is sigmoid, so binary_crossentropy should be used. Incase you want to use categorical_crossentropy then use softmax as activation function in last layer.

Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh as activation function in LSTM layers.

And you can try using LSTM's dropouts as well like dropout and recurrent dropout

LSTM(units, dropout=0.2, recurrent_dropout=0.2,
                             activation='tanh')

You can define units as 64 or 128. Start from small number and after testing you take them till 1024.

You can try adding convolution layer as well for extracting features or use Bidirectional LSTM But models based Bidirectional takes time to train.

Moreover, since you are working on text, pre-processing of text and size of training data always play much bigger role than expected.

Edited

Add Class weights in fit parameter

class_weights = class_weight.compute_class_weight('balanced',
                                                  np.unique(labels),
                                                  labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
                          class_weights))


model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)

Raguelragweed answered 22/8, 2018 at 8:2 Comment(23)

thank you!! i use categorical_crossentropy because i have multiple class to predict..is correct??? can i use binary for this purpose??? Now i use softmax and tanh but accuracy is still low. How can i use LSTM's dropouts as well like dropout and recurrent dropout??? for the pre-processing i use

embeddings = dict( ) embeddings = gensim.models.KeyedVectors.load_word2vec_format("GoogleNews-vectors-negative300.bin.gz" , binary=True)

is correct?? – Thresathresh 22/8, 2018 at 9:5

@angelocurtigiardina You can use binary_crossentropy if you are using softmax and check the answer, I have edited. – Raguelragweed 22/8, 2018 at 9:28

have you tried using pre-trained glove and fasttext? @angelocurtigiardina – Raguelragweed 22/8, 2018 at 9:32

really really thank's! with binary the accuracy is high!!!! now i'm testing..then i try with glove and fasttext!!!! really thank's! @UpasanaMittal – Thresathresh 22/8, 2018 at 12:36

Really bad :( my code is this..tried with word2vec and fasttext, binary and categorical..with binary the accuracy is high but the result are incorrect..what am I doing wrong? ......i'm not able to post my code.. – Thresathresh 22/8, 2018 at 14:16

@angelocurtigiardina can you tell me size of your data and number of samples per class? And did you add conv layer? – Raguelragweed 22/8, 2018 at 20:53

If you can post your code on github then do let me know. I will check. – Raguelragweed 22/8, 2018 at 20:53

there is my code: github.com/ancileddu/multi-label-text-classification .. i can't add the conv layer because the teacher just wants lstm :( really thank you! you're very kind! it's my last university exam and I'm going crazy!!!! – Thresathresh 23/8, 2018 at 6:34

Please explain me one thing. Why are you training word2vec on negative words but not on the data you are training your model on? – Raguelragweed 23/8, 2018 at 10:12

it's an error..now i've changed with embeddings = gensim.models.Word2Vec(train_texts, min_count=1, size=300)..is correct??? – Thresathresh 23/8, 2018 at 11:44

You haven't done text cleaning. any particular reason? – Raguelragweed 23/8, 2018 at 11:52

it is necessary? practically I have to remove from the trainset the useless words and the punctuation ?? – Thresathresh 23/8, 2018 at 12:3

sorry, i'm noob..'tokenizer = Tokenizer(num_words=max_features, filters='!"#$%&()*+,-./:;<=>?@[\]^_`{|}~\t\n', lower=True, split=" ")' this cose clean data, no? – Thresathresh 23/8, 2018 at 12:33

What is the total number of samples per class? – Raguelragweed 24/8, 2018 at 3:59

class 0: 808 - class 1: 1652 - class 2: 1220 - class 3: 1708 - class 4: 969 – Thresathresh 24/8, 2018 at 6:28

There is class imbalance. I am editing answer with giving class_weights as dict in fit parameter. Please do that as well. – Raguelragweed 24/8, 2018 at 6:30

labels are labels = ["0","1","2","3","4"] right? now the error is: class_weight must contain all classes in the data. The classes {0, 1, 2, 3, 4} exist in the data but not in class_weight..i'm going crazy.. model.fit(train_sequences, train_labels, validation_split=0.1, class_weight=class_weights_dict) – Thresathresh 24/8, 2018 at 7:20

I checked your code. and done changes accordingly again. Use it. And I am trying the model with the data provided at my end as well. – Raguelragweed 24/8, 2018 at 7:28

i'm very grateful for your precious help..thank you very much!!!!!! but my model doesn't provide an accurate prediction :( how many epochs should i use? i've tried with 1 and 5.. – Thresathresh 24/8, 2018 at 7:41

use early callback in fit parameter and try epoch atleast 50 – Raguelragweed 24/8, 2018 at 7:55

@angelocurtigiardina can you please check this github.com/upasana-mittal/stackoverflow/blob/master/… – Raguelragweed 24/8, 2018 at 8:11

Please try changing glove embedding to other as i have tried with twitter one. You can try with wikipedia one.nlp.stanford.edu/projects/glove and use 100 dimension only. – Raguelragweed 24/8, 2018 at 8:21

this is really really cool!!!!!!!!! thank you very much..now i commit all the mod..you are the best!!!!!!!! now i'm trying the model with 50 epochs..good!!!! – Thresathresh 24/8, 2018 at 9:23

B

4

change:

model.add(Activation('sigmoid'))

to:

model.add(Activation('softmax'))

Booboo answered 22/8, 2018 at 7:50 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.