LSTM Text Classification Bad Accuracy Keras
T

2

5

I'm going crazy in this project. This is multi-label text-classification with lstm in keras. My model is this:

model = Sequential()

model.add(Embedding(max_features, embeddings_dim, input_length=max_sent_len, mask_zero=True, weights=[embedding_weights] ))
model.add(Dropout(0.25))
model.add(LSTM(output_dim=embeddings_dim , activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True))
model.add(Dropout(0.25))
model.add(LSTM(activation='sigmoid', units=embeddings_dim, recurrent_activation='hard_sigmoid', return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))

adam=keras.optimizers.Adam(lr=0.04)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])

Only that I have too low an accuracy .. with the binary-crossentropy I get a good accuracy, but the results are wrong !!!!! changing to categorical-crossentropy, I get very low accuracy. Do you have any suggestions?

there is my code: GitHubProject - Multi-Label-Text-Classification

Thresathresh answered 22/8, 2018 at 7:49 Comment(0)
R
8

In last layer, the activation function you are using is sigmoid, so binary_crossentropy should be used. Incase you want to use categorical_crossentropy then use softmax as activation function in last layer.

Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh as activation function in LSTM layers.

And you can try using LSTM's dropouts as well like dropout and recurrent dropout

LSTM(units, dropout=0.2, recurrent_dropout=0.2,
                             activation='tanh')

You can define units as 64 or 128. Start from small number and after testing you take them till 1024.

You can try adding convolution layer as well for extracting features or use Bidirectional LSTM But models based Bidirectional takes time to train.

Moreover, since you are working on text, pre-processing of text and size of training data always play much bigger role than expected.

Edited

Add Class weights in fit parameter

class_weights = class_weight.compute_class_weight('balanced',
                                                  np.unique(labels),
                                                  labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
                          class_weights))


model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)
Raguelragweed answered 22/8, 2018 at 8:2 Comment(23)
thank you!! i use categorical_crossentropy because i have multiple class to predict..is correct??? can i use binary for this purpose??? Now i use softmax and tanh but accuracy is still low. How can i use LSTM's dropouts as well like dropout and recurrent dropout??? for the pre-processing i use embeddings = dict( ) embeddings = gensim.models.KeyedVectors.load_word2vec_format("GoogleNews-vectors-negative300.bin.gz" , binary=True) is correct??Thresathresh
@angelocurtigiardina You can use binary_crossentropy if you are using softmax and check the answer, I have edited.Raguelragweed
have you tried using pre-trained glove and fasttext? @angelocurtigiardinaRaguelragweed
really really thank's! with binary the accuracy is high!!!! now i'm testing..then i try with glove and fasttext!!!! really thank's! @UpasanaMittalThresathresh
Really bad :( my code is this..tried with word2vec and fasttext, binary and categorical..with binary the accuracy is high but the result are incorrect..what am I doing wrong? ......i'm not able to post my code..Thresathresh
@angelocurtigiardina can you tell me size of your data and number of samples per class? And did you add conv layer?Raguelragweed
If you can post your code on github then do let me know. I will check.Raguelragweed
there is my code: github.com/ancileddu/multi-label-text-classification .. i can't add the conv layer because the teacher just wants lstm :( really thank you! you're very kind! it's my last university exam and I'm going crazy!!!!Thresathresh
Please explain me one thing. Why are you training word2vec on negative words but not on the data you are training your model on?Raguelragweed
it's an error..now i've changed with embeddings = gensim.models.Word2Vec(train_texts, min_count=1, size=300)..is correct???Thresathresh
You haven't done text cleaning. any particular reason?Raguelragweed
it is necessary? practically I have to remove from the trainset the useless words and the punctuation ??Thresathresh
sorry, i'm noob..'tokenizer = Tokenizer(num_words=max_features, filters='!"#$%&()*+,-./:;<=>?@[\]^_`{|}~\t\n', lower=True, split=" ")' this cose clean data, no?Thresathresh
What is the total number of samples per class?Raguelragweed
class 0: 808 - class 1: 1652 - class 2: 1220 - class 3: 1708 - class 4: 969Thresathresh
There is class imbalance. I am editing answer with giving class_weights as dict in fit parameter. Please do that as well.Raguelragweed
labels are labels = ["0","1","2","3","4"] right? now the error is: class_weight must contain all classes in the data. The classes {0, 1, 2, 3, 4} exist in the data but not in class_weight..i'm going crazy.. model.fit(train_sequences, train_labels, validation_split=0.1, class_weight=class_weights_dict)Thresathresh
I checked your code. and done changes accordingly again. Use it. And I am trying the model with the data provided at my end as well.Raguelragweed
i'm very grateful for your precious help..thank you very much!!!!!! but my model doesn't provide an accurate prediction :( how many epochs should i use? i've tried with 1 and 5..Thresathresh
use early callback in fit parameter and try epoch atleast 50Raguelragweed
@angelocurtigiardina can you please check this github.com/upasana-mittal/stackoverflow/blob/master/…Raguelragweed
Please try changing glove embedding to other as i have tried with twitter one. You can try with wikipedia one.nlp.stanford.edu/projects/glove and use 100 dimension only.Raguelragweed
this is really really cool!!!!!!!!! thank you very much..now i commit all the mod..you are the best!!!!!!!! now i'm trying the model with 50 epochs..good!!!!Thresathresh
B
4

change:

model.add(Activation('sigmoid'))

to:

model.add(Activation('softmax'))
Booboo answered 22/8, 2018 at 7:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.