How can I improve the classification accuracy of LSTM,GRU recurrent neural networks
Asked Answered
P

1

5

Binary Classification Problem in Tensorflow:

I have gone through the online tutorials and trying to apply it on a real-time problem using gated-recurrent unit (GRU). I have tried all the possibilities which I know to improve the classification.

1) Started adding stacked RNN(GRU) layers 2) Increasing hidden units per RNN layer 3) Added "sigmoid" and "RelU" activation functions for hidden layer 4) Normalized the input data 5) Changed the hyperparameters

Please find the link to the dataset: https://github.com/madhurilalitha/Myfirstproject/blob/master/ApplicationLayerTrainingData1.txt

If you can go through the dataset, it has labels "normal" and "other than normal". I have encoded "normal" as '1 0' and abnormal as '0 1'. I have also changed the dataset in to 3D of the shape below:

Shape of new train X (7995, 5, 40) Shape of new train Y (7995, 2) Shape of new test X (1994, 5, 40) Shape of new test Y (1994, 2)

I am not pretty sure where I am missing the logic, Could someone help me in finding the fault in my code?

The classification accuracy on the test data is 52.3%. It performs with same accuracy even on training data. Please find the code below:

#Hyper Parameters for the model
learning_rate = 0.001   
n_classes = 2    
display_step = 100    
input_features = train_X.shape[1] #No of selected features(columns)    
training_cycles = 1000    
time_steps = 5 # No of time-steps to backpropogate    
hidden_units = 50 #No of GRU units in a GRU Hidden Layer   

#Input Placeholders
with tf.name_scope('input'):
    x = tf.placeholder(tf.float64,shape = [None,time_steps,input_features], name 
= "x-input")    
    y = tf.placeholder(tf.float64, shape = [None,n_classes],name = "y-input")
#Weights and Biases    
with tf.name_scope("weights"):
    W = tf.Variable(tf.random_normal([hidden_units,n_classes]),name = "layer-
weights")    

with tf.name_scope("biases"):
    b = tf.Variable(tf.random_normal([n_classes]),name = "unit-biases")     


# Unstack to get a list of 'time_steps' tensors of shape (batch_size, 
input_features)
x_ = tf.unstack(x,time_steps,axis =1)    

#Defining a single GRU cell
gru_cell = tf.contrib.rnn.GRUCell(hidden_units)    

#GRU Output
with tf.variable_scope('MyGRUCel36'):   
    gruoutputs,grustates = 
tf.contrib.rnn.static_rnn(gru_cell,x_,dtype=tf.float64)    

#Linear Activation , using gru inner loop last output
output = tf.add(tf.matmul(gruoutputs[-1],tf.cast(W,tf.float64)),tf.cast(b,tf.float64))

#Defining the loss function
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits = output))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

#Training the Model
sess = tf.InteractiveSession()    
sess.run(tf.global_variables_initializer())    
for i in range (training_cycles):   
    _,c = sess.run([optimizer,cost], feed_dict = {x:newtrain_X, y:newtrain_Y})

    if (i) % display_step == 0:
        print ("Cost for the training cycle : ",i," : is : ",sess.run(cost, feed_dict ={x :newtrain_X,y:newtrain_Y}))
correct = tf.equal(tf.argmax(output, 1), tf.argmax(y,1))    
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))    
print('Accuracy on the overall test set is :',accuracy.eval({x:newtest_X, y:newtest_Y}))    
Patin answered 10/7, 2017 at 16:4 Comment(0)
C
6

It sounds like you're on the right track. I would try visualizing your training data to make sure it's decreasing as you expect.

Is there a reason that you think you should be getting higher accuracy? That could just be the best you can do with this amount of data. One of the best ways to increase your model performance is to get more data; is it possible to get more data?

Hyperparameter optimization is a good way to proceed; I would try playing with different learning rates, different numbers of hidden layers, and different sizes of hidden layers.

Circumpolar answered 10/7, 2017 at 17:41 Comment(4)
Thanks alot Mr.fbt, You are right, the accuracy is valid the for the dataset I used. I went a step ahead and shuffled the samples while training my first model(case 1). When I used the original dataset ( unshuffled one with original size), the model performed better with high valid accuracy(case 2). I made sure that I am not only considering "accuracy alone" for measuring the performance of my model in these two cases.Patin
If you're looking for other metrics, F1, recall, and precision are good ones in addition to accuracy; Kappa is another strong one.Circumpolar
This is my new account. When I tried to upvote your answer, it shows the message "the votes are recorded but are not shown publicly as I have less reputation"Patin
That's legit. I think you can click the checkmark to mark the answer as accepted without reputation.Circumpolar

© 2022 - 2024 — McMap. All rights reserved.