How to apply SMOTE algorithm before word embedding layer in LSTM.
I have a problem of text binary classification (Good(9500) or Bad(500) review with total of 10000 training sample and it's unbalanced training sample), mean while i am using LSTM with pre-trained word-embeddings (100 dimension space for each word) as well, so each training input have an id's (Total of 50 ids with zero padding's as well when the text description is having lesser than 50 words and trimmed to 50 when the description is exceeded 50 characters) of word dictionary.
Below is my general flow,
- Input - 1000(batch) X 50 (sequence length)
- Word Embedding - 200(Unique vocabulary word) X 100 (word representation)
- After word embedding layer (new input for LSTM) - 1000(batch) X 50(sequence) X 100 (features)
- Final State from LSTM 1000 (batch) X 100 (units)
- Apply final layer 1000(batch) X 100 X [100(units) X 2 (output class)]
All i want to generate more data for Bad review with the help of SMOTE