I am working on a data set of approximately 3000 questions and I want to perform intent classification. The data set is not labelled yet, but from the business perspective, there's a requirement of identifying approximately 80 various intent classes. Let's assume my training data has approximately equal number of each classes and is not majorly skewed towards some of the classes. I am intending to convert the text to word2vec or Glove and then feed into my classifier.
I am familiar with cases in which I have a smaller number of intent classes, such as 8 or 10 and the choice of machine learning classifiers such as SVM, naive bais or deeplearning (CNN or LSTM).
My question is that if you have had experience with such large number of intent classes before, and which of machine learning algorithm do you think will perform reasonably? do you think if i use deep learning frameworks, still large number of labels will cause poor performance given the above training data?
We need to start labelling the data and it is rather laborious to come up with 80 classes of labels and then realise that it is not performing well, so I want to ensure that I am making the right decision on how many classes of intent maximum I should consider and what machine learning algorithm do you suggest?
Thanks in advance...