Intent classification with large number of intent classes
Asked Answered
I

1

5

I am working on a data set of approximately 3000 questions and I want to perform intent classification. The data set is not labelled yet, but from the business perspective, there's a requirement of identifying approximately 80 various intent classes. Let's assume my training data has approximately equal number of each classes and is not majorly skewed towards some of the classes. I am intending to convert the text to word2vec or Glove and then feed into my classifier.

I am familiar with cases in which I have a smaller number of intent classes, such as 8 or 10 and the choice of machine learning classifiers such as SVM, naive bais or deeplearning (CNN or LSTM).

My question is that if you have had experience with such large number of intent classes before, and which of machine learning algorithm do you think will perform reasonably? do you think if i use deep learning frameworks, still large number of labels will cause poor performance given the above training data?

We need to start labelling the data and it is rather laborious to come up with 80 classes of labels and then realise that it is not performing well, so I want to ensure that I am making the right decision on how many classes of intent maximum I should consider and what machine learning algorithm do you suggest?

Thanks in advance...

Insurrectionary answered 24/2, 2019 at 9:50 Comment(0)
P
6

First, word2vec and GloVe are, almost, dead. You should probably consider using more recent embeddings like BERT or ELMo (both of which are sensitive to the context; in other words, you get different embeddings for the same word in a different context). Currently, BERT is my own preference since it's completely open-source and available (gpt-2 was released a couple of days ago which is apparently a little bit better. But, it's not completely available to the public).

Second, when you use BERT's pre-trained embeddings, your model has the advantage of seeing a massive amount of text (Google massive) and thus can be trained on small amounts of data which will increase it's performance drastically.

Finally, if you could classify your intents into some coarse-grained classes, you could train a classifier to specify which of these coarse-grained classes your instance belongs to. Then, for each coarse-grained class train another classifier to specify the fine-grained one. This hierarchical structure will probably improve the results. Also for the type of classifier, I believe a simple fully connected layer on top of BERT would suffice.

Plusch answered 24/2, 2019 at 18:35 Comment(1)
Hi, can you please explain more in your last paragraph? For example, I have 3 parent categories. Each has 5 children category. Then, I would need to train 1 BERT model to classify among the parent categories. And, I would need to train 3 more different BERT models to classify the child category of each parent. Am I understanding you correctly? Many thanks.Clairvoyance

© 2022 - 2024 — McMap. All rights reserved.