Bias towards negative sentiments from Stanford CoreNLP

Asked 8/9, 2014 at 16:49 Answered 9/9, 2015 at 9:37

Solved java twitter nlp stanford-nlp sentiment-analysis

I'm experimenting with deriving sentiment from Twitter using Stanford's CoreNLP library, a la https://www.openshift.com/blogs/day-20-stanford-corenlp-performing-sentiment-analysis-of-twitter-using-java - so see here for the code that I'm implementing.

I am getting results, but I've noticed that there appears to be a bias towards 'negative' results, both in my target dataset and another dataset I use with ground truth - the Sanders Analytics Twitter Sentiment Corpus http://www.sananalytics.com/lab/twitter-sentiment/ - even though the ground truth data do not have this bias.

I'm posting this question on the off chance that someone else has experienced this and/or may know if this is the result of something I've done or some bug in the CoreNLP code.

(edit - sorry it took me so long to respond) I am posting links to plots showing what I mean. I don't have enough reputation to post the images, and can only include two links in this post, so I'll add the links in the comments.

Isia answered 8/9, 2014 at 16:49 Comment(2)

Can you show how much bias? E.g. can you show the results you get, and what you expected to get. – Grater 10/9, 2014 at 7:35

Here is a histogram of the ground truth sentiments from the Sanders corpus. Note the majority of tweets have neutral sentiment. I get similar distributions from other tools: AlchemyAPI, LIWC, and Wilson. But I get a different distribution from Stanford CoreNLP - there are many more negative sentiments. – Isia 25/9, 2014 at 17:27

I'd like to suggest this is simply a domain mismatch. The Stanford RNTN is trained on movie review snippets and you are testing on twitter data. Other than the topics mismatch, tweets also tend to be ungrammatical and use abbreviated ("creative") language. If I had to suggest a more concrete reason, I would start with a lexical mismatch. Perhaps negative emotions are expressed in a domain-independent way, e.g. with common adjectives, and positive emotions are more domain-dependent or more subtle.

It's still interesting that you're getting a negative bias. The Polyanna hypothesis suggests a positive bias, IMHO.

Going beyond your original question, there are several approaches to do sentiment analysis specifically on microblogging data. See e.g. "The Good, The Bad and the OMG!" by Kouloumpis et al.

Siding answered 1/12, 2014 at 7:44 Comment(2)

Perhaps, if you could suggest on how to train Stanford NLP? – Randalrandall 7/2, 2015 at 6:24

Dear @SameerThigale perhaps you will find my previous answer interesting: https://mcmap.net/q/2033302/-stanford-nlp-sentiment-analysis-for-chinese-language – Siding 9/2, 2015 at 13:8

Michael Haas points out correctly that there is a domain mismatch, which is also specified by Richard Socher in the comments section.

Sentences with a lot of unknown words and imperfect punctuation get flagged as negative.

If you are using Python, VADER is a great tool for twitter sentiment analysis. It is a rule based tool with only ~300 lines of code and a custom made lexicon for twitter, which has ~8000 words including slangs and emoticons.

It is easy to modify the rules as well as the lexicon, without any need for re-training. It is fully free and open source.

Eldaelden answered 9/9, 2015 at 9:37 Comment(0)

Recommended topics

Hot tags