Best Algorithmic Approach to Sentiment Analysis [closed]

Asked 16/11, 2010 at 21:57 Answered 23/9, 2013 at 19:9

My requirement is taking in news articles and determining if they are positive or negative about a subject. I am taking the approach outlined below, but I keep reading NLP may be of use here. All that I have read has pointed at NLP detecting opinion from fact, which I don't think would matter much in my case. I'm wondering two things:

1) Why wouldn't my algorithm work and/or how can I improve it? ( I know sarcasm would probably be a pitfall, but again I don't see that occurring much in the type of news we will be getting)

2) How would NLP help, why should I use it?

My algorithmic approach (I have dictionaries of positive, negative, and negation words):

1) Count number of positive and negative words in article

2) If a negation word is found with 2 or 3 words of the positive or negative word, (ie: NOT the best) negate the score.

3) Multiply the scores by weights that have been manually assigned to each word. (1.0 to start)

4) Add up the totals for positive and negative to get the sentiment score.

Widget answered 16/11, 2010 at 21:57 Comment(3)

Sentiment analysis is definitionally a form of NLP; you're processing natural language text. The only way to know exactly how well your approach is going to work is to try it. Conveniently, that will also tell you if it works well enough for your purpose, which is actually the part that matters. – Finella 16/11, 2010 at 22:48

See this question and its answers for a simple algorithm that works well in practice: #3921259 – Drabeck 17/11, 2010 at 11:33

My algorithm is the best algorithm. Because I'm a grad student doing reasearch in sentiment analysis, and I have big ego :) – Doubletime 3/12, 2010 at 20:24

I don't think there's anything particularly wrong with your algorithm, it's a fairly straightforward and practical way to go, but there are a lot of situations where it will get make mistakes.

Ambiguous sentiment words - "This product works terribly" vs. "This product is terribly good"
Missed negations - "I would never in a millions years say that this product is worth buying"
Quoted/Indirect text - "My dad says this product is terrible, but I disagree"
Comparisons - "This product is about as useful as a hole in the head"
Anything subtle - "This product is ugly, slow and uninspiring, but it's the only thing on the market that does the job"

I'm using product reviews for examples instead of news stories, but you get the idea. In fact, news articles are probably harder because they will often try to show both sides of an argument and tend to use a certain style to convey a point. The final example is quite common in opinion pieces, for example.

As far as NLP helping you with any of this, word sense disambiguation (or even just part-of-speech tagging) may help with (1), syntactic parsing might help with the long range dependencies in (2), some kind of chunking might help with (3). It's all research level work though, there's nothing that I know of that you can directly use. Issues (4) and (5) are a lot harder, I throw up my hands and give up at this point.

I'd stick with the approach you have and look at the output carefully to see if it is doing what you want. Of course that then raises the issue of what you want you understand the definition of "sentiment" to be in the first place...

Raynaraynah answered 17/11, 2010 at 9:39 Comment(2)

My back-of-the-envelope estimate (based on 20 docments from a corpus of opinionated text I'm working on annotating) about 3% of positive/negative opinions are comparative, so #4 is probably not such a big issue. Long range dependencies is a big issue, so syntactic parsing is a good idea, though the number of different patterns connecting product features with their opinions is huge. – Doubletime 22/11, 2010 at 4:4

great name and a wonderful answer – Crying 26/2, 2012 at 19:13

my favorite example is "just read the book". it contains no explicit sentiment word and it is highly depending on the context. If it apears in a movie review it means that the-movie-sucks-it's-a-waste-of-your-time-but-the-book-is-good. However, if it is in a book review it delivers a positive sentiment.

And what about - "this is the smallest [mobile] phone in the market". back in the '90, it was a great praise. Today it may indicate that it is a way too small.

I think this is the place to start in order to get the complexity of sentiment analysis: http://www.cs.cornell.edu/home/llee/opinion-mining-sentiment-analysis-survey.html (by Lillian Lee of Cornell).

Encomium answered 17/11, 2010 at 20:20 Comment(2)

Sentiment analysis is not a magic lamp. It is not meant to provide insight based on a single isolated instance. A human could not even provide useful output based on a single contextless instance of the sentences you give. This is why it must be taken in aggregate, over a scenario of interest, with tens/hundreds/thousands of utterances analysed to get an idea of sentiment (or sentiment flow) on a topic (over time). – Crying 10/1, 2013 at 10:24

Cris - yes and no. You are right for some practical applications (depending on measurable and quantifiable error in the domain of interest) but I think that NLP/CL researchers are also interested in the semantic meaning of a specific instance (sentence w/out context). Humans are pretty good at that for most instances. Still much better than state of the art algorithms. – Encomium 14/4, 2016 at 20:14

Machine-learning techniques are probably better.

Whitelaw, Garg, and Argamon have a technique that achieves 92% accuracy, using a technique similar to yours for dealing with negation, and support vector machines for text classification.

Doubletime answered 3/12, 2010 at 20:33 Comment(1)

The original link was broken for me, but I believe I found the paper you were meaning to point to so I edited that into your post. – Chlorohydrin 8/3, 2016 at 18:42

You may find the OpinionFinder system and the papers describing it useful. It is available at http://www.cs.pitt.edu/mpqa/ with other resources for opinion analysis.

It goes beyond polarity classification at the document level, but try to find individual opinions at the sentence level.

Malarkey answered 25/5, 2011 at 15:43 Comment(0)

I believe the best answer to all of the questions that you mentioned is reading the book under the title of "Sentiment Analysis and opinion mining" by Professor Bing Liu. This book is the best of its own in the field of sentiment analysis. it is amazing. Just take a look at it and you will find the answer to all your 'why' and 'how' questions!

Stucco answered 23/9, 2013 at 19:9 Comment(0)

Why don't you try something similar to how SpamAsassin spam filter works? There really not much difference between intension mining and opinion mining.

Babul answered 16/11, 2010 at 22:53 Comment(1)

-1. There's a lot of difference in practice. Opinion mining is a lot harder than spam detection. – Drabeck 17/11, 2010 at 11:30

Recommended topics

Hot tags