Java text analysis libraries
Asked Answered
K

3

13

I'm looking for a java driven solution to a requirement for analysing sentences to log whether a key word was used positively or negatively.

Ie The key word might be 'cabbages' and the sentence:-

'I like cabbages but not peas'

And I'd like a java text analyser of some kind to log this as positive. Can the lucene (Hibernate-Search) libraries be utilized to for this?

Any thoughts?

Klos answered 23/9, 2010 at 12:33 Comment(0)
F
16

You're looking for "sentiment analysis". One possibility is LingPipe, who kindly link to their competitors also. Jeff Dalton also has a great list of natural language processing tools in his blog.

Folberth answered 23/9, 2010 at 12:57 Comment(2)
There is a wealth of stuff here. It is going to take some time to sift through it. I shall report back on my findings - but many thanks for the pointers.Klos
Yes, please do report back if you find anything useful.Folberth
G
1

I doubt there's anything like that. Lucene definitely can't do it out of the box.

How do you even define "whether a key word was used positively or negatively" in a way that can be evaluated programmatically? To do it properly, you'd have to analyse the text for their actual meaning, which is an AI problem that is not even remotely solved.

I suppose you could solve it approximately by just doing a statistical analysis of whether the keyword appears more often close to positive (like, good, great, wonderful) or negative (bad, hate, crappy, damn) keywords, but even there, negations, sarcasm and complex sentence structures will be problematic.

Goidelic answered 23/9, 2010 at 12:44 Comment(1)
I am reminded of a translator that started with "the spirit is willing, but the flesh is weak", and came back with "the wine is good, but the meat is rotten".Catalog
L
0

Take a look at Mahout Taste, which builds on Lucene but adds a lot of what you need out of the box. (edit) I should add, Mahout Taste is merely related to what you're looking for and not a 100% match.

Lexicon answered 23/9, 2010 at 12:43 Comment(1)
(I'm the author.) Taste is a collaborative filtering engine. The encapsulating project, Mahout, concerns more general data mining but does not include sentiment analysis.Syllepsis

© 2022 - 2024 — McMap. All rights reserved.