How I can start building wordnet for Turkish language to use in sentiment analysis
Asked Answered
P

1

8

Although I hold EE background, I didn't get chance to attend Natural Language processing classes.

I would like to build sentiment analysis tool for Turkish language. I think it is best to create a Turkish wordnet database rather than translating the text to English and analyze it with buggy translated text with provided tools. (is it?)

So what do you guys recommend me to do ? First of all taking NLP classes from an open class website? I really don't know where to start. Could you help me and maybe provide me step by step guide? I know this is an academic project but I am interested to build skills as a hobby in that area.

Thanks in advance.

Pray answered 27/12, 2011 at 5:33 Comment(5)
You might be interested in following the proposals for potential new SE sites including Turkish Language & Usage and StackOverflow in Turkish.Wyn
I think you're right about needing to build a Turkish word and phrase database for this rather than translating. However I'm not sure this is the best place for this question. This might be an appropriate question to migrate to the Programmers.SE site as a conceptual issue rather than a coding issue. Thoughts?Wyn
You might be right. Looking the way to migrate to that section.Pray
You can migrate by using the "flag" link and then leaving a flag for moderator attention that asks for it to be migrated. I have also filled one for you, but as the owner of the question if you file one it will be more likely to be approved and sooner :)Wyn
I noticed this in one of the deleted answers: dblab.upatras.gr/balkanet/index.htm It is a project from 2001 to 2004, for making a wordnet for all Balkan languages, and Turkish is included.Franky
F
6

Here is the process I have used before (making Japanese, Chinese, German and Arabic semantic networks):

  1. Gather at least two English/Turkish dictionaries. They must be independent, not derived from each other. You can use Wikipedia to auto-generate one of your dictionaries. If you need to publish your network, then you may need open source dictionaries, or license fees, or a lawyer.
  2. Use those dictionaries to translate English Wordnet, producing a confidence rating for each synset.
  3. Keep those with strong confidence, manually approving or fixing through those with medium or low confidence.
  4. Finish it off manually

I expanded on this in the "Automatic Translation Of WordNet" section of my 2008 paper: http://dcook.org/mlsn/about/papers/nlp2008.MLSN_A_Multilingual_Semantic_Network.pdf

(For your stated goal of a Turkish sentiment dictionary, there are other approaches, not involving a semantic network. E.g. "Semantic Analysis and Opinion Mining", by Bing Liu, is a good round-up of research. But a semantic network approach will, IMHO, always give better results in the long run, and has so many other uses.)

Franky answered 6/11, 2013 at 4:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.