Snowball Stemmer Usage
Asked Answered
R

1

5

I'd like to use the stemmer here for merging word counts.
http://snowball.tartarus.org/download.html
The page has a download link, but I'm not sure how to integrate the files into my eclipse project
Its not just a jar to drop into my lib folder, its a file system. Does anyone know of some documentation explaining this, as I didn't see any on the website.
(As in, what do i import, how do I call it etc..)

Rurik answered 30/7, 2013 at 19:56 Comment(2)
The snowball manual and The snowball how to run itCreese
i've read both of those, but the second one says how to run it standalone with java, not how to import into a project and the first one doesn't touch on real setupRurik
M
16

Build the jar file and add it to your Build Path.

Details:

  • Download the tgz with the code from here http://snowball.tartarus.org/download.php
  • Uncompress.
  • Go to libstemmer_java directory and read README.
  • Follow instructions to compile (using javac).
  • You might have to correct or remove java/org/tartarus/snowball/ext/frenchStemmer.java because it has an error and doesn't compile.
  • Create jar file: Go to libstemmer_java/java directory then jar cvf libstemmer.jar *
  • Add libstemmer.jar to your Build Path (in Eclipse: Project-Properties-Java Build Path-Libreries Tab).

Then you can use the stemmers doing something like:

import org.tartarus.snowball.ext.spanishStemmer;
...
spanishStemmer stemmer = new spanishStemmer();
stemmer.setCurrent("torero");
if (stemmer.stem()){
    System.out.println(stemmer.getCurrent());
}
Multicellular answered 27/4, 2014 at 8:38 Comment(3)
I ran into a compile error so I took out all but the English language packs and the compiling went perfect. Thank you for actually answering this question and not telling someone to RTFM. :)Astrakhan
i am having an unusual problem . i got my string in a variable called "word" like word="torero"; and when i pass this variable to to the stemmer it wont work. eg stemmer.setCurrent(word); stemmer.stem(); System.out.println(stemmer.getCurrent()); .it wont get stemmed. tell me what i am doing wrong here .Donelladonelle
No @JunaidShirwani. It sometimes return wrong output. Same thing happening to me.Knockabout

© 2022 - 2024 — McMap. All rights reserved.