Ok, I have the following code to train the NER Identifier from OpenNLP
FileReader fileReader = new FileReader("train.txt");
ObjectStream fileStream = new PlainTextByLineStream(fileReader);
ObjectStream sampleStream = new NameSampleDataStream(fileStream);
TokenNameFinderModel model = NameFinderME.train("pt-br", "train", sampleStream, Collections.<String, Object>emptyMap());
nfm = new NameFinderME(model);
I don't know if I'm doing something wrong of if something is missing, but the classifying is not working. I'm supposing that the train.txt is wrong.
The error that occurs is that all tokens are classified to only one type.
My train.txt data is something like the following example, but with a lot more of variation and quantity of entries. Another thing is that I'm classifind word by word from a text per time, and not all tokens.
<START:distance> 8000m <END>
<START:temperature> 100ºC <END>
<START:weight> 50kg <END>
<START:name> Renato <END>
Somebody can show what I doing wrong?