I have an annotated corpus in the conll2002 format, namely a tab separated file with a token, pos-tag, and IOB tag followed by entity tag. Example:
John NNP B-PERSON
I want to train a portuguese NER model in NLTK, preferably the MaxEnt model. I do not want to use the "built-in" Stanford NER in NLTK since I was already able to use the stand-alone Stanford NER. I want to use the MaxEnt model to use as comparison to the Stanford NER.
I found NLTK-trainer but I wasn't able to use it.
How can I achieve this?