How to do POS tagging using SVM in Python?
Asked Answered
E

1

6

I want to do POS tagging using SVM with non-English corpus in Python. It looks like Python does not support tagging using SVM yet (http://www.nltk.org/_modules).

scikit-learn has a SVM module. So I installed scikit-learn and use it in Python but I cannot find any tutorials about POS tagging using SVM.

I really have no clue what to do, any help would be appreciated.

Estrone answered 5/9, 2015 at 9:53 Comment(0)
P
5

Does it have to be an SVM? NTLK has built-in tools to do POS tagging: Categorizing and Tagging Words

If you want to use a custom classifier, look here: http://www.nltk.org/api/nltk.classify.html, Ctrl+F "svm", NTLK provides a wrapper for scikit-learn algorithms called SklearnClassifier. Then take a look here http://www.nltk.org/api/nltk.tag.html, Ctrl+F "classifier", there is a class nltk.tag.sequential.ClassifierBasedPOSTaggerwhich apparently can use wrapped up classifiers from sklearn.

I haven't tried this but it might work.

EDIT: It should work like this:

from nltk.classify import SklearnClassifier
from sklearn.svm import SVC
clf = SklearnClassifier(SVC(),sparse=False)
cpos = nltk.tag.sequential.ClassifierBasedPOSTagger(train=train_sents,classifier_builder
= lambda train_feats: clf.train(train_feats))

The only problem is that sklearn classifiers take numerical features only, so you need to convert yours somehow.

Peashooter answered 5/9, 2015 at 12:55 Comment(2)
Thanks, hellpanderrr. I tried other tagger (CRF, TBL, HMM,...) and wanted to use SVM. I used this wrapper but still cannot do any POS tagging.Estrone
With other taggers in Python, you only need training data to train, then you can use tag method and evaluate method. But when I use SVM of scikit-learn or SklearnClassifier, I cannot find any method to train or tag.Estrone

© 2022 - 2024 — McMap. All rights reserved.