Getting additional information (Active/Passive, Tenses ...) from a Tagger
Asked Answered
P

1

6

I'm using the Stanford Tagger for determining the Parts of Speech. However, I want to get more information out of the text. Is there a possibility to get further information like the tense of the sentence or if it is in active/passive?

So far, I'm using the very basic PoS-Tagging approach:

List<List<TaggedWord>> taggedUnits = new ArrayList<List<TaggedWord>>();

String input = "This sentence is going to be future. The door was opened.";
for (List<HasWord> sentence : MaxentTagger.tokenizeText(new StringReader(input)))
{
     taggedUnits.add(tagger.tagSentence(sentence));
}
Polis answered 21/10, 2013 at 13:31 Comment(0)
A
21

You can get tense information from the various penn tags:

27. VB  Verb, base form
28. VBD Verb, past tense
29. VBG Verb, gerund or present participle
30. VBN Verb, past participle
31. VBP Verb, non-3rd person singular present
32. VBZ Verb, 3rd person singular present

About the active/passive aspect, you can use typed dependencies included in Stanford Core NLP.

  1. If the sentence is in active voice, a 'nsubj' dependecy should exist.
  2. If the sentence is in passive voice a 'nsubjpass' dependency should exist

Hope this helps.

Anthotaxy answered 22/10, 2013 at 8:31 Comment(3)
Thank you very much for your help! However, I got stuck when using German for "active/passive detection" -> #19531708Octosyllabic
been reading the docs on this, and this nsubjpass relationship seems to be a feature of all passive sentences - nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/…Apostle
This is very useful, but isn't the full story because both can turn up. For example "They spoke no more until camp was made." I get nsubjpass for 'camp' and nsubj for 'They'. Would it be reasonable to assume the earlier one in the sentence is more important?Glassman

© 2022 - 2024 — McMap. All rights reserved.