Speed up CoreNLP Sentiment Analysis
Asked Answered
E

1

6

Can anybody think of a way to speed up my CoreNLP Sentiment Analysis (below)?

I initialize the CoreNLP pipeline once on server startup:

// Initialize the CoreNLP text processing pipeline
public static Properties props = new Properties();
public static StanfordCoreNLP pipeline;

// Set text processing pipeline's annotators
props.setProperty("annotators", "tokenize, ssplit, pos, parse, sentiment");
// Use Shift-Reduce Constituency Parsing (O(n),
// http://nlp.stanford.edu/software/srparser.shtml) vs CoreNLP's default
// Probabilistic Context-Free Grammar Parsing (O(n^3))
props.setProperty("parse.model", "edu/stanford/nlp/models/srparser/englishSR.ser.gz");
pipeline = new StanfordCoreNLP(props);

Then I call the pipeline from my Controller:

String text = 'A sample string.'
Annotation annotation = pipeline.process(text);
List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
    Tree tree = sentence.get(SentimentCoreAnnotations.SentimentAnnotatedTree.class);
    int sentiment = RNNCoreAnnotations.getPredictedClass(tree);
    ...
}

I've profiled the code -- the line Annotation annotation = pipeline.process(text), which is CoreNLP's main processing call, is very slow. A request with 100 calls to my controller takes an average of 1.07 seconds. The annotation is taking ~7ms per call. I need to reduce that to ~2ms.

I can't remove any of the annotators because sentiment relies on all of them. I'm already using the Shift-Reduce Constituency Parser because it is much faster than the default Context-Free Grammar Parser.

Are there any other parameters I can tune to significantly speed this up?

Enosis answered 18/1, 2016 at 22:30 Comment(1)
I assume you use the default models, it's most likely unfeasible without a big annotated corpus but most likely you could retrain smaller models specific for your domain.Mahdi
A
0

Having the same issue. I've also tried the SR Beam, which was even slower than the PCFG! Based on Stanford benchmarks, SR Beam should be much faster than PCFG, and only slightly slower than SR.

I guess other than using the SR parser instead of the PCFG, the only remaining way to improve the speed might be playing with the tokenizer options...

Acanthaceous answered 4/12, 2017 at 22:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.