Example of NLTK's Vader Scoring Text
Asked Answered
B

1

4

I would like someone to correct my understanding of how VADER scores text. I've read an explanation of this process here, however I cannot match the compound score of test sentences to Vader's output when recreating the process it describes. Lets say we have the sentence:

"I like using VADER, its a fun tool to use"

The words VADER picks up are 'like' (+1.5 score), and 'fun' (+2.3). According to the documentation, these values are summed (so +3.8), and then normalized to a range between 0 and 1 using the following function:

(alpha = 15)
x / x2 + alpha 

With our numbers, this should become:

3.8 / 14.44 + 15 = 0.1290

VADER, however, outputs the returned compound score as follows:

Scores: {'neg': 0.0, 'neu': 0.508, 'pos': 0.492, 'compound': 0.7003}

Where am I going wrong in my reasoning? Similar questions have been asked several times, however an actual example of VADER classifying has not yet been provided. Any help would be appreciated.

Bravo answered 6/8, 2018 at 12:7 Comment(0)
L
8

It's just your normalization that is wrong. From the code the function is defined:

def normalize(score, alpha=15):
"""
Normalize the score to be between -1 and 1 using an alpha that
approximates the max expected value
"""
norm_score = score/math.sqrt((score*score) + alpha)
return norm_score

So you have 3.8/sqrt(3.8*3.8 + 15) = 0.7003

Letter answered 18/9, 2018 at 21:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.