Variation in BLEU Score
Asked Answered
D

2

11

I have some question on BLUE Score calculation for machine translation. I realized they may have a different metrics for BLEU. I found the code reports five value for BLEU, namely BLEU-1, BLEU-2, BLEU-3, BLEU-4 and finally BLEU, which seems to be an exponential average of the previous four BLEUs. Still it is not clear to me what the difference between those is. Do you have any ideas? Thanks

P.s. At first I thought that this question is more of a theoretical content and posted it on meta stackexange. A moderator has closed and commented it as a stackoverflow type question . So please don't punish me again. =)

Dagan answered 2/6, 2017 at 8:52 Comment(0)
E
14

source: http://www.statmt.org/book/slides/08-evaluation.pdf

I haven't heard of BLEU-1 and BLEU-2 but I guess it means 1-gram, 2-gram, 3-gram and 4-gram in the formula of BLEU score, I mean in the formula precision[i] = BLEU-i in your question:
enter image description here

Eijkman answered 10/6, 2017 at 14:6 Comment(0)
L
6

Actually, BLEU-n doesn't use the n-gram scores only. It computes the 1-gram through n-gram scores and gives them equal weight to compute a final score. See the "Cumulative N-Gram Scores" section at this link for more info.

Logging answered 14/4, 2018 at 20:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.