This is an implementation of "BLEU: a Method for Automatic Evaluation of Machine Translation" as described by Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. The paper can be found here: https://dl.acm.org/doi/10.3115/1073083.1073135
In short, BLEU provides a way to determine how "good" a machine translation is using a set of reference sentences and a modified precision metric. As with all academic work, the set of equations below make little to no sense without the following context being provided. The variable n
is usually set to 4, which this library does by default. The variable r
refers to a reference set, c
refers to a machine translation, or "candidate" sentence, w
refers to a set of weights, which we default to 1
. These weights can be parameterized with calls to Score
in the library. The log in the scoring equation uses Euler's constant as a base.
Many Open Source implementations are available, such as the one found in NLTK. However, I did not find one written in C# and figured that it would be helpful to myself and many other developers who are doing Machine Learning in C# to have an easy to use implementation readily available.
var bleu = new BleuScore();
var reference = new List<string> {"the quick brown fox jumped over the lazy dog"};
var candidate = "the fast brown fox jumped over the lazy dog";
var score = bleu.Score(reference, candidate);
var bleu = new BleuScore();
var reference = new List<string> {"the quick brown fox jumped over the lazy dog"};
var candidate = "the fast brown fox jumped over the lazy dog";
var score = bleu.RoundedScore(reference, candidate);
Assert.AreEqual(0.75, score);
var testString = "this is a test string";
var expected = new []{"this is a", "is a test", "a test string"};
var collector = new NGramCollector(testString, 3);
var result = collector.Collect();
Assert.AreEqual(expected, result);