matrix_positions in manual_bigram_penalty

Question

matrix_positions in manual_bigram_penalty

Closed this issue 2 years ago · 5 comments

when evaluating the "manual_bigram_penalty" score, it seems that the key positions are interpreted in the inverse way : [to_position, from_position] and not in intended and mentioned in the comments [from_position, to_position]

I think the culprit is this line in manual_bigram_penality.rs:
.map(|(((x1, y1), (x2, y2)), w)| (((*x2, *y2), (*x1, *y1)), *w)),

It should be ? :
.map(|(((x1, y1), (x2, y2)), w)| (((*x1, *y1), (*x2, *y2)), *w)),

Please let me know if I'm wrong,
Thanks and have a nice day

Answer 1 · 2022-10-17T06:26:25.000Z

In case of the manual bigram penalties, the "mirrored" bigrams to those configured ones are added automatically.
This is because usually both directions of a bigram are bad and this way, you only configure one direction. The other ist included automatically with the same weight.

We could consider making this automatic addition optional, though.

Answer 2 · 2022-10-17T08:05:01.000Z

ok,
For my case, I was using it for easy to type bigrams, with a negative score. and I made sure to configure manually both directions (usually with different values).
When artificially elevating a single bigram score, it was its inverse that showed on "worst", and the evolution algorithm favored its inverse bigram. Thats why I thought it was reading in inverse.

Answer 3 · 2022-10-17T09:00:15.000Z

I will make the auto-inclusion of the mirrored versions configurable, when I'll find the time.

One thing to note (as it may not be obvious): The "worst" elements are in fact those with the highest absolute value of the cost. So in the case of negative costs, it would actually be "best". I decided to do it that way because otherwise, you would only see elements with zero score in that case, which would not give much insight.

Answer 4 · 2022-10-17T10:05:42.000Z

It is better with absolute value, as it can work on both ways.
For my use case, I just changed the global weight to a negative value, while keeping each matrix_position's weight positive.
I was aware about the part where "worst" being the "best" :D, but for some reasons when I was testing, it was the mirrored version that kept being shown (probably some values were overwritten by the automatic mirroring of manually mirrored positions)

anyway, I commented that part of the code for now
an option would be more than helpful.

and thanks again for the great work

Answer 5 · 2022-10-18T07:08:30.000Z

There is now a configuration parameter add_mirrored in the manual_bigram_penalty metric.