Code and dataset for SilverAlign: MT-Based Silver Data Algorithm For Evaluating Word Alignment: https://arxiv.org/abs/2210.06207
Lang | Gold (Size - |A|) | Silver_Small (Size - |A|) | Silver_Large (Size - |A|) |
---|---|---|---|
ENG-CES | 2,501 - 67K | 1,507 - 3,852 | 26K - 57K |
ENG-DEU | 508 - 11K | 227 - 480 | 31K - 77K |
ENG-FAS | 400 - 12K | 137 - 242 | 16K - 27K |
ENG-FRA | 447 - 17K | 216 - 359 | 32K - 74K |
ENG-HIN | 90 - 1,409 | 46 - 87 | 26K - 58K |
ENG-RON | 199 - 5,034 | 69 - 161 | 28K - 64K |
ENG-TUR | 100 - 2,670 | 50 - 80 | 27K - 60K |
FIN-ELL | 7909 - 161K | 1,668 - 2,230 | - |
FIN-HEB | 22,291 - 405K | 4,522 - 6,396 | - |