Helsinki-NLP/OpusFilter

Is it possible to generate score file during training alignment model?

BrightXiaoHan opened this issue · 6 comments

Is it possible to generate score file during training alignment model?

Not at the moment; the forward and backward scores files generated by eflomal are saved as temporary files are removed afterwards. But it should be easy to add an option to save them in a permanent file. But they would be in eflomal's original score format, not as OpusFilter score file. Which one are you looking for?

"OpusFilter score file" is what I want.

Implemented in #36

Thanks

I try it by the following config

  - type: train_alignment
    parameters:
      src_data: zh.rules
      tgt_data: en.rules
      parameters:
        model: 3
        src_tokenizer: [jieba, zh]
        tgt_tokenizer: [moses, en]
        scores: align_score.jsonl
      output: align.priors

but scores file align_score.jsonl was not generated.

This confused me for a while before I noticed that it's a problem in the documentation: The scores file is a top-level option, not under inner parameters. Thanks for noticing! I fixed the README.

The issue also revealed that many OpusFilter methods do not warn about extra parameters... Something to be improved in the future.