对比常见模型在机器阅读理解任务上的效果,主要涉及以下几种模型:
- |
Embed Rand |
Embed Pretrained |
Embed Fix |
Embed Rand + Bigram |
Embed Pretrained + Bigram |
Embed Fix + Bigram |
dureader |
0.582 |
0.583 |
0.588 |
0.538 |
0.587 |
0.582 |
cmrc2018 |
0.114 |
0.154 |
0.147 |
0.062 |
0.232 |
0.229 |
- |
Embed Rand |
Embed Pretrained |
Embed Fix |
Embed Rand + Bigram |
Embed Pretrained + Bigram |
Embed Fix + Bigram |
dureader |
0.659 |
0.604 |
0.6 |
0.536 |
0.462 |
0.479 |
cmrc2018 |
0.7 |
0.555 |
0.547 |
0.545 |
0.354 |
0.359 |
- |
Embed Rand |
Embed Pretrained |
Embed Fix |
Embed Rand + Bigram |
Embed Pretrained + Bigram |
Embed Fix + Bigram |
dureader |
0.596 |
0.172 |
0.172 |
0.533 |
0.173 |
0.174 |
cmrc2018 |
0.349 |
0.109 |
0.11 |
0.302 |
0.109 |
0.113 |
- |
finetune |
freeze |
dureader |
0.77 |
0.353 |
cmrc2018 |
0.773 |
0.077 |
- |
Embed Rand |
Embed Pretrained |
Embed Fix |
dureader |
0.247 |
0.314 |
0.314 |
cmrc2018 |
0.025 |
0.198 |
0.177 |
- |
Embed Rand |
Embed Pretrained |
Embed Fix |
dureader |
0.299 |
0.319 |
0.326 |
cmrc2018 |
0.37 |
0.391 |
0.388 |
- |
Embed Rand |
Embed Pretrained |
Embed Fix |
dureader |
0.279 |
0.305 |
0.3 |
cmrc2018 |
0.193 |
0.249 |
0.245 |