文本生成模型,避免生成结果大多为训练数据里占比大的,有哪些方法?
Closed this issue · 1 comments
guotong1988 commented
请问下大神
@tuzhaopeng 多谢多谢
tuzhaopeng commented
这种情况在翻译里不常见,所以我对该问题也没有特别清晰的认识。
常见的方法是对数据进行处理,比如1)对占比小的样例进行over-sampling,使得两者的比例相当;2)加noisy,或者进行data
dropout,可以让占比大的数据每个iteration对模型都不是完全相同的,这也可以防止过拟合。
…--
Zhaopeng Tu
Principal Researcher
Tencent AI Lab
http://www.zptu.net/
On Fri, Sep 18, 2020 at 2:56 PM Tong Guo ***@***.***> wrote:
请问下大神
@tuzhaopeng <https://github.com/tuzhaopeng> 多谢多谢
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLB6WTQZPY2IGBWAWGICNTSGMACDANCNFSM4RRREBGQ>
.