How did you pre-train the NCBI abstract data exactly ?
zhouyunyun11 opened this issue · 5 comments
zhouyunyun11 commented
In your manuscript, your described like this:
"We initialized BERT with pre-trained BERT provided by (Devlin et al., 2019). We then continue to pre-train the model, using the listed corpora".
Did you use BERT code completely re-train the NCBI abstract corpora? Or used BERT initial model and wordpiece strategy as bioBERT method?
yfpeng commented
We used BERT initial model and workpiece strategy.
zhouyunyun11 commented
Do you mean you used the same strategy as Bio_BERT?
…On Wed, Oct 2, 2019 at 9:20 AM Yifan Peng ***@***.***> wrote:
We used BERT initial model and workpiece strategy.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABMXK4XPGPTZMKUSHTE4E73QMSN3BANCNFSM4I4WGP3Q>
.
zhouyunyun11 commented
Did you create your own vocab.txt file or use Google default one?
…On Wed, Oct 2, 2019 at 4:33 PM Yunyun Zhou ***@***.***> wrote:
Do you mean you used the same strategy as Bio_BERT?
On Wed, Oct 2, 2019 at 9:20 AM Yifan Peng ***@***.***>
wrote:
> We used BERT initial model and workpiece strategy.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#5>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ABMXK4XPGPTZMKUSHTE4E73QMSN3BANCNFSM4I4WGP3Q>
> .
>
yfpeng commented
We used the Google default vocab.txt
yfpeng commented
I am not sure what you meant by "same strategy as Bio_BERT"