Format of Input Dataset
Closed this issue · 2 comments
shoron-dutta commented
What would be the input format for the files "train.txt" (assuming "test.txt" and "valid.txt" follow the same format as this one) , "entity2id.txt", "relation2id.txt"?
Is there any restriction for the kind of values accepted as entity labels?
ZichaoHuang commented
Please check the "data.zip" under this repo.
shoron-dutta commented
Hi,
I have created a dataset of my own following the structure in the
repository you shared. But I'm facing errors in the "Constructing Training
Batches" part of the code.
Have you tried with a custom dataset or have any suggestions for preparing
a custom dataset? (anything to look out for? suppose - "don't keep
duplicates in entity2id file" etc.)
Thanks a lot!
Best,
Shoron
…On Thu, Oct 1, 2020 at 3:48 AM Zichao Huang ***@***.***> wrote:
Please check the "data.zip" under this repo
<https://github.com/thunlp/KB2E>.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEYNVG3OFAJU6HZ5R46WDUDSIQX43ANCNFSM4R7IGEOQ>
.
--
Thanks and best regards
Sharmishtha Dutta