Data creation scripts for markdown
shubhamagarwal92 opened this issue · 0 comments
shubhamagarwal92 commented
Hi!
Thank you for open-sourcing the code. Could you please also provide scripts related to the GROBID library and storing in markdown format as mention in the Appendix of the paper as:
We use a modified version of the GROBID library for converting PDFs to text, as well as obtaining titles,
authors and citations.
The final paper documents are stored in a markdown format, as opposed to full LaTeX. We use markdown as
the standard format for all documents in the corpus to support knowledge blending between sources. Papers
are citation processed, following the title-based approach