Many tables do not have the correct column headers
UtkarshGarg-UG opened this issue · 1 comments
Thank you for your kind words and valuable feedback on the dataset!
In our approach, we employed regular regressions alongside heuristic rules to convert the raw text of tables into structured formats. Despite thorough reviews and continuous improvements to our scripts, some tables remain imperfectly parsed.
We have observed that these less accurately parsed tables, especially those with incorrect headers, do not significantly affect the models' comprehension. Moreover, as they represent a minor fraction of the total dataset, we have chosen to retain them in the current annotations. This approach aims to maintain a consistent evaluation framework for both previously tested models and those anticipated in the near future.