mczhuge/Kaleido-BERT

How to generate input_schema format data?

Closed this issue · 1 comments

Hi,
I find your work very interesting, and it is aligned with my project requirements.
I want to fine tune it for custom dataset, where I have raw images and text with labels, the task is similar to "Category/SubCategory Recognition". How to get the data in input_schema format?
Please share the code if you have any.

Sorry for the late. Since the preprocess procedure is running on Alibaba ODPS (which is a private SQL tool), so the code is not valuable for others' implementation.

But you can refer to #3 , actually, writing a multiprocess python code is also okay.
All the best!