yashkant/spad

Total No. of meta_filtered_uids is 296k instead of 150k+50k+50k mentioned in the paper.

Closed this issue · 3 comments

Hey Yash,

I really liked your paper. The general idea is interesting and I am trying out some personal experiments on objaverse.
I noticed that the data filtering code produces 298k uids instead of 250k.

Could you please tell why is that the case.
image

@yashkant apologies for 150k, as it's mentioned 150k+50k+50k. Extra 46k is for overlapping?

hey @charchit7, thanks for checking out spad!

we missed to clarify that we take a union of our filtered objects in the paper text — this leads to a final dataset size of ~296K (as you noticed). i will add this clarification to our readme.

in my experience, using a high quality subset of objaverse and a strong base model (SDXL) is much more important than total samples. for example, instant3d only trained on 10K assets, and still generates good results.

hope this helps!

Thank you, Yash, for clarification!