
Fitzpatrick-17K file list?

Closed this issue · 3 comments


I was wondering if the revised file list for Fitzpatrick-17K is available somewhere? I'd love to train + evaluate some models on the cleaned version of the dataset.

And thank you for this work- I appreciate it!

Dear Divya

Thanks a lot for reaching out to us!
Unfortunately for Fitzpatrick17k we have only performed cleaning using the automatic mode of SelfClean.
As we are not recommending this for high-stakes domains such as the medical one we solely showcase a significant performance difference when using the cleaned version and refrain from open-sourcing this revised list as the resulting dataset might still contain inaccuracies.

We are however currently working with the creators of Fitzpatrick17k on a revised version where data cleaning is performed as proposed in the data cleaning protocol of SelfClean.


Hi Fabian, thanks for your reply!! That makes sense - thanks again for all this work!

I'm closing this issue and we'll stay in contact for the updated Fitzpatrick17k list.