This repository contains code related to the preprint "On some elusive aspects of databases hindering AI based discovery: A case study on superconducting materials" (https://arxiv.org/abs/2311.09891). The dataset and trained pipeline are published on our Zenodo repository https://doi.org/10.5281/zenodo.8269370.