AI Feynman datasets
Opened this issue · 4 comments
I am trying to fetch a dataset form AI Feynman but I receive the following error:
from pmlb import fetch_data
name = "feynman_III_12_43"
dataset = fetch_data(name)
ValueError: Dataset not found in PMLB.
Hi @aminravanbakhsh
Which version of PMLB are you running?
I managed to fetch this dataset without problems. I'm using python==3.8.19
and pmlb==1.0.2a
.
Two possible solutions:
- Install pmlb from the source. Clone this repo and do
pip install .
from its root . That's how I installed it here. I'm using a conda environment specifically for building PMLB at its latest version. - Download the dataset folder from this repo (https://github.com/EpistasisLab/pmlb/tree/master/datasets/feynman_III_12_43), put it into a local folder, and use
fetch_data(name, local_dir='<path to the folder>')
, it should work, as long as the name of the folder and the.tsv.gz
file are the same. I tried creating a local copy manually and it worked:
from pmlb import fetch_data
name = "feynman_III_12_43_copy"
dataset = fetch_data(name, local_cache_dir=f"./datasets/")
dataset```
Hi @gAldeia
Thank you for your reply.
I am using :
pmlb==1.0.1.post3
Python 3.12.4
@aminravanbakhsh Did you tried downloading the dataset locally and using the local_cache_dir
to load it? It seems that your version 1.0.1.post3 was released in Sep 10, 2020, and the Feynman datasets were added just after July 2021 . Installing it locally by cloning the repo and performing pip install .
should also solve your problem.
While this may be a workaround, ideally the PMLB should be updated at PyPI to its latest version.
Right now I am trying to submit new datasets, and there is this github action issue that is keeping me from actually doing it. If the local cache works I think we can close this issue and open a new one to update PyPI package to its latest version.