Add ability to load and inspect individual datasets
npatki opened this issue · 1 comments
npatki commented
Problem Description
The SDGym library currently allows you to list the available datasets for benchmarking purposes. However, it does not offer any abilities to inspect these datasets -- users may want to do this in order to see what the columns, data types, or values look like before they apply them to the benchmarking run.
Expected behavior
Add a download_demo
method that is similar to the one in the SDV library. This method would return the data and metadata so that SDGym users can inspect the dataset.
Workaround
The SDV library is a prerequisite of SDGym. So as a workaround, you can access the demo datasets through it.
import sdv
from sdv.datasets.demo import download_demo
data, metadata = download_demo(
modality='single_table',
dataset_name='adult'
)