huggingface/dataset-viewer
Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.
PythonApache-2.0
Issues
- 1
A space in the column name breaks the assets URLs
#2762 opened by severo - 4
Handling dataset redirects
#2688 opened by PeterAJansen - 5
persisting CreateCommitError
#2766 opened by severo - 1
Use `DisabledRepoError`
#2761 opened by severo - 0
Store which splits are partial and which are complete
#2809 opened by severo - 7
Differentiate between `NaN` and `null` in the viewer
#2828 opened by polinaeterna - 3
Align CI and production environments
#2819 opened by albertvillanova - 2
- 4
Show the proportion of image/audio formats in stats?
#2806 opened by severo - 0
Transfer script-based datasets in allowlist to data-only
#2804 opened by severo - 3
Empty list of splits on config-split-names
#2803 opened by AndreaFrancis - 3
Return partial dataset-hub-cache instead of error?
#2754 opened by severo - 2
Duckdb Con - Error Invalid Input Error: Cannot change configuration option "extension_directory" - the configuration has been locked
#2682 opened by AndreaFrancis - 2
Help dataset owner to chose between configs and splits?
#2721 opened by severo - 1
Convert downloads cleaning task into a parallel process in search service
#2718 opened by AndreaFrancis - 1
Do not convert Opus audio files to WAV
#2818 opened by albertvillanova - 1
Presidio scan
#2745 opened by lhoestq - 0
an e2e test is broken since #2821
#2824 opened by severo - 0
expose metrics on another port
#2812 opened by severo - 1
Delete service "storage-admin"?
#2715 opened by severo - 4
- 0
Truncate all the logs
#2780 opened by severo - 0
Set `library_name='dataset-viewer'` in hfh requests
#2800 opened by severo - 2
- 2
use the `ROW_IDX_COLUMN` constant name instead of copying the value everywhere
#2798 opened by severo - 0
Improve robustness of column names containing non-alphanumeric characters
#2794 opened by albertvillanova - 2
parquet-and-info worker fails if a parquet file is empty
#2709 opened by severo - 2
Rows Post Processing Error
#2782 opened by Helw150 - 5
Column name wrongly contains data
#2779 opened by severo - 0
- 0
Update datasets to 2.19.1
#2777 opened by albertvillanova - 1
Support LeRobot datasets?
#2775 opened by severo - 4
Improve the message for DatasetWithScriptNotSupportedError
#2742 opened by severo - 8
FineWeb: Unexpected end of stream: Page was smaller (1862094) than expected (2055611)
#2768 opened by lhoestq - 1
Suggest using datasets CLI to convert to parquet in script-dataset error message
#2692 opened by albertvillanova - 1
- 1
Upgrade pyarrow to 16?
#2756 opened by severo - 5
Increase in timeouts from the Hub?
#2690 opened by severo - 5
Update datasets to 2.19.0
#2739 opened by albertvillanova - 1
services/worker tests are failing
#2759 opened by severo - 0
Use `.__cause__` when possible when raising an exception
#2753 opened by severo - 2
/search and /filter are currently broken
#2732 opened by severo - 0
- 1
- 1
Annotate rows in a dataset (an editable new colum?)
#2694 opened by severo - 0
Update numpy
#2700 opened by albertvillanova - 2
Update pandas patch version from 2.2.0 to 2.2.2
#2695 opened by albertvillanova - 3
Update numba
#2698 opened by albertvillanova - 3
Heavy jobs for split-descriptive-statistics failing
#2679 opened by AndreaFrancis - 0
IndexError in paquet-and-info job runner
#2672 opened by severo