
Error on loading audio data

bagustris opened this issue · 2 comments

It is known from the current codes that sliceguard's from_huggingface function only supports image data.

def from_huggingface(dataset_identifier: str):
# Simple utility method to support loading of huggingface datasets
# Currently only supports image data. Use custom load function if you need something else.

However, the example on the following page stated that now sliceguard supports audio data.

Following example above, I faced RuntimeError below (Audio is not supported)

In [7]: from renumics import spotlight
   ...: from sliceguard import SliceGuard
   ...: from import from_huggingface
   ...: from sklearn.metrics import accuracy_score
   ...: # Load an Example Dataset as DataFrame
   ...: df = from_huggingface("renumics/emodb")
RuntimeError                              Traceback (most recent call last)
Cell In[7], line 7
      4 from sklearn.metrics import accuracy_score
      6 # Load an Example Dataset as DataFrame
----> 7 df = from_huggingface("renumics/emodb")

File ~/miniconda3/envs/spotlight/lib/python3.9/site-packages/sliceguard/, in from_huggingface(dataset_identifier)
     29 for fname, ftype in cur_split.features.items():
     30     if (
     31         not isinstance(ftype, Image)
     32         and not isinstance(ftype, ClassLabel)
     33         and not isinstance(ftype, Value)
     34         and not isinstance(ftype, Sequence)
     35     ):
---> 36         raise RuntimeError(
     37             f"Found unsupported datatype {ftype}. Use custom load function."
     38         )
     39     # Run transformations for specific data types if needed.
     40     if isinstance(ftype, ClassLabel):

RuntimeError: Found unsupported datatype Audio(sampling_rate=None, mono=True, decode=True, id=None). Use custom load function.

PR #58 may solve this issue.

@bagustris Sliceguard does indeed support audio data, the load function does not yet. However, if you need a solution immediately you can simply point to wavefile in the data frame supplied to the find_issues function. You can base your code on this Example.

And you are right, PR #58 aims to solve this issue. I will approve it as soon as it has passed review.

Let me know if you need any more support with your use case or encounter any more issues!

@bagustris I Just merged PR #58 and released v0.0.31.
Hopefully, this solves your issue. If any issues remain, feel free to open an issue again or comment under this issue. Closing this for now.