ImportError: "attempted relative import with no known parent package" when attempting to locally test rwth_phoenix2014_t
Closed this issue · 19 comments
I thought I'd try the instructions for testing from https://tensorflow.google.cn/datasets/add_dataset?hl=en#unit-test_your_dataset on a known correct dataset, rwth_phoenix2014_t
. However I keep getting errors like:
> tfds build
(log truncated)
sign-language\datasets\sign_language_datasets\datasets\rwth_phoenix2014_t\rwth_phoenix2014_t.py", line 9, in <module>
from ..warning import dataset_warning
ImportError: attempted relative import with no known parent package
Basically I've been trying to figure out how to locally test a dataset, for #29. The README for this repo at https://github.com/sign-language-processing/datasets?tab=readme-ov-file#adding-a-new-dataset describes the basic idea of how to create and register a new dataset. tl;dr you follow this guide: https://tensorflow.google.cn/datasets/add_dataset?hl=en#default_template_tfds_new. But I wanted to try and find a way to test locally, so I thought I'd try it on a dataset I know to be correct, which is when I encountered this error.
If I try adjusting the code to remove relative imports, replacing them with, e.g. from sign_language_datasets.datasets.warning import dataset_warning
, then I am able to run the tfds build
command without getting this error. Not yet sure if python rwth_phoenix2014_t_test.py
will work, as the tfds build command estimates it will take 4 hours to finish.
Same issue occurs if I attempt to run this in dgs_corpus
.
tfds --version
TensorFlow Datasets: 4.9.4+nightly
pip list
shows
pip list
Package Version
---------------------------- ---------------------
absl-py 1.4.0
astunparse 1.6.3
cachetools 5.3.2
certifi 2023.11.17
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
dill 0.3.8
dm-tree 0.1.8
docopt 0.6.2
etils 1.6.0
flatbuffers 23.5.26
fsspec 2023.12.2
gast 0.5.4
google-auth 2.27.0
google-auth-oauthlib 1.2.0
google-pasta 0.2.0
googleapis-common-protos 1.62.0
grpcio 1.60.1
h5py 3.10.0
idna 3.6
importlib-resources 6.1.1
keras 2.15.0
langcodes 3.3.0
language-data 1.1
libclang 16.0.6
marisa-trie 0.7.8
Markdown 3.5.2
MarkupSafe 2.1.4
ml-dtypes 0.2.0
numpy 1.26.3
oauthlib 3.2.2
opencv-python 4.5.5.64
opt-einsum 3.3.0
packaging 23.2
pandas 2.2.0
pillow 10.2.0
pip 23.3.1
pose_format 0.3.2
promise 2.3
protobuf 3.20.3
psutil 5.9.8
pyarrow 15.0.0
pyasn1 0.5.1
pyasn1-modules 0.3.0
pympi-ling 1.70.2
python-dateutil 2.8.2
python-dotenv 1.0.1
pytz 2023.4
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
scipy 1.12.0
setuptools 68.2.2
sign-language-datasets 0.2.0
six 1.16.0
tensorboard 2.15.1
tensorboard-data-server 0.7.2
tensorflow 2.15.0
tensorflow-estimator 2.15.0
tensorflow-intel 2.15.0
tensorflow-io-gcs-filesystem 0.31.0
tensorflow-metadata 1.14.0
termcolor 2.4.0
tfds-nightly 4.9.4.dev202402010044
toml 0.10.2
tqdm 4.66.1
typing_extensions 4.9.0
tzdata 2023.4
urllib3 2.2.0
webvtt-py 0.4.6
Werkzeug 3.0.1
wheel 0.41.2
wrapt 1.14.1
zipp 3.17.0
I am on Windows 11, using Anaconda and pip is installed within that
Looks like you are running the tests using python
instead of pytest
Please try using pytest
instead.
I will say that in this repository, tests have been widely neglected, due to frequent changes in the dataset creations, and the need to store a small, dummy file for different dataset features.
In our other repositories, you will see tests are running in CI on every commit.
I'll try pytest
!
Commands tried so far are based on the instructions in https://tensorflow.google.cn/datasets/add_dataset?hl=en#unit-test_your_dataset, which gives two commands that I noticed:
tfds build
python my_dataset_test.py
Attempted using pytest
, pip installed pytest-8.0.1
into my conda env. I'm rusty on pytest so I tried a few commands and looked at https://realpython.com/pytest-python-testing/#how-to-install-pytest.
All of these gave me an identical error:
pytest rwth_phoenix2014_t_test.py
- (in
<the parent to where I cloned the repo datasets\sign_language_datasets\datasets\rwth_phoenix2014_t
)pytest .
pytest .
in the parent of that folder- in the parent of that folder
OK, digging in a bit deeper, here's an example stack trace:
_____________________________ ERROR collecting sign_language_datasets/datasets/rwth_phoenix2014_t/rwth_phoenix2014_t_test.py ______________________________
..\..\..\miniconda3\envs\jw_sign_create\Lib\importlib\__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1204: in _gcd_import
???
<frozen importlib._bootstrap>:1176: in _find_and_load
???
<frozen importlib._bootstrap>:1126: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
???
<frozen importlib._bootstrap>:1204: in _gcd_import
???
<frozen importlib._bootstrap>:1176: in _find_and_load
???
<frozen importlib._bootstrap>:1126: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
???
<frozen importlib._bootstrap>:1204: in _gcd_import
???
<frozen importlib._bootstrap>:1176: in _find_and_load
???
<frozen importlib._bootstrap>:1147: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:690: in _load_unlocked
???
<frozen importlib._bootstrap_external>:940: in exec_module
???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
???
sign_language_datasets\datasets\__init__.py:1: in <module>
from .aslg_pc12 import AslgPc12
sign_language_datasets\datasets\aslg_pc12\__init__.py:3: in <module>
from .aslg_pc12 import AslgPc12
sign_language_datasets\datasets\aslg_pc12\aslg_pc12.py:26: in <module>
class AslgPc12(tfds.core.GeneratorBasedBuilder):
<frozen abc>:106: in __new__
???
..\..\..\miniconda3\envs\jw_sign_create\Lib\site-packages\tensorflow_datasets\core\registered.py:209: in __init_subclass__
raise ValueError(f'Dataset with name {cls.name} already registered.')
E ValueError: Dataset with name aslg_pc12 already registered.
Apparently that originates from this file: https://github.com/sign-language-processing/datasets/blob/master/sign_language_datasets/datasets/__init__.py.
When this gets imported, that triggers a "register" process of some kind?
Possibly related to tensorflow/datasets#552
Another possible issue: I have sign_language_datasets
installed via pip, not from source. There may be some interference going on between the two
Trying with source installation:
conda create -n sign_language_datasets_source pip python=3.10 # if I do 3.11 on Windows then there's no compatible tensorflow
# navigate to the repo
git pull # to make sure it's up to date
python -m pip install . #python -m pip ensures we're using the pip inside the conda env
python -m pip install pytest pytest-cov
pytest
Ran this, got a bunch of errors with wanting the "dill" package.
python -m pip install dill # https://pypi.org/project/dill/
pytest
Submitted a pull request to fix the errors preventing PyTest tests from running.
I believe the original question is answered, I will separately make an issue/pull request with instructions on how to test a new dataset
OK, I tried doing the following and I'm getting this again: #53 (comment)
conda create -n sign_language_datasets_source pip python=3.10 # if I do 3.11 on Windows then there's no compatible tensorflow
conda activate sign_language_datasets_source
# navigate to the repo
git pull # to make sure it's up to date
python -m pip install . #python -m pip ensures we're using the pip inside the conda env
python -m pip install pytest pytest-cov dill
pytest .
How come I'm getting the "already registered" again? I'm very confused.
Out of the blue guess without actually testing it - if one dataset might import from another dataset, it could declare the class twice