ImportError: "attempted relative import with no known parent package" when attempting to locally test rwth_phoenix2014_t

Question

ImportError: "attempted relative import with no known parent package" when attempting to locally test rwth_phoenix2014_t

Closed this issue 9 months ago · 19 comments

I thought I'd try the instructions for testing from https://tensorflow.google.cn/datasets/add_dataset?hl=en#unit-test_your_dataset on a known correct dataset, rwth_phoenix2014_t. However I keep getting errors like:

> tfds build
(log truncated)
sign-language\datasets\sign_language_datasets\datasets\rwth_phoenix2014_t\rwth_phoenix2014_t.py", line 9, in <module>
 from ..warning import dataset_warning
ImportError: attempted relative import with no known parent package

Basically I've been trying to figure out how to locally test a dataset, for #29. The README for this repo at https://github.com/sign-language-processing/datasets?tab=readme-ov-file#adding-a-new-dataset describes the basic idea of how to create and register a new dataset. tl;dr you follow this guide: https://tensorflow.google.cn/datasets/add_dataset?hl=en#default_template_tfds_new. But I wanted to try and find a way to test locally, so I thought I'd try it on a dataset I know to be correct, which is when I encountered this error.

If I try adjusting the code to remove relative imports, replacing them with, e.g. from sign_language_datasets.datasets.warning import dataset_warning, then I am able to run the tfds build command without getting this error. Not yet sure if python rwth_phoenix2014_t_test.py will work, as the tfds build command estimates it will take 4 hours to finish.

Answer 1 · 2024-02-21T01:18:16.000Z

Same issue occurs if I attempt to run this in dgs_corpus.

Answer 2 · 2024-02-21T01:20:04.000Z

tfds --version
TensorFlow Datasets: 4.9.4+nightly

Answer 3 · 2024-02-21T01:21:46.000Z

pip list shows

pip list
Package                      Version
---------------------------- ---------------------
absl-py                      1.4.0
astunparse                   1.6.3
cachetools                   5.3.2
certifi                      2023.11.17
charset-normalizer           3.3.2
click                        8.1.7
colorama                     0.4.6
dill                         0.3.8
dm-tree                      0.1.8
docopt                       0.6.2
etils                        1.6.0
flatbuffers                  23.5.26
fsspec                       2023.12.2
gast                         0.5.4
google-auth                  2.27.0
google-auth-oauthlib         1.2.0
google-pasta                 0.2.0
googleapis-common-protos     1.62.0
grpcio                       1.60.1
h5py                         3.10.0
idna                         3.6
importlib-resources          6.1.1
keras                        2.15.0
langcodes                    3.3.0
language-data                1.1
libclang                     16.0.6
marisa-trie                  0.7.8
Markdown                     3.5.2
MarkupSafe                   2.1.4
ml-dtypes                    0.2.0
numpy                        1.26.3
oauthlib                     3.2.2
opencv-python                4.5.5.64
opt-einsum                   3.3.0
packaging                    23.2
pandas                       2.2.0
pillow                       10.2.0
pip                          23.3.1
pose_format                  0.3.2
promise                      2.3
protobuf                     3.20.3
psutil                       5.9.8
pyarrow                      15.0.0
pyasn1                       0.5.1
pyasn1-modules               0.3.0
pympi-ling                   1.70.2
python-dateutil              2.8.2
python-dotenv                1.0.1
pytz                         2023.4
requests                     2.31.0
requests-oauthlib            1.3.1
rsa                          4.9
scipy                        1.12.0
setuptools                   68.2.2
sign-language-datasets       0.2.0
six                          1.16.0
tensorboard                  2.15.1
tensorboard-data-server      0.7.2
tensorflow                   2.15.0
tensorflow-estimator         2.15.0
tensorflow-intel             2.15.0
tensorflow-io-gcs-filesystem 0.31.0
tensorflow-metadata          1.14.0
termcolor                    2.4.0
tfds-nightly                 4.9.4.dev202402010044
toml                         0.10.2
tqdm                         4.66.1
typing_extensions            4.9.0
tzdata                       2023.4
urllib3                      2.2.0
webvtt-py                    0.4.6
Werkzeug                     3.0.1
wheel                        0.41.2
wrapt                        1.14.1
zipp                         3.17.0

I am on Windows 11, using Anaconda and pip is installed within that

Answer 4 · 2024-02-21T07:55:56.000Z

Looks like you are running the tests using python instead of pytest
Please try using pytest instead.

I will say that in this repository, tests have been widely neglected, due to frequent changes in the dataset creations, and the need to store a small, dummy file for different dataset features.

In our other repositories, you will see tests are running in CI on every commit.

Answer 5 · 2024-02-21T14:51:02.000Z

I'll try pytest!

Commands tried so far are based on the instructions in https://tensorflow.google.cn/datasets/add_dataset?hl=en#unit-test_your_dataset, which gives two commands that I noticed:

tfds build
python my_dataset_test.py

Answer 6 · 2024-02-21T15:03:03.000Z

Attempted using pytest, pip installed pytest-8.0.1 into my conda env. I'm rusty on pytest so I tried a few commands and looked at https://realpython.com/pytest-python-testing/#how-to-install-pytest.

All of these gave me an identical error:

pytest rwth_phoenix2014_t_test.py
(in <the parent to where I cloned the repo datasets\sign_language_datasets\datasets\rwth_phoenix2014_t) pytest .
pytest . in the parent of that folder
in the parent of that folder

Answer 7 · 2024-02-21T17:19:23.000Z

https://stackoverflow.com/questions/26589990/py-test-error-unrecognized-arguments-cov-ner-brands-cov-report-term-missi needed pytest-cov

Answer 8 · 2024-02-21T17:20:30.000Z

OK, now I get 22 errors, but from pytest we are making progress!

Answer 9 · 2024-02-21T17:24:38.000Z

OK, digging in a bit deeper, here's an example stack trace:

_____________________________ ERROR collecting sign_language_datasets/datasets/rwth_phoenix2014_t/rwth_phoenix2014_t_test.py ______________________________
..\..\..\miniconda3\envs\jw_sign_create\Lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1204: in _gcd_import
    ???
<frozen importlib._bootstrap>:1176: in _find_and_load
    ???
<frozen importlib._bootstrap>:1126: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
<frozen importlib._bootstrap>:1204: in _gcd_import
    ???
<frozen importlib._bootstrap>:1176: in _find_and_load
    ???
<frozen importlib._bootstrap>:1126: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
<frozen importlib._bootstrap>:1204: in _gcd_import
    ???
<frozen importlib._bootstrap>:1176: in _find_and_load
    ???
<frozen importlib._bootstrap>:1147: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:690: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:940: in exec_module
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
sign_language_datasets\datasets\__init__.py:1: in <module>
    from .aslg_pc12 import AslgPc12
sign_language_datasets\datasets\aslg_pc12\__init__.py:3: in <module>
    from .aslg_pc12 import AslgPc12
sign_language_datasets\datasets\aslg_pc12\aslg_pc12.py:26: in <module>
    class AslgPc12(tfds.core.GeneratorBasedBuilder):
<frozen abc>:106: in __new__
    ???
..\..\..\miniconda3\envs\jw_sign_create\Lib\site-packages\tensorflow_datasets\core\registered.py:209: in __init_subclass__
    raise ValueError(f'Dataset with name {cls.name} already registered.')
E   ValueError: Dataset with name aslg_pc12 already registered.

Answer 10 · 2024-02-21T17:27:15.000Z

Apparently that originates from this file: https://github.com/sign-language-processing/datasets/blob/master/sign_language_datasets/datasets/__init__.py.

When this gets imported, that triggers a "register" process of some kind?

Answer 11 · 2024-02-21T17:34:31.000Z

Possibly related to tensorflow/datasets#552

Answer 12 · 2024-02-21T17:43:14.000Z

Another possible issue: I have sign_language_datasets installed via pip, not from source. There may be some interference going on between the two

Answer 13 · 2024-02-21T17:47:46.000Z

Trying with source installation:

conda create -n sign_language_datasets_source pip python=3.10 # if I do 3.11 on Windows then there's no compatible tensorflow
# navigate to the repo
git pull # to make sure it's up to date
python -m pip install . #python -m pip ensures we're using the pip inside the conda env
python -m pip install pytest pytest-cov 
pytest

Ran this, got a bunch of errors with wanting the "dill" package.

python -m pip install dill # https://pypi.org/project/dill/
pytest

Answer 14 · 2024-02-21T17:52:42.000Z

OK, I am finally getting useful errors out of pytest:

Answer 15 · 2024-02-21T18:51:16.000Z

Submitted a pull request to fix the errors preventing PyTest tests from running.

Answer 16 · 2024-02-21T18:52:22.000Z

I believe the original question is answered, I will separately make an issue/pull request with instructions on how to test a new dataset

Answer 17 · 2024-03-01T20:27:41.000Z

OK, I tried doing the following and I'm getting this again: #53 (comment)

conda create -n sign_language_datasets_source pip python=3.10 # if I do 3.11 on Windows then there's no compatible tensorflow
conda activate sign_language_datasets_source 
# navigate to the repo
git pull # to make sure it's up to date
python -m pip install . #python -m pip ensures we're using the pip inside the conda env
python -m pip install pytest pytest-cov dill
pytest .

Answer 18 · 2024-03-01T20:30:03.000Z

How come I'm getting the "already registered" again? I'm very confused.

Answer 19 · 2024-03-02T12:10:04.000Z

Out of the blue guess without actually testing it - if one dataset might import from another dataset, it could declare the class twice