cldf-datasets/doreco

SQL Tutorial: Unrecognized Features during CLDF conversion

Closed this issue · 6 comments

Following the new Usage tutorial, I am running into the following error running `makecldf':

(doreco) blum@lingn45 doreco % cldfbench makecldf cldfbench_doreco.py --glottolog-version=v4.7
/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
INFO    running _cmd_makecldf on doreco ...
Path to clts data: /Users/blum/Library/Application Support/cldf/clts
Traceback (most recent call last):
  File "/Users/blum/Projects/venv/doreco/bin/cldfbench", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/cldfbench/__main__.py", line 89, in main
    return args.main(args) or 0
           ^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/cldfbench/commands/makecldf.py", line 32, in run
    with_dataset(args, 'makecldf')
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/cldfbench/cli_util.py", line 161, in with_dataset
    res = func(*arg, args)
          ^^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/cldfbench/dataset.py", line 206, in _cmd_makecldf
    self.cmd_makecldf(args)
  File "/Users/blum/Projects/doreco/cldfbench_doreco.py", line 170, in cmd_makecldf
    bipa = clts.bipa[row['IPA']] if row['IPA'] else None
           ^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/clldutils/misc.py", line 241, in __get__
    result = instance.__dict__[self.__name__] = self.fget(instance)
                                                ^^^^^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/pyclts/api.py", line 23, in bipa
    return self.transcriptionsystem('bipa')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/pyclts/api.py", line 80, in transcriptionsystem
    if key in self.transcriptionsystem_dict:
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/clldutils/misc.py", line 241, in __get__
    result = instance.__dict__[self.__name__] = self.fget(instance)
                                                ^^^^^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/pyclts/api.py", line 77, in transcriptionsystem_dict
    return {ts.id: ts for ts in self.iter_transcriptionsystem()}
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/pyclts/api.py", line 77, in <dictcomp>
    return {ts.id: ts for ts in self.iter_transcriptionsystem()}
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/pyclts/api.py", line 69, in iter_transcriptionsystem
    yield TranscriptionSystem(
          ^^^^^^^^^^^^^^^^^^^^
  File "/Users/blum/Projects/venv/doreco/lib/python3.11/site-packages/pyclts/transcriptionsystem.py", line 77, in __init__
    raise ValueError(
ValueError: Unrecognized features (duration: ultra-long, line 129))

I am using a fresh venv with the most recent CLTS. @xrotwang Can you spot what I am doing wrong?

you probably have to upgrade pyclts, too.

$ pip freeze | grep pyclts
pyclts==3.1.1

pyclts is already on version 3.1.1, since I've installed all packages from the requirements.txt

Running cldfbench catinfo:

local clone: /Users/blum/Library/Application Support/cldf/clts
config at: /Users/blum/Library/Application Support/cldf/catalog.ini
versions: 
  v2.2.0 release candidate for v2.2.0 (#128)
  v2.1.0 added CI badge
  v2.0.0 2.0.0 release
  v1.4.1 CLTS recreated with pyclts 2.0
  v1.4   CLTS recreated with pyclts 2.0
API: pyclts 3.1.1

Since the doreco dataset is not a lexibank dataset, the CLTS data isn't looked up via catalog, but according to the path you pass here

doreco/cldfbench_doreco.py

Lines 164 to 166 in a883a9d

clts_data = pathlib.Path('cldf-clts-clts-6dc73af')
if not clts_data.exists():
clts_data = pathlib.Path(input('Path to clts data: '))

It also doesn't do any git checkout, so should you clone of CLTS be checked out to a version other than v2.2.0, that might be the problem.

Thanks! That did the trick