BONSAMURAIS/ystafdb

Extraction of ystafdb triples shows AttributionError

Closed this issue · 10 comments

I was running the ystafdb -cli and encountered the following error:
AttributeError: term 'license' not in namespace 'http://purl.org/dc/elements/1.1/'

We managed to supersede the error by using an older rdflib version (5.0). So we either need to update the setup with the old rdflib version or find out why it doesn't work with the current version of rdflib?

I can confirm that a "clean installation" (see below) with python 3.9 and rdflib 6 produces the above mentionned error:

conda create -n ystafdb python=3.9
conda activate ystafdb
pip install -e .
ystafdb-cli

yields:

Traceback (most recent call last):
  File "/home/xmasgrinch/miniconda3/envs/ystafdb/bin/ystafdb-cli", line 33, in <module>
    sys.exit(load_entry_point('ystafdb', 'console_scripts', 'ystafdb-cli')())
  File "/home/xmasgrinch/workspaces/samurai/ystafdb/ystafdb/bin/ystafdb.py", line 42, in main
    generate_ystafdb(args)
  File "/home/xmasgrinch/workspaces/samurai/ystafdb/ystafdb/__init__.py", line 15, in generate_ystafdb
    generate_foaf_uris(args)
  File "/home/xmasgrinch/workspaces/samurai/ystafdb/ystafdb/foaf.py", line 44, in generate_foaf_uris
    g.add((node, DC.license, URIRef("https://creativecommons.org/licenses/by/3.0/")))
  File "/home/xmasgrinch/miniconda3/envs/ystafdb/lib/python3.9/site-packages/rdflib/namespace/__init__.py", line 206, in __getattr__
    return cls.__getitem__(name)
  File "/home/xmasgrinch/miniconda3/envs/ystafdb/lib/python3.9/site-packages/rdflib/namespace/__init__.py", line 197, in __getitem__
    raise AttributeError(f"term '{name}' not in namespace '{cls._NS}'")j
AttributeError: term 'license' not in namespace 'http://purl.org/dc/elements/1.1/'

Here are the installed dependencies:

> conda list                          
# packages in environment at /home/xmasgrinch/miniconda3/envs/ystafdb:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             4.5                       1_gnu  
appdirs                   1.4.4                    pypi_0    pypi
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
ca-certificates           2021.10.8            ha878542_0    conda-forge
certifi                   2021.10.8        py39hf3d152e_1    conda-forge
decorator                 5.1.0              pyhd8ed1ab_0    conda-forge
docopt                    0.6.2                    pypi_0    pypi
ipython                   7.30.1           py39hf3d152e_0    conda-forge
isodate                   0.6.0                    pypi_0    pypi
jedi                      0.18.1           py39hf3d152e_0    conda-forge
ld_impl_linux-64          2.35.1               h7274673_9
libffi                    3.3                  he6710b0_2
libgcc-ng                 9.3.0               h5101ec6_17
libgomp                   9.3.0               h5101ec6_17
libstdcxx-ng              9.3.0               hd4cf53a_17
matplotlib-inline         0.1.3              pyhd8ed1ab_0    conda-forge
ncurses                   6.3                  h7f8727e_2
numpy                     1.21.4                   pypi_0    pypi
openssl                   1.1.1l               h7f8727e_0
pandas                    1.3.4                    pypi_0    pypi
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pip                       21.2.4           py39h06a4308_0
prompt-toolkit            3.0.22             pyha770c72_0    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pygments                  2.10.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.0.6                    pypi_0    pypi
python                    3.9.7                h12debd9_1
python-dateutil           2.8.2                    pypi_0    pypi
python_abi                3.9                      2_cp39    conda-forge
pytz                      2021.3                   pypi_0    pypi
rdflib                    6.0.2                    pypi_0    pypi
readline                  8.1                  h27cfd23_0
setuptools                58.0.4           py39h06a4308_0
six                       1.16.0                   pypi_0    pypi
sqlite                    3.36.0               hc218d9a_0
tk                        8.6.11               h1ccaba5_0
traitlets                 5.1.1              pyhd8ed1ab_0    conda-forge
tzdata                    2021e                hda174b7_0
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
wheel                     0.37.0             pyhd3eb1b0_1
xz                        5.2.5                h7b6447c_0
ystafdb                   0.5.2                     dev_0    <develop>
zlib                      1.2.11               h7b6447c_3

I pushed a quick fix that goes in the direction of pinning rdflib to a version lower than 6. This can help to continue the development of ystafdb, but at some point we need to document why version 6 of rdflib is complaining.

I looked for the license property at dublin core terms and elements.
In the table below, the columnt uri includes license shows what I found in the dublin core website: DC (~ elements) does not include license, while DCTERMS (~ dcmi-terms) includes license.

rdflib version rdflib.namespace Namespace URL in rdflib source rdflib includes license? uri includes license
5 DC Namespace http://purl.org/dc/elements/1.1/ rdflib.namespace.DC not explicitly NO
5 DCTERMS Namespace http://purl.org/dc/terms/ rdflib.namespace.DCTERMS not explicitly yes
6 DC Namespace http://purl.org/dc/elements/1.1/ rdflib.namespace._DC NO NO
6 DCTERMS Namespace http://purl.org.dc/terms/ rdflib.namespace_DCTERMS YES yes

I think that rdflib 5 did not restrict the terms included in DC, (that's why adding a DC.license worked fine), but in rdflib 6, the list of properties of DC and DCTERMS are more restricted, notably they match what the dublin core website says.

Great catch! We will need to fix that

I'm on it ;)

@kuzeko & @IKnowLogic & @agneta20 :

If you agree, we can

  • first accept the PR for the fix to pin rdflib to '< 6' and produce version 0.5.3 or ystafdb
  • then accept the other PR to fix the issues with using DC/DCTERMS and produce version 0.6.0 of ystafdb [because we have something relatively "new" so we jump to a higher minor version.

Sounds very reasonable,

@IKnowLogic please confirm

I agree, good work.

version 0.6.0 has a fix for this.

release for 0.5.3 and 0.6.0 have been published to pypi (manually created a release, and the github action poped-up to do the automatic pypi publishing)