Extraction of ystafdb triples shows AttributionError
Closed this issue · 10 comments
I was running the ystafdb -cli and encountered the following error:
AttributeError: term 'license' not in namespace 'http://purl.org/dc/elements/1.1/'
We managed to supersede the error by using an older rdflib version (5.0). So we either need to update the setup with the old rdflib version or find out why it doesn't work with the current version of rdflib?
I can confirm that a "clean installation" (see below) with python 3.9 and rdflib 6 produces the above mentionned error:
conda create -n ystafdb python=3.9
conda activate ystafdb
pip install -e .
ystafdb-cli
yields:
Traceback (most recent call last):
File "/home/xmasgrinch/miniconda3/envs/ystafdb/bin/ystafdb-cli", line 33, in <module>
sys.exit(load_entry_point('ystafdb', 'console_scripts', 'ystafdb-cli')())
File "/home/xmasgrinch/workspaces/samurai/ystafdb/ystafdb/bin/ystafdb.py", line 42, in main
generate_ystafdb(args)
File "/home/xmasgrinch/workspaces/samurai/ystafdb/ystafdb/__init__.py", line 15, in generate_ystafdb
generate_foaf_uris(args)
File "/home/xmasgrinch/workspaces/samurai/ystafdb/ystafdb/foaf.py", line 44, in generate_foaf_uris
g.add((node, DC.license, URIRef("https://creativecommons.org/licenses/by/3.0/")))
File "/home/xmasgrinch/miniconda3/envs/ystafdb/lib/python3.9/site-packages/rdflib/namespace/__init__.py", line 206, in __getattr__
return cls.__getitem__(name)
File "/home/xmasgrinch/miniconda3/envs/ystafdb/lib/python3.9/site-packages/rdflib/namespace/__init__.py", line 197, in __getitem__
raise AttributeError(f"term '{name}' not in namespace '{cls._NS}'")j
AttributeError: term 'license' not in namespace 'http://purl.org/dc/elements/1.1/'
Here are the installed dependencies:
> conda list
# packages in environment at /home/xmasgrinch/miniconda3/envs/ystafdb:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 4.5 1_gnu
appdirs 1.4.4 pypi_0 pypi
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge
ca-certificates 2021.10.8 ha878542_0 conda-forge
certifi 2021.10.8 py39hf3d152e_1 conda-forge
decorator 5.1.0 pyhd8ed1ab_0 conda-forge
docopt 0.6.2 pypi_0 pypi
ipython 7.30.1 py39hf3d152e_0 conda-forge
isodate 0.6.0 pypi_0 pypi
jedi 0.18.1 py39hf3d152e_0 conda-forge
ld_impl_linux-64 2.35.1 h7274673_9
libffi 3.3 he6710b0_2
libgcc-ng 9.3.0 h5101ec6_17
libgomp 9.3.0 h5101ec6_17
libstdcxx-ng 9.3.0 hd4cf53a_17
matplotlib-inline 0.1.3 pyhd8ed1ab_0 conda-forge
ncurses 6.3 h7f8727e_2
numpy 1.21.4 pypi_0 pypi
openssl 1.1.1l h7f8727e_0
pandas 1.3.4 pypi_0 pypi
parso 0.8.3 pyhd8ed1ab_0 conda-forge
pexpect 4.8.0 pyh9f0ad1d_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pip 21.2.4 py39h06a4308_0
prompt-toolkit 3.0.22 pyha770c72_0 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pygments 2.10.0 pyhd8ed1ab_0 conda-forge
pyparsing 3.0.6 pypi_0 pypi
python 3.9.7 h12debd9_1
python-dateutil 2.8.2 pypi_0 pypi
python_abi 3.9 2_cp39 conda-forge
pytz 2021.3 pypi_0 pypi
rdflib 6.0.2 pypi_0 pypi
readline 8.1 h27cfd23_0
setuptools 58.0.4 py39h06a4308_0
six 1.16.0 pypi_0 pypi
sqlite 3.36.0 hc218d9a_0
tk 8.6.11 h1ccaba5_0
traitlets 5.1.1 pyhd8ed1ab_0 conda-forge
tzdata 2021e hda174b7_0
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
wheel 0.37.0 pyhd3eb1b0_1
xz 5.2.5 h7b6447c_0
ystafdb 0.5.2 dev_0 <develop>
zlib 1.2.11 h7b6447c_3
I pushed a quick fix that goes in the direction of pinning rdflib to a version lower than 6. This can help to continue the development of ystafdb, but at some point we need to document why version 6 of rdflib is complaining.
I looked for the license
property at dublin core terms and elements.
In the table below, the columnt uri includes license
shows what I found in the dublin core website: DC (~ elements) does not include license, while DCTERMS (~ dcmi-terms) includes license.
rdflib version | rdflib.namespace | Namespace URL in rdflib | source | rdflib includes license ? |
uri includes license |
---|---|---|---|---|---|
5 | DC Namespace | http://purl.org/dc/elements/1.1/ | rdflib.namespace.DC | not explicitly | NO |
5 | DCTERMS Namespace | http://purl.org/dc/terms/ | rdflib.namespace.DCTERMS | not explicitly | yes |
6 | DC Namespace | http://purl.org/dc/elements/1.1/ | rdflib.namespace._DC | NO | NO |
6 | DCTERMS Namespace | http://purl.org.dc/terms/ | rdflib.namespace_DCTERMS | YES | yes |
I think that rdflib 5 did not restrict the terms included in DC, (that's why adding a DC.license
worked fine), but in rdflib 6, the list of properties of DC and DCTERMS are more restricted, notably they match what the dublin core website says.
Great catch! We will need to fix that
I'm on it ;)
@kuzeko & @IKnowLogic & @agneta20 :
If you agree, we can
- first accept the PR for the fix to pin rdflib to '< 6' and produce version 0.5.3 or ystafdb
- then accept the other PR to fix the issues with using DC/DCTERMS and produce version 0.6.0 of ystafdb [because we have something relatively "new" so we jump to a higher minor version.
Sounds very reasonable,
@IKnowLogic please confirm
I agree, good work.
version 0.6.0 has a fix for this.