Problems with application/rdf+xml
GeorgFerdinandSchneider opened this issue · 4 comments
Dear perma-id team,
again many thanks for providing this web service!
We have performed some update on the w3id.org/bot namespace to provide access to version IRIs (cf.#1995 by @attadanta).
While our test script worked fine, still, the Protege tool cannot retrieve a version IRI, e.g. w3id.org/bot-0.3.2
We observed some irregular behaviour of Protege sending out text/html requests. As we cannot test more deeply in w3id.org server, we were wondering if you have some suggestion in this regard maybe from other projects which have a similar naming structure.
Thanks and BR
Georg
I'm not sure how that test suite script works and I'm not a protege user, but I'll try to help. Can you list one failing URL you are requesting, the Accept header being used, and the expected result? I assume the above text should have a slash in the URL? "/bot-0.3.2" won't work. It looks like it does follow through to ttl content. The w3id server does output html in the redirect content, but tools should ignore that.
curl -v -L -H "Accept: text/turtle" https://w3id.org/bot/0.3.2
If you want to test locally you can serve up the w3id repo in a local apache instance and map w3id.org to localhost. There are some https certificate issues to deal with though. I usually test redirects with a non-https http://w3id.localhost/ alias.
Hi @davidlehn,
I ran protégé behind a proxy and captured the following exchange. (I followed this and trimmed the bodies and some of the headers for brevity)
==================================================
GET /bot/0.3.2 HTTP/1.1
------------------------- request headers -------------------------
ACCEPT : application/rdf+xml, application/xml; q=0.7, text/xml; q=0.6, text/plain; q=0.1, */*; q=0.09
ACCEPT-ENCODING : xz,gzip,deflate
USER-AGENT : Java/1.8.0_121
HOST : w3id.org
------------------------- response headers -------------------------
LOCATION : https://w3c-lbd-cg.github.io/bot/bot-0.3.2.ttl
CONTENT-LENGTH : 305
CONTENT-TYPE : text/html; charset=iso-8859-1
127.0.0.1:50640: GET https://w3id.org/bot/0.3.2
<< 302 Found 305b
127.0.0.1:50640: clientdisconnect
127.0.0.1:50642: clientconnect
==================================================
GET /bot/bot-0.3.2.ttl HTTP/1.1
------------------------- request headers -------------------------
ACCEPT : application/rdf+xml, application/xml; q=0.7, text/xml; q=0.6, text/plain; q=0.1, */*; q=0.09
ACCEPT-ENCODING : xz,gzip,deflate
USER-AGENT : Java/1.8.0_121
HOST : w3c-lbd-cg.github.io
------------------------- response headers -------------------------
CONTENT-LENGTH : 114214
CONTENT-TYPE : text/turtle; charset=utf-8
127.0.0.1:50642: GET https://w3c-lbd-cg.github.io/bot/bot-0.3.2.ttl
<< 200 OK 111.54k
127.0.0.1:50644: clientconnect
==================================================
GET /bot/0.3.2 HTTP/1.1
------------------------- request headers -------------------------
ACCEPT : application/rdf+xml, application/xml; q=0.7, text/xml; q=0.6, text/plain; q=0.1, */*; q=0.09
ACCEPT-ENCODING : xz,gzip,deflate
USER-AGENT : Java/1.8.0_121
HOST : w3id.org
------------------------- response headers -------------------------
LOCATION : https://w3c-lbd-cg.github.io/bot/bot-0.3.2.ttl
CONTENT-LENGTH : 305
CONTENT-TYPE : text/html; charset=iso-8859-1
127.0.0.1:50644: GET https://w3id.org/bot/0.3.2
<< 302 Found 305b
127.0.0.1:50644: clientdisconnect
==================================================
GET /bot/bot-0.3.2.ttl HTTP/1.1
------------------------- request headers -------------------------
ACCEPT : application/rdf+xml, application/xml; q=0.7, text/xml; q=0.6, text/plain; q=0.1, */*; q=0.09
ACCEPT-ENCODING : xz,gzip,deflate
USER-AGENT : Java/1.8.0_121
HOST : w3c-lbd-cg.github.io
------------------------- response headers -------------------------
CONTENT-LENGTH : 114214
CONTENT-TYPE : text/turtle; charset=utf-8
127.0.0.1:50642: GET https://w3c-lbd-cg.github.io/bot/bot-0.3.2.ttl
<< 200 OK 111.54k
127.0.0.1:50646: clientconnect
==================================================
GET /bot/0.3.2 HTTP/1.1
------------------------- request headers -------------------------
ACCEPT : application/rdf+xml, application/xml; q=0.7, text/xml; q=0.6, text/plain; q=0.1, */*; q=0.09
ACCEPT-ENCODING : xz,gzip,deflate
USER-AGENT : Java/1.8.0_121
HOST : w3id.org
------------------------- response headers -------------------------
LOCATION : https://w3c-lbd-cg.github.io/bot/bot-0.3.2.ttl
CONTENT-LENGTH : 305
CONTENT-TYPE : text/html; charset=iso-8859-1
127.0.0.1:50646: GET https://w3id.org/bot/0.3.2
<< 302 Found 305b
127.0.0.1:50646: clientdisconnect
==================================================
GET /bot/bot-0.3.2.ttl HTTP/1.1
------------------------- request headers -------------------------
ACCEPT : application/rdf+xml, application/xml; q=0.7, text/xml; q=0.6, text/plain; q=0.1, */*; q=0.09
ACCEPT-ENCODING : xz,gzip,deflate
USER-AGENT : Java/1.8.0_121
HOST : w3c-lbd-cg.github.io
------------------------- response headers -------------------------
CONTENT-LENGTH : 114214
CONTENT-TYPE : text/turtle; charset=utf-8
127.0.0.1:50642: GET https://w3c-lbd-cg.github.io/bot/bot-0.3.2.ttl
<< 200 OK 111.54k
127.0.0.1:50648: clientconnect
==================================================
GET /bot/0.3.2 HTTP/1.1
------------------------- request headers -------------------------
USER-AGENT : Java/1.8.0_121
HOST : w3id.org
ACCEPT : text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
------------------------- response headers -------------------------
LOCATION : https://w3c-lbd-cg.github.io/bot/0.3.2
CONTENT-LENGTH : 297
KEEP-ALIVE : timeout=5, max=100
CONTENT-TYPE : text/html; charset=iso-8859-1
127.0.0.1:50648: GET https://w3id.org/bot/0.3.2
<< 302 Found 297b
127.0.0.1:50648: clientdisconnect
==================================================
GET /bot/0.3.2 HTTP/1.1
------------------------- request headers -------------------------
USER-AGENT : Java/1.8.0_121
HOST : w3c-lbd-cg.github.io
ACCEPT : text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
------------------------- response headers -------------------------
127.0.0.1:50642: GET https://w3c-lbd-cg.github.io/bot/0.3.2
<< 404 Not Found 9.12k
protégé does four requests with the original URL, gets 200's on each, but ignores them. Afterwards, it tries with text/html
, which fails since the redirection rule does not fire for that content type.
Removing the text/html
condition in the rule would probably help. However, I think it would be worthwhile to understand what causes the behavior so that more reliable test tools could be built.
My test suite works exactly like your test snippet. It compares the location response header against an expected URL and complains if those don't match. This alone doesn't ensure that the redirection would work in protégé.
Adding to the mystery, protégé opens the ontology when pointed to my test setup at http://bot.mischung.net.
In case it helps, I handle the access to particular versions with a separated rule. Example: https://github.com/perma-id/w3id.org/blob/master/okn/o/sd/.htaccess
It loads in Protege correctly. Maybe they are expecting a 303 instead of a 302, but in theory it should not matter.
@dgarijo: Thank you! Your hint with the 303 did it for me 👍
I had the same problem with version IRIs and Protege failing to resolve imports. After wasting some hours testing my .htaccess and looking through issues at this repo and at Protege I finally came across this issue here.
What is strange is that "plain" ontology IRIs without a version work with a 301...