https == http for purposes of URI identification
Closed this issue · 3 comments
From caru...@gmail.com on September 02, 2011 09:27:13
What capability do you want added or improved? Resolve HTTPS-based URIs as if they were identical to the corresponding non-secure HTTP based URIs.
Example: assume an entry with URI http://mmisw.org/ont/foo/baz . Now assume the site also supports HTTP Secure access. Now, given ORR's emphasis on self-resolvability, the ontology should also be accessible with https://mmisw.org/ont/foo/baz .
Technically, supporting this is not too complicated. However, since the URIs are strictly different, it would be interesting to examine what "semantic" ramifications this functionality may have.
Original issue: http://code.google.com/p/mmisw/issues/detail?id=292
From jgrayb...@ucsd.edu on September 02, 2011 09:44:31
Interesting question. There are two models: 1) It is a distinct term identifier and therefore a distinct concept, so a sameAs relationships should be generated and maintained. 2) It is actually just a way, on the web, to access the original URI, and has no semantic import.
My first reaction is that the second approach makes life simpler. This will not be the case, though, if people start using the https URL as a semantic URI. Then those people who look for the term using https won't find it. (And how annoying is that!?)
To account for that possibility, following approach 1 seems more bulletproof. The only downside is that doubling the number of terms, and making every term into 2 terms and a relationship between them, has some noticeable impact on the set of inferred relations. (https://A = http://A and https://B = http://B Now if http://A = http://B there is not 1 new relation but 4 (A = B, s.A = B, s.A = s.B, A = s.B). Hmm, I guess it just multiplies the number by 4, so that isn't a crisis. Just noticeable.
This would be a good question for the ont list, but I'm not following it much any more.
I think I will just make "ont" handle https and http based URI request as interchangeable.
I think that is tolerable, and beneficial from a usability standpoint.
It creates an unfortunate situation, in that that string parsing of URIs can't detect the 'real' matches (i.e., https
will not match http
, so a term could end up with two IDs that are not identical). Our best bet may be to run that risk, and at some point declare the official URI to be one or the other.
In fact, I propose we declare the http URI the default identifier in all cases UNLESS the user and system have identified https URIs as the default for a particular vocabulary. So for now, the 'proper' URI is always the http one; we resolve https as a customer service.
At some point in the future, if someone needs https as the primary URI, we can see if they want us to pay for any additional implementation they may deem necessary. It may be a kind of backward-incompatible behavior if we have to point out https URIs are not the real ones, but I think we'll be OK.