Domain contains subdomain.
z639 opened this issue · 2 comments
Hi,
Unless I'm missing something, are these results erroneous ?
Version: 1.1.23
https://i.imgur.com/950xpqU.png
tested in Chrome 63 and Firefox 58 (Debian)
https://codepen.io/anon/pen/ddRrqo?editors=1010
Hi @z639,
Thanks for the feedback 😉
I think this is issue is simply a misunderstanding of what this library (psl
) does, and the naming of the parsed properties (tld
, ...). The library parses domains based on the public suffix list, and the parsed tld
doesn't necessarily match the actual top level domain (from ICANN's standpoint). In this case, the parsed tld
represents the public suffix.
A "public suffix" is one under which Internet users can (or historically could) directly register names. Some examples of public suffixes are .com, .co.uk and pvt.k12.ma.us. The Public Suffix List is a list of all known public suffixes.
The behaviour you are seeing is actually the expected behaviour:
https://github.com/wrangr/psl/blob/master/test/psl.parse.spec.js#L111
I believe the underlying issue is mainly the confusing property names in the parsed object. By this I mean that a better name for parsed.tld
could have been something like parsed.publicSuffix
, and maybe even keeping the parsed.tld
, but with the actual top level domain.
I'm afraid that this design issue was inherited from publicsuffix-ruby and hasn't been a major issue as long as you understand the parser's intention. However, I can see that they have since added support for what they call private domains, which gives you the option to switch off support for private (non-ICANN), and as such would behave as you expected. Would something like this do the trick? It might be worth exploring...
Thoughts?
Thanks for the info and the fast reply.
Yes, I think that addition would do what I'm looking for.
I found your github through this stackoverflow page https://stackoverflow.com/questions/9752963/get-domain-name-without-subdomains-using-javascript and I'm basically looking for a way to check that a domain is valid and then remove any subdomains/prefixes to see if the main domain itself is in an array of white listed domains.
I'll try and figure out how they're doing that at the ruby repository.
Thanks again.