FDQN Extraction error in some domains
Closed this issue · 2 comments
danruggi commented
Error in extraction for
- subdomain.domain.com.de
- domain.com.de
- subdomain.domain.com.se
- domain.com.se
com.de and com.se are valid tld from the Public Domains List, but can't be recognized by the extractor;
Example:
import tldextract
tldextract.extract('http://forums.test123.com.de/')
ExtractResult(subdomain='forums.test123', domain='com', suffix='de')
A website that use this, ie:
https://herbalife(dot)com(dot)se/
(hidden to avoid backlinks)
john-kurkowski commented
danruggi commented
no issue, intended behaviour:
extractor = tldextract.TLDExtract(include_psl_private_domains=True)