john-kurkowski/tldextract

"hanoi.vn" not parsed correctly

Closed this issue · 1 comments

Hello,

"registered_domain" not being parsed correctly for "hanoi.vn" specifically.
versions:
tldextract : 3.4.4
python 3.11.3

Actual behavior

>>> tldextract.extract("hanoi.vn")
ExtractResult(subdomain='', domain='', suffix='hanoi.vn')

Expected behavior

>>> tldextract.extract("hanoi.vn")
ExtractResult(subdomain='', domain='hanoi', suffix='vn')

Interestingly "hanoia.vn" works as expected.

>>> tldextract.extract("hanoia.vn")
ExtractResult(subdomain='', domain='hanoia', suffix='vn')

issue solved.
hanoi.vn was recently added to https://publicsuffix.org/list/public_suffix_list.dat so "hanoi.vn" is correctly identified as a suffix