Strange get_fld behavior
ninoseki opened this issue · 3 comments
ninoseki commented
First of all, thank you for creating a great tool!
I bumped into an issue withget_fld
function so let me report that.
Environment
- Python: 3.9.7
- tld: 0.12.6
Current Behavior
from tld import get_fld
print(get_fld("foo.bar.amazonaws.com", fix_protocol=True)) # amazonaws.com
print(get_fld("bar.amazonaws.com", fix_protocol=True)) # amazonaws.com
print(get_fld("eu-west-3.amazonaws.com", fix_protocol=True)) # amazonaws.com
print(get_fld("amazonaws.com", fix_protocol=True)) # amazonaws.com
print(
get_fld("s3.eu-west-3.amazonaws.com", fix_protocol=True)
) # s3.eu-west-3.amazonaws.com
Expected behavior
from tld import get_fld
print(
get_fld("s3.eu-west-3.amazonaws.com", fix_protocol=True)
) # amazonaws.com
Yomguithereal commented
@ninoseki this behavior is correct as per this line in the tld listings: https://github.com/barseghyanartur/tld/blob/master/src/tld/res/effective_tld_names.dat.txt#L10754
barseghyanartur commented
Check this:
In [1]: from tld import get_fld
In [2]: get_fld("s3.eu-west-3.amazonaws.com", fix_protocol=True)
Out[2]: 's3.eu-west-3.amazonaws.com'
In [3]: get_fld("s3.eu-west-3.amazonaws.com", fix_protocol=True, search_private=False)
Out[3]: 'amazonaws.com'
What you need is to set search_private
to False
. Works for both get_fld
and get_tld
.
ninoseki commented
Thanks🙏