[BUG] - Parsing error on URLs ending in ca.com (e.g: geteduca.com)
Closed this issue · 2 comments
mdolr commented
Hello,
My scrapper encoutered a bug with what seemed like a normal URL
Actual behavior
URL: https://www.geteduca.com
failed to be parsed by tldextract.extract(url)
with result being : ExtractResult(subdomain='', domain='', suffix='')
.
Expected behavior
I would expect to receive ExtractResult(subdomain='www', domain='geteduca', suffix='com')
Thank you for looking into it, I'll try to submit a fix if I have the time 😄
elliotwutingfeng commented
I'm getting the correct results on CPython 3.10.11 and PyPy 3.9.16.
import tldextract; tldextract.extract("https://www.geteduca.com")
Can you let us know your Python version and verify if you are using tldextract >=3.4.4?
mdolr commented
Hey sorry, I've updated my packages and Python but cannot reproduce it anymore. I don't know what happened :/