BUG: reading CSV from online taking a long time
Closed this issue · 3 comments
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
url = 'https://raw.githubusercontent.com/tdwg/rs.tdwg.org/master/terms/terms.csv'
df = pd.read_csv(url)
I have also tried
import pandas as pd
import requests
import io
url = 'https://raw.githubusercontent.com/tdwg/rs.tdwg.org/master/terms/terms.csv'
response = requests.get(url)
df = pd.read_csv(io.StringIO(response.content.decode('utf-8')))
In this case, the line response = requests.get(url)
is what is taking all the time. requests version 2.28.2
Issue Description
When I run this simple script, the run time can vary dramatically. Sometimes it takes minutes to run, sometimes seconds.
Expected Behavior
This should run in much less than a second.
Installed Versions
1.5.2
Hi, thanks for your report. If requests takes all the time In the second example then this looks like a problem on your side? Internet connection or similar? Your script runs in under a second on my machine
Hi @phofl I don't think this is an internet connection issue, but I have been unable to reproduce this issue on another persons machine, so perhaps something specific to my setup.
Thanks for the report
In this case, the line response = requests.get(url) is what is taking all the time
This should run in much less than a second.
If you think requests.get
should take less than a second, please report this to requests
Closing for now then