xmunoz/sodapy

Issue with upsert/replace

tarmangue opened this issue · 5 comments

I keep getting the same error when using upsert or replace:

Traceback (most recent call last):
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 384, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 380, in _make_request
    httplib_response = conn.getresponse()
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\http\client.py", line 1321, in getresponse
    response.begin()
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\http\client.py", line 296, in begin
    version, status, reason = self._read_status()
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\http\client.py", line 257, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\socket.py", line 589, in readinto
    return self._sock.recv_into(b)
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\contrib\pyopenssl.py", line 307, in recv_into
    raise timeout('The read operation timed out')
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\adapters.py", line 449, in send
    timeout=timeout
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\util\retry.py", line 367, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\packages\six.py", line 686, in reraise
    raise value
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 386, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "C:\Users\tristanya\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 306, in _raise_timeout
    raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='mydata.iadb.org', port=443): Read timed out. (read timeout=10)

I have tried using csv and json as the data format, neither work. Any idea what is going on?

This error message indicates that it's probably a network issue. Looks like the request times out after 10 seconds. Have you tried to perform this action manually or with curl to see if it works?

Also, if you could provide the code that caused this exception that would be helpful for debugging.

I have tried with longer timeouts too, same result. Is there a limit to the upload size? Anyway, below is the code:

domain = "example.domain.org"
dataset = "abcd-1234"
client = Socrata(domain, "aBcDeF123455969", username="example@email.com", password="password")
data = open("data.json", encoding='utf-8')
print(client.replace(dataset, data))
client.close()

Update, if I create a ficticious row for the dataset that I am trying to update, and upsert it like so:

client = Socrata(domain, token, username=user, password=pwd)
data = [{'col1': 'AAA', 'col2': 'BBB'}]
print(client.upsert(dataset, data))
client.close()

I get the expected behaviour. Which makes me think the problem might be the fact that I am trying to push a 600k row dataset?

Yes, that is almost certainly the cause. Try splitting up the upsert into a few, more manageably-sized operations.