MG-RAST/Shock

Uploading file in parts with python requests

Closed this issue · 3 comments

I'm trying to upload a file in parts to Shock via python with the requests library and I can't quite figure out how to get it to work. I've tried a number of incantations, and this is the closest I've gotten:

In [5]: import requests
In [6]: from requests_toolbelt.multipart.encoder import MultipartEncoder
In [7]: requests.get('http://localhost:40317').json()
Out[7]: 
{u'anonymous_permissions': {u'delete': False, u'read': False, u'write': False},
 u'attribute_indexes': [u''],
 u'auth': [u'globus'],
 u'contact': u'shock-admin@kbase.us',
 u'documentation': u'http://localhost:40317/wiki/',
 u'id': u'Shock',
 u'resources': [u'node'],
 u'server_time': u'2016-07-06T11:59:34-07:00',
 u'type': u'Shock',
 u'url': u'http://localhost:40317/',
 u'version': u'0.9.14'}

In [8]: mpdata = MultipartEncoder(fields={'attributes_str': '{"foo": "bar"}', 'parts': 'unknown'})
In [9]: headers = {'Authorization': 'OAuth ' + token}
In [10]: mpheaders = dict(headers)
In [11]: mpheaders['Content-Type'] = mpdata.content_type
In [12]: res = requests.post('http://localhost:40317/node/', headers=mpheaders, data=mpdata)

In [13]: j = res.json()
In [14]: j
Out[14]: 
{u'data': {u'attributes': {u'foo': u'bar'},
  u'created_on': u'2016-07-06T12:00:36.86892903-07:00',
  u'expiration': u'0001-01-01T00:00:00Z',
  u'file': {u'checksum': {},
   u'created_on': u'0001-01-01T00:00:00Z',
   u'format': u'',
   u'name': u'',
   u'size': 0,
   u'virtual': False,
   u'virtual_parts': None},
  u'id': u'0b25ef24-368d-4c78-87e3-8bb3643e2b3b',
  u'indexes': {},
  u'last_modified': u'2016-07-06T12:00:36.874577795-07:00',
  u'linkage': None,
  u'parts': {u'compression': u'',
   u'count': 0,
   u'length': 0,
   u'parts': [],
   u'varlen': True},
  u'tags': None,
  u'type': u'parts',
  u'version': u'7d913c698cfceb9ffaca04c83ed9cb23',
  u'version_parts': {u'acl_ver': u'022c1343f7631cecc0e61bb1809dfdd9',
   u'attributes_ver': u'9bb58f26192e4ba00f01e2e7b136bbd8',
   u'file_ver': u'cbe3da041b769ef292a3441ea9c5a205',
   u'indexes_ver': u'99914b932bd37a50b983c5e7c90ae93b'}},
 u'error': None,
 u'status': 200}

In [15]: mpdata = MultipartEncoder(fields={'1': 'whee'})
In [16]: mpheaders['Content-Type'] = mpdata.content_type
In [17]: res = requests.put('http://localhost:40317/node/' + j['data']['id'], headers=mpheaders, data=mpdata)

In [18]: res.text
Out[18]: u'{"status":400,"data":null,"error":["err@node_ParseMultipartForm: invalid param: 1"]}'

Based on the docs, 1 is definitely a valid multipart form parameter, so I'm not quite sure what I'm doing wrong. Do you have any idea?

The value of the numeric parameter needs to be the parts file to upload. The error message is not clear on that. The relevant documentation is here:

with file upload in N parts where N is unknown at node creation time (part uploads may be done in parallel and out of order)
curl -X POST -F "parts=unknown" -F "file_name=<file_name>" http://[:]/node
curl -X PUT -F "1=@<file_part_1>" http://[:]/node/<node_id>
curl -X PUT -F "2=@<file_part_2>" http://[:]/node/<node_id>
...
curl -X PUT -F "parts=close" http://[:]/node/<node_id>

Thanks. For future reference for others that may have the same question, here's the magic:

In [5]: import requests
In [6]: from requests_toolbelt.multipart.encoder import MultipartEncoder

In [7]: headers = {'Authorization': 'OAuth ' + token}

In [8]: mpdata = MultipartEncoder(fields={'attributes_str': '{"foo": "bar"}', 'parts': 'unknown'})
In [9]: mpheaders = dict(headers)
In [10]: mpheaders['Content-Type'] = mpdata.content_type
In [11]: res = requests.post('https://ci.kbase.us/services/shock-api/node/', headers=mpheaders, data=mpdata)

In [12]: j = res.json()
In [13]: j
Out[13]: 
{u'data': {u'attributes': {u'foo': u'bar'},
  u'created_on': u'2016-07-06T14:39:34.880535369-07:00',
  u'file': {u'checksum': {},
   u'format': u'',
   u'name': u'',
   u'size': 0,
   u'virtual': False,
   u'virtual_parts': None},
  u'id': u'f87442dc-2e68-4f87-be12-476749285c9c',
  u'indexes': {},
  u'last_modified': u'2016-07-06T14:39:34.890962385-07:00',
  u'linkages': None,
  u'tags': None,
  u'type': u'parts',
  u'version': u'2c16be6b7e2ab257f11b67be425dd13a'},
 u'error': None,
 u'status': 200}

In [14]: mpdata = MultipartEncoder(fields={'1': ('fn', b'whooptywhoop')})
In [15]: mpheaders['Content-Type'] = mpdata.content_type
In [16]: res = requests.put('https://ci.kbase.us/services/shock-api/node/' + j['data']['id'], headers=mpheaders, data=mpdata)

In [19]: mpdata = MultipartEncoder(fields={'parts': 'close'})
In [20]: mpheaders['Content-Type'] = mpdata.content_type
In [21]: res = requests.put('https://ci.kbase.us/services/shock-api/node/' + j['data']['id'], headers=mpheaders, data=mpdata)

In [24]: requests.get('https://ci.kbase.us/services/shock-api/node/' + j['data']['id'] + '/?download', headers=headers).text
Out[24]: u'whooptywhoop'

Thanks. Have this saved as a documentation issue.