ekalinin/nodeenv

download_node_src does not properly handle multipart downloads

kcdodd opened this issue · 8 comments

The method download_node_src fails if the download doesn't complete in a single part. In my case this lead to a seemingly unrelated error AttributeError: 'bytes' object has no attribute 'tell' deep in the tarfile module. The exception handler,

    try:
        dl_contents = io.BytesIO(urlopen(node_url).read())
    except IncompleteRead as e:
        logger.warning('Incomplete read while reading'
                       'from {}'.format(node_url))
        dl_contents = e.partial

assigned a bytes object to dl_contents instead of a BytesIO. However, updating the exception still did not work because "partial" really does mean partial, and is not the complete file so there is no way to use this. Also, simply calling read() again also appears not to be the way to handle multipart.

I got this to work by using requests, which appears to handle this properly

    import requests
    dl_contents = io.BytesIO(requests.get(node_url).content)

We are being affected by this issue as well, are maintainers ok to switch from urllib to requests? Alternative may be to do multiple attempts to download the file in case of IncompleteRead errors

For future ref - potentially related issue here

fruch commented

We are getting hit by the same issue. Inside precommit.

Switching to requests sounds reasonable to me.

hynek commented

Is there anything that makes requests preferable over urllib3 (that requests depends on)?

fruch commented

seem like #329 wasn't enough to fix ths issue

we are still get hit by it from time to time:
scylladb/scylla-cluster-tests#6559 (comment)

@hynek if request would know to handle downloading a multi part file out of the box better then urllib3, it good enough reason if it would be my code.

I could not reach a point where I was able to reproduce the issue consistently so I can't confirm that the issue is related with multipart download, I would expect that to be easily reproducible. So it's unclear if request would actually fix it. I believe network glitches are causing this as explained here

FYI nodejs/build#1993
Issue is closed but it's still receiving comments

jaklan commented

We also affected by that when using node hooks in pre-commit.

maybe there a different place the node binary can be retrieved from ? mirrors or something like that ?