Download: HTTP Range Request Not Supported
iwinux opened this issue · 4 comments
Hi,
I've trying to download the 83GB TSV dataset. The connection keeps getting interrupted and each time I have to start over, because the server always respond with Content-Range: bytes 0-89432430895/89432430896
.
Is it possible to fix this or is there any alternative way to fetch this dataset?
No Accept-Ranges: bytes
shown in a HEAD
request:
$ curl -I 'https://datasets.clickhouse.com/github_events/tsv/github_events_v2.tsv.xz'
HTTP/2 200
date: Tue, 13 Dec 2022 03:12:21 GMT
content-type: text/tab-separated-values
content-length: 89432430896
x-amz-id-2: G6Yi4dq3k83WF2oziDrxLZkhMHCDZ+80h0XoxdhYsCJCFq284b2y9jbVcYI9QOGbTEbC2qbd8rQ=
x-amz-request-id: V2XBQ3HC5QHKTTAB
last-modified: Mon, 07 Feb 2022 02:06:46 GMT
etag: "e5d93b8c838cfdd9a2a1010680d6a942-5331"
cache-control: max-age=31536000
cf-cache-status: MISS
strict-transport-security: max-age=0; includeSubDomains; preload
x-content-type-options: nosniff
server: cloudflare
cf-ray: 778b84842b8f0cf3-LAX
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400
This is a problem with Cloudflare that we use for proxying these links.
Something that prevents its usage for video hosting on free accounts.
Here is another link:
https://clickhouse-public-datasets.s3.amazonaws.com/github_events/tsv/github_events_v2.tsv.xz
It can be used alternatively.
Thank you! The alternative link is working.
I have uploaded the updated dataset:
https://clickhouse-public-datasets.s3.amazonaws.com/github_events/tsv/github_events_v3.tsv.xz
Good for analysis.
I will be interested to hear about your research if there will be something to share.