BlockchainCommons/spotbit

SpotBit API returns massively redundant historical data

wolfmcnally opened this issue · 1 comments

NOTE: As of the day before this writing the URLs either worked or failed to work as described below. Today, I checked them again and several are returning no data.

Here is a URL that queries for a 90 second interval near the start of 2022: http://h6zwwkcivy2hjys6xpinlnz2f74dsmvltzsd4xb42vinhlcaoe7fdeqd.onion/hist/USD/bitstamp/1640995440000/1640995530000

The call returns 28 records, all of which are identical except for the record ID (the first column), which tells me that they are unintentional duplicates.

image

Changing to a different exchange, Kraken, with the same time range, we get 20 records, complete different range of record IDs, and entirely identical otherwise: http://h6zwwkcivy2hjys6xpinlnz2f74dsmvltzsd4xb42vinhlcaoe7fdeqd.onion/hist/USD/kraken/1640995440000/1640995530000

image

This explains why I'm getting much data back (when I get any at all): the vast majority of it is redundant.

When I query for 10 minutes of data, I get back 252 records, which are all generally within the requested 10 minute period, but are not in any time-sorted order: http://h6zwwkcivy2hjys6xpinlnz2f74dsmvltzsd4xb42vinhlcaoe7fdeqd.onion/hist/USD/bitstamp/1640995440000/1640996040000

When I deduplicate the data, I arrive at 9 distinct candles that fall within the requested time range:

image

Nine candles for 10 minutes seems reasonable, but still very high if I want to produce charts that cover longer periods.

When I chunk together those candles, I get an aggregate candle for the 10 minute period:

{"close":46321.54,"end":"2022-01-01T00:13:00Z","high":46500,"low":46269.76,"open":46305.36,"start":"2022-01-01T00:05:00Z","volume":9.549817579999997952}

Fixed in master.