support either nanosecond timestamp precision or composite primary keys
Opened this issue · 1 comments
Describe This Problem
my use case is trade data analytics for signal or strategy indicators. a trade has:
- timestamp: the time (in nanosecond precision) of the execution of the trade
- base: the tag/symbol of the commodity or security that was bought or sold
- quote: the tag/symbol of the currency, commodity or security that the price of
base
is given in - exchange: the tag/symbol of the exchange where the trade was executed
- id: the (uint64) id used by the exchange to uniquely identify the trade. due to the volume of trades, most exchanges use a composite key of (id, base, quote), so there are always collisions if
base
,quote
,exchange
andid
are not considered together - price: the price (in
quote
) for 1 unit ofbase
- size: the quantity of
base
in the trade. a positivesize
indicates a buy. a negativesize
indicates a sell
i am struggling to document a table definition in horaedb that does not lose trades. since there are many trades which occur within the same millisecond. nanosecond precision appears to be unavailable.
i have attempted to work around the millisecond precision limitation by using a composite primary key (time, exchange, quote, id). each base gets it's own table which allows for multi-exchange series and combining quotes where their underlying values are equivalent (ie: usdt == usdc). however it appears that horaedb accepts the definition but ignores any value in the composite key that is not the timestamp. this results in all but the first trade in a given millisecond being discarded as a duplicate.
Proposal
i would appreciate ideas about how to utilise nanosecond timestamp precision or how to correctly define a composite key that will actually work.
Additional Context
here's the table definition i have tried:
CREATE TABLE IF NOT EXISTS {base} (
time timestamp NOT NULL,
id uint64,
exchange string,
quote string,
price double,
size double,
TIMESTAMP KEY(time),
PRIMARY KEY(time, id, exchange, quote)
) ENGINE=Analytic with (enable_ttl='false')
here's an example of source data where base
is btc, quote
is usd and exchange
is coinbase:
https://api.exchange.coinbase.com/products/BTC-USD/trades?limit=10&after=1000
Hi, nanoseconds is not supported in horaedb now.
For your case, I think you could add another column to bypass this limit,
CREATE TABLE IF NOT EXISTS {base} (
time timestamp NOT NULL,
time2 uint64 NOT NULL,
id uint64,
exchange string,
quote string,
price double,
size double,
TIMESTAMP KEY(time),
PRIMARY KEY(time, time2, id, exchange, quote)
) ENGINE=Analytic with (enable_ttl='false')
Here we add another column time2
to primary keys, and it's defined as time(nanoseconds) % 1e6
, so rows with same time
value won't be overwritten.