Crypto-toolbox/HFT-Orderbook

Understanding how to query the book efficiently

kjgd opened this issue · 3 comments

kjgd commented

Let's consider the following book:

import lob as LOB
lob = LOB.LimitOrderBook()
orders = [
    LOB.Order(uid=1, is_bid=True, size=5, price=100),
    LOB.Order(uid=2, is_bid=True, size=5, price=95),
    LOB.Order(uid=3, is_bid=True, size=5, price=90),
    LOB.Order(uid=4, is_bid=False, size=5, price=200),
    LOB.Order(uid=5, is_bid=False, size=5, price=205),
    LOB.Order(uid=6, is_bid=False, size=5, price=210),
    ]
for order in orders:
     lob.process(order)

I'm looking at the limit level dictionary:
lob._price_levels
Since no limit level is aware of whether it is a bid or an ask level, it looks like determining this means querying:
lob._price_levels[100].orders.head.is_bid
(for an order book containing a price level of 100)

And therefore the obvious way to separate out bids and asks is:

bids = []
asks = []
for k, level in lob._price_levels.items():
    if level.orders.head.is_bid:
        bids.append(level)
    else:
        asks.append(level)

Now, since the query function should also be O(1), I would like to avoid iterating through every price level in the book where this is not necessary -- a book may have 10000 levels but only the first 20 may be relevant.

One way to do this would be:

levels_sorted = sorted(lob._price_levels.keys())
indices = {}
indices['bid'] = list(reversed(range(0, levels_sorted.index(lob.best_bid.price) + 1)))
indices['ask'] = list(range(levels_sorted.index(lob.best_ask.price), len(levels_sorted)))

def top_N_levels(n):
    levels = {}
    for side in ['bid', 'ask']:
        levels[side] = []
        for x in range(n):
            index = indices[side][x]
            level_price = levels_sorted[index]
            level = lob._price_levels[level_price]
            levels[side].append(level)
    return levels

top_N_levels(2)

It seems like there must be a better way of doing it...

Another option would involve starting at the top of the lob.bids or lob.asks price level tree, and iterate down it only as far as necessary.

Unfortunately, we are dealing with a binary tree, so trying to find the "next lowest" or "next highest" price might mean going all the way down to a leaf node... hardly efficient either...

Hey @kjgd ,
sorry for the delayed answer, the holidays are a busy time at my house :)

Just to clarify - you want to query the top X price levels on the bid and ask sides, right?

try this:

levels_sorted = sorted(lob._price_levels.keys())
bids = [price_level for price_level in levels_sorted if price_level < lob.best_ask]
asks = [price_level for price_level in levels_sorted if price_level > lob.best_bid]

That gives you the order book's limit levels as sorted dict keys.

Unfortunately, we are dealing with a binary tree, so trying to find the "next lowest" or "next highest" price might mean going all the way down to a leaf node... hardly efficient either...

this Implementation is meant to keep track of an order book in an efficient way - not display it efficiently. You're therefore correct that it doesn't do that very well.

kjgd commented

Hey @nlsdfnbch no worries :)

Yes, exactly. Your way is considerably neater, thanks.

this Implementation is meant to keep track of an order book in an efficient way - not display it efficiently. You're therefore correct that it doesn't do that very well.

I see. It seems an unusual design choice... how is the implementation meant to be used, then?

It was initially meant as the core component of a trading engine. The main goal was to be able to quickly determine top levels and execute trades. We never really thought about visualizing the book.