bitrich-info/xchange-stream

[bitfinex] Corrupted orderbooks

Opened this issue · 11 comments

I am using most recent version from develop, today I've again ran into issues with corrupted orderbooks mentioned in #269. I can see clearly in my logs that top value in orderbook was stuck there for several hours, while orders were constantly changing.

I've tried to enable websocket sequencing, result is here: https://github.com/davidjirovec/xchange-stream/tree/sequencing-test - just run BitfinexManualExample.

Logs like this start appearing:

[153,[4014,2,0.28261943],593]
[153,[4008.3,2,2.40802452],594]
[153,[4008.2,2,2.3670261],595]
[153,[4054.4,2,-2.52],596]
[153,[4033.3,0,-1],597]
[153,[4033.2,1,-0.49633561],598]
[153,[4029.8,2,-1.72184484],599]
[153,[4008.2,0,1],600]
[153,[4025.8,1,0.3866601],601]
[153,[4025.6,1,0.2],602]
[153,[4024.8,0,1],603]
[153,[4008.2,2,2.3670261],604]
[153,[4029.8,3,-2.22184484],606]
Wrong sequence 606 vs 605
[153,[4047.9,2,-0.62546047],607]
Wrong sequence 607 vs 606
[153,[4008.2,0,1],608]
Wrong sequence 608 vs 607
[153,[4024.8,1,0.39387914],609]
Wrong sequence 609 vs 608

It seems message 605 got lost somehow.

Maybe I am doing something wrong, but compared to official nodejs sample https://github.com/bitfinexcom/bitfinex-api-node/blob/bf00a353d90578797a23fac996eed5e9fee81b91/examples/ws2/sequencing.js I don't see anything important.

Some time ago I've also implemented orderbook checksums, these sometimes fail too, but it usually takes days to happen and I prefer not to use it to make my bot fast as possible.

Any ideas?

Most exchanges don't guarantee that there won't be gaps in websocket feeds (binance certainly doesn't and neither does Coinbase Pro). The binance order book implementation does what they (and CBP) recommend, which is to detect gaps and rebase from the Rest API. Not sure if this definitely applies to Bitfinex's order books.

fwiw, I am planning to implement guaranteed safety on the new authenticated channels (see my open issue) by applying whatever exchange specific processing is required to do so. This means making rest calls, which means we will want another ExchangeSpecification parameter to allow rate limiting of these, as we do for socket reconnections.

Maybe this needs the same approach.

Bitfinex does not mention any way to handle missing numbers in sequence. Actually they mention just flag that turns sequences on and information that it is beta feature :)

I guess here really could be useful the official api implementation in nodejs, which does not have issues with missing sequence numbers and exactly this line https://github.com/bitfinexcom/bitfinex-api-node/blob/b597c4d615879620de5552af7a6db08d663d7fdc/lib/transports/ws2.js#L335 says sequence numbers should be increasing by one.

To me it seems it could be some concurrency issue, I guess before entering rx and exchange specific orderbook handling, which could mean it is in common code for more exchanges.

Maybe @pchertalev could be also interested in this?

There's another issue open on here about concurrency issues with Orderbook. That class is from XChange which has no reason to provide thread safety, and it's mutable (has to be to provide the update capability). That'd be the first place I'd look if it is indeed a concurrency issue.

I also care about this. I may have noticed it once or twice but I currently only use order books for visual information rather than driving anything off them so hadn't paid it much attention.

Happy to help if I can.

Sure, thanks. Any ideas can help.

Well, this is core of my test case: develop...davidjirovec:sequencing-test#diff-24429d40711776b7bfbbf1d0a497fd6eR344

Just checking the sequences on NettyStreamingService level, so thread unsafe Orderbook should not be the cause. I just hope I am not creating any concurrency issue myself this way :)

If handleMessage is called concurrently, you do indeed have a race condition. Message 2 might hit getAndIncrement before message 1. However, if handleMessage is called concurrently, that's a problem in itself. It shouldn't be. If it is, there's nothing you can do to enforce ordering. Even if you made the method synchronized you'd still get the same problem.

Maybe put another atomic counter on the method to check if you're getting concurrent calls?

Oh, sorry, my bad. Message was not lost. It was hb message, which was consumed by NettyStreamingService's child - BitfinexStreamingService.

It seems handleMessage should not be called concurrently https://github.com/netty/netty/wiki/New-and-noteworthy-in-4.0#well-defined-thread-model so this should be safe.

Still, there is definitely something wrong - parts from yesterday's log: https://gist.github.com/davidjirovec/207ef0c5882105cdfa020cff39d899e0
still same order stuck on the top of orderbook, but my order was never able to buy it.

I guess these stuck orders happen deeper in orderbooks, but there it might not be noticeable unless price moves so that the stuck order would get to the top of orderbook.

Maybe try a test that randomly adds and removes orders from the book and validates the end position? Run that long enough and it will recreate any bug. Record the orders as it goes so you'll be able to replay.

Or just log your random seed. Both work.

@davidjirovec, I came across this which may be of interest: https://medium.com/bitfinex/bitfinex-api-order-books-checksums-5a39628c8b94 and this discussion: Crypto-toolbox/btfxwss#12

I'm trying to work out how I can tell whether I have missed a balance or order update for #274 and while it looks like there's a good solution for order books, there's no good solution for the authenticated streams. Surprising and frustrating.

As I've mentioned in #269, I've already had the checksums implemented, but it was very ugly and slow. But after this issue, I've done a few improvements so it is not so ugly and it is definitely much faster https://github.com/davidjirovec/xchange-stream/blob/7f20afbbce8854a3910f1772b1b5b405dc5cf5f4/xchange-bitfinex/src/main/java/info/bitrich/xchangestream/bitfinex/BitfinexStreamingMarketDataService.java#L78

I've also tried to solve possible concurrency hazards I've found. I am running the code for a while now and I get a few checksum mismatches a week reported. Not sure if it is problem with my code or Bitfinex, but better to have false alarms than trade with wrong data :)

I would not recommend to merge the code yet, it definitely could use some refactoring.

Interesting note: I've mosty recycled the official javascript example for the checksum calculation. I had big issue to correctly format my java BigDecimals same way as javascript would do it. None of BigDecimal's toString methods gives the same results. If you know about any formatter which would give the same result, great ;) Finally I've used 🎺 Nashorn 🎺. Initial experiments were painfully slow, but my current version does not seem to noticeably slow the app down. And to make sure it is really fast I calculate the checksums only once a minute and hope there will not be much harm done in the meantime.

About the ordering you've mentioned: I guess websocket sequencing - exactly the one I was trying in the initial post in this issue - should be working for all Bitfinex ws streams, not only orderbooks.

was this issue ever resolved?