Transactions still dropping (intermittent)

Question

Transactions still dropping (intermittent)

Closed this issue 5 years ago · 31 comments

Tx sent to node via API spend, tx id is generated. Tx lost /not included in a block.

Answer 1 · 2018-08-08T07:10:41.000Z

We had issues where broadcasting Amoveo transactions from our hot wallet, one immediately succeeded by the next one, resulted in only the last one being registered by the node.

All the transactions were responded with 200 status, but they were not included in the next blocks.

This has happened 3 times.

A workaround of waiting for 30 seconds between the broadcasts solved the issue.

Answer 2 · 2018-08-08T21:56:18.000Z

I can verify that it is happening again every now and then, i pause around 2s between every spend.

Answer 3 · 2018-08-09T05:18:28.000Z

There was a bug in the "max spend" feature on the light node. I pushed a fix to github, but haven't updated my nodes yet. That should fix some cases of this problem.

Can I get more information about the other cases?
How badly are transaction filling up the mining pools? How many txs can fit in a block easily?

Answer 4 · 2018-08-09T07:14:28.000Z

As @hantuzun stated we came across this issue at amoveo.exchange, this wasn't an issue up until recently.
Whenever we broadcast transactions one after another without any time delays between broadcasts, even though we get a 200 status from the node for every single transaction, 1 in x transactions actually reach the mempool where x isn't a number that would even come close to filling up an entire block.

PS: we are not using "max spend"

Answer 5 · 2018-08-11T18:41:14.000Z

Yes, it seems this problem went into "remission" for some time, and has reared its ugly face again.

It reduces confidence in the network, exchanges, and mining pools, because users are unsure who to blame. It affects all of us, we should get this solved ASAP.

Answer 6 · 2018-08-11T18:46:01.000Z

So is this only happening when you try to broadcast more than 1 tx?
If we switch to using multi-tx, does that fix it?

Answer 7 · 2018-08-11T20:00:47.000Z

even if multi tx works, spends shouldn’t have this behavior,

Zack - do you have enough information to recreate bug on your end?

Answer 8 · 2018-08-11T20:02:54.000Z

I have not seen this bug happen at all.

Answer 9 · 2018-08-11T20:36:14.000Z

When running tests to try to recreate, it may be good to have some test nodes not run 'stock'. Some nodes on our network may be modified, and the network should be robust if it's someone's modified node causing the issue.

One example (but not the only example) is this number in potential_block.erl https://github.com/zack-bitcoin/amoveo/blob/master/apps/amoveo_core/src/consensus/chain/potential_block.erl#L7 it should be assumed that some block-winning nodes have changed that number, and we should test to see if the bug can be reproduced in testing when it is changed

Answer 10 · 2018-08-12T01:41:33.000Z

It seems to be affecting my first broadcasted txs if it happens.

The pool always pays from big to small yet if i didnt mine the block myself so far only the bottom of my list got included even tho the other ones were added and thus (hopefully) broadcasted first.

But ofc it doesn't happen all the time :( plus i switched to a more irregular payout style now so we might not be able to track this anymore if it only happens to pools...

Answer 11 · 2018-08-12T08:16:18.000Z

I second this bug be fixed first. I haven't been able to reproduce it constantly but I'll let you know if/when I do. Having this bug decreases trust in the system resulting in avoidance type behavior in sending txs.

This bug -> Less tx
This bug fixed -> More tx

Answer 12 · 2018-08-12T17:28:01.000Z

What percentage of txs are being lost? Does this only happen if you make a tx immediately before a block is found?

Answer 13 · 2018-08-16T11:08:45.000Z

In my experience, this bug happens whenever we broadcast transactions in batch. It's usually a batch of 4 to 6 transactions, and only the last one is being registered by the nodes.

Answer 14 · 2018-08-27T10:12:35.000Z

I have a 2s pause between my pool spends, they still do fail a few times per day.

On the other hand my node fails to spend anything 100%, not sure if it is related but it is something we can work with.

I tried to send VEO out via http://78.46.149.239:8080/wallet.html and it never happened, tried 3-4 times on different block heights but its perfectly in sync...

Answer 15 · 2018-09-04T01:13:31.000Z

Just wanted to chime in here with my experience from qTrade. This has given us plenty of headaches in our deposit sweeping and withdraw flows. I've encountered and manually fixed this problem 10-20 times since we added VEO. Symptoms are:

We generate several transactions in rapid succession spending from the same address, incrementing our own nonce counter (bulk processing 10 deposits at once and sweeping, for instance).
Submission to our own node, and several other public nodes via txs method succeeds.
In the next block a random subset of the sent transactions get included, and this frequently nonce-orphans some or all of the remaining transactions.
Our software now waits 2 blocks to ensure that the orphan-causing transaction has confirmed and re-initiates the sweep, effectively just re-sending the orphaned tx with a new higher nonce.

My best guess is the block assembler is randomly picking transactions and incidentally orphaning others in the process. I think this is why ethereum requires account nonce's to be consecutive, although there may be other ways to solve it. For now at least preferring the lowest known nonce for a given account would help us a lot, since I believe in most cases the pools have all our transactions in their mempool before block solve. Apologies if my assumptions are off, I haven't spend much time in the mining code.

I will spend some time trying to create a python replication script to demonstrate the problem.

Answer 16 · 2018-09-04T03:29:18.000Z

@icook
It is a bad idea to make several txs at the same time.
It is better to use a multi-tx to bundle all your payments into one tx.
This way the network can't get mixed up and only include some of your txs.

Answer 17 · 2018-09-04T05:43:45.000Z

tbh this is a poor workaround no solution to the problem...consdering frequent discord messagers even single txs get dropped

Answer 18 · 2018-09-04T15:25:28.000Z

you also save money on tx fees by grouping txs together with multi-tx.

Answer 19 · 2018-09-04T17:13:21.000Z

Unless you changed that later, no you dont.

Thats usually the main reason to use it and it was the main reason why i tested it on day 0, fee is very much the same.

Answer 20 · 2018-09-04T17:30:29.000Z

governance fee is the same.
miner fee is less.

multi-tx uses less bytes per payment, so you can fit more payments in a block.

Answer 21 · 2018-09-04T17:31:05.000Z

I made an api for generating multi-tx, there is no advantage in avoiding this nice feature.

Answer 22 · 2018-09-04T17:50:40.000Z

We tested it so we know the api...and you keep saying that its cheaper but it isnt
http://explorer.veopool.pw/?input=27399

Answer 23 · 2018-09-04T18:18:05.000Z

If you don't use multi-tx, then a malicious miner could purposefully only include your high-nonce txs, to make all your low-nonce txs invalid.
It is in your interest to use multi-tx every time you are publishing more than one tx during a single block period.

@Sykh you are the one who decides how expensive miner fees are for including txs in your mining pool's blocks.
By default I think it is 1000 satoshis.

Answer 24 · 2018-09-04T18:19:02.000Z

I heard that in ethereum all the nonces have to come sequentially, this it is impossible for this kind of error to happen.
Maybe we should do a hard update for this type of behavior.

Answer 25 · 2018-09-04T21:31:15.000Z

A malicous node / pool (not the miner) can drop your multi tx aswell, there is no connection here.

Stop dancing around the issue pls

1. The node IS dropping txs, it doesnt matter if you think multi tx is better or what could happen from malicous sources, its a bug, it shouldnt happen
2. multi-tx isnt cheaper
3. i can decide jack and 1000 satoshi mining fee isnt really the issue when you charge over 60k for every multi-tx entry

Answer 26 · 2018-09-05T19:55:38.000Z

@zack-bitcoin We would like to take advantage of multi-tx, but the implementation work required for us to use it is a really bad time/value trade off. Fundamentally operations between deposits and withdraws are isolated, and there's no good way to integrate them easily. You're correct that it would help alleviate this issue some, but it is impractical for us to implement.

We're also encountering this on the user side. A withdraw from amoveo.exchange got orphaned, and several of our users deposit transactions have as well, causing support headaches since this is confusing to users (they don't know why it failed, and think it's something broken on our end). I think it's quite common for users to send say, 10% of their VEO to each exchange. They rapidly send two deposits, but only one ever gets mined.

To say that to guarantee delivery of transactions you should only every send one per block is unworkably restrictive.

To shed some more light on our implementation, for sending transactions we currently:

Push the signed transaction to our local node, all public mining pools, and Zack's nodes.
Every 60 seconds we scan for a new block, and if it's not confirmed we re-push the transaction.

This means the average transaction we send is being pushed to multiple public nodes 10-30 times before its confirmed. It's honestly been on my todo list to scale it back because this is a bit more spammy than I'd like, but even still we get orphaned transactions. With bitcoin and similar currencies I've only ever needed to push it to a single node.

If you do decide to fork to consecutive nonces-only please try to give us at least 1 week to adapt our integration. There are certain changes we'll need to make and test, and they take time.

Answer 27 · 2018-09-05T20:45:54.000Z

Just put a bit more work into my reproduction script, figured I'd share the findings:

Sending 10 consecutive transactions (nonce 1-10) from a single node, then manually triggering tx exchange and mining a single block appears to work correctly every time, all 10 are mined in the next block.
Sending those same transactions in reverse order results in an empty mempool, and only the very lowest nonce transaction is mined into the next block.
Sending the same transactions in random order results in a subset of them being accepted.

Sending transactions in nonce order [6, 8, 4, 7, 2, 3, 1, 5, 9, 10]
Sending to node 0
Sending to node 1
Sending to node 2
All mempools ======
Node 0; [1, 5] nonces
Node 1; [] nonces
Node 2; [] nonces
2018-09-05 15:28:06,240 [node_0         ] [INFO ] Mined block, height at 11, balance at 110.02175686
2018-09-05 15:28:06,389 [network        ] [INFO ] Nodes consensus state is [10, 11], full blocks
2018-09-05 15:28:06,557 [network        ] [INFO ] Nodes consensus state is [11], full blocks
All mempools ======
Node 0; [] nonces
Node 1; [] nonces
Node 2; [] nonces

A two second pause between pushing sending txs doesn't seem to help, there's still only a subset being accepted when not pushed in ascending order.

My hunch is that sorting mempools in nonce order before sending in https://github.com/zack-bitcoin/amoveo/blob/master/apps/amoveo_core/src/consensus/sync.erl#L347 may reduce dropped transactions. You may also be able to create a nonce grace range, where nonces of up to account_nonce + 10 will be retained in the mempool.

Just some thoughts, I'm still exploring root cause.

Answer 28 · 2018-09-06T00:39:22.000Z

The nonces are already being sent in order.

Answer 29 · 2018-09-06T00:42:26.000Z

Sharing txs too fast could cause this problem.
If you share all your txs with Bob, up to tx n-1.
Then you make a new tx N, and you rapidly send it to everyone.
Alice hasn't received your old txs from Bob, if she receives your new tx first in the rapid fire, then she will ignore all your old txs.

Answer 30 · 2018-12-30T22:40:58.000Z

This has been fixed right?
I haven't had a problem or had anyone complain about this in a long time.

Answer 31 · 2019-01-01T11:19:38.000Z

Not completly, it became much more rare tho.