DB::flush_count overflow
Opened this issue ยท 72 comments
I got a case where the DB::flush_count
went over 65535 and a full database resync was needed.
Normally the flush_count
would overflow after about 455 days in Bitcoin if we were indexing a fully synced node. However this is not the case if the node is not fully synced and ElectrumX manages to catch up the full node. In that scenario ElectrumX will quickly exhaust the 16bit space of the flush_count
variable as it flushes to disk after each block.
Also several coins have shorter block intervals thus hitting the limit in as little as a couple of months.
I was surprised it took this long for a complaint. I did some work on this over a month ago but it is incomplete; I will pick it up again soon.
Is that this problem?
May 27 17:07:49 bohol electrumx_server.py[1755]: ERROR:Controller:Traceback (most recent call last):
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/controller.py", line 157, in on_future_done
May 27 17:07:49 bohol electrumx_server.py[1755]: future.result()
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/block_processor.py", line 201, in main_loop
May 27 17:07:49 bohol electrumx_server.py[1755]: await task()
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/block_processor.py", line 216, in first_caught_up
May 27 17:07:49 bohol electrumx_server.py[1755]: await self.controller.run_in_executor(self.flush, True)
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/controller.py", line 137, in run_in_executor
May 27 17:07:49 bohol electrumx_server.py[1755]: return await self.loop.run_in_executor(None, func, *args)
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 55, in run
May 27 17:07:49 bohol electrumx_server.py[1755]: result = self.fn(*self.args, **self.kwargs)
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/block_processor.py", line 366, in flush
May 27 17:07:49 bohol electrumx_server.py[1755]: self.flush_history(self.history)
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/db.py", line 437, in flush_history
May 27 17:07:49 bohol electrumx_server.py[1755]: flush_id = pack('>H', self.flush_count)
May 27 17:07:49 bohol electrumx_server.py[1755]: struct.error: 'H' format requires 0 <= number <= 65535
Yes.
I'm not a programmer. Practically what is the problem?
There is a risk to lose the wallet?
ElectrumX is not a wallet. It's a public blockchain index. Your question doesn't make a lot of sense.
I'm sorry, I made a mistake!
I have confused Electrumx with "Electrum Bitcoin Wallet" (electrum.org)
@erasmospunk - can you try the compact-hist script from 2f26e81. Once I have a couple of reports it works I'll have it run in the background of a running server.
@kyuupichan I didn't keep the databases that had this issue but I run it on Dash that is near the limit:
INFO:root:Starting history compaction...
INFO:DB:switching current directory to /path/to/dash/electrumxdb
INFO:DB:using rocksdb for DB backend
INFO:DB:opened DB for serving
INFO:DB:software version: ElectrumX 1.0.11
INFO:DB:DB version: 5
INFO:DB:coin: Dash
INFO:DB:network: mainnet
INFO:DB:height: 691,997
INFO:DB:tip: 000000000000128ef39c8088a08b53674c1fe26750a85ff64a8e51ef643590a1
INFO:DB:tx count: 3,032,454
INFO:DB:flush count: 63,405
INFO:DB:deleted 1,547 stale undo entries
INFO:DB:reorg limit is 200 blocks
INFO:DB:history compaction: wrote 568,616 rows (8.0 MB), removed 1,172,466 rows, largest: 4, 8.1% complete
INFO:DB:history compaction: wrote 584,118 rows (8.0 MB), removed 1,198,997 rows, largest: 4, 16.5% complete
INFO:DB:hashX 06dcc60fe4d28abe8abd33 is large: 97,931 entries across 8 rows
INFO:DB:history compaction: wrote 523,761 rows (8.0 MB), removed 1,093,185 rows, largest: 8, 23.9% complete
INFO:DB:history compaction: wrote 574,894 rows (8.0 MB), removed 1,176,844 rows, largest: 8, 32.1% complete
INFO:DB:hashX 988d8582fa6f328c94e95a is large: 63,793 entries across 6 rows
INFO:DB:hashX 5987058f7f58d339d76e5c is large: 97,607 entries across 8 rows
INFO:DB:history compaction: wrote 518,220 rows (8.0 MB), removed 1,079,053 rows, largest: 8, 39.5% complete
INFO:DB:history compaction: wrote 574,615 rows (8.0 MB), removed 1,180,103 rows, largest: 8, 47.7% complete
INFO:DB:history compaction: wrote 582,429 rows (8.0 MB), removed 1,188,457 rows, largest: 8, 56.0% complete
INFO:DB:history compaction: wrote 571,884 rows (8.0 MB), removed 1,172,848 rows, largest: 8, 64.2% complete
INFO:DB:hashX 527250e83e6e3a7452f5a6 is large: 51,791 entries across 5 rows
INFO:DB:hashX 6705914577688925e4a8b0 is large: 58,488 entries across 5 rows
INFO:DB:history compaction: wrote 520,474 rows (8.0 MB), removed 1,122,776 rows, largest: 8, 71.6% complete
INFO:DB:history compaction: wrote 574,083 rows (8.0 MB), removed 1,188,666 rows, largest: 8, 79.8% complete
INFO:DB:hashX ad744e1ecb43f49811decc is large: 89,738 entries across 8 rows
INFO:DB:history compaction: wrote 552,658 rows (8.0 MB), removed 1,143,296 rows, largest: 8, 87.7% complete
INFO:DB:history compaction: wrote 552,613 rows (8.0 MB), removed 1,150,632 rows, largest: 8, 95.5% complete
INFO:DB:history compaction: wrote 313,102 rows (4.3 MB), removed 641,720 rows, largest: 8, 100.0% complete
INFO:root:History compaction complete
The db was 743M before compaction and 589M after that.
I am also running Digibyte that will hit the limit in about a week.
@kyuupichan update for Dash and Digibyte that exceeded the 65535 flush count, the compaction worked correctly.
Thanks! I'll get something done so it isn't a separate script.
@kyuupichan how do I run compact-hist ?
do I run
python3 compact_history.py
in the db directory ?
@zilveer yes, but remember to set the same environment variables as when you run electrumx normally
Hi kyuupichan,
Running the compact_history script on a regular basis takes time and effort.
Since Fujicoin's block interval is 1 minute, it needs to be executed once every 1.5 months.
Can you improve the program to run automatically?
Or can you handle it in the exectrumX code?
Thanks,
motty
Same issue for Groestlcoin.
We have to run the compact_history script every 1.5 months (blocktime is 1 min).
@erasmospunk I heared from Jackie that you are adding Groestlcoin into Coinomi. Since that will need electrumx, it might affect Coinomi also.
@kyuupichan is there any automated process for this now?
@erasmospunk @kyuupichan how do I set same environment variables as I have for ElectrumX?
I have the environment variables located at /etc/electrumx.conf ,
I guess I should run something else then only?
python3 /db/compact_history.py
any help is appreciated.
regards
@zilveer no, you need to run the script manually
I just hit this on my Litecoin electrumx. Running compact_history.py resolved the issue.
INFO:root:Starting history compaction...
INFO:DB:switching current directory to /coins/ex-ltc
INFO:DB:using rocksdb for DB backend
INFO:DB:opened DB for serving
INFO:DB:software version: ElectrumX 1.2.1
INFO:DB:DB version: 6
INFO:DB:coin: Litecoin
INFO:DB:network: mainnet
INFO:DB:height: 1,361,447
INFO:DB:tip: 02460c78e710b3c6ab23befaaf655d676ede661cde441791598cb71312f352ca
INFO:DB:tx count: 20,614,241
INFO:DB:flush count: 65,535
INFO:DB:reorg limit is 800 blocks
INFO:DB:history compaction: wrote 482,955 rows (8.0 MB), removed 987,267 rows, largest: 3, 1.9% complete
INFO:DB:hashX a072489792368675f65305 is large: 56,801 entries across 5 rows
INFO:DB:history compaction: wrote 464,907 rows (8.0 MB), removed 932,056 rows, largest: 5, 3.8% complete
INFO:DB:history compaction: wrote 454,804 rows (8.0 MB), removed 937,715 rows, largest: 5, 5.6% complete
INFO:DB:history compaction: wrote 450,735 rows (8.0 MB), removed 929,326 rows, largest: 5, 7.4% complete
INFO:DB:hashX 2e3de1a540181006544315 is large: 70,983 entries across 6 rows
INFO:DB:hashX b1cd2adefcfb5489fe9e17 is large: 72,320 entries across 6 rows
INFO:DB:history compaction: wrote 456,680 rows (8.0 MB), removed 922,512 rows, largest: 6, 9.3% complete
INFO:DB:hashX 70a2ffae2ff6d4aab8811b is large: 53,440 entries across 5 rows
INFO:DB:history compaction: wrote 446,602 rows (8.0 MB), removed 903,667 rows, largest: 6, 11.1% complete
INFO:DB:history compaction: wrote 464,804 rows (8.0 MB), removed 925,790 rows, largest: 6, 12.9% complete
INFO:DB:history compaction: wrote 462,887 rows (8.0 MB), removed 931,290 rows, largest: 6, 14.8% complete
INFO:DB:hashX 14a49acbbfb25d8085e227 is large: 70,388 entries across 6 rows
INFO:DB:history compaction: wrote 473,612 rows (8.0 MB), removed 950,408 rows, largest: 6, 16.7% complete
INFO:DB:hashX 76682bce5f8d911cacf82d is large: 58,591 entries across 5 rows
INFO:DB:history compaction: wrote 471,765 rows (8.0 MB), removed 957,865 rows, largest: 6, 18.6% complete
INFO:DB:hashX 12c70f1c63c858b1600031 is large: 70,580 entries across 6 rows
INFO:DB:history compaction: wrote 465,127 rows (8.0 MB), removed 938,018 rows, largest: 6, 20.4% complete
INFO:DB:history compaction: wrote 471,832 rows (8.0 MB), removed 954,327 rows, largest: 6, 22.3% complete
INFO:DB:hashX 9e47242a86bd98676f0a3b is large: 54,213 entries across 5 rows
INFO:DB:hashX 612aed6e16ec8f7408ec3b is large: 58,731 entries across 5 rows
INFO:DB:hashX ab48cd40dc608fb60aad3c is large: 71,001 entries across 6 rows
INFO:DB:history compaction: wrote 453,687 rows (8.0 MB), removed 927,053 rows, largest: 6, 24.1% complete
INFO:DB:hashX 4345bbb80db07ad8f98041 is large: 96,875 entries across 8 rows
INFO:DB:history compaction: wrote 426,965 rows (8.0 MB), removed 859,309 rows, largest: 8, 25.9% complete
INFO:DB:history compaction: wrote 477,428 rows (8.0 MB), removed 971,735 rows, largest: 8, 27.8% complete
INFO:DB:history compaction: wrote 471,457 rows (8.0 MB), removed 946,487 rows, largest: 8, 29.7% complete
INFO:DB:hashX 70ae02791360ff114e194c is large: 55,089 entries across 5 rows
INFO:DB:hashX d4b2aaab343cda3f54f04d is large: 280,356 entries across 23 rows
INFO:DB:hashX f1d820315a973e4b43554f is large: 108,325 entries across 9 rows
INFO:DB:history compaction: wrote 385,106 rows (8.0 MB), removed 791,184 rows, largest: 23, 31.2% complete
INFO:DB:hashX 327bd393203f605ae97b51 is large: 122,202 entries across 10 rows
INFO:DB:hashX fe90f22034b04e8bf3c552 is large: 166,835 entries across 14 rows
INFO:DB:history compaction: wrote 407,680 rows (8.1 MB), removed 843,299 rows, largest: 23, 32.8% complete
INFO:DB:history compaction: wrote 483,171 rows (8.0 MB), removed 967,339 rows, largest: 23, 34.8% complete
INFO:DB:hashX aba8b9db4ce1cf087b4959 is large: 85,704 entries across 7 rows
INFO:DB:history compaction: wrote 439,544 rows (8.0 MB), removed 886,004 rows, largest: 23, 36.5% complete
INFO:DB:hashX f9ae44fd1e7bda3c656d5e is large: 63,673 entries across 6 rows
INFO:DB:hashX f4ee90d4c3b6669e315261 is large: 70,803 entries across 6 rows
INFO:DB:history compaction: wrote 422,720 rows (8.0 MB), removed 876,016 rows, largest: 23, 38.2% complete
INFO:DB:history compaction: wrote 498,671 rows (8.0 MB), removed 1,007,629 rows, largest: 23, 40.2% complete
INFO:DB:history compaction: wrote 463,089 rows (8.1 MB), removed 930,923 rows, largest: 23, 42.1% complete
INFO:DB:history compaction: wrote 486,407 rows (8.0 MB), removed 988,294 rows, largest: 23, 44.0% complete
INFO:DB:history compaction: wrote 486,905 rows (8.0 MB), removed 972,226 rows, largest: 23, 46.0% complete
INFO:DB:hashX 460e609124a1b7e6a9db76 is large: 98,251 entries across 8 rows
INFO:DB:history compaction: wrote 456,768 rows (8.0 MB), removed 935,789 rows, largest: 23, 47.8% complete
INFO:DB:history compaction: wrote 503,498 rows (8.0 MB), removed 1,011,865 rows, largest: 23, 49.9% complete
INFO:DB:history compaction: wrote 471,366 rows (8.0 MB), removed 952,125 rows, largest: 23, 51.7% complete
INFO:DB:hashX 0b53a2945e4ce710ed9284 is large: 55,680 entries across 5 rows
INFO:DB:hashX 9f3939d9123e4117a44b87 is large: 70,546 entries across 6 rows
INFO:DB:hashX 517d73a86d55f143c28487 is large: 70,879 entries across 6 rows
INFO:DB:history compaction: wrote 428,641 rows (8.0 MB), removed 856,422 rows, largest: 23, 53.5% complete
INFO:DB:history compaction: wrote 488,172 rows (8.0 MB), removed 986,213 rows, largest: 23, 55.4% complete
INFO:DB:history compaction: wrote 481,533 rows (8.0 MB), removed 973,207 rows, largest: 23, 57.4% complete
INFO:DB:history compaction: wrote 486,085 rows (8.0 MB), removed 992,084 rows, largest: 23, 59.3% complete
INFO:DB:history compaction: wrote 469,129 rows (8.0 MB), removed 944,652 rows, largest: 23, 61.2% complete
INFO:DB:history compaction: wrote 493,875 rows (8.0 MB), removed 1,004,869 rows, largest: 23, 63.2% complete
INFO:DB:history compaction: wrote 482,323 rows (8.0 MB), removed 979,647 rows, largest: 23, 65.1% complete
INFO:DB:history compaction: wrote 479,415 rows (8.0 MB), removed 972,942 rows, largest: 23, 67.0% complete
INFO:DB:history compaction: wrote 466,582 rows (8.0 MB), removed 928,960 rows, largest: 23, 68.9% complete
INFO:DB:history compaction: wrote 500,128 rows (8.0 MB), removed 1,007,263 rows, largest: 23, 70.9% complete
INFO:DB:history compaction: wrote 497,630 rows (8.0 MB), removed 999,766 rows, largest: 23, 72.9% complete
INFO:DB:history compaction: wrote 449,140 rows (8.0 MB), removed 910,120 rows, largest: 23, 74.7% complete
INFO:DB:history compaction: wrote 472,171 rows (8.0 MB), removed 952,386 rows, largest: 23, 76.6% complete
INFO:DB:hashX fdb2e553d5ac12c9f011c4 is large: 65,798 entries across 6 rows
INFO:DB:history compaction: wrote 474,040 rows (8.0 MB), removed 964,055 rows, largest: 23, 78.5% complete
INFO:DB:hashX bd32e44c16f1819dbffaca is large: 52,384 entries across 5 rows
INFO:DB:history compaction: wrote 459,667 rows (8.0 MB), removed 932,956 rows, largest: 23, 80.3% complete
INFO:DB:history compaction: wrote 472,210 rows (8.1 MB), removed 947,540 rows, largest: 23, 82.2% complete
INFO:DB:history compaction: wrote 502,901 rows (8.2 MB), removed 997,117 rows, largest: 23, 84.2% complete
INFO:DB:history compaction: wrote 478,412 rows (8.0 MB), removed 965,630 rows, largest: 23, 86.1% complete
INFO:DB:history compaction: wrote 450,143 rows (8.2 MB), removed 947,434 rows, largest: 23, 88.0% complete
INFO:DB:history compaction: wrote 470,232 rows (8.0 MB), removed 943,633 rows, largest: 23, 89.8% complete
INFO:DB:hashX 0af1de3cc97b0bbd0c6de7 is large: 98,155 entries across 8 rows
INFO:DB:history compaction: wrote 447,901 rows (8.0 MB), removed 907,270 rows, largest: 23, 91.6% complete
INFO:DB:hashX f0364777f30a9a9a3488ef is large: 70,782 entries across 6 rows
INFO:DB:history compaction: wrote 480,832 rows (8.2 MB), removed 982,766 rows, largest: 23, 93.6% complete
INFO:DB:hashX 990364d3177668d0d409f3 is large: 71,746 entries across 6 rows
INFO:DB:history compaction: wrote 462,264 rows (8.0 MB), removed 930,576 rows, largest: 23, 95.4% complete
INFO:DB:hashX de9d8433aecb5447d549f4 is large: 61,524 entries across 5 rows
INFO:DB:hashX d61462346b185bcc0f99f4 is large: 70,530 entries across 6 rows
INFO:DB:history compaction: wrote 454,200 rows (8.0 MB), removed 911,703 rows, largest: 23, 97.2% complete
INFO:DB:hashX 73a0cced83973163ac9dfd is large: 53,210 entries across 5 rows
INFO:DB:history compaction: wrote 488,369 rows (8.1 MB), removed 981,747 rows, largest: 23, 99.2% complete
INFO:DB:history compaction: wrote 197,921 rows (3.2 MB), removed 415,361 rows, largest: 23, 100.0% complete
INFO:root:History compaction complete
Is this safe to run while the server is operational, or does the service need to be stopped (which is what i did this time cause of the error anyways)
It has to be stopped as only one process can open the DB at a time.
Thanks for the quick reply.
Now the question is how can we streamline this so an admin doesn't have to do maint on coins every month or so.
I was seeing the CRITICAL:root:ElectrumX server terminated abnormally
error on Litecoin as well and running compact_history.py
resolved the issue for me. If it's unclear how to integrate this into daemon, it would be great to have at least mention in the docs...
The answer is to do history differently. I'm working on that, but no ETA.
Also ran into this for the first time and was quite confused. Good to hear you are working on this!
I encountered the same issue with Litecoin.
ERROR:root:task crashed: <Task finished coro=<Controller.serve() done, defined at /electrumx/electrumx/server/controller.py:79> exception=error("'H' format requires 0 <= number <= 65535")>
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/aiorpcX-0.8.2-py3.7.egg/aiorpcx/util.py", line 118, in check_task
task.result()
File "/electrumx/electrumx/server/controller.py", line 129, in serve
serve_externally_event.set()
File "/usr/local/lib/python3.7/site-packages/aiorpcX-0.8.2-py3.7.egg/aiorpcx/curio.py", line 241, in __aexit__
await self.join(wait=self._wait)
File "/usr/local/lib/python3.7/site-packages/aiorpcX-0.8.2-py3.7.egg/aiorpcx/curio.py", line 214, in join
raise task.exception()
File "/electrumx/electrumx/server/block_processor.py", line 663, in fetch_and_process_blocks
await self.flush(True)
File "/electrumx/electrumx/server/block_processor.py", line 334, in flush
await self.run_in_thread_with_lock(flush)
File "/electrumx/electrumx/server/block_processor.py", line 190, in run_in_thread_with_lock
return await asyncio.shield(run_in_thread_locked())
File "/electrumx/electrumx/server/block_processor.py", line 189, in run_in_thread_locked
return await run_in_thread(func, *args)
File "/usr/local/lib/python3.7/site-packages/aiorpcX-0.8.2-py3.7.egg/aiorpcx/curio.py", line 61, in run_in_thread
return await get_event_loop().run_in_executor(None, func, *args)
File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/electrumx/electrumx/server/block_processor.py", line 333, in flush
self.estimate_txs_remaining)
File "/electrumx/electrumx/server/db.py", line 200, in flush_dbs
self.flush_history()
File "/electrumx/electrumx/server/db.py", line 269, in flush_history
self.history.flush()
File "/electrumx/electrumx/server/history.py", line 129, in flush
flush_id = pack_be_uint16(self.flush_count)
struct.error: 'H' format requires 0 <= number <= 65535
WARNING:Controller:received SIGTERM signal, initiating shutdown
However, I didn't realise and restarted the server. No errors were displayed when the server was started manually after that, I only realised that my client couldn't connect. The logs were reading:
INFO:Prefetcher:catching up to daemon height 1,506,052 (2,525 blocks behind)
INFO:BlockProcessor:our height: 1,503,537 daemon: 1,506,052 UTXOs 0MB hist 0MB
INFO:BlockProcessor:processed 10 blocks in 0.1s
INFO:BlockProcessor:processed 500 blocks in 2.4s
INFO:BlockProcessor:processed 844 blocks in 2.3s
INFO:BlockProcessor:processed 469 blocks in 1.9s
INFO:BlockProcessor:processed 702 blocks in 2.3s
INFO:BlockProcessor:caught up to height 1506052
INFO:DB:flushed filesystem data in 0.00s
INFO:BlockProcessor:flushing to DB for a clean shutdown...
INFO:DB:flushed filesystem data in 0.00s
I would suggest to display an error message on startup when the server cannot continue until the database is compacted. The only hint in the logs above was that flushing took 0.00s, but that wasn't helpful identifying the cause.
Compaction was the solution, all I did was shutting down the server and running the following command with the same environment as the server:
./compact_history.py
The exception is being swallowed in the latter case; I will look into it
No, you're using leveldb
, as you can see on the screenshot.
If you want to use rocksdb, just set DB_ENGINE
to rocksdb
.
how to apply same environment variables while running compact_history.py ?
@kyuupichan maybe you can help? my config is in external file: /home/electrumx/myconf.conf
I tried to edit ~/.bash_profile but with no success
@vetal20600 this isn't the place for basic Unix sysadmin stuff. To pass environment variables you can just precede them on the command line:
$ A=B command
sets envvar A to B and runs command, for example. or use envdir
from daemontools like I mention in the docs.
wouldn't it be possible to run this function automatically instead of throwing an error message?
@kyuupichan Thanks for pointing me to this pre-existing reported issue! Was not sure what to search. What kind of support/help do you think you need to possibly have this automated or caught properly?
Interim, would you instead suggest some sort of OS-level CRON job maybe shut the server down, compact the history, and start it up again every other week?
Week? This happens once a year at most on sane coins. If your coin has 1 block every 5 seconds then ....
The blockchain I am currently running this with generates a new block approx. every minute, each block contains at minimum two transactions (one for generation and one for rewards). Not sure how I hit this so soon, though looking back in the comments here there are other coins with my similar blocktime (and estimated refresh requirement) so this seems to be an issue that could be discussed here and a solution determined :)
I do not see myself proficient in Python, but wouldn't mind attempting to help where possible if you had some context that could be provided as to what the flush count is about and what compact history does.
I run ltc testnet and have to compact every ~2 weeks.
In any case, the way it fails is sub-optimal. The server still runs, but does not sync, so clients don't even notice something is off
Would be nice to fix the issue, or at least fail critically with a good error message to point out how to fix it.
The answer is to do history differently. I'm working on that, but no ETA.
Any progress? :)
No. I doubt I will fix this except as part of a major rework. No guarantee on either. Given almost no one contributes anything to the codebase other than new crapcoins, don't hold your breath
๐ Yeah I see that. Like I said, I'd like to help and contribute but having no experience with Python makes it a little harder to start. I also figure this market doesn't seem very lucrative to want to invest more of your time into this.
All that said, why do you think this could only be fixed with major rework? I mentioned I'd like to help where I can, but would need some guidance on to where to start. The codebase looks nice, but there doesn't seem to be much commentary as to what various methods are doing or their purpose. While I could spend hours or days looking at things and interpreting what you might be doing in a scenario, if I had more details to go off of it would make contributing back easier
weird, happening to me here on testnet
Dec 31 00:51:13 bitcoin electrumx_server[12131]: INFO:BlockProcessor:processed 20 blocks in 0.7s
Dec 31 00:51:14 bitcoin electrumx_server[12131]: INFO:BlockProcessor:processed 14 blocks in 0.3s
Dec 31 00:51:16 bitcoin electrumx_server[12131]: INFO:BlockProcessor:our height: 1,381,416 daemon: 1,450,220 UTXOs 11MB hist 346MB
Dec 31 00:51:17 bitcoin electrumx_server[12131]: INFO:DB:flushed filesystem data in 0.69s
Dec 31 00:51:17 bitcoin electrumx_server[12131]: INFO:BlockProcessor:flushing to DB for a clean shutdown...
Dec 31 00:51:17 bitcoin electrumx_server[12131]: INFO:DB:flushed filesystem data in 0.00s
Dec 31 00:51:17 bitcoin electrumx_server[12131]: INFO:SessionManager:closing down RPC listening servers
Dec 31 00:51:17 bitcoin electrumx_server[12131]: ERROR:root:task crashed: <Task finished coro=<Controller.serve() done, defined at /opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/controller.py:81> exception=error("'H' format requires 0 <= number <= 65535")>
Dec 31 00:51:17 bitcoin electrumx_server[12131]: Traceback (most recent call last):
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/aiorpcx/util.py", line 118, in check_task
Dec 31 00:51:17 bitcoin electrumx_server[12131]: task.result()
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/controller.py", line 134, in serve
Dec 31 00:51:17 bitcoin electrumx_server[12131]: await group.spawn(wait_for_catchup())
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/aiorpcx/curio.py", line 231, in __aexit__
Dec 31 00:51:17 bitcoin electrumx_server[12131]: await self.join()
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/aiorpcx/curio.py", line 200, in join
Dec 31 00:51:17 bitcoin electrumx_server[12131]: raise task.exception()
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 662, in fetch_and_process_blocks
Dec 31 00:51:17 bitcoin electrumx_server[12131]: await self.flush(True)
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 335, in flush
Dec 31 00:51:17 bitcoin electrumx_server[12131]: await self.run_in_thread_with_lock(flush)
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 191, in run_in_thread_with_lock
Dec 31 00:51:17 bitcoin electrumx_server[12131]: return await asyncio.shield(run_in_thread_locked())
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 190, in run_in_thread_locked
Dec 31 00:51:17 bitcoin electrumx_server[12131]: return await run_in_thread(func, *args)
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/aiorpcx/curio.py", line 61, in run_in_thread
Dec 31 00:51:17 bitcoin electrumx_server[12131]: return await get_event_loop().run_in_executor(None, func, *args)
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
Dec 31 00:51:17 bitcoin electrumx_server[12131]: result = self.fn(*self.args, **self.kwargs)
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 334, in flush
Dec 31 00:51:17 bitcoin electrumx_server[12131]: self.estimate_txs_remaining)
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/db.py", line 200, in flush_dbs
Dec 31 00:51:17 bitcoin electrumx_server[12131]: self.flush_history()
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/db.py", line 269, in flush_history
Dec 31 00:51:17 bitcoin electrumx_server[12131]: self.history.flush()
Dec 31 00:51:17 bitcoin electrumx_server[12131]: File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/history.py", line 129, in flush
Dec 31 00:51:17 bitcoin electrumx_server[12131]: flush_id = pack_be_uint16(self.flush_count)
Dec 31 00:51:17 bitcoin electrumx_server[12131]: struct.error: 'H' format requires 0 <= number <= 65535
That's not weird at all; it happens to any and all coins once they have enough blocks since the prior compaction
Can confirm the script works as advertised on Bitcoin mainnet. If anyone is interested, this is a oneliner I used to set the proper conf values and run it:
export $(cat /etc/electrumx.conf | xargs) && python3.7 compact_history.py
@whizz Do you have that script set to run on a cronjob every so often?
Do you have that script set to run on a cronjob every so often?
@Manbearpixel Not really, as this was the first time this happened to me. In Bitcoin, it's bound to happen every 15 months, so you could probably set it to 1 year schedule, but for these kinds of jobs that happen only very rarely, I rather set a reminder and run it manually, because a lot of things can happen in a year.
If you were to set it in cron, you'd have to stop electrumx before however and then start it again. This depends on your system of choice, in my case that would be something like:
systemctl stop electrumx && \
export $(cat /etc/electrumx.conf | xargs) && \
python3.7 compact_history.py && \
systemctl start electrumx
FYI: Given that #681 was merged, the script is now electrumx_compact_history
, instead of compact_history.py
. The script is installed alongside the main electrumx_server
when you do the install.
Here's how I run it.
su - <user> -s /bin/bash
export DB_DIRECTORY=/mnt/bitcoin/electrumx/testnet
export COIN=BitcoinSegwit
export NET=testnet
/opt/electrumx_venv/bin/python3.7 /opt/electrumx_compact_database.py
I've just started getting same error:
Nov 06 21:57:23 odroid-hc1 electrumx[26635]: ERROR:root:task crashed: <Task finished coro=<Controller.serve() done, defined at /opt/python3.6/lib/python3.6/site-packages/electrumX-1.13.0-py3.6.egg/electrumx/server/controller.py:81> exception=error("'H' format requires 0 <= number <= 65535",)>
I'll try that compacting stuff.
This is much harder to do if you are unfortunate enough to be using docker.
Why?
Hi,
Could it be possible to get an exhaustive, pretty simple and comprehensive explanation of the @erasmospunk 's sentence "However this is not the case if the node is not fully synced and ElectrumX manages to catch up the full node" ?
I'm new to the blockchain world, and need to depoy ElectrumX in a professionnal environment, so I really need to understand this limitation which happened twice on our side, so that I can explain it back to the guys I work with. If the explanation I give is clear, I'll know I'll know I understood correctly.
Here is what I understand:
- the ElectrumX node is "simply" a public index of the Blockchain, which permits faster exchanges through a cache DB (either pyLevel or RocksDB. Is one more stable than the other ? Does the choice impact the way indexation and flushes are executed ?).
- the ElectrumX server synchronises itself with the bitcoin node provided by the DAEMON_URL environment variable
- while the ElectrumX node constantly synchronizes with the bitcoin node available at DAEMON_URL, it tries to flush history and clean up things in the DB in the meantime (this is not very clear to me right now), and sometimes, it reaches a limit of 65k blocks flushed or so and crashes ?
This last part is not obvious for me, so if you can take some time to explain, I would be very grateful.
Anyway thanks a lot for you work on this project.
Have a good day,
A flush count is stored in keys in the database in 2 bytes, to save space, so when the number of flushes to the DB reaches 2^16 it fails. The script compact_history compacts the history to a smaller number of flushes.
Whilst it would be possible to increase the 2 bytes to 4 say, really the DB does need to be compacted on a regular basis, and so I have not done this.
Thank you very much for your (very fast) answer.
We're thinking about going for 2 instances of ElectrumX server so that one is still available while the other has its history compacted. A load-balancer would be in front of these 2 instances.
This way we could use the script on a regular basis, say every week or so, and the things could go smoothly.
Thanks again.
I get this problem when i run the script like this (Environment variables are exported from electrumx.conf):
python3 compact_history.py
Error:
Traceback (most recent call last):
File "compact_history.py", line 33, in
from server.env import Env
ModuleNotFoundError: No module named 'server'
In the script there are these both lines:
from server.env import Env
from server.db import DB
What am I doing wrong?
I'm now hit by this issue too (ElectrumX 1.14.0 running inside Docker). Is there now an easy way to fix this?
Answering to myself: I ran docker-compose run electrumx /bin/sh
, then inside the container I executed /electrumx/electrumx_compact_history
. It appears it has fixed the database problem.
@schildbach as stated before by other guys, you need to schedule this to run "often".
By "often", I mean: it will depend on the network you'll be using (testnet, mainnet...)
In an enterprise context, on testnet, we need to schedule a compact history every 2 weeks but we do this every week, to keep it consistent with mainnet, which needs to be compacted every week to be sure it's gonna keep on living without any crash.
This number may vary in the future, as it already varied. The biggest the blockchain the more often you need to run it ? That's what has been said previously in this thread. Let's hope we won't have to do this every day :p
Note that you can directly use docker-compose exec electrumx electrumx_compact_history
without the need to enter the service manually. This will help with scripting and cron.
Why?
Because if you have to do anything inside the container while the process is stopped, you have to do it in a side car launched with the exact same parameters, env, and volumes.
If you stop PID 1 in a docker container, the container dies.
Yet another reason I really dislike docker.
@nyetwurk just use the same docker run ...
but with a different entrypoint or cmd. You have to supply the same env/params with or without docker, so really there is no difference.
@nyetwurk just use the same
docker run ...
but with a different entrypoint or cmd. You have to supply the same env/params with or without docker, so really there is no difference.
For a complex container started with a deployment tool like ansible that becomes a total mess. Docker still inexpicably lacks a reliable way to convert docker inspect
to a docker run
command line though there are third party hack tools that do a passable job on occasion.
If you want to get rid of regular database compaction tasks, use the electrs server. Electrs implements automatic database compaction. It has been a month since it started up, but it is operating stably.
Electrs github:
https://github.com/romanz/electrs
https://github.com/romanz/electrs/blob/master/doc/usage.md
If you use Docker-based installation, you can compile it without fail, but it's a little troublesome. Manual installation may not compile depending on the OS version. Ubuntu 18.04 will compile without problems. If that doesn't work, there is a way to get the binaries compiled with docker out of the container.
Failed to do history compaction with same error that I got for electrumx daemon using 1.15.0
Any ways to fix without syncing from scratch?
INFO:root:Starting history compaction...
INFO:electrumx.server.db.DB:switching current directory to /data
INFO:electrumx.server.db.DB:using leveldb for DB backend
INFO:electrumx.server.db.DB:opened UTXO DB (for sync: True)
INFO:electrumx.server.db.DB:UTXO DB version: 8
INFO:electrumx.server.db.DB:coin: BitcoinSegwit
INFO:electrumx.server.db.DB:network: mainnet
INFO:electrumx.server.db.DB:height: 648,180
INFO:electrumx.server.db.DB:tip: 0000000000000000000821bfdee3af8ac2ba17183e7b0d7f39c527b43477e8c3
INFO:electrumx.server.db.DB:tx count: 568,016,657
INFO:electrumx.server.db.DB:flushing DB cache at 1,200 MB
INFO:electrumx.server.history.History:history DB version: 1
INFO:electrumx.server.history.History:flush count: 65,535
Traceback (most recent call last):
File "/home/electrumx/.local/bin/electrumx_compact_history", line 73, in main
loop.run_until_complete(compact_history())
File "/usr/local/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
return future.result()
File "/home/electrumx/.local/bin/electrumx_compact_history", line 51, in compact_history
await db.open_for_compacting()
File "/home/electrumx/.local/lib/python3.7/site-packages/electrumx/server/db.py", line 150, in open_for_compacting
await self._open_dbs(True, True)
File "/home/electrumx/.local/lib/python3.7/site-packages/electrumx/server/db.py", line 147, in _open_dbs
await self._read_tx_counts()
File "/home/electrumx/.local/lib/python3.7/site-packages/electrumx/server/db.py", line 115, in _read_tx_counts
assert len(tx_counts) == size
AssertionError
CRITICAL:root:History compaction terminated abnormally
@vlddm I think that AssertionError suggests that your DB got corrupted. Unfortuntely you probably need to sync from scratch.
Running electrumx in docker; and after upgrading to 1.15 I got bumped into this bug.
What I did to workaround is these steps:
- while container running I opened a shell into container
sudo docker exec -it <container-id> sh
(to get container id one can runsudo docker ps
) - edit /electrumx/electrumx_server file
vi /electrumx/electrumx_server
- added 3 lines of code prior to the line
if __name__ == ...
import os, time
while os.path.exists('/data/stop4service'):
time.sleep(10)
- stopped the container
- created the file named 'stop4service' on the host where container's /data folder is mapped
- started the container
- opened a shell into container
sudo docker exec -it <container-id> sh
- now that container is running but doing nothing, execute compact history command
python3 /electrumx/electrumx_compact_history
- after history compact command finished it's ok to remove the 'stop4service' file - the container will resume
If we're gonna live with this compact_history stuff it'd be great to have such a pause for service merged into the electrumx_server script.
I am considering making some changes to the db schema, and as part of that have been looking at this issue.
Currently the history db schema is:
# Key: address_hashX + flush_id
# Value: sorted "list" of tx_nums in history of hashX
I'm thinking it could be changed to:
# Key: address_hashX + tx_num
# Value: <null>
This would completely do away with flush_id and compaction; although I am not sure how it would affect performance.
I quite like it that items would be constant size.
Re db size on disk, I guess it would increase it somewhat, hard to tell by how much though as it depends on address reuse and whether the db is compacted:
- worst case for current schema is if all tx_nums are in separate items (without compaction): 18 bytes per tx
- "average" (guess) case for current schema - two tx_nums for each address (fund-then-spend): 11.5 bytes per tx
- best hypothetical case for current schema is if all txs in the chain touch the same address, compacted: 5 bytes per tx
- with new schema, there would be no variance: 16 bytes per tx
What do you think @kyuupichan ?
To be clear, this is about the history db, so for current reference size (depends on when last compaction was though!), see:
$ du -h .
18G ./meta
4.4G ./utxo
33G ./hist
55G .
Seems worth trying, let me know how it goes.
I get this problem when i run the script like this (Environment variables are exported from electrumx.conf):
python3 compact_history.pyError:
Traceback (most recent call last):
File "compact_history.py", line 33, in
from server.env import Env
ModuleNotFoundError: No module named 'server'In the script there are these both lines:
from server.env import Env
from server.db import DBWhat am I doing wrong?
I also get this error, not sure what to do.
Running in Debian w/ systemD, with LevelDb. Piping in environment with this command:
export $(cat /etc/electrumx.conf | xargs) && python3.7 /var/lib/electrumx/elect_compact.py
Also tried a similar thing in systemd with environment files, which resulted in the same error.
It seems like something is missing to define server.env, or possibly must I change server.env to something that matches my configuration?
The electrumx server operates and executes normally otherwise, outside of database compaction issues.
I was able to resolve this issue with the script mentioned in this post:
spesmilo/electrumx#88 (comment)
It seems that the script in use above has since changed and is no longer applicable to our environment.
I encountered this same struct 'H' format 65536 bug with a testnet BitcoinSegwit setup. The compact_history script completed, but I see the same error when restarting the service. EDIT: Re-running it a second time seems to have fixed all errors and the binary is running.
May 15, 2017
One possible terrible hack (since it is pretty clear this isn't going to be fixed): run compact on startup if the db exists. That way, if it crashes, it will just run the compact script when it is run agian.
Again, why not build compaction into electrumx itself? So at least it can be done regularly (and possibly even automatically on startup) so it can continue to be run in docker.
Do you have a patch?
Do you have a patch?
I would if spesmilo/electrumx#88 (comment) worked
And you wouldn't like it much, because it would likely be a hack to the docker container to just run it unconditionally on start up via a bash script