Hardware-accelerated implementation for IOTA PearlDiver, which utilizes multi-threaded SIMD, FPGA and GPU.
dcurl exploits SIMD instructions on CPU and OpenCL on GPU. Both CPU and GPU accelerations can be enabled in multi-threaded execuction fashion, resulting in much faster proof-of-work (PoW) for IOTA. In addition, dcurl supports FPGA-accelerated PoW, described in docs/fpga-accelerator.md. dcurl can be regarded as the drop-in replacement for ccurl. IOTA Reference Implementation (IRI) adaptation is available to benefit from hardware-accelerated PoW.
Check docs/build-n-test.md for details.
Check docs/naming-convention.md for details.
$ make doc
After integrating dcurl into IRI, performance of attachToTangle is measured as the following.
- Setting: MWM = 14, 200 attachToTangle API requests with each containing 2 transactions
- Local CPU:
- AMD Ryzen Threadripper 2990WX 32-Core Processor
- 2 PoW tasks at the same time
- Each task uses 32 CPU threads to find nonce
- SIMD enabled
- Remote worker:
- The board with Intel/Altera Cyclone V SoC
- 1 PoW task at the same time in a board
- FPGA acceleration enabled
- Connected with local network
Except the original IRI, the other instances use the DLTcollab/IRI instead of iotaledger/IRI.
IRI version | attachToTangle API behavior | Effect |
---|---|---|
IOTA IRI | One transaction bundle at the same time (Synchronized) | Transactions of a bundle are calculated one by one |
DLTCollab IRI | Multiple transaction bundles at the same time | Transactions of different bundles compete for the PoW calculation resources |
The original IRI should be the slowest one since it does not contain any PoW acceleration. However, the graph is different from the expectation. This is caused by 2 factors:
- The graph shows the execution time of each API request instead of the overall throughput.
- The table shows that there are competition of the PoW resources, which means the execution time would be longer than expected.
And from the graph we can see that 4 remote workers would be a good choice to accelerate PoW.
Modified IRI accepting external PoW Library Supported IRI version: 1.7.0
Load the external dcurl shared library from the installed JAR file
$ cd ~/iri && make check
$ java -jar target/iri.jar -p <port> --pearldiver-exlib dcurl
Here is a partial list of open source projects that have adopted dcurl. If you are using dcurl and would like your projects added to the list, please send pull requests to dcurl.
- iota-gpu-pow: IOTA PoW node
- You can construct a IOTA PoW node, which uses
ccurl
by default - Generate a drop-in replacement for
ccurl
and acquire performance boost.$ make BUILD_COMPAT=1 check
$ cp ./build/libdcurl.so <iota-gpu-pow>/libccurl.so
- iota.keccak.pow.node.js: IOTA PoW node using the iota.keccak.js implementation
- iota.keccak.js can be used to build high performance node-js spammers but is also capable of signing inputs for value bundles. Using a smart implementation and remote PoW, it is capable of performing > 100 TPS for IRI.
- IOTA PoW Hardware Accelerator FPGA for Raspberry Pi took dcurl for prototyping.
- Check shufps/dcurl and shufps/iota_vhdl_pow for tracking historical changes outperforming 200x speedup than a Raspberry Pi without hardware hccelerator.
- tangle-accelerator: caching proxy server for IOTA, which can cache API requests and rewrite their responses as needed to be routed through full nodes.
- An instance of Tangle-accelerator can serve thousands of Tangle requests at once without accessing remote full nodes frequently.
- As an intermediate server accelerateing interactions with the Tangle, it faciliates dcurl to perform hardware-accelerated PoW operations on edge devices.
-
What is binary encoded ternary?
It is a skill to transform the ternary trit value to two separate bits value.
Hence multiple trits can be compressed to two separate data of the same data type and fully utilize the space. -
Can the project Batch Binary Encoded Ternary Curl be applied to dcurl?
The answer is no.
They both use the same skill, binary encoded ternary.
However, their purpose are totally different.-
bct_curl:
Focus on hashing the multiple data of the same length at the same time.
It ends when the hashing is finished. -
dcurl:
Focus on trying the different values of one transaction at the same time to find the nonce value.
The procedure of finding the nonce value also does the hashing.
However, it ends when the nonce value is found, which means one of the values is acceptable.
-
dcurl
is freely redistributable under the MIT License.
Use of this source code is governed by a MIT-style license that can be
found in the LICENSE
file.