filecoin-project/lotus

Lotus-bench results thread (v24 params)

ytjoe opened this issue · 21 comments

ytjoe commented

This issue is a place to put lotus-bench results for v24 params. (testnet/3)

# Pull testnet/3 for compilation
FFI_BUILD_FROM_SOURCE=1 make clean all bench

# Maximize cache 
export FIL_PROOFS_MAXIMIZE_CACHING=1

# Run 32g sector test
./bench --sector-size=34359738368
./bench --sector-size=34359738368 --no-gpu

Additionally, please tell us what CPU, GPU, and memory (including speed) you have in your setup.

ytjoe commented
# 3700x
# no gpu
# memory 128G 
results (v24) (34359738368)
seal: addPiece: 6m29.658286168s (84.1 MiB/s)
seal: preCommit phase 1: 3h51m36.3388762s (2.36 MiB/s)
seal: preCommit phase 2: 4h3m46.195089422s (2.24 MiB/s)
seal: commit phase 1: 3.329454884s (9.61 GiB/s)
seal: commit phase 2: 1h35m36.694015932s (5.71 MiB/s)
seal: verify: 64.685677ms
unseal: 5.300121ms  (5.9 TiB/s)
generate candidates: 642.129991ms (49.8 GiB/s)
compute epost proof (cold): 10.123346465s
compute epost proof (hot): 9.211648893s
verify epost proof (cold): 41.600541ms
verify epost proof (hot): 15.11789ms

@ytQiao preCommit phase 1 How can it be completed in such a short time. My test is completed in 23 hours. Is there any way?

ytjoe commented

@ytQiao preCommit phase 1 How can it be completed in such a short time. My test is completed in 23 hours. Is there any way?

Compile on bench test machine, use AMD processor, memory requirement is greater than 128G, swap partition is recommended to be greater than 128G (specific can be tested later), add parameters during compilation and bench running

Pull testnet/3 for compilation

FFI_BUILD_FROM_SOURCE=1 make clean all bench

Maximize cache

export FIL_PROOFS_MAXIMIZE_CACHING=1

Do you add these two parameters when compiling and running? Or something else?

ytjoe commented

Pull testnet/3 for compilation

FFI_BUILD_FROM_SOURCE=1 make clean all bench

Maximize cache

export FIL_PROOFS_MAXIMIZE_CACHING=1

Do you add these two parameters when compiling and running? Or something else?

Yes. There's nothing else.

OK, I retested after compiling, thank you for your answer

# 3700x
# no gpu
# memory 128G 
results (v24) (34359738368)
seal: addPiece: 6m29.658286168s (84.1 MiB/s)
seal: preCommit phase 1: 3h51m36.3388762s (2.36 MiB/s)
seal: preCommit phase 2: 4h3m46.195089422s (2.24 MiB/s)
seal: commit phase 1: 3.329454884s (9.61 GiB/s)
seal: commit phase 2: 1h35m36.694015932s (5.71 MiB/s)
seal: verify: 64.685677ms
unseal: 5.300121ms  (5.9 TiB/s)
generate candidates: 642.129991ms (49.8 GiB/s)
compute epost proof (cold): 10.123346465s
compute epost proof (hot): 9.211648893s
verify epost proof (cold): 41.600541ms
verify epost proof (hot): 15.11789ms
results (v24) (34359738368)
seal: addPiece: 9m35.637211091s (56.9 MiB/s)
seal: preCommit phase 1: 4h52m16.234933759s (1.87 MiB/s)
seal: preCommit phase 2: 3h42m59.305512531s (2.45 MiB/s)
seal: commit phase 1: 1m50.349284613s (297 MiB/s)
seal: commit phase 2: 2h50m1.595939973s (3.21 MiB/s)
seal: verify: 298.305989ms
unseal: 418.540821ms  (76.5 GiB/s)
generate candidates: 1.525869374s (21 GiB/s)
compute epost proof (cold): 9.030044981s
compute epost proof (hot): 8.965704682s
verify epost proof (cold): 55.595438ms
verify epost proof (hot): 19.265659ms

Thank you for your proposal, the efficiency has been greatly improved, but the difference between the same methods is 1 hour. Is there any other factors that will affect it? Is there anything that can be improved?

s1eke commented

WARN: sha-ni not available, falling back

It reported the WRAN ,Will it have any effect?

ytjoe commented

@Tylertest8 Can I show you your configuration information?

ytjoe commented

@s1eke More information is needed, but I think you need to compile on the tested machine.

@ytQiao

AMD Ryzen 3970X + RAM 128G + SWAP 300G + HDD + NOGPU
ytjoe commented

@Tylertest8 I used some nvme as storage, which may be one of the reasons,you can try it

ytjoe commented

Test data source: magik6k

# TR 3970x + 2x 2080ti
results (v24) (34359738368)
seal: addPiece: 6m8.798820562s (88.9 MiB/s)
seal: preCommit phase 1: 3h59m13.609729554s (2.28 MiB/s)
seal: preCommit phase 2: 52m3.442064626s (10.5 MiB/s)
seal: commit phase 1: 7.536231307s (4.25 GiB/s)
seal: commit phase 2: 37m25.869552159s (14.6 MiB/s)
seal: verify: 57.648867ms
generate candidates: 573.01274ms (55.8 GiB/s)
compute epost proof (cold): 15.398034616s
compute epost proof (hot): 14.742154327s
verify epost proof (cold): 39.170784ms
verify epost proof (hot): 16.905623ms

@ytQiao Ok thank you i need to keep trying

s1eke commented

specifications of my computer:

CPU:Intel Xeon E5-2683 v4  @ 3.000GHz * 2
RAM:32G * 24
GPU:NVIDIA Tesla T4  

This is my order of operations:

# FFI_BUILD_FROM_SOURCE=1 make clean all bench
# export FIL_PROOFS_MAXIMIZE_CACHING=1
# export BELLMAN_CUSTOM_GPU="Tesla T4:2560"
# ./bench --storage-dir=/lotus/tmp --sector-size=34359738368

and output log:

2020-03-31T23:39:50.561-0400    INFO    lotus-bench     lotus-bench/main.go:213 Writing piece into sector...
2020-04-01T00:33:39.561-0400    INFO    lotus-bench     lotus-bench/main.go:227 Running replication(1)...
WARN: sha-ni not available, falling back

@ytQiao

ytjoe commented

@s1eke Only amd processors have Sha instruction set, which Intel does not have

s1eke commented

CPU:Intel Xeon E5-2683 v4 @ 3.000GHz * 2
RAM:32G * 24
GPU:NVIDIA Tesla T4

results (v24) (34359738368)
seal: addPiece: 54m59.980201531s (9.93 MiB/s)
seal: preCommit phase 1: 35h40m44.350375592s (261 KiB/s)
seal: preCommit phase 2: 2h26m25.74103585s (3.73 MiB/s)
seal: commit phase 1: 670.996849ms (47.7 GiB/s)
seal: commit phase 2: 2h7m21.05272074s (4.29 MiB/s)
seal: verify: 86.39903ms
unseal: 3.210476ms  (9.73 TiB/s)
generate candidates: 846.492758ms (37.8 GiB/s)
compute epost proof (cold): 12.787342364s
compute epost proof (hot): 11.96141447s
verify epost proof (cold): 49.16003ms
verify epost proof (hot): 30.577227ms

This is too slow😂

ytjoe commented

@s1eke Adding parameters FIL_PROOFS_MAXIMIZE_CACHING = 1 can also increase speed,but the performance improvement is not as big as AMD. I think.

s1eke commented

It's already increase speed😂

20200403105556

AMD Ryzen 7 3700X 8-Core Processor + 128G RAM + Some nvme

results (v24) (34359738368)
seal: addPiece: 6m18.579275883s (86.6 MiB/s)
seal: preCommit phase 1: 4h8m30.565015104s (2.2 MiB/s) 
seal: preCommit phase 2: 3h5m10.332082143s (2.95 MiB/s)
seal: commit phase 1: 5.557466928s (5.76 GiB/s)

AMD Ryzen 5 3600X 6-Core Processor + 128G RAM + Some nvme

results (v24) (34359738368)
seal: addPiece: 6m18.215139138s (86.6 MiB/s)
seal: preCommit phase 1: 4h10m31.226008064s (2.18 MiB/s)
seal: preCommit phase 2: 4h9m33.291920304s (2.19 MiB/s)
seal: commit phase 1: 1.123879236s (28.5 GiB/s)

TR 3970x + 128G RAM + 2x 2080ti + Some nvme

results (v24) (34359738368)
seal: addPiece: 6m8.798820562s (88.9 MiB/s)
seal: preCommit phase 1: 3h59m13.609729554s (2.28 MiB/s)
seal: preCommit phase 2: 52m3.442064626s (10.5 MiB/s)
seal: commit phase 1: 7.536231307s (4.25 GiB/s)
seal: commit phase 2: 37m25.869552159s (14.6 MiB/s)
seal: verify: 57.648867ms
generate candidates: 573.01274ms (55.8 GiB/s)
compute epost proof (cold): 15.398034616s
compute epost proof (hot): 14.742154327s
verify epost proof (cold): 39.170784ms
verify epost proof (hot): 16.905623ms

v25 params are now out in testnet/3.

You can view and submit benchmarks to https://filecoin-benchmarks.on.fleek.co/