learning-at-home/hivemind
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
PythonMIT
Issues
- 0
- 1
[BUG] GradScaler does not work with torch 2.3.0
#610 opened by samsja - 1
- 0
- 0
question about how rpc works in the hivemind package
#605 opened by drimeF0 - 4
[Feature Request] Network Statistics
#520 opened by chavinlo - 0
- 1
Support for windows
#596 opened by ParisNeo - 2
- 0
Support for fully homomorphic encryption on training, finetuning, and inference
#584 opened by sirus20x6 - 0
- 1
- 1
forking before initialization of the MPFuture handler - server runtime not initialized in WSL --new_hive
#581 opened by poedator - 0
- 2
How well does it scale?
#575 opened by lonnietc - 2
- 0
- 1
hivemind.compression: TypedStorage is deprecated
#563 opened by borzunov - 1
Failed to close hivemind.P2P
#564 opened by borzunov - 1
Metaclasses for logging
#556 opened by StrangeTcy - 2
AttributeError in MPFuture
#552 opened by borzunov - 1
Failed to connect to bootstrap peers
#551 opened by amerfarooq - 1
[Feature Request] improve bfoat16 serialization when there is no compression
#550 opened by justheuristic - 0
[BUG][MINOR] relayFinder already running
#549 opened by justheuristic - 1
[BUG] Unable to train a bloat16-compressed model
#545 opened by the-beee - 3
Mismatched protobuf versions in sub-dependencies
#539 opened by briansemrau - 4
[Feature Request] enable circuit relay v2
#536 opened by justheuristic - 1
- 3
hivemind.averaging.partition.AllreduceException: Averaging step failed: could not find a group
#519 opened by chavinlo - 0
[BUG] Cyclic references in TaskPool
#534 opened by justheuristic - 0
[chore] deprecations for v1.2.0
#526 opened by justheuristic - 2
Unable to decrease loss OR Unable to syncronize
#515 opened by chavinlo - 0
[BUG] stale gradients
#514 opened by elricwan - 2
[BUG] Failed to load_state_from_peers at the first time because of "list index out of range" error
#504 opened by alex-snd - 6
- 0
- 2
- 3
[Feature Request] Supporting RWKV (a RNN that can match transformer LM & zero-shot performance at 1B+ params)
#496 opened by BlinkDL - 2
On the fresh run with cifar10 on macos 11.5.2
#498 opened by stoneyang - 1
[Feature Request] MoE enhancements
#478 opened by GreenFatGuy - 1
- 4
[Feature Request] Create docker image for WSL2
#461 opened by kotenok2000 - 2
[BUG] Global connection not working
#472 opened by Lednik7 - 0
[Feature Request] fp16/bf16 gpu params with fp32 offloading in hivemind.Optimizer
#476 opened by justheuristic - 1
- 0
- 4
Set a wait time for other peers to join
#455 opened by elricwan - 0
- 3
[BUG] You current contribution: 0 samples
#451 opened by elricwan - 5