Issues
- 0
[Feature] Ethernet connection
#121 opened by saeedmaleki - 0
[Bug] Stronger correctness check in mscclpp-test
#198 opened by chhwang - 0
[Feature] Support python-based mscclpp-test
#187 opened by Binyang2014 - 1
[Feature] fp16 allreduce.
#192 opened by saeedmaleki - 1
MSCCL++ v0.4.0 Release Plan (Released)
#160 opened by chhwang - 0
[Feature] a warning for when CQ is about to be full and ask user to flush it.
#194 opened by saeedmaleki - 0
- 0
- 2
- 1
MSCCL++ v0.3.0 Release Plan
#89 opened by chhwang - 0
[Feature] `getPacket` arg list does not match the `get` function from `sm_channels`
#158 opened by saeedmaleki - 0
[feature] `ProxyChannel` should not be taking device handles in for constructors.
#155 opened by saeedmaleki - 2
[Bug] Bootstrap occasionally returns "Address in use" error during `initialize()`
#163 opened by chhwang - 1
- 0
[feature] `poll` instead of wait.
#176 opened by saeedmaleki - 0
[Bug] Compilation fails with `CMAKE_BUILD_TYPE=Debug`
#174 opened by chhwang - 2
[Performance] Improve single-node AllReduce latency
#164 opened by chhwang - 2
- 0
[Unit Test] Listing unit tests needed
#115 opened by chhwang - 1
[Feature] Support custom `RegisteredMemory::Impl`
#135 opened by chhwang - 0
- 1
- 2
[Feature] Connection without Bootstrap
#137 opened by olsaarik - 0
- 1
[Bug] Need to call cudaIpcCloseMemHandle to release remote registered memory
#165 opened by Binyang2014 - 0
[Unit Test] Add Python unit tests
#134 opened by chhwang - 0
[Feature] Support `POLL_PRINT_ON_STUCK`
#99 opened by chhwang - 3
- 2
[feature] Enable custom fifo size and tail flush
#138 opened by saeedmaleki - 0
- 1
- 1
- 1
[bug] `fifo` has 128bit atomic reading problems.
#154 opened by saeedmaleki - 1
- 1
[feature] add `.def_prop_ro` for all device components so that they can be passed to GPU kernels.
#144 opened by saeedmaleki - 1
[bug] change all `__CUDACC__` to `__CUDA_ARCH__`
#143 opened by saeedmaleki - 0
[feature] python binding for DeviceSyncer
#156 opened by saeedmaleki - 0
- 0
- 0
- 0
[Bug] SmChannel and ownership of semaphores
#126 opened by saeedmaleki - 1
- 0
[Feature] Add Python bindings
#120 opened by olsaarik - 1
- 2
[Bug] nanobind is leaking with connection
#130 opened by saeedmaleki - 1
pickle for uniq_id
#129 opened by saeedmaleki - 0
- 4
[Bug] Hanging bootstrap communication
#92 opened by chhwang - 5
[Performance] AllReduce performance debugging
#90 opened by chhwang - 2