Issues
- 1
Restrict CUB device histograms to `NUM_ACTIVE_CHANNELS <= NUM_CHANNELS`
#1792 opened by bernhardmgruber - 4
[Possible BUG]: compute-sanitizer initchecker reports uninitialized global memory reads from thrust::reduce_by_key
#1790 opened by ssadasivam1 - 0
[BUG]: CUB test "DeviceHistogram::Histogram* large levels" for float produces INF in test code
#1793 opened by bernhardmgruber - 6
- 2
[BUG]: Stale CUDA error causes CUB calls to fail
#1791 opened by dkolsen-pgi - 1
Gather benchmark results for each CUB algorithm using different offset types
#1787 opened by jrhemstad - 3
[FEA]: Investigate if NVTX ranges in CUB algorithms support graph capture
#1674 opened by gevtushenko - 2
CUB histogram API signatures are misleading
#1765 opened by bernhardmgruber - 5
- 0
- 0
CI testing for memory errors
#1775 opened by bernhardmgruber - 0
[BUG]: MSVC < 2022 doesn't properly handle thrust's member function detector.
#1731 opened by alliepiper - 0
Merge CUB and Thrust counting iterators
#1772 opened by bernhardmgruber - 0
- 1
Describe potential solutions for algorithm implementations and their pros/cons
#1771 opened by elstehle - 0
Fix 1B load/stores in atomic caused by refactoring
#1769 opened by jrhemstad - 0
A low level, untyped, uninitialized RAII abstraction over an allocation `cuda::buffer<Properties...>` (alternatively `cuda::unique_ptr<T[], Properties...>`?)
#1768 opened by jrhemstad - 1
- 2
[FEA]: Move support standard versions in matrix.yaml to per-compiler YAML anchors
#1758 opened by jrhemstad - 0
Port `thrust::merge` to CUB
#1763 opened by elstehle - 8
[BUG]: Intermittent wrong output from thrust::remove_if under heavy GPU loading
#1730 opened by ssadasivam1 - 0
CUB's NVTX ranges fail to compile when usercode uses explicitly versioned NVTX API
#1750 opened by bernhardmgruber - 0
Move to feature flag to guard for deduction guides
#1704 opened by miscco - 0
Specify qualifier order in `.clang-format`
#1748 opened by bernhardmgruber - 0
- 3
[BUG]: libcudacxx clashes w/ libc++: ambiguous overload resolution for `__swallow`
#1678 opened by Artem-B - 10
- 0
- 0
[DOC]: Update contributing guide to include information on how to use cmake presets
#1689 opened by jrhemstad - 0
Simplify test matrix spec/usage
#1700 opened by alliepiper - 0
Port Thrust docs to use Sphinx
#1742 opened by jrhemstad - 0
[DOC]: concepts library appears twice in TOC and is out of place (and ranges library page is missing)
#1728 opened by harrism - 0
[BUG]: PTXAS emits advisory regarding cp.async.bulk.*.multicast use on sm_90
#1733 opened by ahendriksen - 5
- 1
[FEA]: Upgrade Catch2 in CUB to version 3
#1724 opened by bernhardmgruber - 0
Remove old `meow(void)` function signatures
#1703 opened by miscco - 0
- 0
[BUG]: thrust::optional<T&>::emplace() does not compile
#1706 opened by Snektron - 0
- 0
Add tests for thrust::optional
#1713 opened by miscco - 0
[BUG]: thrust::count_if and copy_if performance on Grace and x86 10x+ / 20x+ slower than libstdc++
#1709 opened by gonzalobg - 0
- 0
Missing full qualification of namespace std in CUB
#1699 opened by miscco - 0
[BUG]: CUB test relies on deprecated error code
#1691 opened by gevtushenko - 1
[PERF][BUG]: Thrust uses cudaMemcpy for Device->Device copies (66% SoL on H200)
#1672 opened by ahendriksen - 0
- 0
[FEA]: Remove `cuda::mr::managed_memory` property until we've decided on more appropriate properties
#1680 opened by jrhemstad - 0
[PERF][BUG]: `thrust::transform` does not saturate bandwidth on newer hardware architectures (down to 62% SoL on H200 for int)
#1673 opened by ahendriksen - 0
Create generic floating-point wrapping type for <NumExponentBits, NumMantissaBits>
#1665 opened by jrhemstad - 0
Provide generic implementation of `floating_point<M, E>` limited to arithmetic operations
#1666 opened by jrhemstad