NVIDIA/cudnn-frontend
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
C++MIT
Issues
- 1
Bug in Flash with rng dropout sample
#80 opened by wfoy - 5
Why is graph::check_support really slow?
#70 opened by ZoroDerVonCodier - 2
Missing header files in the package
#76 opened by DEKHTIARJonathan - 4
How to use cudnn frontend in Jax?
#89 opened by MoFHeka - 0
- 0
Question About Reduce Node
#87 opened by xcwang1999 - 2
Question about calling MHA
#85 opened by GonChen - 1
question about memory layout of the convolution
#83 opened by jinz2014 - 0
- 19
What's the difference of flash attention implement between cudnn and Dao-AILab?
#52 opened by MoFHeka - 2
Matmul test failure
#78 opened by shiwenloong - 1
cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
#75 opened by ifromeast - 1
Error running Flash & BatchNormalization tests
#77 opened by nravic - 2
[ERROR] Exception CUDNN_BACKEND_TENSOR_DESCRIPTOR cudnnFinalize failed cudnn_status: CUDNN_STATUS_NOT_INITIALIZED
#71 opened by Tr-buaa - 1
Support "make install"
#64 opened by iskunk - 2
Support use of external/system Catch2 installation
#63 opened by iskunk - 2
CUDNN_FRONTEND_BUILD_UNIT_TESTS option is broken
#62 opened by iskunk - 2
- 3
Windows build error
#66 opened by tianleiwu - 4
- 1
INT8 sample didn't work?
#31 opened by vincentccc - 5
Many samples don't work for me
#30 opened by KarelPeeters - 5
Cudnn Error InstanceNormalizationPlugin
#58 opened by ninono12345 - 4
Is dgrad+relu with fp32 supported?
#40 opened by tangjicheng46 - 5
Forward conv1d + transposition + conv1d ?
#9 opened by touisteur - 2
Why `cudnnConvolutionBackwardData` call `cudnn::ops::convertTensor_kernel<__half, __half, float, 0>(float, __half const*` ?
#21 opened by strint - 8
Number of heuristic engine configs mismatch by calling getEngineConfigCount and getEngineConfig
#17 opened by infloop777 - 1
question about the fusion_sample
#42 opened by cheneeheng - 2
Update single header file for nlohmann json
#50 opened by ernestyalumni - 13
Execute matmul op faild
#39 opened by Gebixiaochen - 3
Why implicit_convolveNd_hhgemm consume too long
#22 opened by yc-gao - 2
- 0
About cudnn backend
#29 opened by pdd-vn - 2
- 2
how to map to the original algorithm
#20 opened by jackmsye - 1
need default return value for cudnn_frontend::PointWiseDesc_v8::getPortCount() const
#25 opened by azrael417 - 5
- 6
Lack of activation function LeakyReLU
#18 opened by akineeic - 2
CUDNN not working with RTX A4000
#16 opened by vsantosu - 8
Error: CUDNN_STATUS_EXECUTION_FAILED
#12 opened by akineeic - 5
Support for half2 type convolution
#13 opened by infloop777 - 5
Cannot build nvidia-tensorflow with v0.5
#15 opened by ziyuang