cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
ifromeast opened this issue · 2 comments
when running pytest test/python_fe
on latest version, it returns
graph.validate()
graph.build_operation_graph()
graph.create_execution_plans([cudnn.heur_mode.A, cudnn.heur_mode.FALLBACK])
> graph.check_support()
E cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
test/python_fe/test_matmul_bias_relu.py:278: cudnnGraphNotSupportedError
==================================================================================== warnings summary =====================================================================================
test/python_fe/test_apply_rope.py::test_apply_rope
/home/vipuser/miniconda3/envs/llm-env/lib/python3.10/site-packages/torch/random.py:159: UserWarning: CUDA reports that you have 8 available devices, and you have used fork_rng without explicitly specifying which devices are being used. For safety, we initialize *every* CUDA device by default, which can be quite slow if you have a lot of CUDAs. If you know that you are only making use of a few CUDA devices, set the environment variable CUDA_VISIBLE_DEVICES or the 'devices' keyword argument of fork_rng with the set of devices you are actually using. For example, if you are using CPU only, set device.upper()_VISIBLE_DEVICES= or devices=[]; if you are using device 0 only, set CUDA_VISIBLE_DEVICES=0 or devices=[0]. To initialize all devices and suppress this warning, set the 'devices' keyword argument to `range(torch.cuda.device_count())`.
warnings.warn(message)
test/python_fe/test_conv_genstats.py::test_conv_genstats
/mnt/zzd/llm.c/cudnn-frontend/test/python_fe/test_conv_genstats.py:14: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
conv_output = torch.nn.functional.conv2d(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================= short test summary info =================================================================================
FAILED test/python_fe/test_apply_rope.py::test_apply_rope - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_batchnorm.py::test_bn_relu_with_mask - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_batchnorm.py::test_drelu_dadd_dbn - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_conv_bias.py::test_conv_bias_relu - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_conv_bias.py::test_conv_relu - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_conv_bias.py::test_conv3d_bias_leaky_relu - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_conv_bias.py::test_leaky_relu_backward - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_conv_bias.py::test_conv_int8 - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_conv_genstats.py::test_conv_genstats - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_conv_reduction.py::test_reduction - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_matmul_bias_relu.py::test_matmul_bias_relu[param_extract0] - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_matmul_bias_relu.py::test_matmul_bias_relu[param_extract1] - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_matmul_bias_relu.py::test_matmul_bias_relu[param_extract4] - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
FAILED test/python_fe/test_matmul_bias_relu.py::test_matmul_bias_relu[param_extract5] - cudnn._compiled_module.cudnnGraphNotSupportedError: [cudnn_frontend] Error: No execution plans built successfully.
================================================================ 14 failed, 3514 skipped, 2 warnings in 100.57s (0:01:40) =================================================================
and my CUDA is 12.4, cuDNN is 9.1, Driver Version is 550.54.15 on Ubuntu 22.04
Thanks @ifromeast for following up on this from the llm.c repo.
I have added an experimental branch issues/75_and_78 to print the cudaGetLastError().
Please run,
CUDNN_LOGLEVEL_DBG=3 CUDNN_LOGDEST_DBG=backend_api.log CUDNN_FRONTEND_LOG_FILE=fe.log CUDNN_FRONTEND_LOG_INFO=1 pytest -s test/python_fe and please attach both backend_api.log
and fe.log
for us to help debug.
Thanks
@Anerudhan I am seeing this error as well when trying to run the matmul example here. I am running using my own script rather than within the repo.
Here are the log files you requested
Could you please take a look. Thanks