[Issue]: Conversion of tiny-cuda-nn lib into HIP

Question

[Issue]: Conversion of tiny-cuda-nn lib into HIP

Vishal-S-P opened this issue 7 months ago · 4 comments

Problem Description

I am facing issues related to code conversion from CUDA to HIP using CUDAEXTENSION approach. Please see the steps to reproduce section.

Operating System

OS: NAME="Ubuntu" VERSION="22.04.3 LTS (Jammy Jellyfish)"

CPU

AMD EPYC 7773X 64-Core Processor

GPU

AMD Instinct MI250X

ROCm Version

ROCm 6.0.0

ROCm Component

HIPIFY

Steps to Reproduce

I am trying to convert CUDA code from https://github.com/NVlabs/tiny-cuda-nn into HIP and compiling the pytorch extenstion. Here is the setup.py I am using -

my_setup.txt

Additionally, I converted the header files in https://github.com/NVlabs/tiny-cuda-nn/tree/master/include/tiny-cuda-nn using the shell script below -

#!/bin/bash
CUDA_DIR="../../include/tiny-cuda-nn"
HIP_DIR="../../include/tiny-cuda-nn"
find $CUDA_DIR -type f ( -iname *.h ) -exec sh -c '
for file; do
hipfile="$HIP_DIR/${file#$CUDA_DIR/}"
mkdir -p "$(dirname "$hipfile")"
echo "Converting $file -> $hipfile"
hipify-perl "$file" -print-stats -inplace
done
' sh {} +

You can reproduce the following error -

/dockerx/Text-to-3D-Models-on-AMD-GPUs/tiny-cuda-nn/include/tiny-cuda-nn/vec.h:303:53: error: invalid input constraint 'l' in asm 303 | asm ("red.relaxed.gpu.global.add.f32 [%0], %1;" :: "l"(addr), "r"(in_int)); | ^ /dockerx/Text-to-3D-Models-on-AMD-GPUs/tiny-cuda-nn/include/tiny-cuda-nn/vec.h:329:61: error: invalid input constraint 'l' in asm 329 | asm ("red.relaxed.gpu.global.add.noftz.f16x2 [%0], %1;" :: "l"(addr), "r"(in_int));

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Answer 1 · 2024-06-15T00:14:00.000Z

@Vishal-S-P somehow the HIP compiler is seeing that inline PTX at line 329 of vec.h and that certainly won't work. Apparently the guard "#if TCNN_MIN_GPU_ARCH >= 70" is somehow passing. That needs to be fixed.

Answer 2 · 2024-06-15T00:17:50.000Z

I am passing the

definitions = base_definitions + [f"-DTCNN_MIN_GPU_ARCH={compute_capability}"] and hardcoded compatibility to be 70.

Should I not use 70?

Answer 3 · 2024-10-07T16:34:30.000Z

Hi @Vishal-S-P, thanks for waiting.

Yes, try using a value below 70 in your setup.py since the PTX ISA is specific to NVIDIA GPUs.

Also, note that you will also have to hipify the cutlass library under the dependencies directory since it is also written for CUDA.

Answer 4 · 2024-10-21T14:59:28.000Z

@Vishal-S-P I am closing this ticket due to inactivity. If the fix suggested above does not work, please feel free to re-open the ticket and we can look into it further.