In order to debug a GPU kernel with cuda-gdb
, we add -G -O0
to nvcc command line, which in case of CMake would be:
enable_language(CUDA)
...
set(CMAKE_CUDA_FLAGS_DEBUG "-G -O0" ${CMAKE_CUDA_FLAGS_DEBUG})
In HIP it is done differently, because the compiler is a Clang flavor, and inherits its behavior. The HIP frontend compiler invokes backend compilers (which are also clang++) by itself. Therefore, the frontend clang++ needs -Xclang to forward the options down to the backends:
set(CMAKE_HIP_FLAGS_DEBUG "-ggdb -fstandalone-debug -Xclang -O0 -Xclang -gcodeview" ${CMAKE_HIP_FLAGS})
Note the debugging must always be performed with roc-gdb, even for the host code, because HIP’s Clang and GCC these days tend to use incompatible debugging formats. As a result, you should get a beautiful debuggable code in TUI:
Dmitry Mikushin
If you need help with Machine Learning, Computer Vision or with GPU computing in general, please reach out to us at Applied Parallel Computing LLC.