First, make sure you tell CLion to treat .cu
and .cuh
files as C++ using the File Types
settings menu.
CLion is not able to parse CUDA's language extensions, but it does provide a preprocessor macro that is defined only when clion is parsing the code. You can use this to implement almost complete CUDA support yourself.
Much of the problem is that CLion's parser is derailed by keywords like __host__
or __device__
, causing it to fail to do things it otherwise knows how to do:
CLion has failed to understand Dtype
in this example, because the CUDA stuff confused its parsing.
The most minimal solution to this problem is to give clion preprocessor macros to ignore the new keywords, fixing the worst of the brokenness:
#ifdef __JETBRAINS_IDE__
#define __host__
#define __device__
#define __shared__
#define __constant__
#define __global__
#endif
This fixes the above example:
However, CUDA functions like __syncthreads
, __popc
will still fail to index. So will CUDA builtins like threadIdx
. One option is to provide endless preprocessor macros (or even struct definitions) for these, but that's ugly and sacrifices type-safety.
If you're using Clang's CUDA frontend, you can do better. Clang implements the implicitly-defined CUDA builtins by defining them in headers, which it then includes when compiling your code. These provide definitions of things like threadIdx
. By pretending to be the CUDA compiler's preprocessor and including device_functions.h
, we can get __popc
and friends to work, too:
#ifdef __JETBRAINS_IDE__
#define __host__
#define __device__
#define __shared__
#define __constant__
#define __global__
// This is slightly mental, but gets it to properly index device function calls like __popc and whatever.
#define __CUDACC__
#include <device_functions.h>
// These headers are all implicitly present when you compile CUDA with clang. Clion doesn't know that, so
// we include them explicitly to make the indexer happy. Doing this when you actually build is, obviously,
// a terrible idea :D
#include <__clang_cuda_builtin_vars.h>
#include <__clang_cuda_intrinsics.h>
#include <__clang_cuda_math_forward_declares.h>
#include <__clang_cuda_complex_builtins.h>
#include <__clang_cuda_cmath.h>
#endif // __JETBRAINS_IDE__
This will get you perfect indexing of virtually all CUDA code. CLion even gracefully copes with <<<...>>>
syntax. It puts a little red line under one character on each end of the launch block, but otherwise treats it as a function call - which is perfectly fine: