compilation .cu files with Dynamic Parallelism(CUDA)
Asked Answered
A

1

9

I switched to a new GPU GeForce GTX 980 with cc 5.2, so it must support dynamic parallelism. However, I was not able to compile even a simple code (from programming guide). I will not provide it here (not necessary, just there is a global kernel calling another global kernel).

1) I use VS2013 for coding. In property pages -> CUDA C/C++ -> device, I changed code generation property to compute_35,sm_35, and here is the output:

1>------ Build started: Project: testCublas3, Configuration: Debug Win32 ------
1>  Compiling CUDA source file kernel.cu...
1>  
1>  C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3>"C:\Program      Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include"  -G   --keep-dir Debug -maxrregcount=0  --machine 32 --compile -cudart static  -g   -DWIN32 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o Debug\kernel.cu.obj "C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3\kernel.cu" 
1>C:/programs/misha/cuda/Projects/test projects/testCublas3/testCublas3/kernel.cu(13): error : kernel launch from __device__ or __global__ functions requires separate compilation mode
1>  kernel.cu
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 6.5.targets(593,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include"  -G   --keep-dir Debug -maxrregcount=0  --machine 32 --compile -cudart static  -g   -DWIN32 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o Debug\kernel.cu.obj "C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3\kernel.cu"" exited with code 2.

I guess, that I need another option for this compilation: -rdc=true, but I didn't find where I can set it in VS2013.

2) When I set code generationproperty to compute_52,sm_52, there is a error: Unsupported gpu architecture 'compute_52'. But my cc is 5.2. So I can compile codes for 3.5 cc maximum?

Thanks

Arndt answered 9/1, 2015 at 0:8 Comment(3)
regarding the unsupported architecture message, there was an update provided for CUDA 6.5 to support cc5.2. It is here. If you install that, that particular error should go away.Overweigh
Thanks, already downloading. How about the first question? Is that true that I have to include option -rdc=true somehow?Arndt
Yes, you have to enable separate compilation and linking, and you also need to include some extra libraries. Perhaps you should look at one of the VS project files that are in the cuda samples, such as the device cublas sample.Overweigh
O
9

Regarding item 1, cuda dynamic parallelism requires separate compilation and linking (-rdc=true), as well as linking in of the device cudart libraries (-lcudadevrt). Dynamic parallelism that also uses CUBLAS will also require linking in the device CUBLAS library (-lcublas_device). Possibly the simplest way to define where all these should go in a visual studio project is to start by looking at a visual studio project for the device cublas sample.

Regarding item 2, the reason your GTX 980 compute capability 5.2 is not being recognized is that you need the latest update for the cuda 6.5 toolkit, which is available here.

(Note that the cublas_device capability has been removed from recent versions of CUDA.)

Overweigh answered 9/1, 2015 at 23:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.