How can I unroll a loop in and debug the kernel by using NSight?

Nov 22, 2012 at 7:16 AM


I want to unroll a loop in cuda kernel function. Is that supported by the current 1.12?

BTW, I also want to know how to debug the kernel function by using NSight.

Thanks in advance.

Nov 22, 2012 at 11:17 AM


Here's a guide on how to enable NSight with CUDAFy.


As for unrolling loops, the answer is yes, but probably not on your old version of CUDAfy. I'm not sure if the binaries are up to date, but if they aren't, you can always download the latest source code and build it. Inside your kernel implementation, simply inculde GThread.InsertCode("#pragma unroll SOMENUMBERHERE"); before your loop.