Determining Shared Memory at Runtime

Apr 23, 2016 at 1:54 PM
I noticed that the allocation of shared memory requires to know the value at compile time and the best way to ensure its there is to use a constant.

I have a situation where I may want to allocate less or more shared memory depending on how much there is on the GPU. To find this out, I need to get the properties of the GPU. I'm fine with compiling the code after I've made the calculations but it doesn't seem possible because the constants can't be changed.

If all my variables were assigned a value at the time the CUDAfy module was compiled, it should be able to generate the code just the same, but I've had no success.

Is there a way?
Apr 23, 2016 at 3:14 PM
If I remember correctly, that's something cudafy doesn't do.
A dirty solution would be to move your code inside an inline cudafy function, which would receive (among other things) as argument a shared mem array allocated elsewhere.
Now write 2 (at least) kernels, one that allocates a small shared mem array, and another allocs a large one. Both will call the inline func, passing their shared arrays as arg. The host code calls the kernel it finds more suited for the task.
Apr 25, 2016 at 1:52 AM
Ah, I see what you're getting at, thanks. Not quite the solution I was hoping for but at least I can have some flexibility with lots of code reuse.