Error Unknown with large set of data

Feb 11, 2013 at 11:14 AM
Hi
I have nearly 1000K+ records to process in my function. My GPU configuration is as follows:




--- General Information for device 0 ---
Name: GeForce GTX 650
Device Id: 0
Compute capability: 3.0
Clock rate: 1058500
Simulated: False

--- Memory Information for device 0 ---
Total global mem: 1073741824
Total constant Mem: 65536
Max mem pitch: 2147483647
Texture Alignment: 512

--- MP Information for device 0 ---
Shared mem per mp: 49152
Registers per mp: 65536
Threads in warp: 32
Max threads per block: 1024
Max thread dimensions: (1024, 1024, 1)
Max grid dimensions: (2147483647, 65535, 1)

When i ran with 200K + records, it ran perfectly. But with 1000K records, it is giving error Unknown exception.
I am launching the kernel with following:

dim3 grids = new dim3(1024 / 8);
   `
dim3 threads = new dim3(16, 16);

_gpu.Launch(grids , threads, "GroupOutages", arrOutages_dev, arrOutages.Length, results_dev, arrOutages.Length, dev_r);

Please guide me what could be the best kernel sizeing in my case as I am very new to this. An early reply will be highly appriciated.

Thanks
Sachin
Feb 11, 2013 at 11:19 AM
I have also tried with

_gpu.Launch(16 , 1024, "GroupOutages", arrOutages_dev, arrOutages.Length, results_dev, arrOutages.Length, dev_r);

It also giving me the same error.
Feb 12, 2013 at 6:07 AM
Any solutions/suggestions please !
Coordinator
Feb 12, 2013 at 9:07 AM
Edited Feb 14, 2013 at 1:55 PM
Error unknowns are nasty because they are erm unknown. Actually this post gives some insight: http://cudafy.codeplex.com/discussions/405279

Without your full code difficult to help but try increasing the number gradually. Could you be running out of memory or going beyond initialized memory?
Feb 12, 2013 at 5:48 PM
You can find the ful code from following discussion link:

http://cudafy.codeplex.com/discussions/429919

After analyzing the code, as an expert, can you please suggest what should be gridsize and block size to launch the Cudafy function.
Feb 14, 2013 at 10:54 AM
Edited Feb 14, 2013 at 10:57 AM
Hi

1 - use multiples of 32 for block size, to avoid splitting warps.
2 - depending on your card and the nature of your problem, a good block size could be 256 or 512, for example.
3 - let the occupancy calculator tool help you determine a good block size, since full occupancy depends on register and shared mem use.

as for "errors unknow", you should check this thread:

http://cudafy.codeplex.com/discussions/407193

but since you're getting them after you exceed a fixed record size, I suspect you're doing a buffer overrun somewhere.