Why does this not throw an Out of Bounds Exception

May 3, 2012 at 4:31 AM

Why does the following code not throw out of bounds exception when I'm clearly indexing past the array bounds?

[Cudafy] 
public static void SinglePassSort(GThread thread, int[] input, int[] destination)
        {
            
            int[] cache = thread.AllocateShared<int>("inMemoryTree", N);
            int[] bitness = thread.AllocateShared<int>("bitNess", N);
            int[] totalFalse = thread.AllocateShared<int>("totalFalse", 1);


            var m = int.MaxValue;
            destination[destination.Length + 1] = 9939;
            destination[m] = 9393;
            destination[m + 1] = 939393;
            int thid = thread.threadIdx.x;
            //int offset = 1;

            int leftLeafIndex = 2 * thid;
            int rightLeafIndex = 2 * thid + 1;

            destination[leftLeafIndex] = leftLeafIndex;
            destination[rightLeafIndex] = rightLeafIndex;
}

May 3, 2012 at 4:36 AM

So actually if you run the modifed sample you will see that the returned destination array has the following entries

 

destination[0] = 9939
destination[1] = 9393
destination[2] = 9939

        [Cudafy]
        public static void SinglePassSort(GThread thread, int[] input, int[] destination)
        {
            
            int[] cache = thread.AllocateShared<int>("inMemoryTree", N);
            int[] bitness = thread.AllocateShared<int>("bitNess", N);
            int[] totalFalse = thread.AllocateShared<int>("totalFalse", 1);


            var m = int.MaxValue;
            destination[destination.Length + 1] = 9939;
            destination[m] = 939355;
            destination[m + 1] = 939393;
            
            int thid = thread.threadIdx.x;
            //int offset = 1;

            int leftLeafIndex = 2 * thid;
            int rightLeafIndex = 2 * thid + 1;

            destination[leftLeafIndex] = leftLeafIndex;
            destination[rightLeafIndex] = rightLeafIndex;

            destination[0] = destination[destination.Length + 1];
            destination[1] = destination[m];
            destination[2] = destination[m + 1];
}
Coordinator
May 3, 2012 at 11:27 AM

The reason is that CUDA does not detect this!  Remember that the code above is first translated to CUDA C and then compiled using the NVIDIA CUDA compiler (nvcc).  Such code can lead to black screens, unexpected results, and general bad news.  Thankfully unless you've disabled the Windows time-out (e.g. for debugging in Parallel NSight) the driver will recover.  To benefit from all the .NET goodness that we are used to you could run the code through the emulator first (pass appropriate flag to GetDevice(...)

Nick