Load .cubin

Aug 8, 2013 at 12:43 PM
How to load .cubin? I tried GPGPU.LoadModule but it ends with XmlException.
Coordinator
Aug 10, 2013 at 6:18 AM
You cannot load a cubin or ptx directly, only when part of a cudafy module. Please consult the user manual.
Aug 10, 2013 at 2:38 PM
I’m sorry. I can’t see how can I convert / embed .cubin into .cdfy. Can you please give me another hint? Are dummy functions what I am looking for?
Coordinator
Aug 12, 2013 at 7:36 AM
Currently there is no way to import a made elsewhere ptx or cubin into CUDAfy as it defeats much of the purpose of CUDAfy. I will keep it in mind, but so I can gauge the priority, can you say where the cubin was made? Why do you not make the module in CUDAfy?
Aug 15, 2013 at 4:42 AM
Edited Aug 15, 2013 at 4:44 AM
Hello, related question, would there a way to modify the gpu binary before launching? I want to do some very low level coding in binary before each lunch. (some hardware level AMD GCN instructions while in binary). I would like to use c# with opencl but I have this one requirement.
Coordinator
Aug 15, 2013 at 7:26 AM
sunsetquest wrote:
Hello, related question, would there a way to modify the gpu binary before launching? I want to do some very low level coding in binary before each lunch. (some hardware level AMD GCN instructions while in binary). I would like to use c# with opencl but I have this one requirement.
Are you using OpenCL with CUDAfy? OpenCL does not precompile, it only stores the OpenCL source code and compiles when loading modules.
Aug 16, 2013 at 5:40 AM
Thank you Nick, I was planning to use openCl. c++ OpenCl allows a call to clGetProgramInfousing(..., CL_PROGRAM_BINARIES,...) modify it, and then reload it with clCreateProgramWithBinary(). Basically, I was just hoping that cudafy would allow raw access to the binary before launch or some other way I could do the same thing. I am planning to create a template with several instructions that I can identify in it and then use regular expressions to swap out the 32-bit raw AMD GCN instructions.
Coordinator
Aug 16, 2013 at 1:59 PM
At what stage do you access the compiled binary? Sounds like we'd need a discrete compile stage option and a new load module in same way as CUDAfy does with CUDA. This could fit in to the CUDAfy model.
The more info you can supply about the OpenCL steps the easier to gauge effort required.
Aug 21, 2013 at 4:44 AM
Hi Nick, Sorry for my late reply. I was out of town for a couple days.

Here is an example I somewhat created of how I was planning of accessing the raw binaries in opencl .
// load a kernel
string programSource = @"
__kernel void square(__global float* input, __global float* output) 
{ 
size_t i = get_global_id(0); 
output[i] = input[i] * input[i]; 
};";
CLN.Program program = Cl.CreateProgramWithSource(context, 1, new[] { programSource }, null, out err); 
err = Cl.BuildProgram(program, 0, null, string.Empty, null, IntPtr.Zero); 

// Extract the Binaries 
int deviceCt = Cl.GetProgramInfo(program, ProgramInfo.NumDevices, out err).CastTo<int>();
int[] binarySizes = Cl.GetProgramInfo(program, ProgramInfo.BinarySizes, out err).CastToArray<int>(deviceCt);
InfoBuffer[] tempInfoBuffers = new InfoBuffer[deviceCt];
for (int i = 0; i < deviceCt; i++)
    tempInfoBuffers[i] = new InfoBuffer(new IntPtr(binarySizes[i]/*size*/));
InfoBufferArray binarys = new InfoBufferArray(tempInfoBuffers);
IntPtr nbread = new IntPtr();
err = Cl.GetProgramInfo(program, ProgramInfo.Binaries, new IntPtr(4/*unsigned char* */ * deviceCt), binarys, out nbread); 

// Modify Binaries as needed here

//reload the modified Binaries 
InfoBufferArray<ErrorCode> errCodes = new InfoBufferArray<ErrorCode>(deviceCt);
program = Cl.CreateProgramWithBinary(context, (uint)deviceCt, devices, new IntPtr[1] { new IntPtr(binarySizes[0]) }, binarys, errCodes, out err);
err = Cl.BuildProgram(program, 0, null, string.Empty, null, IntPtr.Zero); //clBuildProgram(program, 0, NULL, NULL, NULL, NULL);

// Run the binarys here