My Project involves mapping of an algorithm written in C to OpenCL, for CPU and GPU devices. It should run on a linux machine as well as Apple devices. I am trying to get it working on CPU using Xcode on a macbook. But my code crashes during the transpose stage. I know why it is crashing but can't resolve it.
I can provide the algorithm written in C. I want this done in a day or earlier. Please let me know if this is possible and how fast you think you can map this. Please let me know ASAP if this is possible.
Thanks !