openCL - Creating sub-buffers returns errorcode 13

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



openCL - Creating sub-buffers returns errorcode 13



Hi I am new to OpenCL and using the C++ wrapper. Trying to run the same kernel on two devices simultaneously. The buffer is created and the attempt is to chunk it up using sub-buffers and passing those chucks to the kernel and dispatching them twice - once to Command Queue 1 and then to Command Queue 2 with different chunks of the main buffer.



When running it throws an error -13. All the other sub-buffers have been created except this one in question.



Any guidance will be much appreciated.



Using OpenCL 1.1


//Creating main buffer
cl::Buffer zeropad_buf(openclObjects.context,CL_MEM_READ_ONLY| CL_MEM_COPY_HOST_PTR,(size+2)*(size+2)*cshape[level][1]*sizeof(float),zeropad);
cl::Buffer output_buf(openclObjects.context,CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR ,cshape[level][0]*size*size*sizeof(float),output_f);

//Creating sub_buffers for zeropad_buf
size_t zeropad_buf_size = (size+2)*(size+2)*cshape[level][1]*sizeof(float);
size_t output_buf_size = cshape[level][0]*size*size*sizeof(float);

cl_buffer_region zero_rgn_4core = 0, zeropad_buf_size/2;
**cl_buffer_region zero_rgn_2core = zeropad_buf_size/2, zeropad_buf_size/2;** //Throws error -13

cl_buffer_region output_rgn_4core = 0, output_buf_size/2;
cl_buffer_region output_rgn_2core = output_buf_size/2, output_buf_size/2;



cl::Buffer zeropad_buf_4Core = zeropad_buf.createSubBuffer(CL_MEM_READ_ONLY,CL_BUFFER_CREATE_TYPE_REGION, &zero_rgn_4core);
**cl::Buffer zeropad_buf_2Core = zeropad_buf.createSubBuffer(CL_MEM_READ_ONLY,CL_BUFFER_CREATE_TYPE_REGION, &zero_rgn_2core);**
std::cout<<"zero_pad sub-buffer created"<<std::endl;

cl::Buffer output_buf_4Core = output_buf.createSubBuffer(CL_MEM_READ_WRITE,CL_BUFFER_CREATE_TYPE_REGION, &output_rgn_4core);
cl::Buffer output_buf_2Core = output_buf.createSubBuffer(CL_MEM_READ_WRITE,CL_BUFFER_CREATE_TYPE_REGION, &output_rgn_2core);




1 Answer
1



From the documentation:



CL_MISALIGNED_SUB_BUFFER_OFFSET is returned in errcode_ret if there are no devices in context associated with buffer for which the origin value is aligned to the CL_DEVICE_MEM_BASE_ADDR_ALIGN value.


CL_MISALIGNED_SUB_BUFFER_OFFSET


errcode_ret


CL_DEVICE_MEM_BASE_ADDR_ALIGN



It looks like you might need to align your split region offsets and sizes to lie on integer multiples of the least common multiple (LCM) of the CL_DEVICE_MEM_BASE_ADDR_ALIGN properties of all of your devices.


CL_DEVICE_MEM_BASE_ADDR_ALIGN



By this, I mean something like the following:



Assuming the devices you are using are in a variable


std::vector<cl::Device> devices;



Query the CL_DEVICE_MEM_BASE_ADDR_ALIGN property for each device:


CL_DEVICE_MEM_BASE_ADDR_ALIGN


cl_uint total_alignment_requirement = 1;
for (cl::Device& dev : devices)

cl_uint device_mem_base_align = 0;
if (CL_SUCCESS == dev.getInfo(CL_DEVICE_MEM_BASE_ADDR_ALIGN, &device_mem_base_align))
total_alignment_requirement = std::lcm(total_alignment_requirement, device_mem_base_align);



Then, when it comes to allocating zeropad, make sure the memory is aligned to total_alignment_requirement, for example if you're currently allocating it with malloc(), use posix_memalign() instead. (Even better, don't create the buffer using CL_MEM_USE_HOST_PTR and let OpenCL allocate the memory if you can.)


zeropad


total_alignment_requirement


malloc()


posix_memalign()


CL_MEM_USE_HOST_PTR



Finally, your regions need to be aligned too:


size_t zeropad_split_pos = zeropad_buf_size / 2;
zeropad_split_pos -= zeropad_split_pos % total_alignment_requirement;
cl_buffer_region zero_rgn_4core = 0, zeropad_split_pos;
cl_buffer_region zero_rgn_2core = zeropad_split_pos, zeropad_buf_size - zeropad_split_pos;



This ensures that the first region starts and ends on an address that is a multiple of total_alignment_requirement, and the second region starts on an aligned address too.


total_alignment_requirement



(I haven't tested this code, but it should be close to correct. Note that std::lcm is a very new C++ standard library feature, so if that's not available in your toolchain, you'll need to supply your own lcm function.)


std::lcm





Thanks for your response. But I don't understand what it means. Is there an example code or an explanation with an example perhaps?
– Sanjay Rakshit
Aug 6 at 21:30





I'm not aware of example code, but I've edited my answer to give you an idea of the kind of code that's required.
– pmdj
Aug 7 at 14:21





Thanks a ton. Just what i was looking for.
– Sanjay Rakshit
Aug 7 at 20:56






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard