OpenCL Cookbook: Hello World using C++ host binding

Last time, in the OpenCL Cookbook series, I presented a hello world example using OpenCL and C for the host binding language. This time I present a very similar example but using the C++ host binding language. As you already know from previous posts the host language that interfaces with an OpenCL device can be any number of languages such as C, C++, Java, C# and Python.

So far I’ve been using the C API but I’ve decided to switch to the C++ API for two reasons: (1) it’s considerably less lines of code being more succinct and (2) it supports exceptions meaning that you do not have to check error codes for every line of binding code that you write. So, here follows, a brief primer of the C++ OpenCL binding. It’s a very simple example but trust me – we’ll be getting to more complex examples soon (time is the issue).

OpenCL kernel

__kernel void hello_world (__global char* message, int messageSize) {
	for (int i =0; i < messageSize; i++) {
		printf("%s", message[i]);

The kernel (OpenCL function) above receives a char array (in essence a string) from the host as well as the size of the char array (as there is no way to derive an array’s size from the array itself (Java programmers gasp in shock and disgust). The kernel simply iterates over all the letters in the char array and prints them one at a time to standard output thereby printing the message: “Hello World!”. Now let’s look at the C++ code that interfaces with this kernel.

C++ host binding


#include <fstream>
#include <iostream>
#include <iterator>
#include <CL/cl.hpp>
#include <CL/opencl.h>

using namespace std;

int main () {

vector<cl::Platform> platforms;
vector<cl::Device> devices;
vector<cl::Kernel> kernels;

try {

// create platform
platforms[0].getDevices(CL_DEVICE_TYPE_GPU, &devices);

// create context
cl::Context context(devices);

// create command queue
cl::CommandQueue queue(context, devices[0]);

// load opencl source
ifstream cl_file("");
string cl_string(istreambuf_iterator<char>(cl_file), (istreambuf_iterator<char>()));
cl::Program::Sources source(1, make_pair(cl_string.c_str(),
cl_string.length() + 1));

// create program
cl::Program program(context, source);

// compile opencl source;

// load named kernel from opencl source
cl::Kernel kernel(program, "hello_world");

// create a message to send to kernel
char* message = "Hello World!";
int messageSize = 12;

// allocate device buffer to hold message
sizeof(char) * messageSize, message);

// set message as kernel argument
kernel.setArg(0, buffer);
kernel.setArg(1, sizeof(int), &messageSize);

// execute kernel

// wait for completion

cout << endl;

} catch (cl::Error e) {
cout << endl << e.what() << " : " << e.err() << endl;

return 0;


The above C++ host binding code is annotated to say what it’s doing at each step but I’ll provide a brief overview. Initially it’s creating a platform, a context and a command queue which are basic opencl binding data structures that are required to interface with an opencl device. It then loads the opencl source from a separate file and with it creates a program. The program is built which compiles the opencl source. It then loads a specific kernel (function) from that source by a given name. It creates a string message on the host side but in order to send it to the device it must create a buffer of the same size as the message. The buffer is created and set as a kernel argument along with the size of the message we are sending.

The kernel is then executed and we wait for its completion on the host. The finish command flushes all outstanding tasks to the device and waits for them to finish. Note the clean exception handling using a try/catch wrap around the entire code instead of having to check error codes produced by each statement. I much prefer the C++ api to the C API. I think you’ll agree that it’s more concise and cleaner. Till next time.

4 thoughts on “OpenCL Cookbook: Hello World using C++ host binding

  1. Strlen(message) declares in for-loop easy more than send size of message to kernel function because SetArg member functions not set size of message send to kernel.

  2. > The kernel (OpenCL function) above receives a char array (in essence a string) from the host as well as the size of the char array (as there is no way to derive an array’s size from the array itself (Java programmers gasp in shock and disgust).

    Quick nitpicky correction – you aren’t passing an array, you’re passing a pointer. Those are two entirely different concepts. Of course you can’t derive an array’s size – you aren’t passing any arrays!

    Or in other words, the only thing you’re passing is the address of the first element of that array. Nothing more. Java programmer or not, there isn’t anything to gasp here at – it’s fairly obvious that it’s not possible to determine a memory block’s size just by knowing its address.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s