error code (-11):: what are all possible reasons of getting error "cl_build_program_failure" in OpenCL?
Asked Answered
B

2

23

I am using ATI RV770 graphics card, OpenCl 1.0 and ati-stream-sdk-v2.3-lnx64 on linux.

While running my host code which includes following two sections to build kernel program, i am getting error code (-11) i.e. cl_build_program_failure. Does it means that kernel program compiled, if not then how is it compiled and debugged?

const char* KernelPath = "abc_kernel.cl";   //kernel program is in separate file but in same directory of host code..

/* Create Program object from the kernel source *******/

char* sProgramSource = readKernelSource(KernelPath);
size_t sourceSize =  strlen(sProgramSource) ;
program = clCreateProgramWithSource(context, 1,(const char **) &sProgramSource,&sourceSize, &err);
checkStatus("error while creating program",err);

/* Build (compile & Link ) Program *******/

char* options = (char* )malloc(10*sizeof(char));
strcpy(options, "-g");
err = clBuildProgram(program, num_devices, devices_id, options, NULL, NULL);
checkStatus("Build Program Failed", err); //This line throwing the error....

function to read kernel program is as follows::

/* read program source file*/

char* readKernelSource(const char* kernelSourcePath){
 FILE    *fp = NULL;
 size_t  sourceLength;
 char    *sourceString ;
 fp = fopen( kernelSourcePath , "r");
 if(fp == 0)
 {
        printf("failed to open file");
        return NULL;
 }
 // get the length of the source code
 fseek(fp, 0, SEEK_END);
 sourceLength = ftell(fp);
 rewind(fp);
 // allocate a buffer for the source code string and read it in
 sourceString = (char *)malloc( sourceLength + 1);
 if( fread( sourceString, 1, sourceLength, fp) !=sourceLength )
 {
          printf("\n\t Error : Fail to read file ");
          return 0;
 }
 sourceString[sourceLength+1]='\0';
 fclose(fp);
 return sourceString;

}// end of readKernelSource

Can anyone tell how to fix it?

Does it means that it is OpenCl compilation error at runtime or something else?

//Printing build_log info using clGetProgramBuildInfo() as below, But why is is not printing anything?

char* build_log; size_t log_size;

// First call to know the proper size
        err = clGetProgramBuildInfo(program, devices_id, CL_PROGRAM_BUILD_LOG, 0, NULL, &log_size);
        build_log = (char* )malloc((log_size+1));

        // Second call to get the log
        err = clGetProgramBuildInfo(program, devices_id, CL_PROGRAM_BUILD_LOG, log_size, build_log, NULL);
        build_log[log_size] = '\0';
        printf("--- Build log ---\n ");
        fprintf(stderr, "%s\n", build_log);
        free(build_log);
Bunk answered 27/2, 2012 at 11:23 Comment(1)
Did you paste the kernel source into a 3rd party tool of some kind? A profiler maybe? When I get build errors, they are usually syntax within the program itself.Circlet
B
45

This error is typically caused by a syntax error in your kernel code. You can call the OpenCL function clGetProgramBuildInfo with the flag CL_PROGRAM_BUILD_LOG to access the log generated by the compiler. This log contains the output you are probably used to when compiling on the command-line (errors, warnings, etc.).

For example, you could add something similar to the following after you call clBuildProgram:

if (err == CL_BUILD_PROGRAM_FAILURE) {
    // Determine the size of the log
    size_t log_size;
    clGetProgramBuildInfo(program, devices_id[0], CL_PROGRAM_BUILD_LOG, 0, NULL, &log_size);

    // Allocate memory for the log
    char *log = (char *) malloc(log_size);

    // Get the log
    clGetProgramBuildInfo(program, devices_id[0], CL_PROGRAM_BUILD_LOG, log_size, log, NULL);

    // Print the log
    printf("%s\n", log);
}

You can also see the function buildOpenCLProgram() in SDKCommon.cpp in the AMD APP SDK for a real example.

Bounty answered 27/2, 2012 at 15:11 Comment(8)
I did the same to print build_log info as shown above. But it is printing anything, even i used malloc() to create storage for build_log.Bunk
I'm guessing you mean "it is not printing anything"? Can you confirm that the printf is actually being called? You might need to initialize the log first to ensure that the string is null-terminated. Try adding "memset(log, 0, log_size);" after the call to malloc.Bounty
i tried "memset((void *)build_log, 0, sizeof(build_log));" but at this line it is giving segmentation fault...Bunk
sizeof(build_log) is wrong; build_log is a pointer, so sizeof(build_log) is the size of a pointer. What you really want to pass to memset is the number of bytes you allocated (log_size).Bounty
Well, after applying clgetProgrambuidInfo() i got the build status that no build has been performed on the specified program object for device. Does means that kernel code not compiled? Please help!!! How can we compile our kernel code successfully?Bunk
OK, I just noticed that you edited your original post to add your error checking code. Note that clGetProgramBuildInfo expects to be passed a single device as the second parameter, whereas clBuildProgram expects a list of devices. But you are passing the same variable to both (I'm actually surprised that this compiles). You need to modify your code to pass a single device, as in the example I gave.Bounty
@Bunk use calloc() instead of malloc() to zero it (very late contribution ;)).Shontashoo
passing &log_size instead of NULL will cause CL_INVALID_VALUE error due to keeping callback function pointer NULL as mentioned here: khronos.org/registry/OpenCL/sdk/1.0/docs/man/xhtml/…Expand
R
0

The problem here is that the program buffer is unterminated.

How do I know this? I had an OpenCL program that was working just fine until I disabled all of the linux kernel mitigations in my kernel. Trying to run the program several days later resulted in similar OpenCL compilation errors for stuff that clearly isn't in the CL source file. Rebuilding the program didn't fix the issue. The string that fread() fills if you puts() it you'll see the garbage at the end of the program buffer.

My solution was to buffer[length] = '\0'; after allocating an extra byte for termination purposes. Other solutions could be to use calloc(). I suspect that one of the mitigation's was making sure that allocated memory is cleared before being handed to user space.

Refraction answered 21/2, 2023 at 21:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.