Embed resources (eg, shader code; images) into executable/library with CMake
Asked Answered
M

9

48

I am writing an application in C++ which relies on various resources in my project. Right now, I have the relative path from the produced executable to each resource hard-coded in my sources, and that allows my program to open the files and read in the data in each resource. This works ok, but it requires that I start the executable from a specific path relative to the resources. So if I try to start my executable from anywhere else, it fails to open the files and cannot proceed.

Is there a portable way to have CMake embed my resources into the executables (or libraries) such that I can simply access them in memory at runtime instead of opening files whose paths are brittle? I have found a related question, and it looks like embedding resources can be done well enough with some ld magic. So my question is how do I do this in a portable, cross platform manner using CMake? I actually need my application run on both x86 and ARM. I am ok with supporting only Linux (Embedded), but bonus points if anyone can suggest how to do this for Windows (Embedded) as well.

EDIT: I forgot to mention a desired property of the solution. I would like to be able to use CMake to cross-compile the application when I am building for ARM rather than have to compile it natively on my ARM target.

Mello answered 5/8, 2012 at 1:28 Comment(1)
see github.com/graphitemaster/incbinSoulier
A
41

One of the easiest ways to do this is to include a small, portable C program in your build that reads the resource and generates a C file that contains the length of the resource data and the actual resource data as an array of constant character literals. This will be entirely platform independent, but should only be used for resources that are reasonably small. For larger resources, you probably don't want to embed the files in your program.

For resource "foo", the generated C file "foo.c" would contain:

const char foo[] = { /* bytes of resource foo */ };
const size_t foo_len = sizeof(foo);

To access the resource from C++, you declare the following two symbols in either a header or the cpp file where they're used:

extern "C" const char foo[];
extern "C" const size_t foo_len;

To generate foo.c in the build, you need a target for the C program (call it embedfile.c), and you need to use the add_custom_command command to call this program:

add_executable(embedfile embedfile.c)

add_custom_command(
  OUTPUT foo.c
  COMMAND embedfile foo foo.rsrc
  DEPENDS foo.rsrc)

Then, include foo.c on the source list of a target that requires the "foo" resource. You now have access to the bytes of "foo".

The program embedfile.c is:

#include <stdlib.h>
#include <stdio.h>

FILE* open_or_exit(const char* fname, const char* mode)
{
  FILE* f = fopen(fname, mode);
  if (f == NULL) {
    perror(fname);
    exit(EXIT_FAILURE);
  }
  return f;
}

int main(int argc, char** argv)
{
  if (argc < 3) {
    fprintf(stderr, "USAGE: %s {sym} {rsrc}\n\n"
        "  Creates {sym}.c from the contents of {rsrc}\n",
        argv[0]);
    return EXIT_FAILURE;
  }

  const char* sym = argv[1];
  FILE* in = open_or_exit(argv[2], "r");

  char symfile[256];
  snprintf(symfile, sizeof(symfile), "%s.c", sym);

  FILE* out = open_or_exit(symfile,"w");
  fprintf(out, "#include <stdlib.h>\n");
  fprintf(out, "const char %s[] = {\n", sym);

  unsigned char buf[256];
  size_t nread = 0;
  size_t linecount = 0;
  do {
    nread = fread(buf, 1, sizeof(buf), in);
    size_t i;
    for (i=0; i < nread; i++) {
      fprintf(out, "0x%02x, ", buf[i]);
      if (++linecount == 10) { fprintf(out, "\n"); linecount = 0; }
    }
  } while (nread > 0);
  if (linecount > 0) fprintf(out, "\n");
  fprintf(out, "};\n");
  fprintf(out, "const size_t %s_len = sizeof(%s);\n\n",sym,sym);

  fclose(in);
  fclose(out);

  return EXIT_SUCCESS;
}
Ageold answered 5/8, 2012 at 6:58 Comment(9)
This looks like a very elegant, platform-independent solution, +1. I will try it out. However, there is one small downfall regarding something I forgot to mention originally (see my edit). If I incorporate this solution into the project and try to cross-compile it, I will run into trouble when trying to execute embedfile since that executable will have been compiled for ARM and I would be trying to execute it on x86. I suppose I can just compile embedfile separately for my host platform and install it to my path. Slightly inconvenient, but I only need to do it once.Mello
@SchighSchagh There are probably CMake flags for that, but I don't have a lot of experience with cross-compiling at the moment. embedfile would be easy to rewrite in a language like Python or Perl, if you know that the interpreter is available on the host platforms of interest.Ageold
@SchighSchagh: You might want to look at this: cmake.org/Wiki/…Ageold
I ended up porting your embedfile.c to Python, and I've been rather happy with the solution so far. :)Mello
Hey @SchighSchagh, would you be willing/able to release your python code for embedfile.c to the public/me? :)Laurent
I have a BSD python version here: gist.github.com/jlisee/5667e173bd2865a68f95 It can output read/write from stdin/stdout, allows you to set the output path and symbol name separately, and has embedded tests.Jemie
@Joe I forked and modified your version to then check timestamps and only re-embed the resource if it does not exist or if the new version has a different timestamp than the old. It then sets the atime and mtime of the resource to the source file (this is so if using it as a script with CMake, QMake, or another build tool, the compiler does not regenerate every resource file and then recompile it): gist.github.com/Alexhuszagh/f1b45c37455dee65cf436fd5fd9be9eePleurisy
It looks like this method has been rolled up into a nice CMake compatible package: github.com/cyrilcode/embed-resourceUmbilicate
why not use something like xxd directly?Behaviorism
G
43

As an alternative to the answer of sfstewman, here's a small cmake (2.8) function to convert all files in a specific folder to C data and write them in wished output file:

# Creates C resources file from files in given directory
function(create_resources dir output)
    # Create empty output file
    file(WRITE ${output} "")
    # Collect input files
    file(GLOB bins ${dir}/*)
    # Iterate through input files
    foreach(bin ${bins})
        # Get short filename
        string(REGEX MATCH "([^/]+)$" filename ${bin})
        # Replace filename spaces & extension separator for C compatibility
        string(REGEX REPLACE "\\.| |-" "_" filename ${filename})
        # Read hex data from file
        file(READ ${bin} filedata HEX)
        # Convert hex data for C compatibility
        string(REGEX REPLACE "([0-9a-f][0-9a-f])" "0x\\1," filedata ${filedata})
        # Append data to output file
        file(APPEND ${output} "const unsigned char ${filename}[] = {${filedata}};\nconst unsigned ${filename}_size = sizeof(${filename});\n")
    endforeach()
endfunction()
Gullett answered 29/11, 2014 at 20:51 Comment(6)
string(MAKE_C_IDENTIFIER ${filename} filename) could replace string(REGEX REPLACE...)Aranyaka
I found that MAKE_C_IDENTIFIER didn't work with periods and hyphens in the file name. The original regex didn't work either. I modified it a bit so that the variable created uses underscores for those cases as well: string(REGEX REPLACE "\\.| |-" "_" filename ${filename})Coruscation
Also, another convenient addition, if you change the output for the data to be file(APPEND ${output} "const char ${filename}[] = {${filedata}0x00};\nconst unsigned ${filename}_size = sizeof(${filename});\n") , you can output the contents of the file directly to a stream like cout without needing to convert or cast. The 0x00 also automatically null terminates it so you don't start printing random values of memory when used in this fashion.Coruscation
It works perfectly for me, thanks. I'd only to know if there's a way to re-run this function when resource file are modified without calling CMake explicitly, in order to avoid to forget to re-run it, for example during a git bisectAegisthus
@Aegisthus this should be made a custom command rather than a function.Butene
@Butene Any chance you made this a custom command?Officiary
A
41

One of the easiest ways to do this is to include a small, portable C program in your build that reads the resource and generates a C file that contains the length of the resource data and the actual resource data as an array of constant character literals. This will be entirely platform independent, but should only be used for resources that are reasonably small. For larger resources, you probably don't want to embed the files in your program.

For resource "foo", the generated C file "foo.c" would contain:

const char foo[] = { /* bytes of resource foo */ };
const size_t foo_len = sizeof(foo);

To access the resource from C++, you declare the following two symbols in either a header or the cpp file where they're used:

extern "C" const char foo[];
extern "C" const size_t foo_len;

To generate foo.c in the build, you need a target for the C program (call it embedfile.c), and you need to use the add_custom_command command to call this program:

add_executable(embedfile embedfile.c)

add_custom_command(
  OUTPUT foo.c
  COMMAND embedfile foo foo.rsrc
  DEPENDS foo.rsrc)

Then, include foo.c on the source list of a target that requires the "foo" resource. You now have access to the bytes of "foo".

The program embedfile.c is:

#include <stdlib.h>
#include <stdio.h>

FILE* open_or_exit(const char* fname, const char* mode)
{
  FILE* f = fopen(fname, mode);
  if (f == NULL) {
    perror(fname);
    exit(EXIT_FAILURE);
  }
  return f;
}

int main(int argc, char** argv)
{
  if (argc < 3) {
    fprintf(stderr, "USAGE: %s {sym} {rsrc}\n\n"
        "  Creates {sym}.c from the contents of {rsrc}\n",
        argv[0]);
    return EXIT_FAILURE;
  }

  const char* sym = argv[1];
  FILE* in = open_or_exit(argv[2], "r");

  char symfile[256];
  snprintf(symfile, sizeof(symfile), "%s.c", sym);

  FILE* out = open_or_exit(symfile,"w");
  fprintf(out, "#include <stdlib.h>\n");
  fprintf(out, "const char %s[] = {\n", sym);

  unsigned char buf[256];
  size_t nread = 0;
  size_t linecount = 0;
  do {
    nread = fread(buf, 1, sizeof(buf), in);
    size_t i;
    for (i=0; i < nread; i++) {
      fprintf(out, "0x%02x, ", buf[i]);
      if (++linecount == 10) { fprintf(out, "\n"); linecount = 0; }
    }
  } while (nread > 0);
  if (linecount > 0) fprintf(out, "\n");
  fprintf(out, "};\n");
  fprintf(out, "const size_t %s_len = sizeof(%s);\n\n",sym,sym);

  fclose(in);
  fclose(out);

  return EXIT_SUCCESS;
}
Ageold answered 5/8, 2012 at 6:58 Comment(9)
This looks like a very elegant, platform-independent solution, +1. I will try it out. However, there is one small downfall regarding something I forgot to mention originally (see my edit). If I incorporate this solution into the project and try to cross-compile it, I will run into trouble when trying to execute embedfile since that executable will have been compiled for ARM and I would be trying to execute it on x86. I suppose I can just compile embedfile separately for my host platform and install it to my path. Slightly inconvenient, but I only need to do it once.Mello
@SchighSchagh There are probably CMake flags for that, but I don't have a lot of experience with cross-compiling at the moment. embedfile would be easy to rewrite in a language like Python or Perl, if you know that the interpreter is available on the host platforms of interest.Ageold
@SchighSchagh: You might want to look at this: cmake.org/Wiki/…Ageold
I ended up porting your embedfile.c to Python, and I've been rather happy with the solution so far. :)Mello
Hey @SchighSchagh, would you be willing/able to release your python code for embedfile.c to the public/me? :)Laurent
I have a BSD python version here: gist.github.com/jlisee/5667e173bd2865a68f95 It can output read/write from stdin/stdout, allows you to set the output path and symbol name separately, and has embedded tests.Jemie
@Joe I forked and modified your version to then check timestamps and only re-embed the resource if it does not exist or if the new version has a different timestamp than the old. It then sets the atime and mtime of the resource to the source file (this is so if using it as a script with CMake, QMake, or another build tool, the compiler does not regenerate every resource file and then recompile it): gist.github.com/Alexhuszagh/f1b45c37455dee65cf436fd5fd9be9eePleurisy
It looks like this method has been rolled up into a nice CMake compatible package: github.com/cyrilcode/embed-resourceUmbilicate
why not use something like xxd directly?Behaviorism
P
12

I would like to propose another alternative. It uses the GCC linker to directly embed a binary file into the executable, with no intermediary source file. Which in my opinion is simpler and more efficient.

set( RC_DEPENDS "" )

function( add_resource input )
    string( MAKE_C_IDENTIFIER ${input} input_identifier )
    set( output "${CMAKE_ARCHIVE_OUTPUT_DIRECTORY}/${input_identifier}.o" )
    target_link_libraries( ${PROJECT_NAME} ${output} )

    add_custom_command(
        OUTPUT ${output}
        COMMAND ${CMAKE_LINKER} --relocatable --format binary --output ${output} ${input}
        DEPENDS ${input}
    )

    set( RC_DEPENDS ${RC_DEPENDS} ${output} PARENT_SCOPE )
endfunction()

# Resource file list
add_resource( "src/html/index.html" )

add_custom_target( rc ALL DEPENDS ${RC_DEPENDS} )

Then in your C/C++ files all you need is:

extern char index_html_start[] asm( "_binary_src_html_index_html_start" );
extern char index_html_end[]   asm( "_binary_src_html_index_html_end" );
extern size_t index_html_size  asm( "_binary_src_html_index_html_size" );
Partin answered 6/5, 2019 at 13:13 Comment(4)
I think something like this is what I was originally looking for! How does MAKE_C_IDENTIFIER work/where does it come from?Mello
Also, do you know if the asm stuff works for non-x86 and/or cross compilation?Mello
Actually, is asm really needed? I'm guessing a bit at how this works still, but could you just do extern char _binary_src_html_index_html_start[]; et al and leave it at that? Then it's a matter of taste if it would be ok to use the "raw" identifier, or alias it somehow such as via an extra const variable, or via a macro, or whatever.Mello
@NicuStiurca I appreciate this given I am 7y late. 1. The MAKE_C_IDENTIFIER is a CMake language function that produces a C compatible variable name (underscored) from a string. It is a short, neat way to produce a file system friendly and descriptive filename from the input file path. 2. asm is compiler specific but most support it, including GCC, Intel, IBM etc. It is conditionally supported but from what I understand should work in most cases. 3. You are correct. It is just a convenient way to rename the long mangled name of the object file to make your code more readable.Partin
O
7

Pure CMake function to convert any file into C/C++ source code, implemented with only CMake commands:

####################################################################################################
# This function converts any file into C/C++ source code.
# Example:
# - input file: data.dat
# - output file: data.h
# - variable name declared in output file: DATA
# - data length: sizeof(DATA)
# embed_resource("data.dat" "data.h" "DATA")
####################################################################################################

function(embed_resource resource_file_name source_file_name variable_name)

    file(READ ${resource_file_name} hex_content HEX)

    string(REPEAT "[0-9a-f]" 32 column_pattern)
    string(REGEX REPLACE "(${column_pattern})" "\\1\n" content "${hex_content}")

    string(REGEX REPLACE "([0-9a-f][0-9a-f])" "0x\\1, " content "${content}")

    string(REGEX REPLACE ", $" "" content "${content}")

    set(array_definition "static const unsigned char ${variable_name}[] =\n{\n${content}\n};")

    set(source "// Auto generated file.\n${array_definition}\n")

    file(WRITE "${source_file_name}" "${source}")

endfunction()

https://gist.github.com/amir-saniyan/de99cee82fa9d8d615bb69f3f53b6004

Onward answered 29/6, 2020 at 19:11 Comment(0)
S
5

I'd say the most elegant way to have embedded resources in C++ is simply to use the Qt Resource System which is portable across different platforms, compatible with CMake, and essentially wraps up everything done in the answer above, besides providing compression, being fully tested and fool-proof, everything else.

Create a Qt resource file - an XML listing the files to be embedded:

<RCC>
    <qresource prefix="/">
        <file>uptriangle.png</file>
        <file>downtriangle.png</file>
    </qresource>
</RCC>

Call the file qtres.qrc. The resource file above will have the two png files (located in the same directory as qtres.qrc) embedded in the final executable. You can easily add/remove monitor resources to a qrc file using QtCreator (the Qt IDE).

Now in your CMakeLists.txt file add:

set(CMAKE_AUTOMOC ON)
find_package(Qt5Core)
qt5_add_resources(QT_RESOURCE qtres.qrc)

In your main.cpp, before you need to access the resource, add the following line:

Q_INIT_RESOURCE(qtres);

Now you can access any of the resources above using Qt classes compatible with Qt Resource System, such as QPixmap, QImage ... and mosty importantly maybe in general cases the QResource wrapper class which wraps an embedded Qt resource and enables access to it through a friendly interface. As an example, to access data within downtriangle.png in the above resources, the following lines will do the trick:

#include <QtCore>
#include <QtGui>

// ...

int main(int argc, char **argv)
{

    // ...

    Q_INIT_RESOURCE(qtres);

    // ...
    QResource res("://downtriangle.png"); // Here's your data, anyway you like
    // OR
    QPixmap pm("://downtriangle.png");  // Use it with Qt classes already

    // ...

}

Here, res can be used to directly access the data using res.data(), res.size() ... To parse the image content of the file use pm. Use pm.size(), pm.width() ...

And you're good to go. I hope it helped.

Sparling answered 24/7, 2014 at 5:34 Comment(3)
Qt is a pretty heavy dependency for my taste, but +1 anyway for a good answer.Mello
I am at the moment developing an embedded application for the ARM/DSP platform which uses exclusively STL + Boost and have managed to use QtResources with it, which essentially wraps up what is done in the first answer under the hood. No dependencies (not at least heavy ones) besides the QResource (if you're using the file as a raw binary resource) and the Qt5Core cmake module (which comes with the framework and is also found standalone with default distro configurations). Take a look at the resource file that is generated by qt5_add_resources(QT_RESOURCE qtres.qrc) command.Sparling
The Qt resource system is great, tho this answer only applies if you want to and are allowed to use Qt. For example, a slim SDK lib file will unlikely be able to use this method.Indicant
E
3

I'm using this super easy library in my projects: embed

It can be used with C++20 and C++14.

Example of CMakeLists.txt:

cmake_minimum_required(VERSION 3.21)
project(Test)

include(FetchContent)
FetchContent_Declare(
  battery-embed
  GIT_REPOSITORY https://github.com/batterycenter/embed.git
  GIT_TAG        <latest-git-tag>
)
FetchContent_MakeAvailable(battery-embed)

add_executable(Test src/main.cpp)

b_embed(Test resources/message.txt)

And an example of src/main.cpp:

#include <iostream>
#include "battery/embed.hpp"

int main() {
    std::cout << b::embed<"resources/message.txt">() << std::endl;
    return 0;
}
Earthnut answered 5/12, 2023 at 15:33 Comment(1)
Thx. This is very easy for use.Unyoke
O
1

There is a single-file CMake script that allows you to embed data easily that's called cmrc.

Example usage:

include(CMakeRC.cmake)
cmrc_add_resource_library(foo-resources
        ALIAS foo::rc
        NAMESPACE foo
        shaders/trig.vert
        shaders/trig.frag)

target_link_libraries(foo foo::rc)
#include <cmrc/cmrc.hpp>

CMRC_DECLARE(foo); // It should be the NAMESPACE property you specified
                   // in your CMakeLists.txt

int main() {
    auto fs = cmrc::foo::get_filesystem();
    auto vert_shader = fs.open("shaders/trig.vert");
    auto frag_shader = fs.open("shaders/trig.frag");
    ...
    glShaderSource(vertexShader, 1, &vert_shader.begin(), nullptr);
}

It's probably the easiest library to setup and use.

Often answered 23/1, 2022 at 11:35 Comment(2)
Looks verify small and tidy, which is great. Do you know if it works with cross-compiling? This is long past when I first needed this, but cross-compiling was one of my original requirements. That may still be relevant to whoever else is reading this.Mello
It's cross-platform, yes. @NicuStiurcaOften
F
1

I used this CMake function to embed files into .lib files. It depends on nothing but cmake itself, and the data is re-generated every time the corresponding file is updated.

function(embed_resources target)
    set(script_path "${CMAKE_CURRENT_BINARY_DIR}/anything_to_c.cmake")
    file(WRITE  "${script_path}" "file(READ \${CMAKE_ARGV3} buf HEX)\n")
    file(APPEND "${script_path}" "string(REGEX REPLACE \"([0-9a-f][0-9a-f])\" \"0x\\\\1, \" buf \${buf})\n")
    file(APPEND "${script_path}" "file(WRITE \${CMAKE_ARGV4} \"const unsigned char \${CMAKE_ARGV5}[] = { \${buf}0x00 };\\n\")\n")
    file(APPEND "${script_path}" "file(APPEND \${CMAKE_ARGV4} \"const unsigned \${CMAKE_ARGV6} = sizeof(\${CMAKE_ARGV5}) - 1;\\n\")\n")
    foreach(res_path ${ARGN})
        string(MAKE_C_IDENTIFIER ${res_path} identifier)
        set(src_path "${CMAKE_CURRENT_SOURCE_DIR}/${res_path}")
        set(dst_path "${CMAKE_CURRENT_BINARY_DIR}/${identifier}.c")
        set(anything_to_c ${CMAKE_COMMAND} -P ${script_path} ${src_path} ${dst_path} ${identifier} ${identifier}_size)
        add_custom_command(OUTPUT ${dst_path} COMMAND ${anything_to_c} DEPENDS ${src_path} VERBATIM)
        target_sources(${target} PRIVATE ${src_path} ${dst_path} )
    endforeach()
endfunction()

For example, in CMakeLists.txt, copy and paste the function above and write:

embed_resources(${PROJECT_NAME}
    res/test.vert
    res/test.frag
)

And in C/C++ code:

extern "C" const char res_test_vert[];
extern "C" const unsigned res_test_vert_size;

https://github.com/shir0areed/non-invasive-embed-resources.cmake

Familiar answered 4/5, 2023 at 14:54 Comment(0)
H
0

This is an improved version of @Itay Grudev's solution by adding null-terminated character to the file. So that it is possible to just use symbol as a null-terminated string directly: extern const char file[] asm("_binary_your_file_start"); cout << strlen(file);

This work nice with text files. For binary files, it is better to stay with the original solution :)

set(RC_DEPENDS "")

# If you want to make the symbol name look prettier, just use relative path as the input
function(add_resource input)
  string(MAKE_C_IDENTIFIER ${input} input_identifier)
  set(res_intermediate_dir ${CMAKE_CURRENT_BINARY_DIR}/resources)
  set(res_with_null_output "${res_intermediate_dir}/${input}")
  set(output "${res_intermediate_dir}/${input_identifier}.o")

  # Add null-terminated character to the file
  add_custom_command(
      DEPENDS ${input}
      OUTPUT ${res_with_null_output}
      COMMAND ${CMAKE_COMMAND} -E copy ${input} ${res_with_null_output};
      COMMAND echo -n '\\0' >> ${res_with_null_output}
      WORKING_DIRECTORY ${CMAKE_CURRENT_LIST_DIR}
  )

  add_custom_command(
      DEPENDS ${res_with_null_output}
      OUTPUT ${output}
      COMMAND ${CMAKE_LINKER} --relocatable --format binary --output ${output} ${input}
      WORKING_DIRECTORY ${res_intermediate_dir}
  )

  set(RC_DEPENDS ${RC_DEPENDS} ${output} PARENT_SCOPE)
endfunction()

# Resource file list
add_resource( "src/html/index.html" )

add_custom_target( rc ALL DEPENDS ${RC_DEPENDS} )

Hedvah answered 2/2, 2023 at 22:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.