Aligning static string literals
Asked Answered
Z

2

5

I have a static array of structures:

struct CommandStruct
{
    char* data;
    unsigned ans_size;
};

static const CommandStruct commands[] =
{
    { "Some literal", 28 },
    { "Some other literal", 29 },
    { "Yet another literal", 8 },
};

And I want the strings to be 16-byte aligned. Is it possible to achieve it directly? I might get away with defining each literal separately, like __declspec(align(16)) static const char some_command_id[] = "my literal", but that's a mess. I need all initialization in a single block of code.

Zeeba answered 25/2, 2014 at 8:52 Comment(8)
Do you want the pointers to character sequences aligned (the char* data) or the actual chars?Wilen
Of course the chars. The first char must be at 16-byte boundary.Zeeba
User-defined literals? Alignment support is new in C++11 and needs a little verbosity, so you may need your own "aligned string" class.Maurer
A practical solution is to copy them to 16-byte aligned storage.Fatback
Is there a reason why you want them to be 16 byte aligned instead of platform specific alignment?Sent
@aks, to process with SSE instructionsZeeba
Can you just align the struct, and then put each string into a char[16]?Guarantee
@tenfour, you mean, put it into char[max_literal_size]? But then I'll have to calculate that value manually.Zeeba
S
3

With C++11, following may help: https://ideone.com/IDEdY0

#include <cstdint>

// Sequence of char
template <char...Cs> struct char_sequence
{
    template <char C> using push_back = char_sequence<Cs..., C>;
};

// Remove all chars from char_sequence from '\0'
template <typename, char...> struct strip_sequence;

template <char...Cs>
struct strip_sequence<char_sequence<>, Cs...>
{
    using type = char_sequence<Cs...>;
};

template <char...Cs, char...Cs2>
struct strip_sequence<char_sequence<'\0', Cs...>, Cs2...>
{
    using type = char_sequence<Cs2...>;
};

template <char...Cs, char C, char...Cs2>
struct strip_sequence<char_sequence<C, Cs...>, Cs2...>
{
    using type = typename strip_sequence<char_sequence<Cs...>, Cs2..., C>::type;
};

// struct to create a aligned char array
template <std::size_t Alignment, typename chars> struct aligned_string;

template <std::size_t Alignment, char...Cs>
struct aligned_string<Alignment, char_sequence<Cs...>>
{
    alignas(Alignment) static constexpr char str[sizeof...(Cs)] = {Cs...};
};

template <std::size_t Alignment, char...Cs>
alignas(Alignment) constexpr
char aligned_string<Alignment, char_sequence<Cs...>>::str[sizeof...(Cs)];

// helper to get the i_th character (`\0` for out of bound)
template <std::size_t I, std::size_t N>
constexpr char at(const char (&a)[N]) { return I < N ? a[I] : '\0'; }

// helper to check if the c-string will not be truncated
template <std::size_t max_size, std::size_t N>
constexpr bool check_size(const char (&)[N])
{
    static_assert(N <= max_size, "string too long");
    return N <= max_size;
}

// Helper macros to build char_sequence from c-string
#define PUSH_BACK_8(S, I) \
    ::push_back<at<(I) + 0>(S)>::push_back<at<(I) + 1>(S)> \
    ::push_back<at<(I) + 2>(S)>::push_back<at<(I) + 3>(S)> \
    ::push_back<at<(I) + 4>(S)>::push_back<at<(I) + 5>(S)> \
    ::push_back<at<(I) + 6>(S)>::push_back<at<(I) + 7>(S)>

#define PUSH_BACK_32(S, I) \
        PUSH_BACK_8(S, (I) + 0) PUSH_BACK_8(S, (I) + 8) \
        PUSH_BACK_8(S, (I) + 16) PUSH_BACK_8(S, (I) + 24)

#define PUSH_BACK_128(S, I) \
    PUSH_BACK_32(S, (I) + 0) PUSH_BACK_32(S, (I) + 32) \
    PUSH_BACK_32(S, (I) + 64) PUSH_BACK_32(S, (I) + 96)

// Macro to create char_sequence from c-string (limited to 128 chars)
#define MAKE_CHAR_SEQUENCE(S) \
    strip_sequence<char_sequence<> \
    PUSH_BACK_128(S, 0) \
    >::type::template push_back<check_size<128>(S) ? '\0' : '\0'>

// Macro to return an aligned c-string
#define MAKE_ALIGNED_STRING(ALIGNMENT, S) \
    aligned_string<ALIGNMENT, MAKE_CHAR_SEQUENCE(S)>::str

And so you have:

static const CommandStruct commands[] =
{
    { MAKE_ALIGNED_STRING(16, "Some literal"), 28 },
    { MAKE_ALIGNED_STRING(16, "Some other literal"), 29 },
    { MAKE_ALIGNED_STRING(16, "Yet another literal"), 8 },
};
Soursop answered 27/2, 2014 at 11:55 Comment(4)
Does it work in clang for you? It returns: warning: array index 125 is past the end of the array (which contains 6 elements) [-Warray-bounds] for meBoxhaul
@RushPL: the static_assert for address is not supported, but else it works (coliru.stacked-crooked.com/a/5dc1419e638a9776).Soursop
It turns out that it was triggered by a different error and was a false positive. Thanks for a prompt comment.Boxhaul
Since I consider your answer here a work of genius would you have any idea about my question? https://mcmap.net/q/24019/-c-11-compile-time-format-string-literal-construction-for-invoking-printf/403571Boxhaul
Z
0

Having browsed through boost a bit, I've managed to cook up something which just automates separate literals construction, (I also make enum of all the array's elements):

#define CMD_TUPLE ( \
    (cmdCommandOne, "The first command", 1900),\
    (cmdCommandTwo, "The second one",    1),\
    (cmdAnother,    "Another command",   11))

#define CMD_SEQ BOOST_PP_TUPLE_TO_SEQ(CMD_TUPLE)

#define CMD_MAKE_ENUM(r, data, elem) BOOST_PP_TUPLE_ELEM(0, elem),
enum Commands { BOOST_PP_SEQ_FOR_EACH(CMD_MAKE_ENUM, , CMD_SEQ) cmdLast };

#define CMD_MAKE_STRING(r, data, elem) \
    __declspec(align(16)) static const char \
    BOOST_PP_CAT(cmd_, BOOST_PP_TUPLE_ELEM(0, elem))[] = BOOST_PP_TUPLE_ELEM(1, elem);
BOOST_PP_SEQ_FOR_EACH(CMD_MAKE_STRING, , CMD_SEQ)

#define CMD_MAKE_ARRAY(r, data, elem) \
    { BOOST_PP_CAT(cmd_, BOOST_PP_TUPLE_ELEM(0, elem)), \
    BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_REST_N(2, BOOST_PP_TUPLE_TO_SEQ(elem))) },
static const CommandStruct commands[] = {
    BOOST_PP_SEQ_FOR_EACH(CMD_MAKE_ARRAY, , CMD_SEQ)
};
Zeeba answered 25/2, 2014 at 19:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.