Populate An Array Using Constexpr at Compile-time
Asked Answered
E

4

26

I would like to populate an array of enum using constexpr. The content of the array follows a certain pattern.

I have an enum separating ASCII character set into four categories.

enum Type {
    Alphabet,
    Number,
    Symbol,
    Other,
};

constexpr Type table[128] = /* blah blah */;

I would like to have an array of 128 Type. They can be in a structure. The index of the array will be corresponding to the ASCII characters and the value will be the Type of each character.

So I can query this array to find out which category an ASCII character belongs to. Something like

char c = RandomFunction();
if (table[c] == Alphabet) 
    DoSomething();

I would like to know if this is possible without some lengthy macro hacks.

Currently, I initialize the table by doing the following.

constexpr bool IsAlphabet (char c) {
    return ((c >= 0x41 && c <= 0x5A) ||
            (c >= 0x61 && c <= 0x7A));
}

constexpr bool IsNumber (char c) { /* blah blah */ }

constexpr bool IsSymbol (char c) { /* blah blah */ }

constexpr Type whichCategory (char c) { /* blah blah */ }

constexpr Type table[128] = { INITIALIZE };

where INITIALIZE is the entry point of some very lengthy macro hacks. Something like

#define INITIALIZE INIT(0)
#define INIT(N) INIT_##N
#define INIT_0 whichCategory(0), INIT_1
#define INIT_1 whichCategory(1), INIT_2
//...
#define INIT_127 whichCategory(127)

I would like a way to populate this array or a structure containing the array without the need for this macro hack...

Maybe something like

struct Table {
    Type _[128];
};

constexpr Table table = MagicFunction();

So, the question is how to write this MagicFunction?

Note: I am aware of cctype and likes, this question is more of a Is this possible? rather than Is this the best way to do it?.

Any help would be appreciated.

Thanks,

Empurple answered 9/11, 2012 at 18:38 Comment(13)
You do know that ASCII only ranges [0 .. 127]? And that char's signedness is implementation defined? Your current approach is very dangerous. Oh, and last but not least, the C++ standard doesn't demand ASCII encoding at all. It might aswell be EBCDIC.Charleencharlemagne
The good news is that because arrays can be initialized with pack expansions, what you ask for is indeed feasible. You just need to invoke the function plenty of times :pReconcilable
Most likely not possible because C++ doesn't require ASCII representation for characters. Also, in the strictest sense, ASCII character set comprises of only 128 characters.Eggers
@MatthieuM. can you elaborate a bit? :) thxEmpurple
I know this does not answer your question directly, but what's wrong with using functions from cctype? cplusplus.com/reference/clibrary/cctypeAqueduct
There's already something similar (though not entirely equivalent) in the Standard library: the ctype<char> facet of a locale (and its classic_table). It's not guaranteed to be ASCII (as Xeo pointed out) and is not constexpr.Icaria
@Charleencharlemagne Why is it significant here that C++ does not require any specific encoding? A char must be at least 8 bits wide (§1.7/1) and therefore can contain ASCII. I see no char literals in the OP's question. (I'm not completely sure about the signed/unsigned comparison, see §5/10 bullet 5 subbullet 3/4.)Icaria
@DyP: ((c >= 0x41 && c <= 0x5A) || (c >= 0x61 && c <= 0x7A)) from IsAlphabet -- this assumes the decimal ordering that is present in ASCII. The signedness is important since OP passes literals > 127, which may map to negative chars.Charleencharlemagne
The content or the uses of the array here really does not matter much here. For as far as I care, it doesn't even need to be a map between categories and ASCII characters. I just would like to see an efficient implementation to populate an array, which can be used as some sort of state machine, at compile time.Empurple
@Charleencharlemagne If c has been initialized by c = 0x41; // A why shouldn't this work? I only see problems with c = 'A';.Icaria
@Xeo, my questions is more of a is it possible instead of is this the best way to do it. Thanks for your pointing out the range for ASCII though. I always group ASCII and extended ASCII together but I guess not. I was also not aware that C++ does not require ASCII encoding.Empurple
@BeyondSora: I'm sorry, I did not have the time to properly address this question yesterday. Thankfully Xeo did (despite his grumpiness :p). The indices generation combined with pack expansion is a great trick to generate all kind of initialization lists (as you can see here), and thus I was pointing out at the fact that since arrays accept initialization lists, you were golden.Reconcilable
@DyP: With EBCDIC, 0x41 isn't mapped to any symbol at all, and letters have a strange place in the codepage. Just see here.Charleencharlemagne
C
32

Ignoring ALL the issues, indices to the rescue:

template<unsigned... Is> struct seq{};
template<unsigned N, unsigned... Is>
struct gen_seq : gen_seq<N-1, N-1, Is...>{};
template<unsigned... Is>
struct gen_seq<0, Is...> : seq<Is...>{};

template<unsigned... Is>
constexpr Table MagicFunction(seq<Is...>){
  return {{ whichCategory(Is)... }};
}

constexpr Table MagicFunction(){
  return MagicFunction(gen_seq<128>{});
}

Live example.

Charleencharlemagne answered 9/11, 2012 at 20:55 Comment(13)
@Steven: Added a link to the Lounge<C++> wiki entry. It basically builds a list [0 .. 127] and expands that, calling whichCategory(0), whichCategory(1), ..., whichCategory(127) and passes that as initializer arguments for list-initialization to Table._ (notice the double {} for initializing the inner array).Charleencharlemagne
I didn't know you can return something of this form { /*...*/ }, is this something new in C++11 or always there in the standard?Empurple
@BeyondSora: New to C++11, called list-initialization (also less formal and more commonly known as uniform initialization).Charleencharlemagne
Why do you use double curly brackets?Pino
This answer is obsolete in C++14 because C++14 allows you to build the whole array inside a constexpr function.Display
NB: seq and gen_seq are in the standard library since C++14, called std::index_sequence and std::make_index_sequence.Schwerin
@Schwerin - Why would you use C++14 specific features to solve a problem in a particular way when there is a much more compile-time (as in time taken to compile) and easier to understand way to solve the same problem that was also introduced in C++14?Display
@Display I'm afraid I don't quite get your question. I'm just saying that part of the proposed solution can be replaced by using analogues from the standard library, decreasing the amount of code and increasing readability.Schwerin
@Schwerin - Except, if you had a version of C++ new enough to have the features you mention, you wouldn't make a solution that looked like that. Because that version of C++ has other features that make solving this problem easier.Display
@Display Look, the question was asked in 2012, and was given a good 2012-ish answer. If you feel that the answer should be updated/removed/etc, contact the author of the answer. Again, the only thing I am saying is that C++14 has std::make_index_sequence. I am not saying that this answer is still the best solution, that your answer is bad, or anything like that.Schwerin
@Schwerin - I think the answer is great. It answers the question for C++11 and it's a fine answer. Your update to the answer is pointless. This answer doesn't need a tweak to use C++14 stuff in C++14. It needs to be completely re-written for C++14 to do things in a different way. So, if someone is using C++14s std::make_index_sequence to initialize a ::std::array at compile time, they're doing something wrong, C++14 has a much better way to initialize an ::std::array at compile time. But the answer itself is great. Your suggestion for an improvement is not.Display
@Display Thank you for sharing your opinion; you are absolutely free in your desire to consider my comment bad, useless, pointless, whatever. There is a special "flag" feature for comments: you may use that. However, from my perspective the comment is still a valuable one. If someone has already been using this solution, there may be no reason to rewrite it completely just for the sake of using new C++ features. On the other hand, simplifying it slightly and effortlessly may be of interest. The intent of my comment was just providing information. I now digress from further discussion.Schwerin
@Schwerin - I only flag comments that are actively abusive of people. You said: If someone has already been using this solution, there may be no reason to rewrite it completely just for the sake of using new C++ features. and I think that's a very valid point. Thank you.Display
D
17

In C++17 ::std::array has been updated to be more constexpr friendly and you can do the same as in C++14, but without some of the scary looking hacks to get around the lack of constexpr in crucial places. Here is what the code would look like there:

#include <array>

enum Type {
    Alphabet,
    Number,
    Symbol,
    Other,
};

constexpr ::std::array<Type, 128> MagicFunction()
{
   using result_t = ::std::array<Type, 128>;
   result_t result = {Other};
   result[65] = Alphabet;
   //....
   return result;
}

const ::std::array<Type, 128> table = MagicFunction();

Again MagicFunction still needs to obey the rather loose constexpr rules. Mainly, it may not modify any global variables or use new (which implies modifying global state, namely the heap) or other such things.

Display answered 16/10, 2017 at 21:44 Comment(1)
I came here trying to fill a std::array<std::pair<X,Y>> like this with result[65] = std::make_pair( x,y ). This won't work, but result[65].first = x; result[65].second = y; will.Mainstream
S
5

IMHO the best way to do this is simply write a tiny setup program that will generate table for you. And then you can either throw out the setup program, or check it in alongside the generated source code.

The tricky part of this question is just a duplicate of this other one: Is it possible to create and initialize an array of values using template metaprogramming?

The trick is, it's impossible to write anything like

Type table[256] = some_expression();

at file scope, because global arrays can be initialized only with literal (source-level) initializer-lists. You can't initialize a global array with the result of a constexpr function, even if you could somehow get that function to return a std::initializer_list, which you can't because its constructor isn't declared constexpr.

So what you have to do is get the compiler to generate the array for you, by making it a static const data member of a template class. After one or two levels of metaprogramming that I'm too confused to write out, you'll bottom out in a line that looks something like

template <int... Indices>
Type DummyStruct<Indices...>::table[] = { whichCategory(Indices)... };

where Indices is a parameter-pack that looks like 0,1,2,... 254,255. You construct that parameter-pack using a recursive helper template, or maybe just using something out of Boost. And then you can write

constexpr Type (&table)[] = IndexHelperTemplate<256>::table;

...But why would you do all that, when the table is only 256 entries that will never change unless ASCII itself changes? The right way is the simplest way: precompute all 256 entries and write out the table explicitly, with no templates, constexpr, or any other magic.

Steamship answered 9/11, 2012 at 21:12 Comment(3)
IMHO hardcode it is bad because its important to understand the logic on that values if that was complex (in a more generic case). If there is n pregenerated factorial values who ensure my that there is no copypaste mistake nor error on some value?Dynamism
If there are n lines of complicated metaprogramming to compute the values, who ensures that there is no copy-paste mistake or bug in that code? Our goal is always to reduce the possibility of error. For something like a three-line factorial function, maybe you'd minimize the possibility of error by writing code instead of data. For OP's actual problem — a 128-byte classification table — I still feel strongly that writing data instead of code is the best way to minimize the possibility of error. (But note that constexpr programming has gotten MUCH more powerful and natural since 2012.)Steamship
I prefer to see what is behind the data than hardcoded numbers.Dynamism
D
4

The way to do this in C++14 looks like this:

#include <array>

enum Type {
    Alphabet,
    Number,
    Symbol,
    Other,
};

constexpr ::std::array<Type, 128> MagicFunction()
{
   using result_t = ::std::array<Type, 128>;
   result_t result = {Other};
   const result_t &fake_const_result = result;
   const_cast<result_t::reference>(fake_const_result[65]) = Alphabet;
   //....
   return result;
}

const ::std::array<Type, 128> table = MagicFunction();

No clever template hackery required any longer. Though, because C++14 didn't really undergo a thorough enough review of what did and didn't have to be constexpr in the standard library, a horrible hack involving const_cast has to be used.

And, of course, MagicFunction had better not modify any global variables or otherwise violate the constexpr rules. But those rules are pretty liberal nowadays. You can, for example, modify all the local variables you want, though passing them by reference or taking their addresses may not work out so well.

See my other answer for C++17, which allows you to drop some of the ugly-looking hacks.

Display answered 30/6, 2017 at 2:0 Comment(13)
You can even use a for loop inside MagicFunction.Feliciafeliciano
I get an error about result being declared uninitialized, though. std::array<Type, 128> result{}; does work - I guess Type{} is Alphabet.Feliciafeliciano
@Feliciafeliciano oops, I'll fix that.Display
What does the :: at the beginning?Liv
@user1754322 - See this question: #1662412Display
Ummm... This does not compile with C++14. You need a constexpr reference operator[]( size_type pos ); which only exists in C++17: en.cppreference.com/w/cpp/container/array/operator_atPriory
@TimRae - I thought I'd passed the C++14 flag to the compiler when I tested that. :-/ Perhaps the standard library implementation I was using did not properly remove functions when the version was set to C++14. Oh, well. I updated my answer to reflect the need for C++17. One could, of course, do something like *(result.data() + index) = Other;, but that would be horrible.Display
@TimRae - Oh-ho! The .data function isn't constexpr either! sigh Well, I'll update the answer with two versions, a C++14 and a C++17 version.Display
C++17 is official now and supported by gcc 7 and clang 6, so I think you could make it more prominent in your answer with an actual code block.Priory
Or even better just make a second answerPriory
@TimRae - Done. :-)Display
Why do you need all the const stuff? Are you not done after the line result_t result = {Other};? The lines after that also don't seem used, since what is returned is return result;, and not the fake_const_result?Bladdernose
@Bladdernose - If you notice, I have a C++17 answer that does exactly that. The problem is that the C++14 standard neglected to mark the non-const version of operator [] for std::array as constexpr. So, I can't use it inside of a constexpr function.Display

© 2022 - 2024 — McMap. All rights reserved.